You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Robin Verlangen <ro...@us2.nl> on 2012/11/08 11:33:03 UTC

Reduce phase hangs

Hi there,

It seems that some of my jobs hang in the reduce phase for a very long time
(for example, days). Is there anything I could tweak on? The query is
pretty simple, like:

SELECT SUM(colA), to_date(colB) AS dt FROM table GROUP BY to_date(colB)
ORDER BY dt ASC;

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

<http://goo.gl/Lt7BC>

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

Re: Reduce phase hangs

Posted by Mark Grover <gr...@gmail.com>.
It could be the order by since all the output records have to go through a
single reducer to get ordered.

Can you remove the order by and re-try the query?

You can read up more about what I am saying at
https://cwiki.apache.org/Hive/languagemanual-sortby.html

Mark

On Thu, Nov 8, 2012 at 5:23 AM, Robin Verlangen <ro...@us2.nl> wrote:

> Hmm, after looking in the job&task-tracker web interfaces it seemed that
> one of the new nodes was unable to connect to two of the others. This
> caused the copying of data to "hang" (in fact: timeout, on timeout, on
> timeout, ...).
>
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> <http://goo.gl/Lt7BC>
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>
>
> On Thu, Nov 8, 2012 at 2:15 PM, Mohammad Tariq <do...@gmail.com> wrote:
>
>> Maybe you need more heap or that nodes's health is not fine. Or maybe the
>> data block on that node is causing the code to go in a very long loop.
>> There can be several reasons. It's tough to say anything confidently
>> without knowing the real issue.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Thu, Nov 8, 2012 at 5:39 PM, Robin Verlangen <ro...@us2.nl> wrote:
>>
>>> It actually seems that only 1 of the 24 reducers hangs at the "copy"
>>> phase. Any solutions for this?
>>>
>>> Best regards,
>>>
>>> Robin Verlangen
>>> *Software engineer*
>>> *
>>> *
>>> W http://www.robinverlangen.nl
>>> E robin@us2.nl
>>>
>>> <http://goo.gl/Lt7BC>
>>>
>>> Disclaimer: The information contained in this message and attachments is
>>> intended solely for the attention and use of the named addressee and may be
>>> confidential. If you are not the intended recipient, you are reminded that
>>> the information remains the property of the sender. You must not use,
>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>> received this message in error, please contact the sender immediately and
>>> irrevocably delete this message and any copies.
>>>
>>>
>>>
>>> On Thu, Nov 8, 2012 at 11:47 AM, Robin Verlangen <ro...@us2.nl> wrote:
>>>
>>>> Nothing special over there. Most of the jobs complete after quite some
>>>> time. However it makes no sense to me that it takes that long. Its probably
>>>> just a couple of hundreds of megabytes.
>>>>
>>>> Best regards,
>>>>
>>>> Robin Verlangen
>>>> *Software engineer*
>>>> *
>>>> *
>>>> W http://www.robinverlangen.nl
>>>> E robin@us2.nl
>>>>
>>>> <http://goo.gl/Lt7BC>
>>>>
>>>> Disclaimer: The information contained in this message and attachments
>>>> is intended solely for the attention and use of the named addressee and may
>>>> be confidential. If you are not the intended recipient, you are reminded
>>>> that the information remains the property of the sender. You must not use,
>>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>>> received this message in error, please contact the sender immediately and
>>>> irrevocably delete this message and any copies.
>>>>
>>>>
>>>>
>>>> On Thu, Nov 8, 2012 at 11:43 AM, Mohammad Tariq <do...@gmail.com>wrote:
>>>>
>>>>> Hello Robin,
>>>>>
>>>>>         Look at the TaskTracker logs and see if you find anything
>>>>> interesting there.
>>>>>
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Nov 8, 2012 at 4:03 PM, Robin Verlangen <ro...@us2.nl> wrote:
>>>>>
>>>>>> Hi there,
>>>>>>
>>>>>> It seems that some of my jobs hang in the reduce phase for a very
>>>>>> long time (for example, days). Is there anything I could tweak on? The
>>>>>> query is pretty simple, like:
>>>>>>
>>>>>> SELECT SUM(colA), to_date(colB) AS dt FROM table GROUP BY
>>>>>> to_date(colB) ORDER BY dt ASC;
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Robin Verlangen
>>>>>> *Software engineer*
>>>>>> *
>>>>>> *
>>>>>> W http://www.robinverlangen.nl
>>>>>> E robin@us2.nl
>>>>>>
>>>>>> <http://goo.gl/Lt7BC>
>>>>>>
>>>>>> Disclaimer: The information contained in this message and attachments
>>>>>> is intended solely for the attention and use of the named addressee and may
>>>>>> be confidential. If you are not the intended recipient, you are reminded
>>>>>> that the information remains the property of the sender. You must not use,
>>>>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>>>>> received this message in error, please contact the sender immediately and
>>>>>> irrevocably delete this message and any copies.
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reduce phase hangs

Posted by Robin Verlangen <ro...@us2.nl>.
Hmm, after looking in the job&task-tracker web interfaces it seemed that
one of the new nodes was unable to connect to two of the others. This
caused the copying of data to "hang" (in fact: timeout, on timeout, on
timeout, ...).

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

<http://goo.gl/Lt7BC>

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



On Thu, Nov 8, 2012 at 2:15 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Maybe you need more heap or that nodes's health is not fine. Or maybe the
> data block on that node is causing the code to go in a very long loop.
> There can be several reasons. It's tough to say anything confidently
> without knowing the real issue.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Thu, Nov 8, 2012 at 5:39 PM, Robin Verlangen <ro...@us2.nl> wrote:
>
>> It actually seems that only 1 of the 24 reducers hangs at the "copy"
>> phase. Any solutions for this?
>>
>> Best regards,
>>
>> Robin Verlangen
>> *Software engineer*
>> *
>> *
>> W http://www.robinverlangen.nl
>> E robin@us2.nl
>>
>> <http://goo.gl/Lt7BC>
>>
>> Disclaimer: The information contained in this message and attachments is
>> intended solely for the attention and use of the named addressee and may be
>> confidential. If you are not the intended recipient, you are reminded that
>> the information remains the property of the sender. You must not use,
>> disclose, distribute, copy, print or rely on this e-mail. If you have
>> received this message in error, please contact the sender immediately and
>> irrevocably delete this message and any copies.
>>
>>
>>
>> On Thu, Nov 8, 2012 at 11:47 AM, Robin Verlangen <ro...@us2.nl> wrote:
>>
>>> Nothing special over there. Most of the jobs complete after quite some
>>> time. However it makes no sense to me that it takes that long. Its probably
>>> just a couple of hundreds of megabytes.
>>>
>>> Best regards,
>>>
>>> Robin Verlangen
>>> *Software engineer*
>>> *
>>> *
>>> W http://www.robinverlangen.nl
>>> E robin@us2.nl
>>>
>>> <http://goo.gl/Lt7BC>
>>>
>>> Disclaimer: The information contained in this message and attachments is
>>> intended solely for the attention and use of the named addressee and may be
>>> confidential. If you are not the intended recipient, you are reminded that
>>> the information remains the property of the sender. You must not use,
>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>> received this message in error, please contact the sender immediately and
>>> irrevocably delete this message and any copies.
>>>
>>>
>>>
>>> On Thu, Nov 8, 2012 at 11:43 AM, Mohammad Tariq <do...@gmail.com>wrote:
>>>
>>>> Hello Robin,
>>>>
>>>>         Look at the TaskTracker logs and see if you find anything
>>>> interesting there.
>>>>
>>>> Regards,
>>>>     Mohammad Tariq
>>>>
>>>>
>>>>
>>>> On Thu, Nov 8, 2012 at 4:03 PM, Robin Verlangen <ro...@us2.nl> wrote:
>>>>
>>>>> Hi there,
>>>>>
>>>>> It seems that some of my jobs hang in the reduce phase for a very long
>>>>> time (for example, days). Is there anything I could tweak on? The query is
>>>>> pretty simple, like:
>>>>>
>>>>> SELECT SUM(colA), to_date(colB) AS dt FROM table GROUP BY
>>>>> to_date(colB) ORDER BY dt ASC;
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Robin Verlangen
>>>>> *Software engineer*
>>>>> *
>>>>> *
>>>>> W http://www.robinverlangen.nl
>>>>> E robin@us2.nl
>>>>>
>>>>> <http://goo.gl/Lt7BC>
>>>>>
>>>>> Disclaimer: The information contained in this message and attachments
>>>>> is intended solely for the attention and use of the named addressee and may
>>>>> be confidential. If you are not the intended recipient, you are reminded
>>>>> that the information remains the property of the sender. You must not use,
>>>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>>>> received this message in error, please contact the sender immediately and
>>>>> irrevocably delete this message and any copies.
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reduce phase hangs

Posted by Mohammad Tariq <do...@gmail.com>.
Maybe you need more heap or that nodes's health is not fine. Or maybe the
data block on that node is causing the code to go in a very long loop.
There can be several reasons. It's tough to say anything confidently
without knowing the real issue.

Regards,
    Mohammad Tariq



On Thu, Nov 8, 2012 at 5:39 PM, Robin Verlangen <ro...@us2.nl> wrote:

> It actually seems that only 1 of the 24 reducers hangs at the "copy"
> phase. Any solutions for this?
>
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> <http://goo.gl/Lt7BC>
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>
>
> On Thu, Nov 8, 2012 at 11:47 AM, Robin Verlangen <ro...@us2.nl> wrote:
>
>> Nothing special over there. Most of the jobs complete after quite some
>> time. However it makes no sense to me that it takes that long. Its probably
>> just a couple of hundreds of megabytes.
>>
>> Best regards,
>>
>> Robin Verlangen
>> *Software engineer*
>> *
>> *
>> W http://www.robinverlangen.nl
>> E robin@us2.nl
>>
>> <http://goo.gl/Lt7BC>
>>
>> Disclaimer: The information contained in this message and attachments is
>> intended solely for the attention and use of the named addressee and may be
>> confidential. If you are not the intended recipient, you are reminded that
>> the information remains the property of the sender. You must not use,
>> disclose, distribute, copy, print or rely on this e-mail. If you have
>> received this message in error, please contact the sender immediately and
>> irrevocably delete this message and any copies.
>>
>>
>>
>> On Thu, Nov 8, 2012 at 11:43 AM, Mohammad Tariq <do...@gmail.com>wrote:
>>
>>> Hello Robin,
>>>
>>>         Look at the TaskTracker logs and see if you find anything
>>> interesting there.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Thu, Nov 8, 2012 at 4:03 PM, Robin Verlangen <ro...@us2.nl> wrote:
>>>
>>>> Hi there,
>>>>
>>>> It seems that some of my jobs hang in the reduce phase for a very long
>>>> time (for example, days). Is there anything I could tweak on? The query is
>>>> pretty simple, like:
>>>>
>>>> SELECT SUM(colA), to_date(colB) AS dt FROM table GROUP BY to_date(colB)
>>>> ORDER BY dt ASC;
>>>>
>>>> Best regards,
>>>>
>>>> Robin Verlangen
>>>> *Software engineer*
>>>> *
>>>> *
>>>> W http://www.robinverlangen.nl
>>>> E robin@us2.nl
>>>>
>>>> <http://goo.gl/Lt7BC>
>>>>
>>>> Disclaimer: The information contained in this message and attachments
>>>> is intended solely for the attention and use of the named addressee and may
>>>> be confidential. If you are not the intended recipient, you are reminded
>>>> that the information remains the property of the sender. You must not use,
>>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>>> received this message in error, please contact the sender immediately and
>>>> irrevocably delete this message and any copies.
>>>>
>>>>
>>>
>>
>

Re: Reduce phase hangs

Posted by Robin Verlangen <ro...@us2.nl>.
It actually seems that only 1 of the 24 reducers hangs at the "copy" phase.
Any solutions for this?

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

<http://goo.gl/Lt7BC>

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



On Thu, Nov 8, 2012 at 11:47 AM, Robin Verlangen <ro...@us2.nl> wrote:

> Nothing special over there. Most of the jobs complete after quite some
> time. However it makes no sense to me that it takes that long. Its probably
> just a couple of hundreds of megabytes.
>
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> <http://goo.gl/Lt7BC>
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>
>
> On Thu, Nov 8, 2012 at 11:43 AM, Mohammad Tariq <do...@gmail.com>wrote:
>
>> Hello Robin,
>>
>>         Look at the TaskTracker logs and see if you find anything
>> interesting there.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Thu, Nov 8, 2012 at 4:03 PM, Robin Verlangen <ro...@us2.nl> wrote:
>>
>>> Hi there,
>>>
>>> It seems that some of my jobs hang in the reduce phase for a very long
>>> time (for example, days). Is there anything I could tweak on? The query is
>>> pretty simple, like:
>>>
>>> SELECT SUM(colA), to_date(colB) AS dt FROM table GROUP BY to_date(colB)
>>> ORDER BY dt ASC;
>>>
>>> Best regards,
>>>
>>> Robin Verlangen
>>> *Software engineer*
>>> *
>>> *
>>> W http://www.robinverlangen.nl
>>> E robin@us2.nl
>>>
>>> <http://goo.gl/Lt7BC>
>>>
>>> Disclaimer: The information contained in this message and attachments is
>>> intended solely for the attention and use of the named addressee and may be
>>> confidential. If you are not the intended recipient, you are reminded that
>>> the information remains the property of the sender. You must not use,
>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>> received this message in error, please contact the sender immediately and
>>> irrevocably delete this message and any copies.
>>>
>>>
>>
>

Re: Reduce phase hangs

Posted by Robin Verlangen <ro...@us2.nl>.
Nothing special over there. Most of the jobs complete after quite some
time. However it makes no sense to me that it takes that long. Its probably
just a couple of hundreds of megabytes.

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

<http://goo.gl/Lt7BC>

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



On Thu, Nov 8, 2012 at 11:43 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Robin,
>
>         Look at the TaskTracker logs and see if you find anything
> interesting there.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Thu, Nov 8, 2012 at 4:03 PM, Robin Verlangen <ro...@us2.nl> wrote:
>
>> Hi there,
>>
>> It seems that some of my jobs hang in the reduce phase for a very long
>> time (for example, days). Is there anything I could tweak on? The query is
>> pretty simple, like:
>>
>> SELECT SUM(colA), to_date(colB) AS dt FROM table GROUP BY to_date(colB)
>> ORDER BY dt ASC;
>>
>> Best regards,
>>
>> Robin Verlangen
>> *Software engineer*
>> *
>> *
>> W http://www.robinverlangen.nl
>> E robin@us2.nl
>>
>> <http://goo.gl/Lt7BC>
>>
>> Disclaimer: The information contained in this message and attachments is
>> intended solely for the attention and use of the named addressee and may be
>> confidential. If you are not the intended recipient, you are reminded that
>> the information remains the property of the sender. You must not use,
>> disclose, distribute, copy, print or rely on this e-mail. If you have
>> received this message in error, please contact the sender immediately and
>> irrevocably delete this message and any copies.
>>
>>
>

Re: Reduce phase hangs

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Robin,

        Look at the TaskTracker logs and see if you find anything
interesting there.

Regards,
    Mohammad Tariq



On Thu, Nov 8, 2012 at 4:03 PM, Robin Verlangen <ro...@us2.nl> wrote:

> Hi there,
>
> It seems that some of my jobs hang in the reduce phase for a very long
> time (for example, days). Is there anything I could tweak on? The query is
> pretty simple, like:
>
> SELECT SUM(colA), to_date(colB) AS dt FROM table GROUP BY to_date(colB)
> ORDER BY dt ASC;
>
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> <http://goo.gl/Lt7BC>
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>