You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by himanshu chandola <hi...@yahoo.com> on 2009/11/24 08:48:08 UTC

Maps getting stuck at 100%

Hi,
I use cloudera's distribution for hadoop. What I see is that a small fraction of maps get stuck at 100%. They show up as 100% but continue running. After a lot of delay, they succeed finally but it takes a while, like 10 mins from the time when they show up as 100%.

We recently reformatted our hadoop fs. Could it be related to that ? 


Thanks




 Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.



      

Re: Maps getting stuck at 100%

Posted by himanshu chandola <hi...@yahoo.com>.
The data is still the same.

I will check on logs and see if I can find something.

H

 Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.




________________________________
From: Rekha Joshi <re...@yahoo-inc.com>
To: "mapreduce-user@hadoop.apache.org" <ma...@hadoop.apache.org>
Sent: Tue, November 24, 2009 4:11:01 AM
Subject: Re: Maps getting stuck at 100%

Re: Maps getting stuck at 100% Even if code is the same, if the data it processes has changed (for eg: date related data), or the parameters are different(for eg:sort/spill on map), the change in behavior can occur.
Seems to me related to buffering concern.The detailed logs can point out what exactly is happening.

Thanks & Regards,
/R


On 11/24/09 2:18 PM, "himanshu chandola" <hi...@yahoo.com> wrote:


Hi Todd,
>>It was definitely working fine a week before and the code hasn't changed much. On my laptop a pseudo distributed installation for the same code finishes successive map reduce iteration quickly enough.
>
>>As far as I can see it, it is probably due to reformatting the fs. But I can't understand why it occurs this way.
>
>>tx
>
>>Himanshu
>> 
>>Morpheus: Do you believe in fate, Neo?
>>Neo: No.
>>Morpheus: Why Not?
>>Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
________________________________
From:Todd Lipcon <to...@cloudera.com>
>To: mapreduce-user@hadoop.apache.org
>Sent: Tue, November 24, 2009 2:52:51 AM
>Subject: Re: Maps getting stuck at 100%
>
>>Hi Himanshu,
>
>>The map progress percentage is calculated based on the input read, rather than the processing actually done. So, if you're doing a lot of work in your mapper, or reading ahead of what you've processed, you'll see this behavior reasonably often. It also can show up sometimes in streaming jobs if you are doing a lot of work per row, since have more buffering going on between the counters and your actual mapper work.
>
>>The easiest way to see what the tasks are doing is to drill down to the logs for an individual task that's stuck at 100%. If you add some logging output to your program, that can be helpful. Another trick, if you have the right access, is to ssh into your tasktracker node and send the SIGQUIT signal to one of your task pids - this will make it dump stack to its stdout log, which you can then inspect to understand what's going on.
>
>>Hope that helps
>>-Todd
>
>>On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola <hi...@yahoo.com> wrote:
>
>Hi,
>>>>I use cloudera's distribution for hadoop. What I see is that a small fraction of maps get stuck at 100%. They show up as 100% but continue running. After a lot of delay, they succeed finally but it takes a while, like 10 mins from the time when they show up as 100%.
>>
>>>>We recently reformatted our hadoop fs. Could it be related to that ?
>>
>>
>>>>Thanks
>>
>>
>>
>>
>>>> Morpheus: Do you believe in fate, Neo?
>>>>Neo: No.
>>>>Morpheus: Why Not?
>>>>Neo: Because I don't like the idea that I'm not in control of my life.
>>
>>
>>
>>
>>
>
>> 
>


      

Re: Maps getting stuck at 100%

Posted by Rekha Joshi <re...@yahoo-inc.com>.
Even if code is the same, if the data it processes has changed (for eg: date related data), or the parameters are different(for eg:sort/spill on map), the change in behavior can occur.
Seems to me related to buffering concern.The detailed logs can point out what exactly is happening.

Thanks & Regards,
/R


On 11/24/09 2:18 PM, "himanshu chandola" <hi...@yahoo.com> wrote:

Hi Todd,
It was definitely working fine a week before and the code hasn't changed much. On my laptop a pseudo distributed installation for the same code finishes successive map reduce iteration quickly enough.

As far as I can see it, it is probably due to reformatting the fs. But I can't understand why it occurs this way.

tx

Himanshu

Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.


________________________________
From: Todd Lipcon <to...@cloudera.com>
To: mapreduce-user@hadoop.apache.org
Sent: Tue, November 24, 2009 2:52:51 AM
Subject: Re: Maps getting stuck at 100%

Hi Himanshu,

The map progress percentage is calculated based on the input read, rather than the processing actually done. So, if you're doing a lot of work in your mapper, or reading ahead of what you've processed, you'll see this behavior reasonably often. It also can show up sometimes in streaming jobs if you are doing a lot of work per row, since have more buffering going on between the counters and your actual mapper work.

The easiest way to see what the tasks are doing is to drill down to the logs for an individual task that's stuck at 100%. If you add some logging output to your program, that can be helpful. Another trick, if you have the right access, is to ssh into your tasktracker node and send the SIGQUIT signal to one of your task pids - this will make it dump stack to its stdout log, which you can then inspect to understand what's going on.

Hope that helps
-Todd

On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola <hi...@yahoo.com> wrote:
Hi,
I use cloudera's distribution for hadoop. What I see is that a small fraction of maps get stuck at 100%. They show up as 100% but continue running. After a lot of delay, they succeed finally but it takes a while, like 10 mins from the time when they show up as 100%.

We recently reformatted our hadoop fs. Could it be related to that ?


Thanks




 Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.








Re: Maps getting stuck at 100%

Posted by himanshu chandola <hi...@yahoo.com>.
Hi Todd,
It was definitely working fine a week before and the code hasn't changed much. On my laptop a pseudo distributed installation for the same code finishes successive map reduce iteration quickly enough.

As far as I can see it, it is probably due to reformatting the fs. But I can't understand why it occurs this way.

tx

Himanshu

 Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.




________________________________
From: Todd Lipcon <to...@cloudera.com>
To: mapreduce-user@hadoop.apache.org
Sent: Tue, November 24, 2009 2:52:51 AM
Subject: Re: Maps getting stuck at 100%

Hi Himanshu,

The map progress percentage is calculated based on the input read, rather than the processing actually done. So, if you're doing a lot of work in your mapper, or reading ahead of what you've processed, you'll see this behavior reasonably often. It also can show up sometimes in streaming jobs if you are doing a lot of work per row, since have more buffering going on between the counters and your actual mapper work.

The easiest way to see what the tasks are doing is to drill down to the logs for an individual task that's stuck at 100%. If you add some logging output to your program, that can be helpful. Another trick, if you have the right access, is to ssh into your tasktracker node and send the SIGQUIT signal to one of your task pids - this will make it dump stack to its stdout log, which you can then inspect to understand what's going on.

Hope that helps
-Todd


On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola <hi...@yahoo.com> wrote:

Hi,
>>I use cloudera's distribution for hadoop. What I see is that a small fraction of maps get stuck at 100%. They show up as 100% but continue running. After a lot of delay, they succeed finally but it takes a while, like 10 mins from the time when they show up as 100%.
>
>>We recently reformatted our hadoop fs. Could it be related to that ?
>
>
>>Thanks
>
>
>
>
>> Morpheus: Do you believe in fate, Neo?
>>Neo: No.
>>Morpheus: Why Not?
>>Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
>
>



      

Re: Maps getting stuck at 100%

Posted by Todd Lipcon <to...@cloudera.com>.
Hi Himanshu,

The map progress percentage is calculated based on the input read, rather
than the processing actually done. So, if you're doing a lot of work in your
mapper, or reading ahead of what you've processed, you'll see this behavior
reasonably often. It also can show up sometimes in streaming jobs if you are
doing a lot of work per row, since have more buffering going on between the
counters and your actual mapper work.

The easiest way to see what the tasks are doing is to drill down to the logs
for an individual task that's stuck at 100%. If you add some logging output
to your program, that can be helpful. Another trick, if you have the right
access, is to ssh into your tasktracker node and send the SIGQUIT signal to
one of your task pids - this will make it dump stack to its stdout log,
which you can then inspect to understand what's going on.

Hope that helps
-Todd

On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola <
himanshu_coolguy@yahoo.com> wrote:

> Hi,
> I use cloudera's distribution for hadoop. What I see is that a small
> fraction of maps get stuck at 100%. They show up as 100% but continue
> running. After a lot of delay, they succeed finally but it takes a while,
> like 10 mins from the time when they show up as 100%.
>
> We recently reformatted our hadoop fs. Could it be related to that ?
>
>
> Thanks
>
>
>
>
>  Morpheus: Do you believe in fate, Neo?
> Neo: No.
> Morpheus: Why Not?
> Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
>
>