You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Mathijs Homminga <ma...@knowlogy.nl> on 2007/04/03 11:17:44 UTC

Re-reduce, without re-map

Hi,

We have some troubles with the reduce phase of our job.
Is it possible to re-execute the reduce tasks without the need to do all 
map tasks again?

Thanks!
Mathijs Homminga

Re: Re-reduce, without re-map

Posted by Owen O'Malley <ow...@yahoo-inc.com>.

On Apr 4, 2007, at 2:57 AM, Mathijs Homminga wrote:

>> Your reduce task may fail because there are too many values  
>> associated with
>> some key and it takes more than 10 minutes to process the key.  
>> Please try to
>> let your reduce task explicitly notify the task tracker that "I am  
>> alive" by
>> doing report.setStatus(String) once, for example, every 100 or  
>> 1000 values.

By the way, a fix for this problem just went in yesterday:
https://issues.apache.org/jira/browse/HADOOP-1105

Furthermore, if all you want to do is say "I'm alive", it is better  
to use:
report.progress().

-- Owen

Re: Re-reduce, without re-map

Posted by Mathijs Homminga <ma...@knowlogy.nl>.

Thanks Hairong. I will look into this.

hairong Kuang wrote:
> Mathijs,
>
> Your reduce task may fail because there are too many values associated with
> some key and it takes more than 10 minutes to process the key. Please try to
> let your reduce task explicitly notify the task tracker that "I am alive" by
> doing report.setStatus(String) once, for example, every 100 or 1000 values. 
>
> Hairong
>
> -----Original Message-----
> From: Mathijs Homminga [mailto:mathijs.homminga@knowlogy.nl] 
> Sent: Tuesday, April 03, 2007 3:27 AM
> To: hadoop-user@lucene.apache.org
> Subject: Re: Re-reduce, without re-map
>
> Each reduce task (Nutch indexing job) gets as far as 66%, and then fails
> with the following error:
>
> "Task failed to report status for 600 seconds. Killing."
>
> In the end, no reduce task completes successfully. 
> Besides solves this issue, I was wondering if I could update code and
> configuration and start the reduce phase again without the need to redo all
> map tasks (that saves me 2 hours). Assuming of course that the output of the
> map tasks has not changed.
>
> Mathijs
>
>
>
> Arun C Murthy wrote:
>   
>> Hi Mathijs,
>>
>> Mathijs Homminga wrote:
>>     
>>> We have some troubles with the reduce phase of our job.
>>> Is it possible to re-execute the reduce tasks without the need to do 
>>> all map tasks again?
>>>
>>>       
>>   That the MR-framework already does... you don't have to re-execute 
>> the maps for the *failed* reduces. Are you noticing something else?
>>
>>   What are the 'troubles' you allude to? Also with once we get
>> HADOOP-1127 in, you should try turing on 'speculative execution' - 
>> that helps when some tasks are very slow w.r.t other similar tasks.
>>
>> Arun
>>
>>     
>>> Thanks!
>>> Mathijs Homminga
>>>       
>
>
>

RE: Re-reduce, without re-map

Posted by hairong Kuang <ha...@yahoo-inc.com>.

Mathijs,

Your reduce task may fail because there are too many values associated with
some key and it takes more than 10 minutes to process the key. Please try to
let your reduce task explicitly notify the task tracker that "I am alive" by
doing report.setStatus(String) once, for example, every 100 or 1000 values. 

Hairong

-----Original Message-----
From: Mathijs Homminga [mailto:mathijs.homminga@knowlogy.nl] 
Sent: Tuesday, April 03, 2007 3:27 AM
To: hadoop-user@lucene.apache.org
Subject: Re: Re-reduce, without re-map

Each reduce task (Nutch indexing job) gets as far as 66%, and then fails
with the following error:

"Task failed to report status for 600 seconds. Killing."

In the end, no reduce task completes successfully. 
Besides solves this issue, I was wondering if I could update code and
configuration and start the reduce phase again without the need to redo all
map tasks (that saves me 2 hours). Assuming of course that the output of the
map tasks has not changed.

Mathijs

Arun C Murthy wrote:
> Hi Mathijs,
>
> Mathijs Homminga wrote:
>>
>> We have some troubles with the reduce phase of our job.
>> Is it possible to re-execute the reduce tasks without the need to do 
>> all map tasks again?
>>
>
>   That the MR-framework already does... you don't have to re-execute 
> the maps for the *failed* reduces. Are you noticing something else?
>
>   What are the 'troubles' you allude to? Also with once we get
> HADOOP-1127 in, you should try turing on 'speculative execution' - 
> that helps when some tasks are very slow w.r.t other similar tasks.
>
> Arun
>
>> Thanks!
>> Mathijs Homminga
>

Re: Re-reduce, without re-map

Posted by Mathijs Homminga <ma...@knowlogy.nl>.

Each reduce task (Nutch indexing job) gets as far as 66%, and then fails with the following error:

"Task failed to report status for 600 seconds. Killing."

In the end, no reduce task completes successfully. 
Besides solves this issue, I was wondering if I could update code and configuration and start the reduce phase again without the need to redo all map tasks (that saves me 2 hours). Assuming of course that the output of the map tasks has not changed.

Mathijs

Arun C Murthy wrote:
> Hi Mathijs,
>
> Mathijs Homminga wrote:
>>
>> We have some troubles with the reduce phase of our job.
>> Is it possible to re-execute the reduce tasks without the need to do 
>> all map tasks again?
>>
>
>   That the MR-framework already does... you don't have to re-execute 
> the maps for the *failed* reduces. Are you noticing something else?
>
>   What are the 'troubles' you allude to? Also with once we get 
> HADOOP-1127 in, you should try turing on 'speculative execution' - 
> that helps when some tasks are very slow w.r.t other similar tasks.
>
> Arun
>
>> Thanks!
>> Mathijs Homminga
>

Re: Re-reduce, without re-map

Posted by Arun C Murthy <ar...@yahoo-inc.com>.

Hi Mathijs,

Mathijs Homminga wrote:
> 
> We have some troubles with the reduce phase of our job.
> Is it possible to re-execute the reduce tasks without the need to do all 
> map tasks again?
> 

   That the MR-framework already does... you don't have to re-execute 
the maps for the *failed* reduces. Are you noticing something else?

   What are the 'troubles' you allude to? Also with once we get 
HADOOP-1127 in, you should try turing on 'speculative execution' - that 
helps when some tasks are very slow w.r.t other similar tasks.

Arun

> Thanks!
> Mathijs Homminga