You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Chris K Wensel <ch...@wensel.net> on 2009/03/12 18:12:19 UTC

Reducers spawned when mapred.reduce.tasks=0

Hey all

Have some users reporting intermittent spawning of Reducers when the  
job.xml shows mapred.reduce.tasks=0 in 0.19.0 and .1.

This is also confirmed when jobConf is queried in the (supposedly  
ignored) Reducer implementation.

In general this issue would likely go unnoticed since the default  
reducer is IdentityReducer.

but since it should be ignored in the Mapper only case, we don't  
bother not setting the value, and subsequently comes to ones attention  
rather abruptly.

am happy to open a JIRA, but wanted to see if anyone else is  
experiencing this issue.

note the issue seems to manifest with or without spec exec.

ckw

--
Chris K Wensel
chris@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/


Re: Reducers spawned when mapred.reduce.tasks=0

Posted by Amareshwari Sriramadasu <am...@yahoo-inc.com>.
Instantiation of Reducer is moved to the place where reduce() is getting 
called, in branch 0.19.1. See HADOOP-5002. Hope that should solve your 
issue with configure() method.
Thanks
Amareshwari
Chris K Wensel wrote:
> fwiw, we have released a workaround for this issue in Cascading 1.0.5.
>
> http://www.cascading.org/
> http://cascading.googlecode.com/files/cascading-1.0.5.tgz
>
> In short, Hadoop 0.19.0 and .1 instantiate the users Reducer class and 
> subsequently calls configure() when there is no intention to use the 
> class (during job/task cleanup tasks).
>
> This clearly can cause havoc for users who use configure() to 
> initialize resources used by the reduce() method.
>
> Testing for jobConf.getNumReduceTasks() is 0 inside the configure() 
> method seems to work out well.
>
> branch-0.19 looks like it won't instantiate the Reducer class during 
> job/task cleanup tasks, so I expect will leak into future releases.
>
> cheers,
>
> ckw
>
> On Mar 12, 2009, at 8:20 PM, Amareshwari Sriramadasu wrote:
>
>> Are you seeing reducers getting spawned from web ui? then, it is a bug.
>> If not, there won't be reducers spawned, it could be job-setup/ 
>> job-cleanup task that is running on a reduce slot. See HADOOP-3150 
>> and HADOOP-4261.
>> -Amareshwari
>> Chris K Wensel wrote:
>>>
>>> May have found the answer, waiting on confirmation from users.
>>>
>>> Turns out 0.19.0 and .1 instantiate the reducer class when the task 
>>> is actually intended for job/task cleanup.
>>>
>>> branch-0.19 looks like it resolves this issue by not instantiating 
>>> the reducer class in this case.
>>>
>>> I've got a workaround in the next maint release:
>>> http://github.com/cwensel/cascading/tree/wip-1.0.5
>>>
>>> ckw
>>>
>>> On Mar 12, 2009, at 10:12 AM, Chris K Wensel wrote:
>>>
>>>> Hey all
>>>>
>>>> Have some users reporting intermittent spawning of Reducers when 
>>>> the job.xml shows mapred.reduce.tasks=0 in 0.19.0 and .1.
>>>>
>>>> This is also confirmed when jobConf is queried in the (supposedly 
>>>> ignored) Reducer implementation.
>>>>
>>>> In general this issue would likely go unnoticed since the default 
>>>> reducer is IdentityReducer.
>>>>
>>>> but since it should be ignored in the Mapper only case, we don't 
>>>> bother not setting the value, and subsequently comes to ones 
>>>> attention rather abruptly.
>>>>
>>>> am happy to open a JIRA, but wanted to see if anyone else is 
>>>> experiencing this issue.
>>>>
>>>> note the issue seems to manifest with or without spec exec.
>>>>
>>>> ckw
>>>>
>>>> --Chris K Wensel
>>>> chris@wensel.net
>>>> http://www.cascading.org/
>>>> http://www.scaleunlimited.com/
>>>>
>>>
>>> --Chris K Wensel
>>> chris@wensel.net
>>> http://www.cascading.org/
>>> http://www.scaleunlimited.com/
>>>
>>
>
> -- 
> Chris K Wensel
> chris@wensel.net
> http://www.cascading.org/
> http://www.scaleunlimited.com/
>


Re: Reducers spawned when mapred.reduce.tasks=0

Posted by Chris K Wensel <ch...@wensel.net>.
fwiw, we have released a workaround for this issue in Cascading 1.0.5.

http://www.cascading.org/
http://cascading.googlecode.com/files/cascading-1.0.5.tgz

In short, Hadoop 0.19.0 and .1 instantiate the users Reducer class and  
subsequently calls configure() when there is no intention to use the  
class (during job/task cleanup tasks).

This clearly can cause havoc for users who use configure() to  
initialize resources used by the reduce() method.

Testing for jobConf.getNumReduceTasks() is 0 inside the configure()  
method seems to work out well.

branch-0.19 looks like it won't instantiate the Reducer class during  
job/task cleanup tasks, so I expect will leak into future releases.

cheers,

ckw

On Mar 12, 2009, at 8:20 PM, Amareshwari Sriramadasu wrote:

> Are you seeing reducers getting spawned from web ui? then, it is a  
> bug.
> If not, there won't be reducers spawned, it could be job-setup/ job- 
> cleanup task that is running on a reduce slot. See HADOOP-3150 and  
> HADOOP-4261.
> -Amareshwari
> Chris K Wensel wrote:
>>
>> May have found the answer, waiting on confirmation from users.
>>
>> Turns out 0.19.0 and .1 instantiate the reducer class when the task  
>> is actually intended for job/task cleanup.
>>
>> branch-0.19 looks like it resolves this issue by not instantiating  
>> the reducer class in this case.
>>
>> I've got a workaround in the next maint release:
>> http://github.com/cwensel/cascading/tree/wip-1.0.5
>>
>> ckw
>>
>> On Mar 12, 2009, at 10:12 AM, Chris K Wensel wrote:
>>
>>> Hey all
>>>
>>> Have some users reporting intermittent spawning of Reducers when  
>>> the job.xml shows mapred.reduce.tasks=0 in 0.19.0 and .1.
>>>
>>> This is also confirmed when jobConf is queried in the (supposedly  
>>> ignored) Reducer implementation.
>>>
>>> In general this issue would likely go unnoticed since the default  
>>> reducer is IdentityReducer.
>>>
>>> but since it should be ignored in the Mapper only case, we don't  
>>> bother not setting the value, and subsequently comes to ones  
>>> attention rather abruptly.
>>>
>>> am happy to open a JIRA, but wanted to see if anyone else is  
>>> experiencing this issue.
>>>
>>> note the issue seems to manifest with or without spec exec.
>>>
>>> ckw
>>>
>>> -- 
>>> Chris K Wensel
>>> chris@wensel.net
>>> http://www.cascading.org/
>>> http://www.scaleunlimited.com/
>>>
>>
>> -- 
>> Chris K Wensel
>> chris@wensel.net
>> http://www.cascading.org/
>> http://www.scaleunlimited.com/
>>
>

--
Chris K Wensel
chris@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/


Re: Reducers spawned when mapred.reduce.tasks=0

Posted by Amareshwari Sriramadasu <am...@yahoo-inc.com>.
Are you seeing reducers getting spawned from web ui? then, it is a bug.
If not, there won't be reducers spawned, it could be job-setup/ 
job-cleanup task that is running on a reduce slot. See HADOOP-3150 and 
HADOOP-4261.
-Amareshwari
Chris K Wensel wrote:
>
> May have found the answer, waiting on confirmation from users.
>
> Turns out 0.19.0 and .1 instantiate the reducer class when the task is 
> actually intended for job/task cleanup.
>
> branch-0.19 looks like it resolves this issue by not instantiating the 
> reducer class in this case.
>
> I've got a workaround in the next maint release:
> http://github.com/cwensel/cascading/tree/wip-1.0.5
>
> ckw
>
> On Mar 12, 2009, at 10:12 AM, Chris K Wensel wrote:
>
>> Hey all
>>
>> Have some users reporting intermittent spawning of Reducers when the 
>> job.xml shows mapred.reduce.tasks=0 in 0.19.0 and .1.
>>
>> This is also confirmed when jobConf is queried in the (supposedly 
>> ignored) Reducer implementation.
>>
>> In general this issue would likely go unnoticed since the default 
>> reducer is IdentityReducer.
>>
>> but since it should be ignored in the Mapper only case, we don't 
>> bother not setting the value, and subsequently comes to ones 
>> attention rather abruptly.
>>
>> am happy to open a JIRA, but wanted to see if anyone else is 
>> experiencing this issue.
>>
>> note the issue seems to manifest with or without spec exec.
>>
>> ckw
>>
>> -- 
>> Chris K Wensel
>> chris@wensel.net
>> http://www.cascading.org/
>> http://www.scaleunlimited.com/
>>
>
> -- 
> Chris K Wensel
> chris@wensel.net
> http://www.cascading.org/
> http://www.scaleunlimited.com/
>


Re: Reducers spawned when mapred.reduce.tasks=0

Posted by Chris K Wensel <ch...@wensel.net>.
May have found the answer, waiting on confirmation from users.

Turns out 0.19.0 and .1 instantiate the reducer class when the task is  
actually intended for job/task cleanup.

branch-0.19 looks like it resolves this issue by not instantiating the  
reducer class in this case.

I've got a workaround in the next maint release:
http://github.com/cwensel/cascading/tree/wip-1.0.5

ckw

On Mar 12, 2009, at 10:12 AM, Chris K Wensel wrote:

> Hey all
>
> Have some users reporting intermittent spawning of Reducers when the  
> job.xml shows mapred.reduce.tasks=0 in 0.19.0 and .1.
>
> This is also confirmed when jobConf is queried in the (supposedly  
> ignored) Reducer implementation.
>
> In general this issue would likely go unnoticed since the default  
> reducer is IdentityReducer.
>
> but since it should be ignored in the Mapper only case, we don't  
> bother not setting the value, and subsequently comes to ones  
> attention rather abruptly.
>
> am happy to open a JIRA, but wanted to see if anyone else is  
> experiencing this issue.
>
> note the issue seems to manifest with or without spec exec.
>
> ckw
>
> --
> Chris K Wensel
> chris@wensel.net
> http://www.cascading.org/
> http://www.scaleunlimited.com/
>

--
Chris K Wensel
chris@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/