You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by 심탁길 <10...@nhncorp.com> on 2008/09/26 08:26:37 UTC

Jobtracker is out of memory with 100,000 dummy map tasks job

 
Hi all
 
Recently I tried 100,000 dummy map tasks job on the 100ea node(2GB, Dual Core, 64Bit machine, Version: 0.16.4) cluster
 
Map task does nothing but sleeping one minute
 
I found that Jobtracker(1GB Heap) consumes about 650MB of heap memory when the job is 50% done.
 
After all, the job failed at the 90% of progress because Jobtracker hanged up(?) due to out of memory.
 
how do you handle this kind of issue? 
 
another related issue:
 
while the above job was being processed, I clicked on the "Pending" on jobdatails.jsp of web UI
 
then, Jobtracker consumed 100% of CPU. and 100% CPU status lasted a couple of minutes

Re: Jobtracker is out of memory with 100,000 dummy map tasks job

Posted by Arun C Murthy <ac...@yahoo-inc.com>.

On Sep 25, 2008, at 11:26 PM, 심탁길 wrote:

>
> Hi all
>
> Recently I tried 100,000 dummy map tasks job on the 100ea node(2GB,  
> Dual Core, 64Bit machine, Version: 0.16.4) cluster
>
> Map task does nothing but sleeping one minute
>
> I found that Jobtracker(1GB Heap) consumes about 650MB of heap  
> memory when the job is 50% done.
>
> After all, the job failed at the 90% of progress because Jobtracker  
> hanged up(?) due to out of memory.
>

Funny that you bring this up today... I just now concluded that we  
need to port
https://issues.apache.org/jira/browse/HADOOP-3670 to 0.17 branch to  
fix a sluggish JobTracker who was spending all his time doing GC.

Please upgrade to 0.18 if you can, immdediately! *smile*

Arun

> how do you handle this kind of issue?
>
> another related issue:
>
> while the above job was being processed, I clicked on the "Pending"  
> on jobdatails.jsp of web UI
>
> then, Jobtracker consumed 100% of CPU. and 100% CPU status lasted a  
> couple of minutes
>

Re: Jobtracker is out of memory with 100,000 dummy map tasks job

Posted by Amar Kamat <am...@yahoo-inc.com>.

Amar Kamat wrote:
> Ted Dunning wrote:
>> Why do you try to do 100,000 map tasks?  Also, do you mean that you 
>> had 100
>> nodes, each with 2GB?  If so, that is much too small a machine to try 
>> to run
>> 1000 tasks on.  It is much better to run about the same number of 
>> tasks per
>> machine as you have cores (2-3 in your case).   Then you can easily 
>> split
>> your input into 100,000 pieces which will run in sequence.  For most
>> problems, however, it is better to let the system split your data so 
>> that
>> you get a few tens of seconds of work per split.  It is inefficient 
>> to have
>> very short tasks and it is inconvenient to have long-running tasks.
>>
>> On Thu, Sep 25, 2008 at 11:26 PM, 심탁길 <10...@nhncorp.com> wrote:
>>
>>  
>>> Hi all
>>>
>>> Recently I tried 100,000 dummy map tasks job on the 100ea node(2GB, 
>>> Dual
>>> Core, 64Bit machine, Version: 0.16.4) cluster
>>>     
> I assume you are using hadoop-0.16.4. This issue got fixed in 
> hadoop-0.17 where the JT was made a bit more efficient in terms of 
> handling large number of fast finishing maps. See  HADOOP-2119 for 
> more details.
I meant http://issues.apache.org/jira/browse/HADOOP-2119.
Amar
> Amar
>>> Map task does nothing but sleeping one minute
>>>
>>> I found that Jobtracker(1GB Heap) consumes about 650MB of heap 
>>> memory when
>>> the job is 50% done.
>>>
>>> After all, the job failed at the 90% of progress because Jobtracker 
>>> hanged
>>> up(?) due to out of memory.
>>>
>>> how do you handle this kind of issue?
>>>
>>> another related issue:
>>>
>>> while the above job was being processed, I clicked on the "Pending" on
>>> jobdatails.jsp of web UI
>>>
>>> then, Jobtracker consumed 100% of CPU. and 100% CPU status lasted a 
>>> couple
>>> of minutes
>>>
>>>
>>>     
>>
>>
>>   
>

Re: Jobtracker is out of memory with 100,000 dummy map tasks job

Posted by Amar Kamat <am...@yahoo-inc.com>.

Ted Dunning wrote:
> Why do you try to do 100,000 map tasks?  Also, do you mean that you had 100
> nodes, each with 2GB?  If so, that is much too small a machine to try to run
> 1000 tasks on.  It is much better to run about the same number of tasks per
> machine as you have cores (2-3 in your case).   Then you can easily split
> your input into 100,000 pieces which will run in sequence.  For most
> problems, however, it is better to let the system split your data so that
> you get a few tens of seconds of work per split.  It is inefficient to have
> very short tasks and it is inconvenient to have long-running tasks.
>
> On Thu, Sep 25, 2008 at 11:26 PM, 심탁길 <10...@nhncorp.com> wrote:
>
>   
>> Hi all
>>
>> Recently I tried 100,000 dummy map tasks job on the 100ea node(2GB, Dual
>> Core, 64Bit machine, Version: 0.16.4) cluster
>>     
I assume you are using hadoop-0.16.4. This issue got fixed in 
hadoop-0.17 where the JT was made a bit more efficient in terms of 
handling large number of fast finishing maps. See  HADOOP-2119 for more 
details.
Amar
>> Map task does nothing but sleeping one minute
>>
>> I found that Jobtracker(1GB Heap) consumes about 650MB of heap memory when
>> the job is 50% done.
>>
>> After all, the job failed at the 90% of progress because Jobtracker hanged
>> up(?) due to out of memory.
>>
>> how do you handle this kind of issue?
>>
>> another related issue:
>>
>> while the above job was being processed, I clicked on the "Pending" on
>> jobdatails.jsp of web UI
>>
>> then, Jobtracker consumed 100% of CPU. and 100% CPU status lasted a couple
>> of minutes
>>
>>
>>     
>
>
>

Re: Jobtracker is out of memory with 100,000 dummy map tasks job

Posted by Ted Dunning <te...@gmail.com>.

Why do you try to do 100,000 map tasks?  Also, do you mean that you had 100
nodes, each with 2GB?  If so, that is much too small a machine to try to run
1000 tasks on.  It is much better to run about the same number of tasks per
machine as you have cores (2-3 in your case).   Then you can easily split
your input into 100,000 pieces which will run in sequence.  For most
problems, however, it is better to let the system split your data so that
you get a few tens of seconds of work per split.  It is inefficient to have
very short tasks and it is inconvenient to have long-running tasks.

On Thu, Sep 25, 2008 at 11:26 PM, 심탁길 <10...@nhncorp.com> wrote:

>
> Hi all
>
> Recently I tried 100,000 dummy map tasks job on the 100ea node(2GB, Dual
> Core, 64Bit machine, Version: 0.16.4) cluster
>
> Map task does nothing but sleeping one minute
>
> I found that Jobtracker(1GB Heap) consumes about 650MB of heap memory when
> the job is 50% done.
>
> After all, the job failed at the 90% of progress because Jobtracker hanged
> up(?) due to out of memory.
>
> how do you handle this kind of issue?
>
> another related issue:
>
> while the above job was being processed, I clicked on the "Pending" on
> jobdatails.jsp of web UI
>
> then, Jobtracker consumed 100% of CPU. and 100% CPU status lasted a couple
> of minutes
>
>

-- 
ted