You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Lili Wu <li...@gmail.com> on 2008/05/01 00:49:27 UTC

Re: OOM error with large # of map tasks

Hi Devaraj,

We don't have any special configuration on the job conf...

We only allow 3 map tasks and 3 reduce tasks in *one* node at any time.  So
we are puzzled why there are 572 job confs on *one* node?  From the heap
dump, we see there are 569 MapTask and 3 ReduceTask, (and that corresponds
to 1138 MapTaskStatus and 6 ReduceTaskStatus.)

We *think* many Map tasks were stuck in COMMIT_PENDING stage, because in
heap dump, we saw a lot of MapTaskStatus objects being in either
"UNASSIGNED" or "COMMIT_PENDING" state (the runState variable in
MapTaskStatus).   Then we took a look at another node on UI just now,  for a
given task tracker, under "Non-runnign tasks", there are at least 200 or 300
COMMIT_PENDING tasks.  It appears they stuck too.

Thanks a lot for your help!

Lili


On Wed, Apr 30, 2008 at 2:14 PM, Devaraj Das <dd...@yahoo-inc.com> wrote:

> Hi Lili, the jobconf memory consumption seems quite high. Could you please
> let us know if you pass anything in the jobconf of jobs that you run? I
> think you are seeing the 572 objects since a job is running and the
> TaskInProgress objects for tasks of the running job are kept in memory
> (but
> I need to double check this).
> Regarding COMMIT_PENDING, yes it means that tasktracker has finished
> executing the task but the jobtracker hasn't committed the output yet. In
> 0.16 all tasks have to necessarily take the transition from
> RUNNING->COMMIT_PENDING->SUCCEEDED. This behavior has been improved in
> 0.17
> (hadoop-3140) to include only tasks that generate output, i.e., a task is
> marked as SUCCEEDED if it doesn't generate any output in its output path.
>
> Devaraj
>
> > -----Original Message-----
> > From: Lili Wu [mailto:liliwu@gmail.com]
> > Sent: Thursday, May 01, 2008 2:09 AM
> > To: core-user@hadoop.apache.org
> > Cc: samr@ning.com
> > Subject: OOM error with large # of map tasks
> >
> > We are using hadoop 0.16 and are seeing a consistent problem:
> >  out of memory errors when we have a large # of map tasks.
> > The specifics of what is submitted when we reproduce this:
> >
> > three large jobs:
> > 1. 20,000 map tasks and 10 reduce tasks
> > 2. 17,000 map tasks and 10 reduce tasks
> > 3. 10,000 map tasks and 10 reduce tasks
> >
> > these are at normal priority and periodically we swap the
> > priorities around to get some tasks started by each and let
> > them complete.
> > other smaller jobs come  and go every hour or so (no more
> > than 200 map tasks, 4-10 reducers).
> >
> > Our cluster consists of 23 nodes and we have 69 map tasks and
> > 69 reduce tasks.
> > Eventually, we see consistent oom errors in the task logs and
> > the task tracker itself goes down on as many as 14 of our nodes.
> >
> > We examined a heap dump after one of these crashes of a
> > TaskTracker and found something interesting--there were 572
> > instances of JobConf's that
> > accounted for 940mb of String objects.   This seems quite odd
> > that there are
> > so many instances of JobConf.  It seems to correlate with
> > task in the COMMIT_PENDING state as shown on the status for a
> > task tracker node.  Has anyone observed something like this?
> > can anyone explain what would cause tasks to remain in this
> > state? (which also apparently is in-memory vs
> > serialized to disk...).   In general, what does
> > COMMIT_PENDING mean?  (job
> > done, but output not committed to dfs?)
> >
> > Thanks!
> >
>
>

Re: OOM error with large # of map tasks

Posted by Jason Venner <ja...@attributor.com>.

We have a problem with this in our application, in particular sometimes 
threads started by the map/reduce class block the tasktracker$child 
process from exiting when the map/reduce is done.
JMX is the number 1 cause of this for us,  Badly behaving JNI tasks is 
#2, MINA is #3

We modify the tasktracker$child main to System.exit when done, and this 
solves a very large set of this OOM's for us. The JNI tasks can run the 
machine OOM.

(our jni tasks have a 2.5gig working set each - don't ask...)

Devaraj Das wrote:
> Long term we need to see how we can minimize the memory consumption by
> objects corresponding to completed tasks in the tasktracker. 
>
>   
>> -----Original Message-----
>> From: Devaraj Das [mailto:ddas@yahoo-inc.com] 
>> Sent: Friday, May 02, 2008 1:29 AM
>> To: 'core-user@hadoop.apache.org'
>> Subject: RE: OOM error with large # of map tasks
>>
>> Hi Lili, sorry that I missed one important detail in my last 
>> response - tasks that complete successfully on tasktrackers 
>> are marked as COMMIT_PENDING by the tasktracker itself. The 
>> JobTracker takes those COMMIT_PENDING tasks, promotes their 
>> output (if applicable), and then marks them as SUCCEEDED. 
>> However, tasktrackers are not notified about these and the 
>> state of the tasks in the tasktrackers don't change, i.e., 
>> they remain in COMMIT_PENDING state. In short, COMMIT_PENDING 
>> at the tasktracker's end doesn't necessarily mean the job is stuck.
>>
>> The tasktracker keeps in its memory the objects corresponding 
>> to tasks it runs. Those objects are purged on job 
>> completion/failure only. This explains why you see so many 
>> tasks in the COMMIT_PENDING state. I believe it will create 
>> one jobconf for every task it launches.
>>
>> I am only concerned about the memory consumption by the 
>> jobconf objects. As per your report, it is ~1.6 MB per jobconf. 
>>
>> You could try things out with an increased heap size for the 
>> tasktrackers/tasks. You could increase the heap size for the 
>> tasktracker by changing the value of HADOOP_HEAPSIZE in 
>> hadoop-env.sh, and the tasks' heap size can be increased by 
>> tweaking the value of mapred.child.java.opts in the 
>> hadoop-site.xml for your job.
>>
>>     
>>> -----Original Message-----
>>> From: Lili Wu [mailto:liliwu@gmail.com]
>>> Sent: Thursday, May 01, 2008 4:19 AM
>>> To: core-user@hadoop.apache.org
>>> Subject: Re: OOM error with large # of map tasks
>>>
>>> Hi Devaraj,
>>>
>>> We don't have any special configuration on the job conf...
>>>
>>> We only allow 3 map tasks and 3 reduce tasks in *one* node at any 
>>> time.  So we are puzzled why there are 572 job confs on
>>> *one* node?  From the heap dump, we see there are 569 MapTask and 3 
>>> ReduceTask, (and that corresponds to 1138 MapTaskStatus and 6 
>>> ReduceTaskStatus.)
>>>
>>> We *think* many Map tasks were stuck in COMMIT_PENDING 
>>>       
>> stage, because 
>>     
>>> in heap dump, we saw a lot of MapTaskStatus objects being in either 
>>> "UNASSIGNED" or "COMMIT_PENDING" state (the runState variable in
>>> MapTaskStatus).   Then we took a look at another node on UI 
>>> just now,  for a
>>> given task tracker, under "Non-runnign tasks", there are at 
>>>       
>> least 200 
>>     
>>> or 300 COMMIT_PENDING tasks.  It appears they stuck too.
>>>
>>> Thanks a lot for your help!
>>>
>>> Lili
>>>
>>>
>>> On Wed, Apr 30, 2008 at 2:14 PM, Devaraj Das <dd...@yahoo-inc.com> 
>>> wrote:
>>>
>>>       
>>>> Hi Lili, the jobconf memory consumption seems quite high. 
>>>>         
>> Could you 
>>     
>>>> please let us know if you pass anything in the jobconf of 
>>>>         
>> jobs that 
>>     
>>>> you run? I think you are seeing the 572 objects since a job
>>>>         
>>> is running
>>>       
>>>> and the TaskInProgress objects for tasks of the running job
>>>>         
>>> are kept
>>>       
>>>> in memory (but I need to double check this).
>>>> Regarding COMMIT_PENDING, yes it means that tasktracker has
>>>>         
>>> finished
>>>       
>>>> executing the task but the jobtracker hasn't committed the
>>>>         
>>> output yet. 
>>>       
>>>> In
>>>> 0.16 all tasks have to necessarily take the transition from
>>>> RUNNING->COMMIT_PENDING->SUCCEEDED. This behavior has been
>>>>         
>>> improved in
>>>       
>>>> 0.17
>>>> (hadoop-3140) to include only tasks that generate output,
>>>>         
>>> i.e., a task
>>>       
>>>> is marked as SUCCEEDED if it doesn't generate any output in
>>>>         
>>> its output path.
>>>       
>>>> Devaraj
>>>>
>>>>         
>>>>> -----Original Message-----
>>>>> From: Lili Wu [mailto:liliwu@gmail.com]
>>>>> Sent: Thursday, May 01, 2008 2:09 AM
>>>>> To: core-user@hadoop.apache.org
>>>>> Cc: samr@ning.com
>>>>> Subject: OOM error with large # of map tasks
>>>>>
>>>>> We are using hadoop 0.16 and are seeing a consistent problem:
>>>>>  out of memory errors when we have a large # of map tasks.
>>>>> The specifics of what is submitted when we reproduce this:
>>>>>
>>>>> three large jobs:
>>>>> 1. 20,000 map tasks and 10 reduce tasks 2. 17,000 map
>>>>>           
>>> tasks and 10
>>>       
>>>>> reduce tasks 3. 10,000 map tasks and 10 reduce tasks
>>>>>
>>>>> these are at normal priority and periodically we swap the
>>>>>           
>>> priorities
>>>       
>>>>> around to get some tasks started by each and let them complete.
>>>>> other smaller jobs come  and go every hour or so (no more
>>>>>           
>>> than 200
>>>       
>>>>> map tasks, 4-10 reducers).
>>>>>
>>>>> Our cluster consists of 23 nodes and we have 69 map tasks and
>>>>> 69 reduce tasks.
>>>>> Eventually, we see consistent oom errors in the task 
>>>>>           
>> logs and the 
>>     
>>>>> task tracker itself goes down on as many as 14 of our nodes.
>>>>>
>>>>> We examined a heap dump after one of these crashes of a
>>>>>           
>>> TaskTracker
>>>       
>>>>> and found something interesting--there were 572 instances of 
>>>>> JobConf's that
>>>>> accounted for 940mb of String objects.   This seems quite odd
>>>>> that there are
>>>>> so many instances of JobConf.  It seems to correlate 
>>>>>           
>> with task in 
>>     
>>>>> the COMMIT_PENDING state as shown on the status for a
>>>>>           
>>> task tracker
>>>       
>>>>> node.  Has anyone observed something like this?
>>>>> can anyone explain what would cause tasks to remain in
>>>>>           
>>> this state? 
>>>       
>>>>> (which also apparently is in-memory vs
>>>>> serialized to disk...).   In general, what does
>>>>> COMMIT_PENDING mean?  (job
>>>>> done, but output not committed to dfs?)
>>>>>
>>>>> Thanks!
>>>>>
>>>>>           
>>>>         
>
>   
-- 
Jason Venner
Attributor - Program the Web <http://www.attributor.com/>
Attributor is hiring Hadoop Wranglers and coding wizards, contact if 
interested

RE: OOM error with large # of map tasks

Posted by Devaraj Das <dd...@yahoo-inc.com>.

Long term we need to see how we can minimize the memory consumption by
objects corresponding to completed tasks in the tasktracker. 

> -----Original Message-----
> From: Devaraj Das [mailto:ddas@yahoo-inc.com] 
> Sent: Friday, May 02, 2008 1:29 AM
> To: 'core-user@hadoop.apache.org'
> Subject: RE: OOM error with large # of map tasks
> 
> Hi Lili, sorry that I missed one important detail in my last 
> response - tasks that complete successfully on tasktrackers 
> are marked as COMMIT_PENDING by the tasktracker itself. The 
> JobTracker takes those COMMIT_PENDING tasks, promotes their 
> output (if applicable), and then marks them as SUCCEEDED. 
> However, tasktrackers are not notified about these and the 
> state of the tasks in the tasktrackers don't change, i.e., 
> they remain in COMMIT_PENDING state. In short, COMMIT_PENDING 
> at the tasktracker's end doesn't necessarily mean the job is stuck.
> 
> The tasktracker keeps in its memory the objects corresponding 
> to tasks it runs. Those objects are purged on job 
> completion/failure only. This explains why you see so many 
> tasks in the COMMIT_PENDING state. I believe it will create 
> one jobconf for every task it launches.
> 
> I am only concerned about the memory consumption by the 
> jobconf objects. As per your report, it is ~1.6 MB per jobconf. 
> 
> You could try things out with an increased heap size for the 
> tasktrackers/tasks. You could increase the heap size for the 
> tasktracker by changing the value of HADOOP_HEAPSIZE in 
> hadoop-env.sh, and the tasks' heap size can be increased by 
> tweaking the value of mapred.child.java.opts in the 
> hadoop-site.xml for your job.
> 
> > -----Original Message-----
> > From: Lili Wu [mailto:liliwu@gmail.com]
> > Sent: Thursday, May 01, 2008 4:19 AM
> > To: core-user@hadoop.apache.org
> > Subject: Re: OOM error with large # of map tasks
> > 
> > Hi Devaraj,
> > 
> > We don't have any special configuration on the job conf...
> > 
> > We only allow 3 map tasks and 3 reduce tasks in *one* node at any 
> > time.  So we are puzzled why there are 572 job confs on
> > *one* node?  From the heap dump, we see there are 569 MapTask and 3 
> > ReduceTask, (and that corresponds to 1138 MapTaskStatus and 6 
> > ReduceTaskStatus.)
> > 
> > We *think* many Map tasks were stuck in COMMIT_PENDING 
> stage, because 
> > in heap dump, we saw a lot of MapTaskStatus objects being in either 
> > "UNASSIGNED" or "COMMIT_PENDING" state (the runState variable in
> > MapTaskStatus).   Then we took a look at another node on UI 
> > just now,  for a
> > given task tracker, under "Non-runnign tasks", there are at 
> least 200 
> > or 300 COMMIT_PENDING tasks.  It appears they stuck too.
> > 
> > Thanks a lot for your help!
> > 
> > Lili
> > 
> > 
> > On Wed, Apr 30, 2008 at 2:14 PM, Devaraj Das <dd...@yahoo-inc.com> 
> > wrote:
> > 
> > > Hi Lili, the jobconf memory consumption seems quite high. 
> Could you 
> > > please let us know if you pass anything in the jobconf of 
> jobs that 
> > > you run? I think you are seeing the 572 objects since a job
> > is running
> > > and the TaskInProgress objects for tasks of the running job
> > are kept
> > > in memory (but I need to double check this).
> > > Regarding COMMIT_PENDING, yes it means that tasktracker has
> > finished
> > > executing the task but the jobtracker hasn't committed the
> > output yet. 
> > > In
> > > 0.16 all tasks have to necessarily take the transition from
> > > RUNNING->COMMIT_PENDING->SUCCEEDED. This behavior has been
> > improved in
> > > 0.17
> > > (hadoop-3140) to include only tasks that generate output,
> > i.e., a task
> > > is marked as SUCCEEDED if it doesn't generate any output in
> > its output path.
> > >
> > > Devaraj
> > >
> > > > -----Original Message-----
> > > > From: Lili Wu [mailto:liliwu@gmail.com]
> > > > Sent: Thursday, May 01, 2008 2:09 AM
> > > > To: core-user@hadoop.apache.org
> > > > Cc: samr@ning.com
> > > > Subject: OOM error with large # of map tasks
> > > >
> > > > We are using hadoop 0.16 and are seeing a consistent problem:
> > > >  out of memory errors when we have a large # of map tasks.
> > > > The specifics of what is submitted when we reproduce this:
> > > >
> > > > three large jobs:
> > > > 1. 20,000 map tasks and 10 reduce tasks 2. 17,000 map
> > tasks and 10
> > > > reduce tasks 3. 10,000 map tasks and 10 reduce tasks
> > > >
> > > > these are at normal priority and periodically we swap the
> > priorities
> > > > around to get some tasks started by each and let them complete.
> > > > other smaller jobs come  and go every hour or so (no more
> > than 200
> > > > map tasks, 4-10 reducers).
> > > >
> > > > Our cluster consists of 23 nodes and we have 69 map tasks and
> > > > 69 reduce tasks.
> > > > Eventually, we see consistent oom errors in the task 
> logs and the 
> > > > task tracker itself goes down on as many as 14 of our nodes.
> > > >
> > > > We examined a heap dump after one of these crashes of a
> > TaskTracker
> > > > and found something interesting--there were 572 instances of 
> > > > JobConf's that
> > > > accounted for 940mb of String objects.   This seems quite odd
> > > > that there are
> > > > so many instances of JobConf.  It seems to correlate 
> with task in 
> > > > the COMMIT_PENDING state as shown on the status for a
> > task tracker
> > > > node.  Has anyone observed something like this?
> > > > can anyone explain what would cause tasks to remain in
> > this state? 
> > > > (which also apparently is in-memory vs
> > > > serialized to disk...).   In general, what does
> > > > COMMIT_PENDING mean?  (job
> > > > done, but output not committed to dfs?)
> > > >
> > > > Thanks!
> > > >
> > >
> > >
> >

Re: OOM error with large # of map tasks

Posted by sam rash <ra...@gmail.com>.

Hi,
In fact we verified it is our jobconf--we have about 800k in input paths
(11k files for a few TB of data).
We'll indeed up the heap size to about 2048m and we can also do some
significant optimizations on the file paths (use wildcards and others).

Is there any plan to make the storage of the JobConf objects more
memory-efficient?  perhaps they can be serialized to disk or if the jobconf
doesn't change per task (ie it's inherited and not changed), why not keep
one per job in a tasktracker?  (or if it does change, 'share' the common
parts?).  This would greatly help us we have 3 jobs of 20k tasks and if one
gets halfway and we bump another job up, we end up with 1000s of complete
tasks (but no complete jobs) per tasktracker.  Even with our trimming of our
jobconf object and increasing the heap size, we'll hit a limit pretty quick.

thx,
-sr

On Thu, May 1, 2008 at 12:58 PM, Devaraj Das <dd...@yahoo-inc.com> wrote:

> Hi Lili, sorry that I missed one important detail in my last response -
> tasks that complete successfully on tasktrackers are marked as
> COMMIT_PENDING by the tasktracker itself. The JobTracker takes those
> COMMIT_PENDING tasks, promotes their output (if applicable), and then
> marks
> them as SUCCEEDED. However, tasktrackers are not notified about these and
> the state of the tasks in the tasktrackers don't change, i.e., they remain
> in COMMIT_PENDING state. In short, COMMIT_PENDING at the tasktracker's end
> doesn't necessarily mean the job is stuck.
>
> The tasktracker keeps in its memory the objects corresponding to tasks it
> runs. Those objects are purged on job completion/failure only. This
> explains
> why you see so many tasks in the COMMIT_PENDING state. I believe it will
> create one jobconf for every task it launches.
>
> I am only concerned about the memory consumption by the jobconf objects.
> As
> per your report, it is ~1.6 MB per jobconf.
>
> You could try things out with an increased heap size for the
> tasktrackers/tasks. You could increase the heap size for the tasktracker
> by
> changing the value of HADOOP_HEAPSIZE in hadoop-env.sh, and the tasks'
> heap
> size can be increased by tweaking the value of mapred.child.java.opts in
> the
> hadoop-site.xml for your job.
>
> > -----Original Message-----
> > From: Lili Wu [mailto:liliwu@gmail.com]
> > Sent: Thursday, May 01, 2008 4:19 AM
> > To: core-user@hadoop.apache.org
> > Subject: Re: OOM error with large # of map tasks
> >
> > Hi Devaraj,
> >
> > We don't have any special configuration on the job conf...
> >
> > We only allow 3 map tasks and 3 reduce tasks in *one* node at
> > any time.  So we are puzzled why there are 572 job confs on
> > *one* node?  From the heap dump, we see there are 569 MapTask
> > and 3 ReduceTask, (and that corresponds to 1138 MapTaskStatus
> > and 6 ReduceTaskStatus.)
> >
> > We *think* many Map tasks were stuck in COMMIT_PENDING stage,
> > because in heap dump, we saw a lot of MapTaskStatus objects
> > being in either "UNASSIGNED" or "COMMIT_PENDING" state (the
> > runState variable in
> > MapTaskStatus).   Then we took a look at another node on UI
> > just now,  for a
> > given task tracker, under "Non-runnign tasks", there are at
> > least 200 or 300 COMMIT_PENDING tasks.  It appears they stuck too.
> >
> > Thanks a lot for your help!
> >
> > Lili
> >
> >
> > On Wed, Apr 30, 2008 at 2:14 PM, Devaraj Das
> > <dd...@yahoo-inc.com> wrote:
> >
> > > Hi Lili, the jobconf memory consumption seems quite high. Could you
> > > please let us know if you pass anything in the jobconf of jobs that
> > > you run? I think you are seeing the 572 objects since a job
> > is running
> > > and the TaskInProgress objects for tasks of the running job
> > are kept
> > > in memory (but I need to double check this).
> > > Regarding COMMIT_PENDING, yes it means that tasktracker has
> > finished
> > > executing the task but the jobtracker hasn't committed the
> > output yet.
> > > In
> > > 0.16 all tasks have to necessarily take the transition from
> > > RUNNING->COMMIT_PENDING->SUCCEEDED. This behavior has been
> > improved in
> > > 0.17
> > > (hadoop-3140) to include only tasks that generate output,
> > i.e., a task
> > > is marked as SUCCEEDED if it doesn't generate any output in
> > its output path.
> > >
> > > Devaraj
> > >
> > > > -----Original Message-----
> > > > From: Lili Wu [mailto:liliwu@gmail.com]
> > > > Sent: Thursday, May 01, 2008 2:09 AM
> > > > To: core-user@hadoop.apache.org
> > > > Cc: samr@ning.com
> > > > Subject: OOM error with large # of map tasks
> > > >
> > > > We are using hadoop 0.16 and are seeing a consistent problem:
> > > >  out of memory errors when we have a large # of map tasks.
> > > > The specifics of what is submitted when we reproduce this:
> > > >
> > > > three large jobs:
> > > > 1. 20,000 map tasks and 10 reduce tasks 2. 17,000 map
> > tasks and 10
> > > > reduce tasks 3. 10,000 map tasks and 10 reduce tasks
> > > >
> > > > these are at normal priority and periodically we swap the
> > priorities
> > > > around to get some tasks started by each and let them complete.
> > > > other smaller jobs come  and go every hour or so (no more
> > than 200
> > > > map tasks, 4-10 reducers).
> > > >
> > > > Our cluster consists of 23 nodes and we have 69 map tasks and
> > > > 69 reduce tasks.
> > > > Eventually, we see consistent oom errors in the task logs and the
> > > > task tracker itself goes down on as many as 14 of our nodes.
> > > >
> > > > We examined a heap dump after one of these crashes of a
> > TaskTracker
> > > > and found something interesting--there were 572 instances of
> > > > JobConf's that
> > > > accounted for 940mb of String objects.   This seems quite odd
> > > > that there are
> > > > so many instances of JobConf.  It seems to correlate with task in
> > > > the COMMIT_PENDING state as shown on the status for a
> > task tracker
> > > > node.  Has anyone observed something like this?
> > > > can anyone explain what would cause tasks to remain in
> > this state?
> > > > (which also apparently is in-memory vs
> > > > serialized to disk...).   In general, what does
> > > > COMMIT_PENDING mean?  (job
> > > > done, but output not committed to dfs?)
> > > >
> > > > Thanks!
> > > >
> > >
> > >
> >
>
>

RE: OOM error with large # of map tasks

Posted by Devaraj Das <dd...@yahoo-inc.com>.

Hi Lili, sorry that I missed one important detail in my last response -
tasks that complete successfully on tasktrackers are marked as
COMMIT_PENDING by the tasktracker itself. The JobTracker takes those
COMMIT_PENDING tasks, promotes their output (if applicable), and then marks
them as SUCCEEDED. However, tasktrackers are not notified about these and
the state of the tasks in the tasktrackers don't change, i.e., they remain
in COMMIT_PENDING state. In short, COMMIT_PENDING at the tasktracker's end
doesn't necessarily mean the job is stuck.

The tasktracker keeps in its memory the objects corresponding to tasks it
runs. Those objects are purged on job completion/failure only. This explains
why you see so many tasks in the COMMIT_PENDING state. I believe it will
create one jobconf for every task it launches.

I am only concerned about the memory consumption by the jobconf objects. As
per your report, it is ~1.6 MB per jobconf. 

You could try things out with an increased heap size for the
tasktrackers/tasks. You could increase the heap size for the tasktracker by
changing the value of HADOOP_HEAPSIZE in hadoop-env.sh, and the tasks' heap
size can be increased by tweaking the value of mapred.child.java.opts in the
hadoop-site.xml for your job.

> -----Original Message-----
> From: Lili Wu [mailto:liliwu@gmail.com] 
> Sent: Thursday, May 01, 2008 4:19 AM
> To: core-user@hadoop.apache.org
> Subject: Re: OOM error with large # of map tasks
> 
> Hi Devaraj,
> 
> We don't have any special configuration on the job conf...
> 
> We only allow 3 map tasks and 3 reduce tasks in *one* node at 
> any time.  So we are puzzled why there are 572 job confs on 
> *one* node?  From the heap dump, we see there are 569 MapTask 
> and 3 ReduceTask, (and that corresponds to 1138 MapTaskStatus 
> and 6 ReduceTaskStatus.)
> 
> We *think* many Map tasks were stuck in COMMIT_PENDING stage, 
> because in heap dump, we saw a lot of MapTaskStatus objects 
> being in either "UNASSIGNED" or "COMMIT_PENDING" state (the 
> runState variable in
> MapTaskStatus).   Then we took a look at another node on UI 
> just now,  for a
> given task tracker, under "Non-runnign tasks", there are at 
> least 200 or 300 COMMIT_PENDING tasks.  It appears they stuck too.
> 
> Thanks a lot for your help!
> 
> Lili
> 
> 
> On Wed, Apr 30, 2008 at 2:14 PM, Devaraj Das 
> <dd...@yahoo-inc.com> wrote:
> 
> > Hi Lili, the jobconf memory consumption seems quite high. Could you 
> > please let us know if you pass anything in the jobconf of jobs that 
> > you run? I think you are seeing the 572 objects since a job 
> is running 
> > and the TaskInProgress objects for tasks of the running job 
> are kept 
> > in memory (but I need to double check this).
> > Regarding COMMIT_PENDING, yes it means that tasktracker has 
> finished 
> > executing the task but the jobtracker hasn't committed the 
> output yet. 
> > In
> > 0.16 all tasks have to necessarily take the transition from
> > RUNNING->COMMIT_PENDING->SUCCEEDED. This behavior has been 
> improved in
> > 0.17
> > (hadoop-3140) to include only tasks that generate output, 
> i.e., a task 
> > is marked as SUCCEEDED if it doesn't generate any output in 
> its output path.
> >
> > Devaraj
> >
> > > -----Original Message-----
> > > From: Lili Wu [mailto:liliwu@gmail.com]
> > > Sent: Thursday, May 01, 2008 2:09 AM
> > > To: core-user@hadoop.apache.org
> > > Cc: samr@ning.com
> > > Subject: OOM error with large # of map tasks
> > >
> > > We are using hadoop 0.16 and are seeing a consistent problem:
> > >  out of memory errors when we have a large # of map tasks.
> > > The specifics of what is submitted when we reproduce this:
> > >
> > > three large jobs:
> > > 1. 20,000 map tasks and 10 reduce tasks 2. 17,000 map 
> tasks and 10 
> > > reduce tasks 3. 10,000 map tasks and 10 reduce tasks
> > >
> > > these are at normal priority and periodically we swap the 
> priorities 
> > > around to get some tasks started by each and let them complete.
> > > other smaller jobs come  and go every hour or so (no more 
> than 200 
> > > map tasks, 4-10 reducers).
> > >
> > > Our cluster consists of 23 nodes and we have 69 map tasks and
> > > 69 reduce tasks.
> > > Eventually, we see consistent oom errors in the task logs and the 
> > > task tracker itself goes down on as many as 14 of our nodes.
> > >
> > > We examined a heap dump after one of these crashes of a 
> TaskTracker 
> > > and found something interesting--there were 572 instances of 
> > > JobConf's that
> > > accounted for 940mb of String objects.   This seems quite odd
> > > that there are
> > > so many instances of JobConf.  It seems to correlate with task in 
> > > the COMMIT_PENDING state as shown on the status for a 
> task tracker 
> > > node.  Has anyone observed something like this?
> > > can anyone explain what would cause tasks to remain in 
> this state? 
> > > (which also apparently is in-memory vs
> > > serialized to disk...).   In general, what does
> > > COMMIT_PENDING mean?  (job
> > > done, but output not committed to dfs?)
> > >
> > > Thanks!
> > >
> >
> >
>