You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by David Rosenstrauch <da...@darose.net> on 2012/06/08 17:26:42 UTC

Out of memory (heap space) errors on job tracker

Our job tracker has been seizing up with Out of Memory (heap space) 
errors for the past 2 nights.  After the first night's crash, I doubled 
the heap space (from the default of 1GB) to 2GB before restarting the 
job.  After last night's crash I doubled it again to 4GB.

This all seems a bit puzzling to me.  I wouldn't have thought that the 
job tracker should require so much memory.  (The NameNode, yes, but not 
the job tracker.)

Just wondering if this behavior sounds reasonable, or if perhaps there 
might be a bigger problem at play here.  Anyone have any thoughts on the 
matter?

Thanks,

DR

Re: Out of memory (heap space) errors on job tracker

Posted by David Rosenstrauch <da...@darose.net>.
I'll give that a shot, thanks.

DR

On 06/10/2012 01:40 AM, Harsh J wrote:
> Hey David,
>
> Primarily you'd need to lower down
> "mapred.jobtracker.completeuserjobs.maximum" in your mapred-site.xml
> to a value of < 25. I recommend using 5, if you don't need much
> retention of job info per user. This will help keep the JT's live
> memory usage in check and stop your crashes instead of you having to
> raise your heap all the time. There's no "leak", but this config's
> default of 100 causes much issues to JT that runs a lot of jobs per
> day (from several users).
>
> Try it out and let us know!
>
> On Sat, Jun 9, 2012 at 12:37 AM, David Rosenstrauch <da...@darose.net> wrote:
>> We're running 0.20.2 (Cloudera cdh3u4).
>>
>> What configs are you referring to?
>>
>> Thanks,
>>
>> DR
>>
>>
>> On 06/08/2012 02:59 PM, Arun C Murthy wrote:
>>>
>>> This shouldn't be happening at all...
>>>
>>> What version of hadoop are you running? Potentially you need configs to
>>> protect the JT that you are missing, those should ensure your hadoop-1.x JT
>>> is very reliable.
>>>
>>> Arun
>>>
>>> On Jun 8, 2012, at 8:26 AM, David Rosenstrauch wrote:
>>>
>>>> Our job tracker has been seizing up with Out of Memory (heap space)
>>>> errors for the past 2 nights.  After the first night's crash, I doubled the
>>>> heap space (from the default of 1GB) to 2GB before restarting the job.
>>>>   After last night's crash I doubled it again to 4GB.
>>>>
>>>> This all seems a bit puzzling to me.  I wouldn't have thought that the
>>>> job tracker should require so much memory.  (The NameNode, yes, but not the
>>>> job tracker.)
>>>>
>>>> Just wondering if this behavior sounds reasonable, or if perhaps there
>>>> might be a bigger problem at play here.  Anyone have any thoughts on the
>>>> matter?
>>>>
>>>> Thanks,
>>>>
>>>> DR
>>>
>>>
>>> --
>>> Arun C. Murthy
>>> Hortonworks Inc.
>>> http://hortonworks.com/
>>>
>>>
>>>
>>
>>
>
>
>



Re: Out of memory (heap space) errors on job tracker

Posted by David Rosenstrauch <da...@darose.net>.
On 06/10/2012 08:39 PM, Arun C Murthy wrote:
> Harsh - I'd be inclined to think it's worse than just setting mapreduce.jobtracker.completeuserjobs.maximum - the only case this would solve is if a single user submitted 25 *large* jobs (in terms of tasks) over a single 24-hr window.

That's actually the situation we have - one user, submitting multiple 
jobs, each with several thousand map steps.

I'll give that setting a shot and see if it clears it up.

Thanks,

DR

Re: Out of memory (heap space) errors on job tracker

Posted by Arun C Murthy <ac...@hortonworks.com>.
Harsh - I'd be inclined to think it's worse than just setting mapreduce.jobtracker.completeuserjobs.maximum - the only case this would solve is if a single user submitted 25 *large* jobs (in terms of tasks) over a single 24-hr window.

David - I'm guessing you aren't using the CapacityScheduler - that would help you with more controls, limits on jobs etc.

More details here: http://hadoop.apache.org/common/docs/r1.0.3/capacity_scheduler.html

In particular, look at the example config there and let us know if you need help understanding any of it.

Arun

On Jun 9, 2012, at 10:40 PM, Harsh J wrote:

> Hey David,
> 
> Primarily you'd need to lower down
> "mapred.jobtracker.completeuserjobs.maximum" in your mapred-site.xml
> to a value of < 25. I recommend using 5, if you don't need much
> retention of job info per user. This will help keep the JT's live
> memory usage in check and stop your crashes instead of you having to
> raise your heap all the time. There's no "leak", but this config's
> default of 100 causes much issues to JT that runs a lot of jobs per
> day (from several users).
> 
> Try it out and let us know!
> 
> On Sat, Jun 9, 2012 at 12:37 AM, David Rosenstrauch <da...@darose.net> wrote:
>> We're running 0.20.2 (Cloudera cdh3u4).
>> 
>> What configs are you referring to?
>> 
>> Thanks,
>> 
>> DR
>> 
>> 
>> On 06/08/2012 02:59 PM, Arun C Murthy wrote:
>>> 
>>> This shouldn't be happening at all...
>>> 
>>> What version of hadoop are you running? Potentially you need configs to
>>> protect the JT that you are missing, those should ensure your hadoop-1.x JT
>>> is very reliable.
>>> 
>>> Arun
>>> 
>>> On Jun 8, 2012, at 8:26 AM, David Rosenstrauch wrote:
>>> 
>>>> Our job tracker has been seizing up with Out of Memory (heap space)
>>>> errors for the past 2 nights.  After the first night's crash, I doubled the
>>>> heap space (from the default of 1GB) to 2GB before restarting the job.
>>>>  After last night's crash I doubled it again to 4GB.
>>>> 
>>>> This all seems a bit puzzling to me.  I wouldn't have thought that the
>>>> job tracker should require so much memory.  (The NameNode, yes, but not the
>>>> job tracker.)
>>>> 
>>>> Just wondering if this behavior sounds reasonable, or if perhaps there
>>>> might be a bigger problem at play here.  Anyone have any thoughts on the
>>>> matter?
>>>> 
>>>> Thanks,
>>>> 
>>>> DR
>>> 
>>> 
>>> --
>>> Arun C. Murthy
>>> Hortonworks Inc.
>>> http://hortonworks.com/
>>> 
>>> 
>>> 
>> 
>> 
> 
> 
> 
> -- 
> Harsh J

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Out of memory (heap space) errors on job tracker

Posted by Harsh J <ha...@cloudera.com>.
Hey David,

Primarily you'd need to lower down
"mapred.jobtracker.completeuserjobs.maximum" in your mapred-site.xml
to a value of < 25. I recommend using 5, if you don't need much
retention of job info per user. This will help keep the JT's live
memory usage in check and stop your crashes instead of you having to
raise your heap all the time. There's no "leak", but this config's
default of 100 causes much issues to JT that runs a lot of jobs per
day (from several users).

Try it out and let us know!

On Sat, Jun 9, 2012 at 12:37 AM, David Rosenstrauch <da...@darose.net> wrote:
> We're running 0.20.2 (Cloudera cdh3u4).
>
> What configs are you referring to?
>
> Thanks,
>
> DR
>
>
> On 06/08/2012 02:59 PM, Arun C Murthy wrote:
>>
>> This shouldn't be happening at all...
>>
>> What version of hadoop are you running? Potentially you need configs to
>> protect the JT that you are missing, those should ensure your hadoop-1.x JT
>> is very reliable.
>>
>> Arun
>>
>> On Jun 8, 2012, at 8:26 AM, David Rosenstrauch wrote:
>>
>>> Our job tracker has been seizing up with Out of Memory (heap space)
>>> errors for the past 2 nights.  After the first night's crash, I doubled the
>>> heap space (from the default of 1GB) to 2GB before restarting the job.
>>>  After last night's crash I doubled it again to 4GB.
>>>
>>> This all seems a bit puzzling to me.  I wouldn't have thought that the
>>> job tracker should require so much memory.  (The NameNode, yes, but not the
>>> job tracker.)
>>>
>>> Just wondering if this behavior sounds reasonable, or if perhaps there
>>> might be a bigger problem at play here.  Anyone have any thoughts on the
>>> matter?
>>>
>>> Thanks,
>>>
>>> DR
>>
>>
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>>
>
>



-- 
Harsh J

Re: Out of memory (heap space) errors on job tracker

Posted by David Rosenstrauch <da...@darose.net>.
We're running 0.20.2 (Cloudera cdh3u4).

What configs are you referring to?

Thanks,

DR

On 06/08/2012 02:59 PM, Arun C Murthy wrote:
> This shouldn't be happening at all...
>
> What version of hadoop are you running? Potentially you need configs to protect the JT that you are missing, those should ensure your hadoop-1.x JT is very reliable.
>
> Arun
>
> On Jun 8, 2012, at 8:26 AM, David Rosenstrauch wrote:
>
>> Our job tracker has been seizing up with Out of Memory (heap space) errors for the past 2 nights.  After the first night's crash, I doubled the heap space (from the default of 1GB) to 2GB before restarting the job.  After last night's crash I doubled it again to 4GB.
>>
>> This all seems a bit puzzling to me.  I wouldn't have thought that the job tracker should require so much memory.  (The NameNode, yes, but not the job tracker.)
>>
>> Just wondering if this behavior sounds reasonable, or if perhaps there might be a bigger problem at play here.  Anyone have any thoughts on the matter?
>>
>> Thanks,
>>
>> DR
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>



Re: Out of memory (heap space) errors on job tracker

Posted by Arun C Murthy <ac...@hortonworks.com>.
This shouldn't be happening at all...

What version of hadoop are you running? Potentially you need configs to protect the JT that you are missing, those should ensure your hadoop-1.x JT is very reliable.

Arun

On Jun 8, 2012, at 8:26 AM, David Rosenstrauch wrote:

> Our job tracker has been seizing up with Out of Memory (heap space) errors for the past 2 nights.  After the first night's crash, I doubled the heap space (from the default of 1GB) to 2GB before restarting the job.  After last night's crash I doubled it again to 4GB.
> 
> This all seems a bit puzzling to me.  I wouldn't have thought that the job tracker should require so much memory.  (The NameNode, yes, but not the job tracker.)
> 
> Just wondering if this behavior sounds reasonable, or if perhaps there might be a bigger problem at play here.  Anyone have any thoughts on the matter?
> 
> Thanks,
> 
> DR

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/