You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-dev@hadoop.apache.org by Maxim Zizin <mz...@gmail.com> on 2011/02/08 17:59:34 UTC

JobTracker memory usage peaks once a day and OOM sometimes

Hi all,

We monitor JT, NN and SNN memory usage and observe the following 
behavior in our Hadoop cluster. JT's heap size is set to 2000m. About 18 
hours a day it uses ~1GB but every day roughly at the minute it was 
started its used memory increases to ~1.5GB and then decreases back to 
~1GB in about 6 hours. Sometimes this takes a bit more than 6 hours, 
sometimes a bit less. I was wondering whether anyone here knows what JT 
does once a day that makes it use 1.5 times more memory than normally.

We're so interested in JT memory usage because during last two weeks we 
twice had JT getting out of heap space. Both times right after those 
daily used memory peaks when it was going down from 1.5GB to 1GB it 
started increasing again until got stuck at ~2.2GB. After that it 
becomes unresponsive and we have to restart it.

We're using Cloudera's CDH2 version 0.20.1+169.113.

-- 
Regards, Max

Re: JobTracker memory usage peaks once a day and OOM sometimes

Posted by Maxim Zizin <mz...@iponweb.net>.

Arun,

Thanks for an advice. I will definitely consider using that release.

On 2/8/2011 11:56 PM, Arun C Murthy wrote:
> As Allen mentioned it's really hard for anyone here to help you with 
> CDH. You'll have to ask their user lists.
>
> OTOH, there are significant enhancements we've done to the JobTracker 
> in 0.21 and 0.22... the problem is that there isn't an Apache release 
> yet.
>
> the JT we run at Y! has a *huge* number of fixes for memory and 
> performance (>100% improvement) which we've started putting into a 
> 0.20.100 branch - you might be interested in using that if you want a 
> packaged release: http://people.apache.org/~acmurthy/hadoop-0.20.100-rc0/
>
> Arun
>
> On Feb 8, 2011, at 8:59 AM, Maxim Zizin wrote:
>
>> Hi all,
>>
>> We monitor JT, NN and SNN memory usage and observe the following
>> behavior in our Hadoop cluster. JT's heap size is set to 2000m. About 18
>> hours a day it uses ~1GB but every day roughly at the minute it was
>> started its used memory increases to ~1.5GB and then decreases back to
>> ~1GB in about 6 hours. Sometimes this takes a bit more than 6 hours,
>> sometimes a bit less. I was wondering whether anyone here knows what JT
>> does once a day that makes it use 1.5 times more memory than normally.
>>
>> We're so interested in JT memory usage because during last two weeks we
>> twice had JT getting out of heap space. Both times right after those
>> daily used memory peaks when it was going down from 1.5GB to 1GB it
>> started increasing again until got stuck at ~2.2GB. After that it
>> becomes unresponsive and we have to restart it.
>>
>> We're using Cloudera's CDH2 version 0.20.1+169.113.
>>
>> -- 
>> Regards, Max
>

-- 
Regards, Max

Re: Status on MAPREDUCE-279 and plans forward

Posted by Arun C Murthy <ac...@yahoo-inc.com>.

On Jun 11, 2011, at 6:28 AM, Tom White wrote:

> Hi Arun,
>
> Great to hear that MR2 is nearly ready for trunk.
>
> MR2 is being built using Maven, so step 5 is about getting Maven to
> build all of MR I presume? The svn unsplit (HADOOP-7106) and
> Mavenization work in HADOOP-6671 should make this approach more
> tractable. Or were you thinking that there would be a hybrid Ant/Maven
> build during a transition period?
>

Good point Tom. I'm thinking of having a hybrid ant/maven transition  
period to allow both to proceed independently.

> Also, for step 3, what did you think of the directory layout I
> suggested in http://bit.ly/mr-279-maven-layout?
>

Makes sense, maybe we should stage that too. Lets continue that  
discussion on jira.

thanks!
Arun

> Cheers,
> Tom
>
> On Fri, Jun 10, 2011 at 12:44 PM, Arun C Murthy <ac...@yahoo-inc.com>  
> wrote:
>> Another heads up.
>>
>> Things are going well, we have done scale testing (350 nodes) and  
>> continue
>> to do more.
>>
>> We think we are nearly ready to merge MR-279 into trunk (yaay!).
>>
>> A proposal for plan forward:
>>
>> 1. I'd like to get MAPREDUCE-2400 committed to both trunk and  
>> MR-279. This
>> is a very important patch by Tom which allows to submit jobs to  
>> both JT and
>> the new RM - thanks Tom!
>> 2. Check-in the 'yarn' subfolder which is just the ResourceManager  
>> and
>> NodeManager i.e. compute fabric to trunk.
>> 3. Rework the src tree in trunk to look like the MR-279 branch to  
>> ensure no
>> changes to trunk are lost - this will ensure separation between the
>> 'classic' framework (JT/TT) and the MR runtime (map/reduce/sort/ 
>> shuffle
>> etc.)
>> 4. Re-play the 4-5 patches needed to the MR runtime from the MR-279  
>> branch.
>> 5. Change build scripts to reflect the new reality.
>>
>> In terms of timelines, I'm hoping we can get this done in the  
>> coming 2-3
>> weeks.
>>
>> Thoughts?
>>
>> thanks,
>> Arun
>>

Re: Status on MAPREDUCE-279 and plans forward

Posted by Tom White <to...@cloudera.com>.

Hi Arun,

Great to hear that MR2 is nearly ready for trunk.

MR2 is being built using Maven, so step 5 is about getting Maven to
build all of MR I presume? The svn unsplit (HADOOP-7106) and
Mavenization work in HADOOP-6671 should make this approach more
tractable. Or were you thinking that there would be a hybrid Ant/Maven
build during a transition period?

Also, for step 3, what did you think of the directory layout I
suggested in http://bit.ly/mr-279-maven-layout?

Cheers,
Tom

On Fri, Jun 10, 2011 at 12:44 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
> Another heads up.
>
> Things are going well, we have done scale testing (350 nodes) and continue
> to do more.
>
> We think we are nearly ready to merge MR-279 into trunk (yaay!).
>
> A proposal for plan forward:
>
> 1. I'd like to get MAPREDUCE-2400 committed to both trunk and MR-279. This
> is a very important patch by Tom which allows to submit jobs to both JT and
> the new RM - thanks Tom!
> 2. Check-in the 'yarn' subfolder which is just the ResourceManager and
> NodeManager i.e. compute fabric to trunk.
> 3. Rework the src tree in trunk to look like the MR-279 branch to ensure no
> changes to trunk are lost - this will ensure separation between the
> 'classic' framework (JT/TT) and the MR runtime (map/reduce/sort/shuffle
> etc.)
> 4. Re-play the 4-5 patches needed to the MR runtime from the MR-279 branch.
> 5. Change build scripts to reflect the new reality.
>
> In terms of timelines, I'm hoping we can get this done in the coming 2-3
> weeks.
>
> Thoughts?
>
> thanks,
> Arun
>

Status on MAPREDUCE-279 and plans forward

Posted by Arun C Murthy <ac...@yahoo-inc.com>.

Another heads up.

Things are going well, we have done scale testing (350 nodes) and  
continue to do more.

We think we are nearly ready to merge MR-279 into trunk (yaay!).

A proposal for plan forward:

1. I'd like to get MAPREDUCE-2400 committed to both trunk and MR-279.  
This is a very important patch by Tom which allows to submit jobs to  
both JT and the new RM - thanks Tom!
2. Check-in the 'yarn' subfolder which is just the ResourceManager and  
NodeManager i.e. compute fabric to trunk.
3. Rework the src tree in trunk to look like the MR-279 branch to  
ensure no changes to trunk are lost - this will ensure separation  
between the 'classic' framework (JT/TT) and the MR runtime (map/reduce/ 
sort/shuffle etc.)
4. Re-play the 4-5 patches needed to the MR runtime from the MR-279  
branch.
5. Change build scripts to reflect the new reality.

In terms of timelines, I'm hoping we can get this done in the coming  
2-3 weeks.

Thoughts?

thanks,
Arun

Re: Developing on MAPREDUCE-279

Posted by Arun C Murthy <ac...@yahoo-inc.com>.

Folks,

On Mar 21, 2011, Arun C Murthy wrote:
>   We are very happy to have pushed MR-279 to the MR-279 dev-branch
> where we look forward to collaborating with everyone.
>
>   This work is only partially complete. We plan to continue its
> development in the branch and do so in an aggressive manner - we want
> to get it done!

Just thought I'd give a heads up.
Things are progressing really well on MR-279 - all of the major  
features we planned for are complete and checked-in; and  we are  
currently in a developer-testing phase. We would love for more eyes an  
feedback.
We have put up some test plans at http://wiki.apache.org/hadoop/NextGenMapReduceDevTesting 
.
I've been asked for some more docs offline and I'll get some of it on  
the MAPREDUCE-279 jira soon (hopefully tmrw).
thanks,
Arun

Re: JobTracker memory usage peaks once a day and OOM sometimes

Posted by Arun C Murthy <ac...@yahoo-inc.com>.

On Feb 8, 2011, at 12:56 PM, Arun C Murthy wrote:

> As Allen mentioned it's really hard for anyone here to help you with
> CDH. You'll have to ask their user lists.
>
> OTOH, there are significant enhancements we've done to the JobTracker
> in 0.21 and 0.22... the problem is that there isn't an Apache release
> yet.
>
> the JT we run at Y! has a *huge* number of fixes for memory and
> performance (>100% improvement) which we've started putting into a
> 0.20.100 branch - you might be interested in using that if you want a
> packaged release: http://people.apache.org/~acmurthy/hadoop-0.20.100-rc0/
>

I should point out that this isn't an 'Apache Release', just something  
for you to try if you are interested.

Arun

> Arun
>
> On Feb 8, 2011, at 8:59 AM, Maxim Zizin wrote:
>
>> Hi all,
>>
>> We monitor JT, NN and SNN memory usage and observe the following
>> behavior in our Hadoop cluster. JT's heap size is set to 2000m.
>> About 18
>> hours a day it uses ~1GB but every day roughly at the minute it was
>> started its used memory increases to ~1.5GB and then decreases back  
>> to
>> ~1GB in about 6 hours. Sometimes this takes a bit more than 6 hours,
>> sometimes a bit less. I was wondering whether anyone here knows what
>> JT
>> does once a day that makes it use 1.5 times more memory than  
>> normally.
>>
>> We're so interested in JT memory usage because during last two weeks
>> we
>> twice had JT getting out of heap space. Both times right after those
>> daily used memory peaks when it was going down from 1.5GB to 1GB it
>> started increasing again until got stuck at ~2.2GB. After that it
>> becomes unresponsive and we have to restart it.
>>
>> We're using Cloudera's CDH2 version 0.20.1+169.113.
>>
>> -- 
>> Regards, Max
>

Re: JobTracker memory usage peaks once a day and OOM sometimes

Posted by Arun C Murthy <ac...@yahoo-inc.com>.

As Allen mentioned it's really hard for anyone here to help you with  
CDH. You'll have to ask their user lists.

OTOH, there are significant enhancements we've done to the JobTracker  
in 0.21 and 0.22... the problem is that there isn't an Apache release  
yet.

the JT we run at Y! has a *huge* number of fixes for memory and  
performance (>100% improvement) which we've started putting into a  
0.20.100 branch - you might be interested in using that if you want a  
packaged release: http://people.apache.org/~acmurthy/hadoop-0.20.100-rc0/

Arun

On Feb 8, 2011, at 8:59 AM, Maxim Zizin wrote:

> Hi all,
>
> We monitor JT, NN and SNN memory usage and observe the following
> behavior in our Hadoop cluster. JT's heap size is set to 2000m.  
> About 18
> hours a day it uses ~1GB but every day roughly at the minute it was
> started its used memory increases to ~1.5GB and then decreases back to
> ~1GB in about 6 hours. Sometimes this takes a bit more than 6 hours,
> sometimes a bit less. I was wondering whether anyone here knows what  
> JT
> does once a day that makes it use 1.5 times more memory than normally.
>
> We're so interested in JT memory usage because during last two weeks  
> we
> twice had JT getting out of heap space. Both times right after those
> daily used memory peaks when it was going down from 1.5GB to 1GB it
> started increasing again until got stuck at ~2.2GB. After that it
> becomes unresponsive and we have to restart it.
>
> We're using Cloudera's CDH2 version 0.20.1+169.113.
>
> -- 
> Regards, Max

Re: JobTracker memory usage peaks once a day and OOM sometimes

Posted by Maxim Zizin <mz...@iponweb.net>.

Todd,

Thanks for your attention.

Re: a few weeks ago -- I might have missed it. Will look for that thread.

Re: garbage -- I thought that would look more like a progressive raise 
that starts right after the previous drop and ends with another drop 
(like saw). While I see instant raises by 1.5 times, then plateau, then 
instant drop (in ~6 hours), then almost no raise till next instant raise 
in ~18 hours. Sorry If I look dumb by thinking so -- I just don't know 
much about Java and GC. I use streaming and write mappers and reducers 
mostly on Perl.

Re: how many jobs/tasks -- We have roughly a dozen of jobs running at a 
time. Each having ~5 mappers and 1-2 reducers. Although sometimes we 
have jobs with a few hundreds of mappers and a few tens of reducers. But 
they are not daily -- some are hourly, others run four times a day.

On 2/8/2011 11:46 PM, Todd Lipcon wrote:
> Hi Maxim,
>
> I thought I responded to this question already a few weeks ago - maybe not
> :)
>
> Looking at the heap usage of a Java process using default garbage collectors
> is always misleading. In particular, unless you are using the concurrent
> mark and sweep (CMS) GC, a collection won't begin until the old generation
> is actually full. So, you will see a pattern of the heap filling up and then
> dropping back down, like you're describing.
>
> You can hook up JConsole to your daemon and hit the "GC" button to see how
> much actual live data you've got.
>
> In general I'd agree with Allen's assessment that you're probably just
> holding too many tasks in RAM. How many jobs are generally queued or running
> at a time, and how many tasks do each of those jobs contain? If you're in
> the hundreds of thousands there, it's probably just that you need more heap
> allotted to the JT.
>
> -Todd
>
> On Tue, Feb 8, 2011 at 12:35 PM, Maxim Zizin<mz...@iponweb.net>  wrote:
>
>> Allen,
>>
>> Thanks for your answer.
>>
>> Re: handful of jobs -- That was our first thought. But we looked at the
>> logs and found nothing strange. Moreover after JT's restart the time the
>> peaks start shifted. When we restarted it one more time it shifted again. In
>> all cases first peak after restart starts in ~24 hours since restart. So
>> this seems to be some scheduled daily thing or something and does not depend
>> on the jobs we run.
>>
>> Re: heap size -- We have a cluster of 12 slaves. 2GB seems to be enough as
>> it uses ~1GB normally and ~1.5GB during peaks. Although we're going to
>> increase JT's heap size up to 3GB tomorrow. This will at least give us more
>> time to pause crons and restart JT until it goes out of heap space next
>> time. Or am I wrong when I think that the fact that our JT uses 1-1.5 GB
>> means that 2GB of heap is enough?
>>
>>
>> On 2/8/2011 11:16 PM, Allen Wittenauer wrote:
>>
>>> On Feb 8, 2011, at 8:59 AM, Maxim Zizin wrote:
>>>
>>>   Hi all,
>>>> We monitor JT, NN and SNN memory usage and observe the following behavior
>>>> in our Hadoop cluster. JT's heap size is set to 2000m. About 18 hours a day
>>>> it uses ~1GB but every day roughly at the minute it was started its used
>>>> memory increases to ~1.5GB and then decreases back to ~1GB in about 6 hours.
>>>> Sometimes this takes a bit more than 6 hours, sometimes a bit less. I was
>>>> wondering whether anyone here knows what JT does once a day that makes it
>>>> use 1.5 times more memory than normally.
>>>>
>>>> We're so interested in JT memory usage because during last two weeks we
>>>> twice had JT getting out of heap space. Both times right after those daily
>>>> used memory peaks when it was going down from 1.5GB to 1GB it started
>>>> increasing again until got stuck at ~2.2GB. After that it becomes
>>>> unresponsive and we have to restart it.
>>>>
>>>> We're using Cloudera's CDH2 version 0.20.1+169.113.
>>>>
>>>         Who knows what is happening in the CDH release?
>>>
>>>         But in the normal job tracker, keep in mind that memory is consumed
>>> by every individual task listed on the main page.  If you have some jobs
>>> that have extremely high task counts or a lot of counters or really long
>>> names or ..., then that is likely your problem.  Chances are good you have a
>>> handful of jobs that are bad citizens that are getting scrolled off the page
>>> at the same time every day.
>>>
>>>         Also, for any grid of any significant size, 2g of heap is way too
>>> small.
>>>
>> --
>> Regards, Max
>>
>>
>

-- 
Regards, Max

Re: JobTracker memory usage peaks once a day and OOM sometimes

Posted by Todd Lipcon <to...@cloudera.com>.

Hi Maxim,

I thought I responded to this question already a few weeks ago - maybe not
:)

Looking at the heap usage of a Java process using default garbage collectors
is always misleading. In particular, unless you are using the concurrent
mark and sweep (CMS) GC, a collection won't begin until the old generation
is actually full. So, you will see a pattern of the heap filling up and then
dropping back down, like you're describing.

You can hook up JConsole to your daemon and hit the "GC" button to see how
much actual live data you've got.

In general I'd agree with Allen's assessment that you're probably just
holding too many tasks in RAM. How many jobs are generally queued or running
at a time, and how many tasks do each of those jobs contain? If you're in
the hundreds of thousands there, it's probably just that you need more heap
allotted to the JT.

-Todd

On Tue, Feb 8, 2011 at 12:35 PM, Maxim Zizin <mz...@iponweb.net> wrote:

> Allen,
>
> Thanks for your answer.
>
> Re: handful of jobs -- That was our first thought. But we looked at the
> logs and found nothing strange. Moreover after JT's restart the time the
> peaks start shifted. When we restarted it one more time it shifted again. In
> all cases first peak after restart starts in ~24 hours since restart. So
> this seems to be some scheduled daily thing or something and does not depend
> on the jobs we run.
>
> Re: heap size -- We have a cluster of 12 slaves. 2GB seems to be enough as
> it uses ~1GB normally and ~1.5GB during peaks. Although we're going to
> increase JT's heap size up to 3GB tomorrow. This will at least give us more
> time to pause crons and restart JT until it goes out of heap space next
> time. Or am I wrong when I think that the fact that our JT uses 1-1.5 GB
> means that 2GB of heap is enough?
>
>
> On 2/8/2011 11:16 PM, Allen Wittenauer wrote:
>
>> On Feb 8, 2011, at 8:59 AM, Maxim Zizin wrote:
>>
>>  Hi all,
>>>
>>> We monitor JT, NN and SNN memory usage and observe the following behavior
>>> in our Hadoop cluster. JT's heap size is set to 2000m. About 18 hours a day
>>> it uses ~1GB but every day roughly at the minute it was started its used
>>> memory increases to ~1.5GB and then decreases back to ~1GB in about 6 hours.
>>> Sometimes this takes a bit more than 6 hours, sometimes a bit less. I was
>>> wondering whether anyone here knows what JT does once a day that makes it
>>> use 1.5 times more memory than normally.
>>>
>>> We're so interested in JT memory usage because during last two weeks we
>>> twice had JT getting out of heap space. Both times right after those daily
>>> used memory peaks when it was going down from 1.5GB to 1GB it started
>>> increasing again until got stuck at ~2.2GB. After that it becomes
>>> unresponsive and we have to restart it.
>>>
>>> We're using Cloudera's CDH2 version 0.20.1+169.113.
>>>
>>        Who knows what is happening in the CDH release?
>>
>>        But in the normal job tracker, keep in mind that memory is consumed
>> by every individual task listed on the main page.  If you have some jobs
>> that have extremely high task counts or a lot of counters or really long
>> names or ..., then that is likely your problem.  Chances are good you have a
>> handful of jobs that are bad citizens that are getting scrolled off the page
>> at the same time every day.
>>
>>        Also, for any grid of any significant size, 2g of heap is way too
>> small.
>>
>
> --
> Regards, Max
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: JobTracker memory usage peaks once a day and OOM sometimes

Posted by Maxim Zizin <mz...@iponweb.net>.

Allen,

Thanks for your answer.

Re: handful of jobs -- That was our first thought. But we looked at the 
logs and found nothing strange. Moreover after JT's restart the time the 
peaks start shifted. When we restarted it one more time it shifted 
again. In all cases first peak after restart starts in ~24 hours since 
restart. So this seems to be some scheduled daily thing or something and 
does not depend on the jobs we run.

Re: heap size -- We have a cluster of 12 slaves. 2GB seems to be enough 
as it uses ~1GB normally and ~1.5GB during peaks. Although we're going 
to increase JT's heap size up to 3GB tomorrow. This will at least give 
us more time to pause crons and restart JT until it goes out of heap 
space next time. Or am I wrong when I think that the fact that our JT 
uses 1-1.5 GB means that 2GB of heap is enough?

On 2/8/2011 11:16 PM, Allen Wittenauer wrote:
> On Feb 8, 2011, at 8:59 AM, Maxim Zizin wrote:
>
>> Hi all,
>>
>> We monitor JT, NN and SNN memory usage and observe the following behavior in our Hadoop cluster. JT's heap size is set to 2000m. About 18 hours a day it uses ~1GB but every day roughly at the minute it was started its used memory increases to ~1.5GB and then decreases back to ~1GB in about 6 hours. Sometimes this takes a bit more than 6 hours, sometimes a bit less. I was wondering whether anyone here knows what JT does once a day that makes it use 1.5 times more memory than normally.
>>
>> We're so interested in JT memory usage because during last two weeks we twice had JT getting out of heap space. Both times right after those daily used memory peaks when it was going down from 1.5GB to 1GB it started increasing again until got stuck at ~2.2GB. After that it becomes unresponsive and we have to restart it.
>>
>> We're using Cloudera's CDH2 version 0.20.1+169.113.
> 	Who knows what is happening in the CDH release?
>
> 	But in the normal job tracker, keep in mind that memory is consumed by every individual task listed on the main page.  If you have some jobs that have extremely high task counts or a lot of counters or really long names or ..., then that is likely your problem.  Chances are good you have a handful of jobs that are bad citizens that are getting scrolled off the page at the same time every day.
>
> 	Also, for any grid of any significant size, 2g of heap is way too small.

-- 
Regards, Max

Re: JobTracker memory usage peaks once a day and OOM sometimes

Posted by Allen Wittenauer <aw...@linkedin.com>.

On Feb 8, 2011, at 8:59 AM, Maxim Zizin wrote:

> Hi all,
> 
> We monitor JT, NN and SNN memory usage and observe the following behavior in our Hadoop cluster. JT's heap size is set to 2000m. About 18 hours a day it uses ~1GB but every day roughly at the minute it was started its used memory increases to ~1.5GB and then decreases back to ~1GB in about 6 hours. Sometimes this takes a bit more than 6 hours, sometimes a bit less. I was wondering whether anyone here knows what JT does once a day that makes it use 1.5 times more memory than normally.
> 
> We're so interested in JT memory usage because during last two weeks we twice had JT getting out of heap space. Both times right after those daily used memory peaks when it was going down from 1.5GB to 1GB it started increasing again until got stuck at ~2.2GB. After that it becomes unresponsive and we have to restart it.
> 
> We're using Cloudera's CDH2 version 0.20.1+169.113.

	Who knows what is happening in the CDH release?

	But in the normal job tracker, keep in mind that memory is consumed by every individual task listed on the main page.  If you have some jobs that have extremely high task counts or a lot of counters or really long names or ..., then that is likely your problem.  Chances are good you have a handful of jobs that are bad citizens that are getting scrolled off the page at the same time every day.

	Also, for any grid of any significant size, 2g of heap is way too small.