You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Steve Lewis <lo...@gmail.com> on 2011/10/25 22:38:12 UTC

I need help reducing reducer memory

I have problems with reduce tasks failing with GC overhead limit exceeded
My reduce job retains a small amount of data in memory while processing each
key discarding it after the
key is handled
My *mapred.child.java.opts is *-Xmx1200m
I tried
 mapred.job.shuffle.input.buffer.percent = 0.20
 mapred.job.reduce.input.buffer.percent=0.30

I really don't know what parameters I can set to lower the memory footprint
of my reducer and could use help

I am only passing tens of thousands of keys with thousands of values - each
value will be maybe 10KB


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: I need help reducing reducer memory

Posted by SRINIVAS SURASANI <va...@gmail.com>.

Steve,

I had similar problem while loading data from HDFS to Teradata with reducer.
Adding the following switches may help .

hadoop jar *.jar -Dmapred.child.java.opts="-Xmx1200m/2400m
 -xx:-UseGCOverheadLimit"  <i/p> <o/p>
and also you may try -Dmapred.job.reuse.jvm.num.tasks=1.

Regards,
Srinivas

On Tue, Oct 25, 2011 at 4:38 PM, Steve Lewis <lo...@gmail.com> wrote:

> I have problems with reduce tasks failing with GC overhead limit exceeded
> My reduce job retains a small amount of data in memory while processing
> each
> key discarding it after the
> key is handled
> My *mapred.child.java.opts is *-Xmx1200m
> I tried
>  mapred.job.shuffle.input.buffer.percent = 0.20
>  mapred.job.reduce.input.buffer.percent=0.30
>
> I really don't know what parameters I can set to lower the memory footprint
> of my reducer and could use help
>
> I am only passing tens of thousands of keys with thousands of values - each
> value will be maybe 10KB
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>