You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by David Parks <da...@yahoo.com> on 2012/12/17 06:36:15 UTC

OutOfMemory in ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory

I've got 15 boxes in a cluster, 7.5GB of ram each on AWS (m1.large), 1
reducer per node.

 

I'm seeing this exception sometimes. It's not stopping the job from
completing, it's just failing 3 or 4 reduce tasks and slowing things down:

 

Error: java.lang.OutOfMemoryError: Java heap space

        at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMe
mory(ReduceTask.java:1711)

        at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutpu
t(ReduceTask.java:1571)

        at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(
ReduceTask.java:1412)

        at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceT
ask.java:1344)

 

Seems like it's clearly addressed here.

https://issues.apache.org/jira/browse/MAPREDUCE-1182

 

I've talked with AWS support and verified that the patch listed in that JIRA
issue has been applied to 1.0.3 on AWS. 

 

Any thoughts here?