You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Chandra Prakash Bhagtani (JIRA)" <ji...@apache.org> on 2009/11/03 11:22:00 UTC

[jira] Updated: (HADOOP-6357) Reducers fail with OutOfMemoryError while copying Map outputs

     [ https://issues.apache.org/jira/browse/HADOOP-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chandra Prakash Bhagtani updated HADOOP-6357:
---------------------------------------------

    Fix Version/s: 0.20.0
           Status: Patch Available  (was: Open)

The problem was related to java int usein ReducerTask ShuffleRamManager reserve method check-
                                      // Wait till the request can be fulfilled...
                                      while ((size + requestedSize) > maxSize) {

The check fails if (size+requestedSize) exceeds Integer.MAX_VALUE and "wraps around" into a negative value thus failing the check. This forces all subsequent requests to keep on reserving the RAM and finally crash the JVM.

My fix is:  while (((long)size + (long)requestedSize) > maxSize) {

It worked!!!!!

> Reducers fail with OutOfMemoryError while copying Map outputs
> -------------------------------------------------------------
>
>                 Key: HADOOP-6357
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6357
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Chandra Prakash Bhagtani
>             Fix For: 0.20.0
>
>
> Reducers fail while copying Map outputs with following exception
> java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1539) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1432) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1285) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1216) ,Error:
> Reducer's memory usage keeps on increasing and ultimately exceeds -Xmx value  
> I even tried with -Xmx6.5g to each reducer but it's still failing 
> While looking into the reducer logs, I found that reducers were doing shuffleInMemory every time, rather than doing shuffleOnDisk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.