You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Chandra Prakash Bhagtani (JIRA)" <ji...@apache.org> on 2009/11/03 11:22:00 UTC
[jira] Updated: (HADOOP-6357) Reducers fail with OutOfMemoryError
while copying Map outputs
[ https://issues.apache.org/jira/browse/HADOOP-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chandra Prakash Bhagtani updated HADOOP-6357:
---------------------------------------------
Fix Version/s: 0.20.0
Status: Patch Available (was: Open)
The problem was related to java int usein ReducerTask ShuffleRamManager reserve method check-
// Wait till the request can be fulfilled...
while ((size + requestedSize) > maxSize) {
The check fails if (size+requestedSize) exceeds Integer.MAX_VALUE and "wraps around" into a negative value thus failing the check. This forces all subsequent requests to keep on reserving the RAM and finally crash the JVM.
My fix is: while (((long)size + (long)requestedSize) > maxSize) {
It worked!!!!!
> Reducers fail with OutOfMemoryError while copying Map outputs
> -------------------------------------------------------------
>
> Key: HADOOP-6357
> URL: https://issues.apache.org/jira/browse/HADOOP-6357
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 0.20.0
> Reporter: Chandra Prakash Bhagtani
> Fix For: 0.20.0
>
>
> Reducers fail while copying Map outputs with following exception
> java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1539) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1432) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1285) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1216) ,Error:
> Reducer's memory usage keeps on increasing and ultimately exceeds -Xmx value
> I even tried with -Xmx6.5g to each reducer but it's still failing
> While looking into the reducer logs, I found that reducers were doing shuffleInMemory every time, rather than doing shuffleOnDisk
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.