You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-dev@hadoop.apache.org by saurabh jain <sa...@gmail.com> on 2014/08/02 21:31:29 UTC

Caused by: java.lang.OutOfMemoryError: Java heap space - Copy Phase

Hi Folks ,

I am getting the below mentioned exception while running the Map reduce job
during the copy phase of the Mappers output.


I googled about it and tried all the possible solutions suggested but none
of them worked out in my case.

I tried to increase the memory available to JVM  -
D mapred.child.java.opts=-Xmx8G but didn't work out.

Then I also tried to increase the memory available for reducer -D
mapreduce.reduce.memory=2048m. No luck here also.

I also tried to reduce the -D mapred.job.reduce.input.buffer.percent , so
that it moves the output to disk instead of keeping it in memory but no
luck here also.

Please advise if I am missing something very basic here.


Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
in shuffle in fetcher#5
        at
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
        at
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
        at
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.<init>(InMemoryMapOutput.java:63)
        at
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.unconditionalReserve(MergeManagerImpl.java:297)
        at
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:287)
        at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:411)
        at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:341)
        at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

Thanks
Saurabh

Re: Caused by: java.lang.OutOfMemoryError: Java heap space - Copy Phase

Posted by saurabh jain <sa...@gmail.com>.

Value for mapreduce.task.io.sort.mb is 256

On Mon, Aug 4, 2014 at 12:53 AM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

> Hope you are using the hadoop2 .
>
> How much is configured for mapreduce.task.io.sort.mb..?
>
>
>
> I also suggest please go through the following link for memory
> configurations which will direct you..
>
>
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.1/bk_installing_manually_book/content/rpm-chap1-11.html
>
> Cluster nodes have 12 CPU cores, 48 GB RAM, and 12 disks.
>
> Reserved Memory = 6 GB reserved for system memory + (if HBase) 8 GB for
> HBase
>
> Min container size = 2 GB
>
>
> If there is no HBase:
>
> # of containers = min (2*12, 1.8* 12, (48-6)/2) = min (24, 21.6, 21) = 21
>
> RAM-per-container = max (2, (48-6)/21) = max (2, 2) = 2
>
>
>
> Configuration
>  Value Calculation
>
> yarn.nodemanager.resource.memory-mb
>  = 21 * 2 = 42*1024 MB
>
> yarn.scheduler.minimum-allocation-mb
>  = 2*1024 MB
>
> yarn.scheduler.maximum-allocation-mb
>  = 21 * 2 = 42*1024 MB
>
> mapreduce.map.memory.mb
>  = 2*1024 MB
>
> mapreduce.reduce.memory.mb
>  = 2 * 2 = 4*1024 MB
>
> mapreduce.map.java.opts
>  = 0.8 * 2 = 1.6*1024 MB
>
> mapreduce.reduce.java.opts
>  = 0.8 * 2 * 2 = 3.2*1024 MB
>
> yarn.app.mapreduce.am.resource.mb
>  = 2 * 2 = 4*1024 MB
>
> yarn.app.mapreduce.am.command-opts
>  = 0.8 * 2 * 2 = 3.2*1024 MB
>
>
>
>
> Thanks & Regards
> Brahma Reddy Battula
>
>
>
>
> ________________________________________
> From: saurabh jain [sauravmanit@gmail.com]
> Sent: Sunday, August 03, 2014 1:01 AM
> To: mapreduce-issues@hadoop.apache.org; common-dev@hadoop.apache.org;
> mapreduce-dev@hadoop.apache.org
> Cc: saurabh jain
> Subject: Caused by: java.lang.OutOfMemoryError: Java heap space - Copy
> Phase
>
> Hi Folks ,
>
> I am getting the below mentioned exception while running the Map reduce job
> during the copy phase of the Mappers output.
>
>
> I googled about it and tried all the possible solutions suggested but none
> of them worked out in my case.
>
> I tried to increase the memory available to JVM  -
> D mapred.child.java.opts=-Xmx8G but didn't work out.
>
> Then I also tried to increase the memory available for reducer -D
> mapreduce.reduce.memory=2048m. No luck here also.
>
> I also tried to reduce the -D mapred.job.reduce.input.buffer.percent , so
> that it moves the output to disk instead of keeping it in memory but no
> luck here also.
>
> Please advise if I am missing something very basic here.
>
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#5
>         at
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>         at
>
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
>         at
>
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
>         at
>
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.<init>(InMemoryMapOutput.java:63)
>         at
>
> org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.unconditionalReserve(MergeManagerImpl.java:297)
>         at
>
> org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:287)
>         at
>
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:411)
>         at
>
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:341)
>         at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
> Thanks
> Saurabh
>

RE: Caused by: java.lang.OutOfMemoryError: Java heap space - Copy Phase

Posted by Brahma Reddy Battula <br...@huawei.com>.

Hope you are using the hadoop2 .

How much is configured for mapreduce.task.io.sort.mb..?



I also suggest please go through the following link for memory configurations which will direct you..


http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.1/bk_installing_manually_book/content/rpm-chap1-11.html

Cluster nodes have 12 CPU cores, 48 GB RAM, and 12 disks.

Reserved Memory = 6 GB reserved for system memory + (if HBase) 8 GB for HBase

Min container size = 2 GB


If there is no HBase:

# of containers = min (2*12, 1.8* 12, (48-6)/2) = min (24, 21.6, 21) = 21

RAM-per-container = max (2, (48-6)/21) = max (2, 2) = 2



Configuration 
 Value Calculation 
 
yarn.nodemanager.resource.memory-mb 
 = 21 * 2 = 42*1024 MB 
 
yarn.scheduler.minimum-allocation-mb 
 = 2*1024 MB 
 
yarn.scheduler.maximum-allocation-mb 
 = 21 * 2 = 42*1024 MB 
 
mapreduce.map.memory.mb 
 = 2*1024 MB 
 
mapreduce.reduce.memory.mb 
 = 2 * 2 = 4*1024 MB 
 
mapreduce.map.java.opts 
 = 0.8 * 2 = 1.6*1024 MB 
 
mapreduce.reduce.java.opts 
 = 0.8 * 2 * 2 = 3.2*1024 MB 
 
yarn.app.mapreduce.am.resource.mb 
 = 2 * 2 = 4*1024 MB 
 
yarn.app.mapreduce.am.command-opts 
 = 0.8 * 2 * 2 = 3.2*1024 MB 
 



Thanks & Regards
Brahma Reddy Battula




________________________________________
From: saurabh jain [sauravmanit@gmail.com]
Sent: Sunday, August 03, 2014 1:01 AM
To: mapreduce-issues@hadoop.apache.org; common-dev@hadoop.apache.org; mapreduce-dev@hadoop.apache.org
Cc: saurabh jain
Subject: Caused by: java.lang.OutOfMemoryError: Java heap space - Copy Phase

Hi Folks ,

I am getting the below mentioned exception while running the Map reduce job
during the copy phase of the Mappers output.


I googled about it and tried all the possible solutions suggested but none
of them worked out in my case.

I tried to increase the memory available to JVM  -
D mapred.child.java.opts=-Xmx8G but didn't work out.

Then I also tried to increase the memory available for reducer -D
mapreduce.reduce.memory=2048m. No luck here also.

I also tried to reduce the -D mapred.job.reduce.input.buffer.percent , so
that it moves the output to disk instead of keeping it in memory but no
luck here also.

Please advise if I am missing something very basic here.


Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
in shuffle in fetcher#5
        at
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
        at
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
        at
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.<init>(InMemoryMapOutput.java:63)
        at
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.unconditionalReserve(MergeManagerImpl.java:297)
        at
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:287)
        at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:411)
        at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:341)
        at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

Thanks
Saurabh