You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Eduard Skaley <e....@gmail.com> on 2012/10/31 16:45:12 UTC

Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Hello,

I'm getting this Error through job execution:

16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
16:20:32 INFO  [main]                     Job - Task Id : 
attempt_1351680008718_0018_r_000006_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
error in shuffle in fetcher#2
     at 
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:396)
     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.OutOfMemoryError: Java heap space
     at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
     at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
     at 
org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
     at 
org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
     at 
org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
     at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
     at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
     at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)

16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
16:20:43 INFO  [main]                     Job -  map 100% reduce 71%

I have no clue what the issue could be for this. I googled this issue 
and checked several sources of possible solutions but nothing does fit.

I saw this jira entry which could fit: 
https://issues.apache.org/jira/browse/MAPREDUCE-4655.

Here somebody recommends to increase the value for the property 
dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, 
but this is the value for our cluster.
http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html

The issue with the to small input files doesn't fit I think, because the 
map phase reads 137 files with each 130MB. Block Size is 128MB.

The cluster uses version 2.0.0-cdh4.1.1, 
581959ba23e4af85afd8db98b7687662fe9c5f20.

Thx

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

We increased mapreduce.reduce.memory.mb to 2GB and 
mapreduce.reduce.java.opts to 1.5GB.

Now we are getting livelocks for our jobs, map jobs don't start.

We are using CapacityScheduler because we had LiveLocks with FifoScheduler.

Does anybody have a clue ?
> By the way it happens on Yarn not on MRv1
>> each container gets 1GB at the moment.
>>> can you try increasing memory per reducer  ?
>>>
>>>
>>> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
>>> <ma...@gmail.com>> wrote:
>>>
>>>     Hello,
>>>
>>>     I'm getting this Error through job execution:
>>>
>>>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>>>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>>>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>>>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>>>     16:20:32 INFO  [main]                     Job - Task Id :
>>>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>>>     Error:
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>>     error in shuffle in fetcher#2
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>>>     Caused by: java.lang.OutOfMemoryError: Java heap space
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>>>
>>>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>>>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>>>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>>>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>>>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>>>
>>>     I have no clue what the issue could be for this. I googled this
>>>     issue and checked several sources of possible solutions but
>>>     nothing does fit.
>>>
>>>     I saw this jira entry which could fit:
>>>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>>>
>>>     Here somebody recommends to increase the value for the property
>>>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>>>     4096, but this is the value for our cluster.
>>>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>>>
>>>     The issue with the to small input files doesn't fit I think,
>>>     because the map phase reads 137 files with each 130MB. Block
>>>     Size is 128MB.
>>>
>>>     The cluster uses version 2.0.0-cdh4.1.1,
>>>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>>>
>>>     Thx
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> Nitin Pawar
>>>
>>
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

Eduard,

Would you try using the following properties in your job invocation?

-D mapreduce.map.java.opts=-Xmx768m -D
mapreduce.reduce.java.opts=-Xmx768m -D mapreduce.map.memory.mb=2000 -D
mapreduce.reduce.memory.mb=3000

Thx


On Mon, Nov 5, 2012 at 7:43 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Your error takes place during reduce task, when temporary files are written
> to memory/disk. You are clearly running low on resources. Check your memory
> “$ free –m” and disk space “$ df –H” as well as “$hadoop fs -df”
>
>
>
> I remember it took me a couple of days to figure out why I was getting heap
> size error and nothing wporked!  Becaue, I tried to write 7Gb output file
> onto a disk (in pseudo distr mode) that only had 4Gb of free space.
>
>
>
> p.s. Always test your jobs on small input first (few lines of inputs) .
>
>
>
> p.p.s. follow your job execution through web:
> http://<fully-qualified-hostan-name of your job tracker>:50030
>
>
>
>
>
> From: Eduard Skaley [mailto:e.v.skaley@gmail.com]
> Sent: Monday, November 05, 2012 4:10 AM
> To: user@hadoop.apache.org
> Cc: Nitin Pawar
> Subject: Re: Error:
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space
>
>
>
> By the way it happens on Yarn not on MRv1
>
> each container gets 1GB at the moment.
>
> can you try increasing memory per reducer  ?
>
>
>
> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com> wrote:
>
> Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id :
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#2
>     at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue and
> checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit:
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property
> dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, but
> this is the value for our cluster.
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because the map
> phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1,
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx
>
>
>
>
>
>
>
>
>
> --
> Nitin Pawar
>
>
>
>
>
> NOTICE: This e-mail message and any attachments are confidential, subject to
> copyright and may be privileged. Any unauthorized use, copying or disclosure
> is prohibited. If you are not the intended recipient, please delete and
> contact the sender immediately. Please consider the environment before
> printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui
> l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent
> être couverts par le secret professionnel. Toute utilisation, copie ou
> divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire
> prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur.
> Veuillez penser à l'environnement avant d'imprimer le présent courriel



-- 
Alejandro

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

Eduard,

Would you try using the following properties in your job invocation?

-D mapreduce.map.java.opts=-Xmx768m -D
mapreduce.reduce.java.opts=-Xmx768m -D mapreduce.map.memory.mb=2000 -D
mapreduce.reduce.memory.mb=3000

Thx


On Mon, Nov 5, 2012 at 7:43 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Your error takes place during reduce task, when temporary files are written
> to memory/disk. You are clearly running low on resources. Check your memory
> “$ free –m” and disk space “$ df –H” as well as “$hadoop fs -df”
>
>
>
> I remember it took me a couple of days to figure out why I was getting heap
> size error and nothing wporked!  Becaue, I tried to write 7Gb output file
> onto a disk (in pseudo distr mode) that only had 4Gb of free space.
>
>
>
> p.s. Always test your jobs on small input first (few lines of inputs) .
>
>
>
> p.p.s. follow your job execution through web:
> http://<fully-qualified-hostan-name of your job tracker>:50030
>
>
>
>
>
> From: Eduard Skaley [mailto:e.v.skaley@gmail.com]
> Sent: Monday, November 05, 2012 4:10 AM
> To: user@hadoop.apache.org
> Cc: Nitin Pawar
> Subject: Re: Error:
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space
>
>
>
> By the way it happens on Yarn not on MRv1
>
> each container gets 1GB at the moment.
>
> can you try increasing memory per reducer  ?
>
>
>
> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com> wrote:
>
> Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id :
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#2
>     at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue and
> checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit:
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property
> dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, but
> this is the value for our cluster.
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because the map
> phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1,
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx
>
>
>
>
>
>
>
>
>
> --
> Nitin Pawar
>
>
>
>
>
> NOTICE: This e-mail message and any attachments are confidential, subject to
> copyright and may be privileged. Any unauthorized use, copying or disclosure
> is prohibited. If you are not the intended recipient, please delete and
> contact the sender immediately. Please consider the environment before
> printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui
> l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent
> être couverts par le secret professionnel. Toute utilisation, copie ou
> divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire
> prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur.
> Veuillez penser à l'environnement avant d'imprimer le présent courriel



-- 
Alejandro

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

Eduard,

Would you try using the following properties in your job invocation?

-D mapreduce.map.java.opts=-Xmx768m -D
mapreduce.reduce.java.opts=-Xmx768m -D mapreduce.map.memory.mb=2000 -D
mapreduce.reduce.memory.mb=3000

Thx


On Mon, Nov 5, 2012 at 7:43 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Your error takes place during reduce task, when temporary files are written
> to memory/disk. You are clearly running low on resources. Check your memory
> “$ free –m” and disk space “$ df –H” as well as “$hadoop fs -df”
>
>
>
> I remember it took me a couple of days to figure out why I was getting heap
> size error and nothing wporked!  Becaue, I tried to write 7Gb output file
> onto a disk (in pseudo distr mode) that only had 4Gb of free space.
>
>
>
> p.s. Always test your jobs on small input first (few lines of inputs) .
>
>
>
> p.p.s. follow your job execution through web:
> http://<fully-qualified-hostan-name of your job tracker>:50030
>
>
>
>
>
> From: Eduard Skaley [mailto:e.v.skaley@gmail.com]
> Sent: Monday, November 05, 2012 4:10 AM
> To: user@hadoop.apache.org
> Cc: Nitin Pawar
> Subject: Re: Error:
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space
>
>
>
> By the way it happens on Yarn not on MRv1
>
> each container gets 1GB at the moment.
>
> can you try increasing memory per reducer  ?
>
>
>
> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com> wrote:
>
> Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id :
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#2
>     at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue and
> checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit:
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property
> dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, but
> this is the value for our cluster.
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because the map
> phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1,
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx
>
>
>
>
>
>
>
>
>
> --
> Nitin Pawar
>
>
>
>
>
> NOTICE: This e-mail message and any attachments are confidential, subject to
> copyright and may be privileged. Any unauthorized use, copying or disclosure
> is prohibited. If you are not the intended recipient, please delete and
> contact the sender immediately. Please consider the environment before
> printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui
> l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent
> être couverts par le secret professionnel. Toute utilisation, copie ou
> divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire
> prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur.
> Veuillez penser à l'environnement avant d'imprimer le présent courriel



-- 
Alejandro

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

Eduard,

Would you try using the following properties in your job invocation?

-D mapreduce.map.java.opts=-Xmx768m -D
mapreduce.reduce.java.opts=-Xmx768m -D mapreduce.map.memory.mb=2000 -D
mapreduce.reduce.memory.mb=3000

Thx


On Mon, Nov 5, 2012 at 7:43 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Your error takes place during reduce task, when temporary files are written
> to memory/disk. You are clearly running low on resources. Check your memory
> “$ free –m” and disk space “$ df –H” as well as “$hadoop fs -df”
>
>
>
> I remember it took me a couple of days to figure out why I was getting heap
> size error and nothing wporked!  Becaue, I tried to write 7Gb output file
> onto a disk (in pseudo distr mode) that only had 4Gb of free space.
>
>
>
> p.s. Always test your jobs on small input first (few lines of inputs) .
>
>
>
> p.p.s. follow your job execution through web:
> http://<fully-qualified-hostan-name of your job tracker>:50030
>
>
>
>
>
> From: Eduard Skaley [mailto:e.v.skaley@gmail.com]
> Sent: Monday, November 05, 2012 4:10 AM
> To: user@hadoop.apache.org
> Cc: Nitin Pawar
> Subject: Re: Error:
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space
>
>
>
> By the way it happens on Yarn not on MRv1
>
> each container gets 1GB at the moment.
>
> can you try increasing memory per reducer  ?
>
>
>
> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com> wrote:
>
> Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id :
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#2
>     at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue and
> checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit:
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property
> dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, but
> this is the value for our cluster.
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because the map
> phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1,
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx
>
>
>
>
>
>
>
>
>
> --
> Nitin Pawar
>
>
>
>
>
> NOTICE: This e-mail message and any attachments are confidential, subject to
> copyright and may be privileged. Any unauthorized use, copying or disclosure
> is prohibited. If you are not the intended recipient, please delete and
> contact the sender immediately. Please consider the environment before
> printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui
> l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent
> être couverts par le secret professionnel. Toute utilisation, copie ou
> divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire
> prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur.
> Veuillez penser à l'environnement avant d'imprimer le présent courriel



-- 
Alejandro

RE: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by "Kartashov, Andy" <An...@mpac.ca>.

Your error takes place during reduce task, when temporary files are written to memory/disk. You are clearly running low on resources. Check your memory "$ free -m" and disk space "$ df -H" as well as "$hadoop fs -df"

I remember it took me a couple of days to figure out why I was getting heap size error and nothing wporked!  Becaue, I tried to write 7Gb output file onto a disk (in pseudo distr mode) that only had 4Gb of free space.

p.s. Always test your jobs on small input first (few lines of inputs) .

p.p.s. follow your job execution through web:  http://<fully-qualified-hostan-name<http://%3cfully-qualified-hostan-name> of your job tracker>:50030


From: Eduard Skaley [mailto:e.v.skaley@gmail.com]
Sent: Monday, November 05, 2012 4:10 AM
To: user@hadoop.apache.org
Cc: Nitin Pawar
Subject: Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

By the way it happens on Yarn not on MRv1
each container gets 1GB at the moment.
can you try increasing memory per reducer  ?

On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com>> wrote:
Hello,

I'm getting this Error through job execution:

16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
16:20:32 INFO  [main]                     Job - Task Id : attempt_1351680008718_0018_r_000006_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2
    at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
    at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
    at org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
    at org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
    at org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)

16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
16:20:43 INFO  [main]                     Job -  map 100% reduce 71%

I have no clue what the issue could be for this. I googled this issue and checked several sources of possible solutions but nothing does fit.

I saw this jira entry which could fit: https://issues.apache.org/jira/browse/MAPREDUCE-4655.

Here somebody recommends to increase the value for the property dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, but this is the value for our cluster.
http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html

The issue with the to small input files doesn't fit I think, because the map phase reads 137 files with each 130MB. Block Size is 128MB.

The cluster uses version 2.0.0-cdh4.1.1, 581959ba23e4af85afd8db98b7687662fe9c5f20.

Thx








--
Nitin Pawar


NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel

RE: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by "Kartashov, Andy" <An...@mpac.ca>.

Your error takes place during reduce task, when temporary files are written to memory/disk. You are clearly running low on resources. Check your memory "$ free -m" and disk space "$ df -H" as well as "$hadoop fs -df"

I remember it took me a couple of days to figure out why I was getting heap size error and nothing wporked!  Becaue, I tried to write 7Gb output file onto a disk (in pseudo distr mode) that only had 4Gb of free space.

p.s. Always test your jobs on small input first (few lines of inputs) .

p.p.s. follow your job execution through web:  http://<fully-qualified-hostan-name<http://%3cfully-qualified-hostan-name> of your job tracker>:50030


From: Eduard Skaley [mailto:e.v.skaley@gmail.com]
Sent: Monday, November 05, 2012 4:10 AM
To: user@hadoop.apache.org
Cc: Nitin Pawar
Subject: Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

By the way it happens on Yarn not on MRv1
each container gets 1GB at the moment.
can you try increasing memory per reducer  ?

On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com>> wrote:
Hello,

I'm getting this Error through job execution:

16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
16:20:32 INFO  [main]                     Job - Task Id : attempt_1351680008718_0018_r_000006_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2
    at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
    at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
    at org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
    at org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
    at org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)

16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
16:20:43 INFO  [main]                     Job -  map 100% reduce 71%

I have no clue what the issue could be for this. I googled this issue and checked several sources of possible solutions but nothing does fit.

I saw this jira entry which could fit: https://issues.apache.org/jira/browse/MAPREDUCE-4655.

Here somebody recommends to increase the value for the property dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, but this is the value for our cluster.
http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html

The issue with the to small input files doesn't fit I think, because the map phase reads 137 files with each 130MB. Block Size is 128MB.

The cluster uses version 2.0.0-cdh4.1.1, 581959ba23e4af85afd8db98b7687662fe9c5f20.

Thx








--
Nitin Pawar


NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

We increased mapreduce.reduce.memory.mb to 2GB and 
mapreduce.reduce.java.opts to 1.5GB.

Now we are getting livelocks for our jobs, map jobs don't start.

We are using CapacityScheduler because we had LiveLocks with FifoScheduler.

Does anybody have a clue ?
> By the way it happens on Yarn not on MRv1
>> each container gets 1GB at the moment.
>>> can you try increasing memory per reducer  ?
>>>
>>>
>>> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
>>> <ma...@gmail.com>> wrote:
>>>
>>>     Hello,
>>>
>>>     I'm getting this Error through job execution:
>>>
>>>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>>>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>>>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>>>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>>>     16:20:32 INFO  [main]                     Job - Task Id :
>>>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>>>     Error:
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>>     error in shuffle in fetcher#2
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>>>     Caused by: java.lang.OutOfMemoryError: Java heap space
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>>>
>>>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>>>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>>>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>>>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>>>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>>>
>>>     I have no clue what the issue could be for this. I googled this
>>>     issue and checked several sources of possible solutions but
>>>     nothing does fit.
>>>
>>>     I saw this jira entry which could fit:
>>>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>>>
>>>     Here somebody recommends to increase the value for the property
>>>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>>>     4096, but this is the value for our cluster.
>>>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>>>
>>>     The issue with the to small input files doesn't fit I think,
>>>     because the map phase reads 137 files with each 130MB. Block
>>>     Size is 128MB.
>>>
>>>     The cluster uses version 2.0.0-cdh4.1.1,
>>>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>>>
>>>     Thx
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> Nitin Pawar
>>>
>>
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

We increased mapreduce.reduce.memory.mb to 2GB and 
mapreduce.reduce.java.opts to 1.5GB.

Now we are getting livelocks for our jobs, map jobs don't start.

We are using CapacityScheduler because we had LiveLocks with FifoScheduler.

Does anybody have a clue ?
> By the way it happens on Yarn not on MRv1
>> each container gets 1GB at the moment.
>>> can you try increasing memory per reducer  ?
>>>
>>>
>>> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
>>> <ma...@gmail.com>> wrote:
>>>
>>>     Hello,
>>>
>>>     I'm getting this Error through job execution:
>>>
>>>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>>>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>>>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>>>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>>>     16:20:32 INFO  [main]                     Job - Task Id :
>>>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>>>     Error:
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>>     error in shuffle in fetcher#2
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>>>     Caused by: java.lang.OutOfMemoryError: Java heap space
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>>>
>>>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>>>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>>>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>>>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>>>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>>>
>>>     I have no clue what the issue could be for this. I googled this
>>>     issue and checked several sources of possible solutions but
>>>     nothing does fit.
>>>
>>>     I saw this jira entry which could fit:
>>>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>>>
>>>     Here somebody recommends to increase the value for the property
>>>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>>>     4096, but this is the value for our cluster.
>>>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>>>
>>>     The issue with the to small input files doesn't fit I think,
>>>     because the map phase reads 137 files with each 130MB. Block
>>>     Size is 128MB.
>>>
>>>     The cluster uses version 2.0.0-cdh4.1.1,
>>>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>>>
>>>     Thx
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> Nitin Pawar
>>>
>>
>

RE: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by "Kartashov, Andy" <An...@mpac.ca>.

Your error takes place during reduce task, when temporary files are written to memory/disk. You are clearly running low on resources. Check your memory "$ free -m" and disk space "$ df -H" as well as "$hadoop fs -df"

I remember it took me a couple of days to figure out why I was getting heap size error and nothing wporked!  Becaue, I tried to write 7Gb output file onto a disk (in pseudo distr mode) that only had 4Gb of free space.

p.s. Always test your jobs on small input first (few lines of inputs) .

p.p.s. follow your job execution through web:  http://<fully-qualified-hostan-name<http://%3cfully-qualified-hostan-name> of your job tracker>:50030


From: Eduard Skaley [mailto:e.v.skaley@gmail.com]
Sent: Monday, November 05, 2012 4:10 AM
To: user@hadoop.apache.org
Cc: Nitin Pawar
Subject: Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

By the way it happens on Yarn not on MRv1
each container gets 1GB at the moment.
can you try increasing memory per reducer  ?

On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com>> wrote:
Hello,

I'm getting this Error through job execution:

16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
16:20:32 INFO  [main]                     Job - Task Id : attempt_1351680008718_0018_r_000006_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2
    at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
    at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
    at org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
    at org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
    at org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)

16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
16:20:43 INFO  [main]                     Job -  map 100% reduce 71%

I have no clue what the issue could be for this. I googled this issue and checked several sources of possible solutions but nothing does fit.

I saw this jira entry which could fit: https://issues.apache.org/jira/browse/MAPREDUCE-4655.

Here somebody recommends to increase the value for the property dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, but this is the value for our cluster.
http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html

The issue with the to small input files doesn't fit I think, because the map phase reads 137 files with each 130MB. Block Size is 128MB.

The cluster uses version 2.0.0-cdh4.1.1, 581959ba23e4af85afd8db98b7687662fe9c5f20.

Thx








--
Nitin Pawar


NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel

RE: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by "Kartashov, Andy" <An...@mpac.ca>.

Your error takes place during reduce task, when temporary files are written to memory/disk. You are clearly running low on resources. Check your memory "$ free -m" and disk space "$ df -H" as well as "$hadoop fs -df"

I remember it took me a couple of days to figure out why I was getting heap size error and nothing wporked!  Becaue, I tried to write 7Gb output file onto a disk (in pseudo distr mode) that only had 4Gb of free space.

p.s. Always test your jobs on small input first (few lines of inputs) .

p.p.s. follow your job execution through web:  http://<fully-qualified-hostan-name<http://%3cfully-qualified-hostan-name> of your job tracker>:50030


From: Eduard Skaley [mailto:e.v.skaley@gmail.com]
Sent: Monday, November 05, 2012 4:10 AM
To: user@hadoop.apache.org
Cc: Nitin Pawar
Subject: Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

By the way it happens on Yarn not on MRv1
each container gets 1GB at the moment.
can you try increasing memory per reducer  ?

On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com>> wrote:
Hello,

I'm getting this Error through job execution:

16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
16:20:32 INFO  [main]                     Job - Task Id : attempt_1351680008718_0018_r_000006_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2
    at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
    at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
    at org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
    at org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
    at org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)

16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
16:20:43 INFO  [main]                     Job -  map 100% reduce 71%

I have no clue what the issue could be for this. I googled this issue and checked several sources of possible solutions but nothing does fit.

I saw this jira entry which could fit: https://issues.apache.org/jira/browse/MAPREDUCE-4655.

Here somebody recommends to increase the value for the property dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, but this is the value for our cluster.
http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html

The issue with the to small input files doesn't fit I think, because the map phase reads 137 files with each 130MB. Block Size is 128MB.

The cluster uses version 2.0.0-cdh4.1.1, 581959ba23e4af85afd8db98b7687662fe9c5f20.

Thx








--
Nitin Pawar


NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

We increased mapreduce.reduce.memory.mb to 2GB and 
mapreduce.reduce.java.opts to 1.5GB.

Now we are getting livelocks for our jobs, map jobs don't start.

We are using CapacityScheduler because we had LiveLocks with FifoScheduler.

Does anybody have a clue ?
> By the way it happens on Yarn not on MRv1
>> each container gets 1GB at the moment.
>>> can you try increasing memory per reducer  ?
>>>
>>>
>>> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
>>> <ma...@gmail.com>> wrote:
>>>
>>>     Hello,
>>>
>>>     I'm getting this Error through job execution:
>>>
>>>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>>>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>>>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>>>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>>>     16:20:32 INFO  [main]                     Job - Task Id :
>>>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>>>     Error:
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>>     error in shuffle in fetcher#2
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>>>     Caused by: java.lang.OutOfMemoryError: Java heap space
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>>>         at
>>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>>>         at
>>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>>>
>>>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>>>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>>>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>>>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>>>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>>>
>>>     I have no clue what the issue could be for this. I googled this
>>>     issue and checked several sources of possible solutions but
>>>     nothing does fit.
>>>
>>>     I saw this jira entry which could fit:
>>>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>>>
>>>     Here somebody recommends to increase the value for the property
>>>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>>>     4096, but this is the value for our cluster.
>>>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>>>
>>>     The issue with the to small input files doesn't fit I think,
>>>     because the map phase reads 137 files with each 130MB. Block
>>>     Size is 128MB.
>>>
>>>     The cluster uses version 2.0.0-cdh4.1.1,
>>>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>>>
>>>     Thx
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> Nitin Pawar
>>>
>>
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

By the way it happens on Yarn not on MRv1
> each container gets 1GB at the moment.
>> can you try increasing memory per reducer  ?
>>
>>
>> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>>     Hello,
>>
>>     I'm getting this Error through job execution:
>>
>>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>>     16:20:32 INFO  [main]                     Job - Task Id :
>>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>>     Error:
>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>     error in shuffle in fetcher#2
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>>     Caused by: java.lang.OutOfMemoryError: Java heap space
>>         at
>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>>         at
>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>>
>>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>>
>>     I have no clue what the issue could be for this. I googled this
>>     issue and checked several sources of possible solutions but
>>     nothing does fit.
>>
>>     I saw this jira entry which could fit:
>>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>>
>>     Here somebody recommends to increase the value for the property
>>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>>     4096, but this is the value for our cluster.
>>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>>
>>     The issue with the to small input files doesn't fit I think,
>>     because the map phase reads 137 files with each 130MB. Block Size
>>     is 128MB.
>>
>>     The cluster uses version 2.0.0-cdh4.1.1,
>>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>>
>>     Thx
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -- 
>> Nitin Pawar
>>
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

By the way it happens on Yarn not on MRv1
> each container gets 1GB at the moment.
>> can you try increasing memory per reducer  ?
>>
>>
>> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>>     Hello,
>>
>>     I'm getting this Error through job execution:
>>
>>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>>     16:20:32 INFO  [main]                     Job - Task Id :
>>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>>     Error:
>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>     error in shuffle in fetcher#2
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>>     Caused by: java.lang.OutOfMemoryError: Java heap space
>>         at
>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>>         at
>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>>
>>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>>
>>     I have no clue what the issue could be for this. I googled this
>>     issue and checked several sources of possible solutions but
>>     nothing does fit.
>>
>>     I saw this jira entry which could fit:
>>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>>
>>     Here somebody recommends to increase the value for the property
>>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>>     4096, but this is the value for our cluster.
>>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>>
>>     The issue with the to small input files doesn't fit I think,
>>     because the map phase reads 137 files with each 130MB. Block Size
>>     is 128MB.
>>
>>     The cluster uses version 2.0.0-cdh4.1.1,
>>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>>
>>     Thx
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -- 
>> Nitin Pawar
>>
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

By the way it happens on Yarn not on MRv1
> each container gets 1GB at the moment.
>> can you try increasing memory per reducer  ?
>>
>>
>> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>>     Hello,
>>
>>     I'm getting this Error through job execution:
>>
>>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>>     16:20:32 INFO  [main]                     Job - Task Id :
>>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>>     Error:
>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>     error in shuffle in fetcher#2
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>>     Caused by: java.lang.OutOfMemoryError: Java heap space
>>         at
>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>>         at
>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>>
>>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>>
>>     I have no clue what the issue could be for this. I googled this
>>     issue and checked several sources of possible solutions but
>>     nothing does fit.
>>
>>     I saw this jira entry which could fit:
>>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>>
>>     Here somebody recommends to increase the value for the property
>>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>>     4096, but this is the value for our cluster.
>>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>>
>>     The issue with the to small input files doesn't fit I think,
>>     because the map phase reads 137 files with each 130MB. Block Size
>>     is 128MB.
>>
>>     The cluster uses version 2.0.0-cdh4.1.1,
>>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>>
>>     Thx
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -- 
>> Nitin Pawar
>>
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

By the way it happens on Yarn not on MRv1
> each container gets 1GB at the moment.
>> can you try increasing memory per reducer  ?
>>
>>
>> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>>     Hello,
>>
>>     I'm getting this Error through job execution:
>>
>>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>>     16:20:32 INFO  [main]                     Job - Task Id :
>>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>>     Error:
>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>>     error in shuffle in fetcher#2
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>>     Caused by: java.lang.OutOfMemoryError: Java heap space
>>         at
>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>>         at
>>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>>         at
>>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>>
>>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>>
>>     I have no clue what the issue could be for this. I googled this
>>     issue and checked several sources of possible solutions but
>>     nothing does fit.
>>
>>     I saw this jira entry which could fit:
>>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>>
>>     Here somebody recommends to increase the value for the property
>>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>>     4096, but this is the value for our cluster.
>>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>>
>>     The issue with the to small input files doesn't fit I think,
>>     because the map phase reads 137 files with each 130MB. Block Size
>>     is 128MB.
>>
>>     The cluster uses version 2.0.0-cdh4.1.1,
>>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>>
>>     Thx
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -- 
>> Nitin Pawar
>>
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

each container gets 1GB at the moment.
> can you try increasing memory per reducer  ?
>
>
> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hello,
>
>     I'm getting this Error through job execution:
>
>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>     16:20:32 INFO  [main]                     Job - Task Id :
>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>     Error:
>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>     error in shuffle in fetcher#2
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>     Caused by: java.lang.OutOfMemoryError: Java heap space
>         at
>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>         at
>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
>     I have no clue what the issue could be for this. I googled this
>     issue and checked several sources of possible solutions but
>     nothing does fit.
>
>     I saw this jira entry which could fit:
>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
>     Here somebody recommends to increase the value for the property
>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>     4096, but this is the value for our cluster.
>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
>     The issue with the to small input files doesn't fit I think,
>     because the map phase reads 137 files with each 130MB. Block Size
>     is 128MB.
>
>     The cluster uses version 2.0.0-cdh4.1.1,
>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>
>     Thx
>
>
>
>
>
>
>
>
>
> -- 
> Nitin Pawar
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

each container gets 1GB at the moment.
> can you try increasing memory per reducer  ?
>
>
> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hello,
>
>     I'm getting this Error through job execution:
>
>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>     16:20:32 INFO  [main]                     Job - Task Id :
>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>     Error:
>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>     error in shuffle in fetcher#2
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>     Caused by: java.lang.OutOfMemoryError: Java heap space
>         at
>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>         at
>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
>     I have no clue what the issue could be for this. I googled this
>     issue and checked several sources of possible solutions but
>     nothing does fit.
>
>     I saw this jira entry which could fit:
>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
>     Here somebody recommends to increase the value for the property
>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>     4096, but this is the value for our cluster.
>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
>     The issue with the to small input files doesn't fit I think,
>     because the map phase reads 137 files with each 130MB. Block Size
>     is 128MB.
>
>     The cluster uses version 2.0.0-cdh4.1.1,
>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>
>     Thx
>
>
>
>
>
>
>
>
>
> -- 
> Nitin Pawar
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

it was installed through the cloudera manager and we took the default 
value for the per reducer memory.

> Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id : 
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in fetcher#2
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue 
> and checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit: 
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property 
> dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, 
> but this is the value for our cluster.
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because 
> the map phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1, 
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

each container gets 1GB at the moment.
> can you try increasing memory per reducer  ?
>
>
> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hello,
>
>     I'm getting this Error through job execution:
>
>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>     16:20:32 INFO  [main]                     Job - Task Id :
>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>     Error:
>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>     error in shuffle in fetcher#2
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>     Caused by: java.lang.OutOfMemoryError: Java heap space
>         at
>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>         at
>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
>     I have no clue what the issue could be for this. I googled this
>     issue and checked several sources of possible solutions but
>     nothing does fit.
>
>     I saw this jira entry which could fit:
>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
>     Here somebody recommends to increase the value for the property
>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>     4096, but this is the value for our cluster.
>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
>     The issue with the to small input files doesn't fit I think,
>     because the map phase reads 137 files with each 130MB. Block Size
>     is 128MB.
>
>     The cluster uses version 2.0.0-cdh4.1.1,
>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>
>     Thx
>
>
>
>
>
>
>
>
>
> -- 
> Nitin Pawar
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

it was installed through the cloudera manager and we took the default 
value for the per reducer memory.

> Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id : 
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in fetcher#2
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue 
> and checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit: 
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property 
> dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, 
> but this is the value for our cluster.
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because 
> the map phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1, 
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

it was installed through the cloudera manager and we took the default 
value for the per reducer memory.

> Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id : 
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in fetcher#2
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue 
> and checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit: 
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property 
> dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, 
> but this is the value for our cluster.
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because 
> the map phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1, 
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

each container gets 1GB at the moment.
> can you try increasing memory per reducer  ?
>
>
> On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e.v.skaley@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hello,
>
>     I'm getting this Error through job execution:
>
>     16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
>     16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
>     16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
>     16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
>     16:20:32 INFO  [main]                     Job - Task Id :
>     attempt_1351680008718_0018_r_000006_0, Status : FAILED
>     Error:
>     org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
>     error in shuffle in fetcher#2
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
>     Caused by: java.lang.OutOfMemoryError: Java heap space
>         at
>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>         at
>     org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>         at
>     org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
>     16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
>     16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
>     16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
>     16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
>     16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
>     I have no clue what the issue could be for this. I googled this
>     issue and checked several sources of possible solutions but
>     nothing does fit.
>
>     I saw this jira entry which could fit:
>     https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
>     Here somebody recommends to increase the value for the property
>     dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to
>     4096, but this is the value for our cluster.
>     http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
>     The issue with the to small input files doesn't fit I think,
>     because the map phase reads 137 files with each 130MB. Block Size
>     is 128MB.
>
>     The cluster uses version 2.0.0-cdh4.1.1,
>     581959ba23e4af85afd8db98b7687662fe9c5f20.
>
>     Thx
>
>
>
>
>
>
>
>
>
> -- 
> Nitin Pawar
>

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Eduard Skaley <e....@gmail.com>.

it was installed through the cloudera manager and we took the default 
value for the per reducer memory.

> Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id : 
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in fetcher#2
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at 
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue 
> and checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit: 
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property 
> dfs.datanode.max.xcievers / dfs.datanode.max.receiver.threads to 4096, 
> but this is the value for our cluster.
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because 
> the map phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1, 
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Nitin Pawar <ni...@gmail.com>.

can you try increasing memory per reducer  ?


On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com> wrote:

>  Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id :
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#2
>     at
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue and
> checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit:
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property dfs.datanode.max.xcievers
> / dfs.datanode.max.receiver.threads to 4096, but this is the value for
> our cluster.
>
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because the
> map phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1,
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx
>
>
>
>
>
>
>


-- 
Nitin Pawar

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Nitin Pawar <ni...@gmail.com>.

can you try increasing memory per reducer  ?


On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com> wrote:

>  Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id :
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#2
>     at
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue and
> checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit:
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property dfs.datanode.max.xcievers
> / dfs.datanode.max.receiver.threads to 4096, but this is the value for
> our cluster.
>
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because the
> map phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1,
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx
>
>
>
>
>
>
>


-- 
Nitin Pawar

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Nitin Pawar <ni...@gmail.com>.

can you try increasing memory per reducer  ?


On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com> wrote:

>  Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id :
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#2
>     at
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue and
> checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit:
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property dfs.datanode.max.xcievers
> / dfs.datanode.max.receiver.threads to 4096, but this is the value for
> our cluster.
>
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because the
> map phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1,
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx
>
>
>
>
>
>
>


-- 
Nitin Pawar

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

Posted by Nitin Pawar <ni...@gmail.com>.

can you try increasing memory per reducer  ?


On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley <e....@gmail.com> wrote:

>  Hello,
>
> I'm getting this Error through job execution:
>
> 16:20:26 INFO  [main]                     Job -  map 100% reduce 46%
> 16:20:27 INFO  [main]                     Job -  map 100% reduce 51%
> 16:20:29 INFO  [main]                     Job -  map 100% reduce 62%
> 16:20:30 INFO  [main]                     Job -  map 100% reduce 64%
> 16:20:32 INFO  [main]                     Job - Task Id :
> attempt_1351680008718_0018_r_000006_0, Status : FAILED
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#2
>     at
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
>     at
> org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:97)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:286)
>     at
> org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:276)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:384)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:319)
>     at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:179)
>
> 16:20:33 INFO  [main]                     Job -  map 100% reduce 65%
> 16:20:36 INFO  [main]                     Job -  map 100% reduce 67%
> 16:20:39 INFO  [main]                     Job -  map 100% reduce 69%
> 16:20:41 INFO  [main]                     Job -  map 100% reduce 70%
> 16:20:43 INFO  [main]                     Job -  map 100% reduce 71%
>
> I have no clue what the issue could be for this. I googled this issue and
> checked several sources of possible solutions but nothing does fit.
>
> I saw this jira entry which could fit:
> https://issues.apache.org/jira/browse/MAPREDUCE-4655.
>
> Here somebody recommends to increase the value for the property dfs.datanode.max.xcievers
> / dfs.datanode.max.receiver.threads to 4096, but this is the value for
> our cluster.
>
> http://yaseminavcular.blogspot.de/2011/04/common-hadoop-hdfs-exceptions-with.html
>
> The issue with the to small input files doesn't fit I think, because the
> map phase reads 137 files with each 130MB. Block Size is 128MB.
>
> The cluster uses version 2.0.0-cdh4.1.1,
> 581959ba23e4af85afd8db98b7687662fe9c5f20.
>
> Thx
>
>
>
>
>
>
>


-- 
Nitin Pawar