You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Mohit Anchlia <mo...@gmail.com> on 2014/07/01 21:23:32 UTC

Compressing map output

I am trying to compress mapoutput but when I add the following code I get
errors. Is there anything wrong that you can point me to?


conf.setBoolean(

"mapreduce.map.output.compress", *true*);

conf.setClass(

"mapreduce.map.output.compress.codec", GzipCodec.*class*,

CompressionCodec.

*class*);

14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
attempt_1404239414989_0008_r_000000_1, Status : FAILED

Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
in shuffle in fetcher#1

at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
bailing-out.

at
org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)

at
org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)

at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)

at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

Re: Compressing map output

Posted by Mohit Anchlia <mo...@gmail.com>.

Yes it goes away when I comment the map output compression.

On Tue, Jul 1, 2014 at 6:38 PM, M. Dale <me...@yahoo.com> wrote:

>  That looks right. Do you consistently get the error below and the total
> job fails? Does it go away when you comment out the map compression?
>
>
> On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
>
> I am trying to compress mapoutput but when I add the following code I get
> errors. Is there anything wrong that you can point me to?
>
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
>  14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
>
>

Re: Compressing map output

Posted by Mohit Anchlia <mo...@gmail.com>.

Yes it goes away when I comment the map output compression.

On Tue, Jul 1, 2014 at 6:38 PM, M. Dale <me...@yahoo.com> wrote:

>  That looks right. Do you consistently get the error below and the total
> job fails? Does it go away when you comment out the map compression?
>
>
> On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
>
> I am trying to compress mapoutput but when I add the following code I get
> errors. Is there anything wrong that you can point me to?
>
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
>  14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
>
>

Re: Compressing map output

Posted by Mohit Anchlia <mo...@gmail.com>.

Yes it goes away when I comment the map output compression.

On Tue, Jul 1, 2014 at 6:38 PM, M. Dale <me...@yahoo.com> wrote:

>  That looks right. Do you consistently get the error below and the total
> job fails? Does it go away when you comment out the map compression?
>
>
> On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
>
> I am trying to compress mapoutput but when I add the following code I get
> errors. Is there anything wrong that you can point me to?
>
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
>  14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
>
>

Re: Compressing map output

Posted by Mohit Anchlia <mo...@gmail.com>.

Yes it goes away when I comment the map output compression.

On Tue, Jul 1, 2014 at 6:38 PM, M. Dale <me...@yahoo.com> wrote:

>  That looks right. Do you consistently get the error below and the total
> job fails? Does it go away when you comment out the map compression?
>
>
> On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
>
> I am trying to compress mapoutput but when I add the following code I get
> errors. Is there anything wrong that you can point me to?
>
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
>  14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
>
>

Re: Compressing map output

Posted by "M. Dale" <me...@yahoo.com>.

That looks right. Do you consistently get the error below and the total 
job fails? Does it go away when you comment out the map compression?

On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
> I am trying to compress mapoutput but when I add the following code I 
> get errors. Is there anything wrong that you can point me to?
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id : 
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; 
> bailing-out.
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>

Re: Compressing map output

Posted by "M. Dale" <me...@yahoo.com>.

That looks right. Do you consistently get the error below and the total 
job fails? Does it go away when you comment out the map compression?

On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
> I am trying to compress mapoutput but when I add the following code I 
> get errors. Is there anything wrong that you can point me to?
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id : 
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; 
> bailing-out.
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>

Re: Compressing map output

Posted by "M. Dale" <me...@yahoo.com>.

That looks right. Do you consistently get the error below and the total 
job fails? Does it go away when you comment out the map compression?

On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
> I am trying to compress mapoutput but when I add the following code I 
> get errors. Is there anything wrong that you can point me to?
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id : 
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; 
> bailing-out.
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>

Re: Compressing map output

Posted by "M. Dale" <me...@yahoo.com>.

That looks right. Do you consistently get the error below and the total 
job fails? Does it go away when you comment out the map compression?

On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
> I am trying to compress mapoutput but when I add the following code I 
> get errors. Is there anything wrong that you can point me to?
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id : 
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; 
> bailing-out.
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>