You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Mohit Anchlia <mo...@gmail.com> on 2014/07/01 21:23:32 UTC
Compressing map output
I am trying to compress mapoutput but when I add the following code I get
errors. Is there anything wrong that you can point me to?
conf.setBoolean(
"mapreduce.map.output.compress", *true*);
conf.setClass(
"mapreduce.map.output.compress.codec", GzipCodec.*class*,
CompressionCodec.
*class*);
14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
attempt_1404239414989_0008_r_000000_1, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
in shuffle in fetcher#1
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
bailing-out.
at
org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
at
org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
Re: Compressing map output
Posted by Mohit Anchlia <mo...@gmail.com>.
Yes it goes away when I comment the map output compression.
On Tue, Jul 1, 2014 at 6:38 PM, M. Dale <me...@yahoo.com> wrote:
> That looks right. Do you consistently get the error below and the total
> job fails? Does it go away when you comment out the map compression?
>
>
> On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
>
> I am trying to compress mapoutput but when I add the following code I get
> errors. Is there anything wrong that you can point me to?
>
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
>
>
Re: Compressing map output
Posted by Mohit Anchlia <mo...@gmail.com>.
Yes it goes away when I comment the map output compression.
On Tue, Jul 1, 2014 at 6:38 PM, M. Dale <me...@yahoo.com> wrote:
> That looks right. Do you consistently get the error below and the total
> job fails? Does it go away when you comment out the map compression?
>
>
> On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
>
> I am trying to compress mapoutput but when I add the following code I get
> errors. Is there anything wrong that you can point me to?
>
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
>
>
Re: Compressing map output
Posted by Mohit Anchlia <mo...@gmail.com>.
Yes it goes away when I comment the map output compression.
On Tue, Jul 1, 2014 at 6:38 PM, M. Dale <me...@yahoo.com> wrote:
> That looks right. Do you consistently get the error below and the total
> job fails? Does it go away when you comment out the map compression?
>
>
> On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
>
> I am trying to compress mapoutput but when I add the following code I get
> errors. Is there anything wrong that you can point me to?
>
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
>
>
Re: Compressing map output
Posted by Mohit Anchlia <mo...@gmail.com>.
Yes it goes away when I comment the map output compression.
On Tue, Jul 1, 2014 at 6:38 PM, M. Dale <me...@yahoo.com> wrote:
> That looks right. Do you consistently get the error below and the total
> job fails? Does it go away when you comment out the map compression?
>
>
> On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
>
> I am trying to compress mapoutput but when I add the following code I get
> errors. Is there anything wrong that you can point me to?
>
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
> in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
>
>
Re: Compressing map output
Posted by "M. Dale" <me...@yahoo.com>.
That looks right. Do you consistently get the error below and the total
job fails? Does it go away when you comment out the map compression?
On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
> I am trying to compress mapoutput but when I add the following code I
> get errors. Is there anything wrong that you can point me to?
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
> error in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
Re: Compressing map output
Posted by "M. Dale" <me...@yahoo.com>.
That looks right. Do you consistently get the error below and the total
job fails? Does it go away when you comment out the map compression?
On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
> I am trying to compress mapoutput but when I add the following code I
> get errors. Is there anything wrong that you can point me to?
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
> error in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
Re: Compressing map output
Posted by "M. Dale" <me...@yahoo.com>.
That looks right. Do you consistently get the error below and the total
job fails? Does it go away when you comment out the map compression?
On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
> I am trying to compress mapoutput but when I add the following code I
> get errors. Is there anything wrong that you can point me to?
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
> error in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>
Re: Compressing map output
Posted by "M. Dale" <me...@yahoo.com>.
That looks right. Do you consistently get the error below and the total
job fails? Does it go away when you comment out the map compression?
On 07/01/2014 03:23 PM, Mohit Anchlia wrote:
> I am trying to compress mapoutput but when I add the following code I
> get errors. Is there anything wrong that you can point me to?
>
> conf.setBoolean(
>
> "mapreduce.map.output.compress", *true*);
>
> conf.setClass(
>
> "mapreduce.map.output.compress.codec", GzipCodec.*class*,
>
> CompressionCodec.
>
> *class*);
>
> 14/07/01 22:21:47 INFO mapreduce.Job: Task Id :
> attempt_1404239414989_0008_r_000000_1, Status : FAILED
>
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
> error in shuffle in fetcher#1
>
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
>
> at
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
>
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)
>