You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by xeonmailinglist-gmail <xe...@gmail.com> on 2015/04/05 15:45:55 UTC

compress data in hadoop

Hi,

I have run the command [1] to create compressed data from my Sequence 
files that are in the |/user/root/out1| dir, but I got the error [2]. 
How I compress data in hadoop?

[1]

|hadoop jar ./share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -D mapreduce.job.reduces=30 -D mapred.output.compress=true -D mapred.output.compression.codec=com.hadoop.io.compression.BZip2Codec -D mapreduce.output.fileoutputformat.compress.type=BLOCK  -mapper /bin/cat -reducer /bin/cat  -input /user/root/out1 -output /user/root/outcompressed
|

[2]

|15/04/05 09:41:32 INFO mapreduce.Job: Task Id : attempt_1428165800289_0017_r_000004_0, Status : FAILED
Error: java.lang.IllegalArgumentException: Compression codec com.hadoop.io.compression.BZip2Codec was not found.
     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:100)
     at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:126)
     at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:484)
     at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:414)
     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:416)
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class com.hadoop.io.compression.BZip2Codec not found
     at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:97)
     ... 9 more
|

​

-- 
--

Thanks,


Re: compress data in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Your package seems different.

Have you tried the following package and class?
org.apache.hadoop.io.compress.BZip2Codec

Regards,
Shahab

On Sun, Apr 5, 2015 at 9:45 AM, xeonmailinglist-gmail <
xeonmailinglist@gmail.com> wrote:

>  Hi,
>
> I have run the command [1] to create compressed data from my Sequence
> files that are in the /user/root/out1 dir, but I got the error [2]. How I
> compress data in hadoop?
>
> [1]
>
> hadoop jar ./share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -D mapreduce.job.reduces=30 -D mapred.output.compress=true -D mapred.output.compression.codec=com.hadoop.io.compression.BZip2Codec -D mapreduce.output.fileoutputformat.compress.type=BLOCK  -mapper /bin/cat -reducer /bin/cat  -input /user/root/out1 -output /user/root/outcompressed
>
> [2]
>
> 15/04/05 09:41:32 INFO mapreduce.Job: Task Id : attempt_1428165800289_0017_r_000004_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: Compression codec com.hadoop.io.compression.BZip2Codec was not found.
>     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:100)
>     at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:126)
>     at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:484)
>     at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:414)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:416)
>     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ClassNotFoundException: Class com.hadoop.io.compression.BZip2Codec not found
>     at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
>     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:97)
>     ... 9 more
>
> ​
>
> --
> --
>
> Thanks,
>
>

Re: compress data in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Your package seems different.

Have you tried the following package and class?
org.apache.hadoop.io.compress.BZip2Codec

Regards,
Shahab

On Sun, Apr 5, 2015 at 9:45 AM, xeonmailinglist-gmail <
xeonmailinglist@gmail.com> wrote:

>  Hi,
>
> I have run the command [1] to create compressed data from my Sequence
> files that are in the /user/root/out1 dir, but I got the error [2]. How I
> compress data in hadoop?
>
> [1]
>
> hadoop jar ./share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -D mapreduce.job.reduces=30 -D mapred.output.compress=true -D mapred.output.compression.codec=com.hadoop.io.compression.BZip2Codec -D mapreduce.output.fileoutputformat.compress.type=BLOCK  -mapper /bin/cat -reducer /bin/cat  -input /user/root/out1 -output /user/root/outcompressed
>
> [2]
>
> 15/04/05 09:41:32 INFO mapreduce.Job: Task Id : attempt_1428165800289_0017_r_000004_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: Compression codec com.hadoop.io.compression.BZip2Codec was not found.
>     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:100)
>     at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:126)
>     at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:484)
>     at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:414)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:416)
>     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ClassNotFoundException: Class com.hadoop.io.compression.BZip2Codec not found
>     at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
>     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:97)
>     ... 9 more
>
> ​
>
> --
> --
>
> Thanks,
>
>

Re: compress data in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Your package seems different.

Have you tried the following package and class?
org.apache.hadoop.io.compress.BZip2Codec

Regards,
Shahab

On Sun, Apr 5, 2015 at 9:45 AM, xeonmailinglist-gmail <
xeonmailinglist@gmail.com> wrote:

>  Hi,
>
> I have run the command [1] to create compressed data from my Sequence
> files that are in the /user/root/out1 dir, but I got the error [2]. How I
> compress data in hadoop?
>
> [1]
>
> hadoop jar ./share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -D mapreduce.job.reduces=30 -D mapred.output.compress=true -D mapred.output.compression.codec=com.hadoop.io.compression.BZip2Codec -D mapreduce.output.fileoutputformat.compress.type=BLOCK  -mapper /bin/cat -reducer /bin/cat  -input /user/root/out1 -output /user/root/outcompressed
>
> [2]
>
> 15/04/05 09:41:32 INFO mapreduce.Job: Task Id : attempt_1428165800289_0017_r_000004_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: Compression codec com.hadoop.io.compression.BZip2Codec was not found.
>     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:100)
>     at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:126)
>     at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:484)
>     at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:414)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:416)
>     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ClassNotFoundException: Class com.hadoop.io.compression.BZip2Codec not found
>     at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
>     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:97)
>     ... 9 more
>
> ​
>
> --
> --
>
> Thanks,
>
>

Re: compress data in hadoop

Posted by Shahab Yunus <sh...@gmail.com>.
Your package seems different.

Have you tried the following package and class?
org.apache.hadoop.io.compress.BZip2Codec

Regards,
Shahab

On Sun, Apr 5, 2015 at 9:45 AM, xeonmailinglist-gmail <
xeonmailinglist@gmail.com> wrote:

>  Hi,
>
> I have run the command [1] to create compressed data from my Sequence
> files that are in the /user/root/out1 dir, but I got the error [2]. How I
> compress data in hadoop?
>
> [1]
>
> hadoop jar ./share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar -D mapreduce.job.reduces=30 -D mapred.output.compress=true -D mapred.output.compression.codec=com.hadoop.io.compression.BZip2Codec -D mapreduce.output.fileoutputformat.compress.type=BLOCK  -mapper /bin/cat -reducer /bin/cat  -input /user/root/out1 -output /user/root/outcompressed
>
> [2]
>
> 15/04/05 09:41:32 INFO mapreduce.Job: Task Id : attempt_1428165800289_0017_r_000004_0, Status : FAILED
> Error: java.lang.IllegalArgumentException: Compression codec com.hadoop.io.compression.BZip2Codec was not found.
>     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:100)
>     at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:126)
>     at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:484)
>     at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:414)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:416)
>     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ClassNotFoundException: Class com.hadoop.io.compression.BZip2Codec not found
>     at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
>     at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:97)
>     ... 9 more
>
> ​
>
> --
> --
>
> Thanks,
>
>