You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Guo Ruijing (JIRA)" <ji...@apache.org> on 2013/12/31 09:32:50 UTC

[jira] [Updated] (HADOOP-10196) Bzip2Codec Compress cannot work

     [ https://issues.apache.org/jira/browse/HADOOP-10196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guo Ruijing updated HADOOP-10196:
---------------------------------

    Summary: Bzip2Codec Compress cannot work  (was: Bzip2Codec Uncompress cannot work)

> Bzip2Codec Compress cannot work
> -------------------------------
>
>                 Key: HADOOP-10196
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10196
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 2.0.2-alpha, 2.0.5-alpha, 2.2.0
>            Reporter: Guo Ruijing
>
> Bzip2Codec Uncompress cannot work.
> 1. Compress Sample file:
> [hadoop@localhost ~]$ cat StreamCompressor.java
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.io.compress.CompressionOutputStream;
> import org.apache.hadoop.io.compress.CompressionCodec;
> import org.apache.hadoop.io.IOUtils;
> import org.apache.hadoop.util.ReflectionUtils;
> public class StreamCompressor {
> public static void main(String[] args) throws Exception
> { String codecClassname = args[0]; Class<?> codecClass = Class.forName(codecClassname); Configuration conf = new Configuration(); CompressionCodec codec = (CompressionCodec) ReflectionUtils.newInstance(codecClass, conf); CompressionOutputStream out = codec.createOutputStream(System.out); IOUtils.copyBytes(System.in, out, 4096, false); out.finish(); }
> }
> 2. Uncompress Sample file:
> [hadoop@localhost ~]$ cat StreamUncompressor.java
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.io.compress.CompressionOutputStream;
> import org.apache.hadoop.io.compress.CompressionInputStream;
> import org.apache.hadoop.io.compress.CompressionCodec;
> import org.apache.hadoop.io.IOUtils;
> import org.apache.hadoop.util.ReflectionUtils;
> public class StreamUncompressor {
> public static void main(String[] args) throws Exception
> { String codecClassname = args[0]; Class<?> codecClass = Class.forName(codecClassname); Configuration conf = new Configuration(); CompressionCodec codec = (CompressionCodec) ReflectionUtils.newInstance(codecClass, conf); CompressionInputStream in = codec.createInputStream(System.in); IOUtils.copyBytes(in, System.out, 4096, false); in.close(); }
> }
> 2. How to compile/run
> 1) javac -classpath /usr/lib/gphd/hadoop/hadoop-common-2.0.5-alpha-gphd-2.1.1.0.jar StreamCompressor.java
> 2) javac -classpath /usr/lib/gphd/hadoop/hadoop-common-2.0.5-alpha-gphd-2.1.1.0.jar StreamUncompressor.java
> 3) jar -cvf Stream.jar StreamCompressor.class StreamUncompressor.class
> 4) rm -rf /tmp/my.txt.bz2 && echo abc > /tmp/my.txt && bzip2 /tmp/my.txt && cat /tmp/my.txt.bz2 | hadoop jar ./Stream.jar StreamUncompressor org.apache.hadoop.io.compress.BZip2Codec
> 5) echo "text" | hadoop jar ./Stream.jar StreamCompressor org.apache.hadoop.io.compress.BZip2Codec | bzcat
> 3. Test Result
> From test, hadoop doesn't support native bzip2 and java bzip2.
> 1) hadoop support bzip2 uncompress.
> rm -rf /tmp/my.txt.bz2 && echo abc > /tmp/my.txt && bzip2 /tmp/my.txt && cat /tmp/my.txt.bz2 | hadoop jar ./Stream.jar StreamUncompressor org.apache.hadoop.io.compress.BZip2Codec
> 13/12/17 03:58:20 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
> abc <<< expect
> 2) bzip2 compress cannot work as following:
> a) [hadoop@localhost hadoop]$ echo "text" | hadoop jar ./Stream.jar StreamCompressor org.apache.hadoop.io.compress.BZip2Codec
> 13/12/17 04:00:59 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
> BZ <<<<< not expect
> b) [hadoop@localhost hadoop]$ echo "text" | hadoop jar ./Stream.jar StreamCompressor org.apache.hadoop.io.compress.BZip2Codec | bzcat
> 13/12/17 04:01:31 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
> bzcat: Compressed file ends unexpectedly;
> perhaps it is corrupted? Possible reason follows.
> bzcat: Invalid argument
> Input file = (stdin), output file = (stdout)
> It is possible that the compressed file(s) have become corrupted.
> You can use the -tvv option to test integrity of such files.
> You can use the `bzip2recover' program to attempt to recover
> data from undamaged sections of corrupted files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)