You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by xiao li <xe...@outlook.com> on 2013/12/13 02:23:13 UTC

hadoop fs -text OutOfMemoryError

I could view the snappy file with hadoop fs -cat but when i issue the -text, it gives me this error though the file size is really tiny. what have i done wrong? Thanks 
hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-20131212-0.snappyException in thread "main" java.lang.OutOfMemoryError: Java heap space	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:115)	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:95)	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)	at java.io.InputStream.read(InputStream.java:82)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)	at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)	at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)	at org.apache.hadoop.fs.shell.Command.run(Command.java:154)	at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)	at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)

Re: hadoop fs -text OutOfMemoryError

Posted by Adam Kawa <ka...@gmail.com>.

Since snappy is non-splittable file (so that to decompress snappy file, you
need to read it from the beginning to the end), does the *append* operation
handle it well on a plain text file? I guess, that it might be problematic.

Snappy is recommended to use with a container format, like Sequence Files
or Avro, rather directly on plain text, because the plan text file
compressed by Snappy can not be processed in parallel.


2013/12/14 Tao Xiao <xi...@gmail.com>

> hi xiao li,
>    you said "Basically, what I need is a Storm HDFS Bolt to be able to
> write output to hdfs file, in order to get less small files, i use hdfs
> append". Did you configue the "append" property in your configuration file?
> you can search for "append" related issues first
>
>
> 2013/12/14 xiao li <xe...@outlook.com>
>
>> export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456
>> $HADOOP_CLIENT_OPTS"
>>
>>
>>
>> I guess it is not the memory issue, just the way how i write the snappy
>> compress file to hdfs.
>> Basically, what I need is a Storm HDFS Bolt to be able to write output to
>> hdfs file, in order to get less small files, i use hdfs append.
>>
>> Well I just can't get snappy working or write compressed files to hdfs
>> through Java.
>>
>> I am looking at the flume hdfs sink to get better code. ; )
>>
>>
>> https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
>>
>> ------------------------------
>> Date: Fri, 13 Dec 2013 22:24:21 +0100
>>
>> Subject: Re: hadoop fs -text OutOfMemoryError
>> From: kawa.adam@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> Hi,
>>
>> What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
>>
>> We had similar problems with running OOM with hadoop fs command (I do not
>> remember if they were exactly related to -text + snappy), when we decreased
>> the heap to some small value. With higher value e.g. 1 or 2 GB, we were
>> fine:
>>
>> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
>> export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
>>
>>
>> 2013/12/13 xiao li <xe...@outlook.com>
>>
>> Hi Tao
>>
>> Thanks for your reply,
>>
>> This is the code, it is pretty simple.
>>
>> '
>>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>>                     fsDataOutputStream
>> .write(Snappy.compress(json.getBytes("UTF-8")));'
>>
>>
>> but FSDataOutputStream is actually opened for appending, I guess the I
>> can't simply append to the snappy file(know nothing about it.)
>>
>>
>>
>> ------------------------------
>> Date: Fri, 13 Dec 2013 21:42:38 +0800
>> Subject: Re: hadoop fs -text OutOfMemoryError
>> From: xiaotao.cs.nju@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> can you describe your problems in more details, for example, was snappy
>> library installed correctly in your cluster, how did you code yout files
>> with snappy, was your file correctly coded with snappy ?
>>
>>
>> 2013/12/13 xiao li <xe...@outlook.com>
>>
>> I could view the snappy file with hadoop fs -cat but when i issue the
>> -text, it gives me this error though the file size is really tiny. what
>> have i done wrong? Thanks
>>
>> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
>> 20131212-0.snappy
>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
>> getCompressedData(BlockDecompressorStream.java:115)
>>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
>> BlockDecompressorStream.java:95)
>>  at org.apache.hadoop.io.compress.DecompressorStream.read(
>> DecompressorStream.java:83)
>>  at java.io.InputStream.read(InputStream.java:82)
>> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
>> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
>> at org.apache.hadoop.fs.shell.Command.processPathArgument(
>> Command.java:278)
>>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
>> at org.apache.hadoop.fs.shell.Command.processRawArguments(
>> Command.java:190)
>>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>>
>>
>>
>>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Adam Kawa <ka...@gmail.com>.

Since snappy is non-splittable file (so that to decompress snappy file, you
need to read it from the beginning to the end), does the *append* operation
handle it well on a plain text file? I guess, that it might be problematic.

Snappy is recommended to use with a container format, like Sequence Files
or Avro, rather directly on plain text, because the plan text file
compressed by Snappy can not be processed in parallel.


2013/12/14 Tao Xiao <xi...@gmail.com>

> hi xiao li,
>    you said "Basically, what I need is a Storm HDFS Bolt to be able to
> write output to hdfs file, in order to get less small files, i use hdfs
> append". Did you configue the "append" property in your configuration file?
> you can search for "append" related issues first
>
>
> 2013/12/14 xiao li <xe...@outlook.com>
>
>> export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456
>> $HADOOP_CLIENT_OPTS"
>>
>>
>>
>> I guess it is not the memory issue, just the way how i write the snappy
>> compress file to hdfs.
>> Basically, what I need is a Storm HDFS Bolt to be able to write output to
>> hdfs file, in order to get less small files, i use hdfs append.
>>
>> Well I just can't get snappy working or write compressed files to hdfs
>> through Java.
>>
>> I am looking at the flume hdfs sink to get better code. ; )
>>
>>
>> https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
>>
>> ------------------------------
>> Date: Fri, 13 Dec 2013 22:24:21 +0100
>>
>> Subject: Re: hadoop fs -text OutOfMemoryError
>> From: kawa.adam@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> Hi,
>>
>> What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
>>
>> We had similar problems with running OOM with hadoop fs command (I do not
>> remember if they were exactly related to -text + snappy), when we decreased
>> the heap to some small value. With higher value e.g. 1 or 2 GB, we were
>> fine:
>>
>> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
>> export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
>>
>>
>> 2013/12/13 xiao li <xe...@outlook.com>
>>
>> Hi Tao
>>
>> Thanks for your reply,
>>
>> This is the code, it is pretty simple.
>>
>> '
>>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>>                     fsDataOutputStream
>> .write(Snappy.compress(json.getBytes("UTF-8")));'
>>
>>
>> but FSDataOutputStream is actually opened for appending, I guess the I
>> can't simply append to the snappy file(know nothing about it.)
>>
>>
>>
>> ------------------------------
>> Date: Fri, 13 Dec 2013 21:42:38 +0800
>> Subject: Re: hadoop fs -text OutOfMemoryError
>> From: xiaotao.cs.nju@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> can you describe your problems in more details, for example, was snappy
>> library installed correctly in your cluster, how did you code yout files
>> with snappy, was your file correctly coded with snappy ?
>>
>>
>> 2013/12/13 xiao li <xe...@outlook.com>
>>
>> I could view the snappy file with hadoop fs -cat but when i issue the
>> -text, it gives me this error though the file size is really tiny. what
>> have i done wrong? Thanks
>>
>> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
>> 20131212-0.snappy
>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
>> getCompressedData(BlockDecompressorStream.java:115)
>>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
>> BlockDecompressorStream.java:95)
>>  at org.apache.hadoop.io.compress.DecompressorStream.read(
>> DecompressorStream.java:83)
>>  at java.io.InputStream.read(InputStream.java:82)
>> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
>> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
>> at org.apache.hadoop.fs.shell.Command.processPathArgument(
>> Command.java:278)
>>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
>> at org.apache.hadoop.fs.shell.Command.processRawArguments(
>> Command.java:190)
>>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>>
>>
>>
>>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Adam Kawa <ka...@gmail.com>.

Since snappy is non-splittable file (so that to decompress snappy file, you
need to read it from the beginning to the end), does the *append* operation
handle it well on a plain text file? I guess, that it might be problematic.

Snappy is recommended to use with a container format, like Sequence Files
or Avro, rather directly on plain text, because the plan text file
compressed by Snappy can not be processed in parallel.


2013/12/14 Tao Xiao <xi...@gmail.com>

> hi xiao li,
>    you said "Basically, what I need is a Storm HDFS Bolt to be able to
> write output to hdfs file, in order to get less small files, i use hdfs
> append". Did you configue the "append" property in your configuration file?
> you can search for "append" related issues first
>
>
> 2013/12/14 xiao li <xe...@outlook.com>
>
>> export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456
>> $HADOOP_CLIENT_OPTS"
>>
>>
>>
>> I guess it is not the memory issue, just the way how i write the snappy
>> compress file to hdfs.
>> Basically, what I need is a Storm HDFS Bolt to be able to write output to
>> hdfs file, in order to get less small files, i use hdfs append.
>>
>> Well I just can't get snappy working or write compressed files to hdfs
>> through Java.
>>
>> I am looking at the flume hdfs sink to get better code. ; )
>>
>>
>> https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
>>
>> ------------------------------
>> Date: Fri, 13 Dec 2013 22:24:21 +0100
>>
>> Subject: Re: hadoop fs -text OutOfMemoryError
>> From: kawa.adam@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> Hi,
>>
>> What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
>>
>> We had similar problems with running OOM with hadoop fs command (I do not
>> remember if they were exactly related to -text + snappy), when we decreased
>> the heap to some small value. With higher value e.g. 1 or 2 GB, we were
>> fine:
>>
>> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
>> export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
>>
>>
>> 2013/12/13 xiao li <xe...@outlook.com>
>>
>> Hi Tao
>>
>> Thanks for your reply,
>>
>> This is the code, it is pretty simple.
>>
>> '
>>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>>                     fsDataOutputStream
>> .write(Snappy.compress(json.getBytes("UTF-8")));'
>>
>>
>> but FSDataOutputStream is actually opened for appending, I guess the I
>> can't simply append to the snappy file(know nothing about it.)
>>
>>
>>
>> ------------------------------
>> Date: Fri, 13 Dec 2013 21:42:38 +0800
>> Subject: Re: hadoop fs -text OutOfMemoryError
>> From: xiaotao.cs.nju@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> can you describe your problems in more details, for example, was snappy
>> library installed correctly in your cluster, how did you code yout files
>> with snappy, was your file correctly coded with snappy ?
>>
>>
>> 2013/12/13 xiao li <xe...@outlook.com>
>>
>> I could view the snappy file with hadoop fs -cat but when i issue the
>> -text, it gives me this error though the file size is really tiny. what
>> have i done wrong? Thanks
>>
>> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
>> 20131212-0.snappy
>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
>> getCompressedData(BlockDecompressorStream.java:115)
>>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
>> BlockDecompressorStream.java:95)
>>  at org.apache.hadoop.io.compress.DecompressorStream.read(
>> DecompressorStream.java:83)
>>  at java.io.InputStream.read(InputStream.java:82)
>> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
>> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
>> at org.apache.hadoop.fs.shell.Command.processPathArgument(
>> Command.java:278)
>>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
>> at org.apache.hadoop.fs.shell.Command.processRawArguments(
>> Command.java:190)
>>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>>
>>
>>
>>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Adam Kawa <ka...@gmail.com>.

Since snappy is non-splittable file (so that to decompress snappy file, you
need to read it from the beginning to the end), does the *append* operation
handle it well on a plain text file? I guess, that it might be problematic.

Snappy is recommended to use with a container format, like Sequence Files
or Avro, rather directly on plain text, because the plan text file
compressed by Snappy can not be processed in parallel.


2013/12/14 Tao Xiao <xi...@gmail.com>

> hi xiao li,
>    you said "Basically, what I need is a Storm HDFS Bolt to be able to
> write output to hdfs file, in order to get less small files, i use hdfs
> append". Did you configue the "append" property in your configuration file?
> you can search for "append" related issues first
>
>
> 2013/12/14 xiao li <xe...@outlook.com>
>
>> export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456
>> $HADOOP_CLIENT_OPTS"
>>
>>
>>
>> I guess it is not the memory issue, just the way how i write the snappy
>> compress file to hdfs.
>> Basically, what I need is a Storm HDFS Bolt to be able to write output to
>> hdfs file, in order to get less small files, i use hdfs append.
>>
>> Well I just can't get snappy working or write compressed files to hdfs
>> through Java.
>>
>> I am looking at the flume hdfs sink to get better code. ; )
>>
>>
>> https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
>>
>> ------------------------------
>> Date: Fri, 13 Dec 2013 22:24:21 +0100
>>
>> Subject: Re: hadoop fs -text OutOfMemoryError
>> From: kawa.adam@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> Hi,
>>
>> What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
>>
>> We had similar problems with running OOM with hadoop fs command (I do not
>> remember if they were exactly related to -text + snappy), when we decreased
>> the heap to some small value. With higher value e.g. 1 or 2 GB, we were
>> fine:
>>
>> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
>> export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
>>
>>
>> 2013/12/13 xiao li <xe...@outlook.com>
>>
>> Hi Tao
>>
>> Thanks for your reply,
>>
>> This is the code, it is pretty simple.
>>
>> '
>>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>>                     fsDataOutputStream
>> .write(Snappy.compress(json.getBytes("UTF-8")));'
>>
>>
>> but FSDataOutputStream is actually opened for appending, I guess the I
>> can't simply append to the snappy file(know nothing about it.)
>>
>>
>>
>> ------------------------------
>> Date: Fri, 13 Dec 2013 21:42:38 +0800
>> Subject: Re: hadoop fs -text OutOfMemoryError
>> From: xiaotao.cs.nju@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> can you describe your problems in more details, for example, was snappy
>> library installed correctly in your cluster, how did you code yout files
>> with snappy, was your file correctly coded with snappy ?
>>
>>
>> 2013/12/13 xiao li <xe...@outlook.com>
>>
>> I could view the snappy file with hadoop fs -cat but when i issue the
>> -text, it gives me this error though the file size is really tiny. what
>> have i done wrong? Thanks
>>
>> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
>> 20131212-0.snappy
>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
>> getCompressedData(BlockDecompressorStream.java:115)
>>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
>> BlockDecompressorStream.java:95)
>>  at org.apache.hadoop.io.compress.DecompressorStream.read(
>> DecompressorStream.java:83)
>>  at java.io.InputStream.read(InputStream.java:82)
>> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
>> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
>> at org.apache.hadoop.fs.shell.Command.processPathArgument(
>> Command.java:278)
>>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
>> at org.apache.hadoop.fs.shell.Command.processRawArguments(
>> Command.java:190)
>>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>>
>>
>>
>>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Tao Xiao <xi...@gmail.com>.

hi xiao li,
   you said "Basically, what I need is a Storm HDFS Bolt to be able to
write output to hdfs file, in order to get less small files, i use hdfs
append". Did you configue the "append" property in your configuration file?
you can search for "append" related issues first


2013/12/14 xiao li <xe...@outlook.com>

> export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456 $HADOOP_CLIENT_OPTS"
>
>
>
> I guess it is not the memory issue, just the way how i write the snappy
> compress file to hdfs.
> Basically, what I need is a Storm HDFS Bolt to be able to write output to
> hdfs file, in order to get less small files, i use hdfs append.
>
> Well I just can't get snappy working or write compressed files to hdfs
> through Java.
>
> I am looking at the flume hdfs sink to get better code. ; )
>
>
> https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
>
> ------------------------------
> Date: Fri, 13 Dec 2013 22:24:21 +0100
>
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: kawa.adam@gmail.com
> To: user@hadoop.apache.org
>
>
> Hi,
>
> What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
>
> We had similar problems with running OOM with hadoop fs command (I do not
> remember if they were exactly related to -text + snappy), when we decreased
> the heap to some small value. With higher value e.g. 1 or 2 GB, we were
> fine:
>
> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
> export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> Hi Tao
>
> Thanks for your reply,
>
> This is the code, it is pretty simple.
>
> '
>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>                     fsDataOutputStream
> .write(Snappy.compress(json.getBytes("UTF-8")));'
>
>
> but FSDataOutputStream is actually opened for appending, I guess the I
> can't simply append to the snappy file(know nothing about it.)
>
>
>
> ------------------------------
> Date: Fri, 13 Dec 2013 21:42:38 +0800
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: xiaotao.cs.nju@gmail.com
> To: user@hadoop.apache.org
>
>
> can you describe your problems in more details, for example, was snappy
> library installed correctly in your cluster, how did you code yout files
> with snappy, was your file correctly coded with snappy ?
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
>  at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>
>
>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Tao Xiao <xi...@gmail.com>.

hi xiao li,
   you said "Basically, what I need is a Storm HDFS Bolt to be able to
write output to hdfs file, in order to get less small files, i use hdfs
append". Did you configue the "append" property in your configuration file?
you can search for "append" related issues first


2013/12/14 xiao li <xe...@outlook.com>

> export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456 $HADOOP_CLIENT_OPTS"
>
>
>
> I guess it is not the memory issue, just the way how i write the snappy
> compress file to hdfs.
> Basically, what I need is a Storm HDFS Bolt to be able to write output to
> hdfs file, in order to get less small files, i use hdfs append.
>
> Well I just can't get snappy working or write compressed files to hdfs
> through Java.
>
> I am looking at the flume hdfs sink to get better code. ; )
>
>
> https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
>
> ------------------------------
> Date: Fri, 13 Dec 2013 22:24:21 +0100
>
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: kawa.adam@gmail.com
> To: user@hadoop.apache.org
>
>
> Hi,
>
> What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
>
> We had similar problems with running OOM with hadoop fs command (I do not
> remember if they were exactly related to -text + snappy), when we decreased
> the heap to some small value. With higher value e.g. 1 or 2 GB, we were
> fine:
>
> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
> export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> Hi Tao
>
> Thanks for your reply,
>
> This is the code, it is pretty simple.
>
> '
>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>                     fsDataOutputStream
> .write(Snappy.compress(json.getBytes("UTF-8")));'
>
>
> but FSDataOutputStream is actually opened for appending, I guess the I
> can't simply append to the snappy file(know nothing about it.)
>
>
>
> ------------------------------
> Date: Fri, 13 Dec 2013 21:42:38 +0800
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: xiaotao.cs.nju@gmail.com
> To: user@hadoop.apache.org
>
>
> can you describe your problems in more details, for example, was snappy
> library installed correctly in your cluster, how did you code yout files
> with snappy, was your file correctly coded with snappy ?
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
>  at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>
>
>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Tao Xiao <xi...@gmail.com>.

hi xiao li,
   you said "Basically, what I need is a Storm HDFS Bolt to be able to
write output to hdfs file, in order to get less small files, i use hdfs
append". Did you configue the "append" property in your configuration file?
you can search for "append" related issues first


2013/12/14 xiao li <xe...@outlook.com>

> export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456 $HADOOP_CLIENT_OPTS"
>
>
>
> I guess it is not the memory issue, just the way how i write the snappy
> compress file to hdfs.
> Basically, what I need is a Storm HDFS Bolt to be able to write output to
> hdfs file, in order to get less small files, i use hdfs append.
>
> Well I just can't get snappy working or write compressed files to hdfs
> through Java.
>
> I am looking at the flume hdfs sink to get better code. ; )
>
>
> https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
>
> ------------------------------
> Date: Fri, 13 Dec 2013 22:24:21 +0100
>
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: kawa.adam@gmail.com
> To: user@hadoop.apache.org
>
>
> Hi,
>
> What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
>
> We had similar problems with running OOM with hadoop fs command (I do not
> remember if they were exactly related to -text + snappy), when we decreased
> the heap to some small value. With higher value e.g. 1 or 2 GB, we were
> fine:
>
> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
> export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> Hi Tao
>
> Thanks for your reply,
>
> This is the code, it is pretty simple.
>
> '
>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>                     fsDataOutputStream
> .write(Snappy.compress(json.getBytes("UTF-8")));'
>
>
> but FSDataOutputStream is actually opened for appending, I guess the I
> can't simply append to the snappy file(know nothing about it.)
>
>
>
> ------------------------------
> Date: Fri, 13 Dec 2013 21:42:38 +0800
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: xiaotao.cs.nju@gmail.com
> To: user@hadoop.apache.org
>
>
> can you describe your problems in more details, for example, was snappy
> library installed correctly in your cluster, how did you code yout files
> with snappy, was your file correctly coded with snappy ?
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
>  at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>
>
>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Tao Xiao <xi...@gmail.com>.

hi xiao li,
   you said "Basically, what I need is a Storm HDFS Bolt to be able to
write output to hdfs file, in order to get less small files, i use hdfs
append". Did you configue the "append" property in your configuration file?
you can search for "append" related issues first


2013/12/14 xiao li <xe...@outlook.com>

> export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456 $HADOOP_CLIENT_OPTS"
>
>
>
> I guess it is not the memory issue, just the way how i write the snappy
> compress file to hdfs.
> Basically, what I need is a Storm HDFS Bolt to be able to write output to
> hdfs file, in order to get less small files, i use hdfs append.
>
> Well I just can't get snappy working or write compressed files to hdfs
> through Java.
>
> I am looking at the flume hdfs sink to get better code. ; )
>
>
> https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
>
> ------------------------------
> Date: Fri, 13 Dec 2013 22:24:21 +0100
>
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: kawa.adam@gmail.com
> To: user@hadoop.apache.org
>
>
> Hi,
>
> What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
>
> We had similar problems with running OOM with hadoop fs command (I do not
> remember if they were exactly related to -text + snappy), when we decreased
> the heap to some small value. With higher value e.g. 1 or 2 GB, we were
> fine:
>
> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
> export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> Hi Tao
>
> Thanks for your reply,
>
> This is the code, it is pretty simple.
>
> '
>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>                     fsDataOutputStream
> .write(Snappy.compress(json.getBytes("UTF-8")));'
>
>
> but FSDataOutputStream is actually opened for appending, I guess the I
> can't simply append to the snappy file(know nothing about it.)
>
>
>
> ------------------------------
> Date: Fri, 13 Dec 2013 21:42:38 +0800
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: xiaotao.cs.nju@gmail.com
> To: user@hadoop.apache.org
>
>
> can you describe your problems in more details, for example, was snappy
> library installed correctly in your cluster, how did you code yout files
> with snappy, was your file correctly coded with snappy ?
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
>  at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>
>
>
>

RE: hadoop fs -text OutOfMemoryError

Posted by xiao li <xe...@outlook.com>.

export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456 $HADOOP_CLIENT_OPTS"                                                                            
I guess it is not the memory issue, just the way how i write the snappy compress file to hdfs. Basically, what I need is a Storm HDFS Bolt to be able to write output to hdfs file, in order to get less small files, i use hdfs append. 
Well I just can't get snappy working or write compressed files to hdfs through Java.
I am looking at the flume hdfs sink to get better code. ; )
https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java

Date: Fri, 13 Dec 2013 22:24:21 +0100
Subject: Re: hadoop fs -text OutOfMemoryError
From: kawa.adam@gmail.com
To: user@hadoop.apache.org

Hi,
What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
We had similar problems with running OOM with hadoop fs command (I do not remember if they were exactly related to -text + snappy), when we decreased the heap to some small value. With higher value e.g. 1 or 2 GB, we were fine:

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"

2013/12/13 xiao li <xe...@outlook.com>

Hi Tao
Thanks for your reply, 
This is the code, it is pretty simple.
'                    fsDataOutputStream.write(Snappy.compress(NEWLINE));
                    fsDataOutputStream.write(Snappy.compress(json.getBytes("UTF-8")));'

but FSDataOutputStream is actually opened for appending, I guess the I can't simply append to the snappy file(know nothing about it.)

Date: Fri, 13 Dec 2013 21:42:38 +0800
Subject: Re: hadoop fs -text OutOfMemoryError
From: xiaotao.cs.nju@gmail.com
To: user@hadoop.apache.org

can you describe your problems in more details, for example, was snappy library installed correctly in your cluster, how did you code yout files with snappy, was your file correctly coded with snappy ?

2013/12/13 xiao li <xe...@outlook.com>

I could view the snappy file with hadoop fs -cat but when i issue the -text, it gives me this error though the file size is really tiny. what have i done wrong? Thanks 

hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-20131212-0.snappy

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:115)

	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:95)

	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)

	at java.io.InputStream.read(InputStream.java:82)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)

	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)

	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)	at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)

	at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)

	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)

	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)

	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)

	at org.apache.hadoop.fs.shell.Command.run(Command.java:154)

	at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

	at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)

RE: hadoop fs -text OutOfMemoryError

Posted by xiao li <xe...@outlook.com>.

export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456 $HADOOP_CLIENT_OPTS"                                                                            
I guess it is not the memory issue, just the way how i write the snappy compress file to hdfs. Basically, what I need is a Storm HDFS Bolt to be able to write output to hdfs file, in order to get less small files, i use hdfs append. 
Well I just can't get snappy working or write compressed files to hdfs through Java.
I am looking at the flume hdfs sink to get better code. ; )
https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java

Date: Fri, 13 Dec 2013 22:24:21 +0100
Subject: Re: hadoop fs -text OutOfMemoryError
From: kawa.adam@gmail.com
To: user@hadoop.apache.org

Hi,
What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
We had similar problems with running OOM with hadoop fs command (I do not remember if they were exactly related to -text + snappy), when we decreased the heap to some small value. With higher value e.g. 1 or 2 GB, we were fine:

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"

2013/12/13 xiao li <xe...@outlook.com>

Hi Tao
Thanks for your reply, 
This is the code, it is pretty simple.
'                    fsDataOutputStream.write(Snappy.compress(NEWLINE));
                    fsDataOutputStream.write(Snappy.compress(json.getBytes("UTF-8")));'

but FSDataOutputStream is actually opened for appending, I guess the I can't simply append to the snappy file(know nothing about it.)

Date: Fri, 13 Dec 2013 21:42:38 +0800
Subject: Re: hadoop fs -text OutOfMemoryError
From: xiaotao.cs.nju@gmail.com
To: user@hadoop.apache.org

can you describe your problems in more details, for example, was snappy library installed correctly in your cluster, how did you code yout files with snappy, was your file correctly coded with snappy ?

2013/12/13 xiao li <xe...@outlook.com>

I could view the snappy file with hadoop fs -cat but when i issue the -text, it gives me this error though the file size is really tiny. what have i done wrong? Thanks 

hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-20131212-0.snappy

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:115)

	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:95)

	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)

	at java.io.InputStream.read(InputStream.java:82)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)

	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)

	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)	at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)

	at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)

	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)

	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)

	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)

	at org.apache.hadoop.fs.shell.Command.run(Command.java:154)

	at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

	at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)

RE: hadoop fs -text OutOfMemoryError

Posted by xiao li <xe...@outlook.com>.

export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456 $HADOOP_CLIENT_OPTS"                                                                            
I guess it is not the memory issue, just the way how i write the snappy compress file to hdfs. Basically, what I need is a Storm HDFS Bolt to be able to write output to hdfs file, in order to get less small files, i use hdfs append. 
Well I just can't get snappy working or write compressed files to hdfs through Java.
I am looking at the flume hdfs sink to get better code. ; )
https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java

Date: Fri, 13 Dec 2013 22:24:21 +0100
Subject: Re: hadoop fs -text OutOfMemoryError
From: kawa.adam@gmail.com
To: user@hadoop.apache.org

Hi,
What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
We had similar problems with running OOM with hadoop fs command (I do not remember if they were exactly related to -text + snappy), when we decreased the heap to some small value. With higher value e.g. 1 or 2 GB, we were fine:

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"

2013/12/13 xiao li <xe...@outlook.com>

Hi Tao
Thanks for your reply, 
This is the code, it is pretty simple.
'                    fsDataOutputStream.write(Snappy.compress(NEWLINE));
                    fsDataOutputStream.write(Snappy.compress(json.getBytes("UTF-8")));'

but FSDataOutputStream is actually opened for appending, I guess the I can't simply append to the snappy file(know nothing about it.)

Date: Fri, 13 Dec 2013 21:42:38 +0800
Subject: Re: hadoop fs -text OutOfMemoryError
From: xiaotao.cs.nju@gmail.com
To: user@hadoop.apache.org

can you describe your problems in more details, for example, was snappy library installed correctly in your cluster, how did you code yout files with snappy, was your file correctly coded with snappy ?

2013/12/13 xiao li <xe...@outlook.com>

I could view the snappy file with hadoop fs -cat but when i issue the -text, it gives me this error though the file size is really tiny. what have i done wrong? Thanks 

hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-20131212-0.snappy

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:115)

	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:95)

	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)

	at java.io.InputStream.read(InputStream.java:82)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)

	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)

	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)	at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)

	at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)

	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)

	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)

	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)

	at org.apache.hadoop.fs.shell.Command.run(Command.java:154)

	at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

	at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)

RE: hadoop fs -text OutOfMemoryError

Posted by xiao li <xe...@outlook.com>.

export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456 $HADOOP_CLIENT_OPTS"                                                                            
I guess it is not the memory issue, just the way how i write the snappy compress file to hdfs. Basically, what I need is a Storm HDFS Bolt to be able to write output to hdfs file, in order to get less small files, i use hdfs append. 
Well I just can't get snappy working or write compressed files to hdfs through Java.
I am looking at the flume hdfs sink to get better code. ; )
https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java

Date: Fri, 13 Dec 2013 22:24:21 +0100
Subject: Re: hadoop fs -text OutOfMemoryError
From: kawa.adam@gmail.com
To: user@hadoop.apache.org

Hi,
What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
We had similar problems with running OOM with hadoop fs command (I do not remember if they were exactly related to -text + snappy), when we decreased the heap to some small value. With higher value e.g. 1 or 2 GB, we were fine:

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"

2013/12/13 xiao li <xe...@outlook.com>

Hi Tao
Thanks for your reply, 
This is the code, it is pretty simple.
'                    fsDataOutputStream.write(Snappy.compress(NEWLINE));
                    fsDataOutputStream.write(Snappy.compress(json.getBytes("UTF-8")));'

but FSDataOutputStream is actually opened for appending, I guess the I can't simply append to the snappy file(know nothing about it.)

Date: Fri, 13 Dec 2013 21:42:38 +0800
Subject: Re: hadoop fs -text OutOfMemoryError
From: xiaotao.cs.nju@gmail.com
To: user@hadoop.apache.org

can you describe your problems in more details, for example, was snappy library installed correctly in your cluster, how did you code yout files with snappy, was your file correctly coded with snappy ?

2013/12/13 xiao li <xe...@outlook.com>

I could view the snappy file with hadoop fs -cat but when i issue the -text, it gives me this error though the file size is really tiny. what have i done wrong? Thanks 

hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-20131212-0.snappy

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:115)

	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:95)

	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)

	at java.io.InputStream.read(InputStream.java:82)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)

	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)

	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)	at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)

	at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)

	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)

	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)

	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)

	at org.apache.hadoop.fs.shell.Command.run(Command.java:154)

	at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

	at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)

Re: hadoop fs -text OutOfMemoryError

Posted by Adam Kawa <ka...@gmail.com>.

Hi,

What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?

We had similar problems with running OOM with hadoop fs command (I do not
remember if they were exactly related to -text + snappy), when we decreased
the heap to some small value. With higher value e.g. 1 or 2 GB, we were
fine:

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"


2013/12/13 xiao li <xe...@outlook.com>

> Hi Tao
>
> Thanks for your reply,
>
> This is the code, it is pretty simple.
>
> '
>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>                     fsDataOutputStream
> .write(Snappy.compress(json.getBytes("UTF-8")));'
>
>
> but FSDataOutputStream is actually opened for appending, I guess the I
> can't simply append to the snappy file(know nothing about it.)
>
>
>
> ------------------------------
> Date: Fri, 13 Dec 2013 21:42:38 +0800
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: xiaotao.cs.nju@gmail.com
> To: user@hadoop.apache.org
>
>
> can you describe your problems in more details, for example, was snappy
> library installed correctly in your cluster, how did you code yout files
> with snappy, was your file correctly coded with snappy ?
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
>  at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>
>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Adam Kawa <ka...@gmail.com>.

Hi,

What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?

We had similar problems with running OOM with hadoop fs command (I do not
remember if they were exactly related to -text + snappy), when we decreased
the heap to some small value. With higher value e.g. 1 or 2 GB, we were
fine:

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"


2013/12/13 xiao li <xe...@outlook.com>

> Hi Tao
>
> Thanks for your reply,
>
> This is the code, it is pretty simple.
>
> '
>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>                     fsDataOutputStream
> .write(Snappy.compress(json.getBytes("UTF-8")));'
>
>
> but FSDataOutputStream is actually opened for appending, I guess the I
> can't simply append to the snappy file(know nothing about it.)
>
>
>
> ------------------------------
> Date: Fri, 13 Dec 2013 21:42:38 +0800
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: xiaotao.cs.nju@gmail.com
> To: user@hadoop.apache.org
>
>
> can you describe your problems in more details, for example, was snappy
> library installed correctly in your cluster, how did you code yout files
> with snappy, was your file correctly coded with snappy ?
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
>  at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>
>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Adam Kawa <ka...@gmail.com>.

Hi,

What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?

We had similar problems with running OOM with hadoop fs command (I do not
remember if they were exactly related to -text + snappy), when we decreased
the heap to some small value. With higher value e.g. 1 or 2 GB, we were
fine:

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"


2013/12/13 xiao li <xe...@outlook.com>

> Hi Tao
>
> Thanks for your reply,
>
> This is the code, it is pretty simple.
>
> '
>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>                     fsDataOutputStream
> .write(Snappy.compress(json.getBytes("UTF-8")));'
>
>
> but FSDataOutputStream is actually opened for appending, I guess the I
> can't simply append to the snappy file(know nothing about it.)
>
>
>
> ------------------------------
> Date: Fri, 13 Dec 2013 21:42:38 +0800
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: xiaotao.cs.nju@gmail.com
> To: user@hadoop.apache.org
>
>
> can you describe your problems in more details, for example, was snappy
> library installed correctly in your cluster, how did you code yout files
> with snappy, was your file correctly coded with snappy ?
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
>  at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>
>
>

Re: hadoop fs -text OutOfMemoryError

Posted by Adam Kawa <ka...@gmail.com>.

Hi,

What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?

We had similar problems with running OOM with hadoop fs command (I do not
remember if they were exactly related to -text + snappy), when we decreased
the heap to some small value. With higher value e.g. 1 or 2 GB, we were
fine:

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"


2013/12/13 xiao li <xe...@outlook.com>

> Hi Tao
>
> Thanks for your reply,
>
> This is the code, it is pretty simple.
>
> '
>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>                     fsDataOutputStream
> .write(Snappy.compress(json.getBytes("UTF-8")));'
>
>
> but FSDataOutputStream is actually opened for appending, I guess the I
> can't simply append to the snappy file(know nothing about it.)
>
>
>
> ------------------------------
> Date: Fri, 13 Dec 2013 21:42:38 +0800
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: xiaotao.cs.nju@gmail.com
> To: user@hadoop.apache.org
>
>
> can you describe your problems in more details, for example, was snappy
> library installed correctly in your cluster, how did you code yout files
> with snappy, was your file correctly coded with snappy ?
>
>
> 2013/12/13 xiao li <xe...@outlook.com>
>
> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
>  at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>
>
>

RE: hadoop fs -text OutOfMemoryError

Posted by xiao li <xe...@outlook.com>.

Hi Tao
Thanks for your reply, 
This is the code, it is pretty simple.
'                    fsDataOutputStream.write(Snappy.compress(NEWLINE));                    fsDataOutputStream.write(Snappy.compress(json.getBytes("UTF-8")));'


but FSDataOutputStream is actually opened for appending, I guess the I can't simply append to the snappy file(know nothing about it.)


Date: Fri, 13 Dec 2013 21:42:38 +0800
Subject: Re: hadoop fs -text OutOfMemoryError
From: xiaotao.cs.nju@gmail.com
To: user@hadoop.apache.org

can you describe your problems in more details, for example, was snappy library installed correctly in your cluster, how did you code yout files with snappy, was your file correctly coded with snappy ?



2013/12/13 xiao li <xe...@outlook.com>




I could view the snappy file with hadoop fs -cat but when i issue the -text, it gives me this error though the file size is really tiny. what have i done wrong? Thanks 

hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-20131212-0.snappy
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:115)
	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:95)
	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
	at java.io.InputStream.read(InputStream.java:82)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)	at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
	at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
	at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
	at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)

RE: hadoop fs -text OutOfMemoryError

Posted by xiao li <xe...@outlook.com>.

Hi Tao
Thanks for your reply, 
This is the code, it is pretty simple.
'                    fsDataOutputStream.write(Snappy.compress(NEWLINE));                    fsDataOutputStream.write(Snappy.compress(json.getBytes("UTF-8")));'


but FSDataOutputStream is actually opened for appending, I guess the I can't simply append to the snappy file(know nothing about it.)


Date: Fri, 13 Dec 2013 21:42:38 +0800
Subject: Re: hadoop fs -text OutOfMemoryError
From: xiaotao.cs.nju@gmail.com
To: user@hadoop.apache.org

can you describe your problems in more details, for example, was snappy library installed correctly in your cluster, how did you code yout files with snappy, was your file correctly coded with snappy ?



2013/12/13 xiao li <xe...@outlook.com>




I could view the snappy file with hadoop fs -cat but when i issue the -text, it gives me this error though the file size is really tiny. what have i done wrong? Thanks 

hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-20131212-0.snappy
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:115)
	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:95)
	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
	at java.io.InputStream.read(InputStream.java:82)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)	at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
	at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
	at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
	at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)

RE: hadoop fs -text OutOfMemoryError

Posted by xiao li <xe...@outlook.com>.

Hi Tao
Thanks for your reply, 
This is the code, it is pretty simple.
'                    fsDataOutputStream.write(Snappy.compress(NEWLINE));                    fsDataOutputStream.write(Snappy.compress(json.getBytes("UTF-8")));'


but FSDataOutputStream is actually opened for appending, I guess the I can't simply append to the snappy file(know nothing about it.)


Date: Fri, 13 Dec 2013 21:42:38 +0800
Subject: Re: hadoop fs -text OutOfMemoryError
From: xiaotao.cs.nju@gmail.com
To: user@hadoop.apache.org

can you describe your problems in more details, for example, was snappy library installed correctly in your cluster, how did you code yout files with snappy, was your file correctly coded with snappy ?



2013/12/13 xiao li <xe...@outlook.com>




I could view the snappy file with hadoop fs -cat but when i issue the -text, it gives me this error though the file size is really tiny. what have i done wrong? Thanks 

hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-20131212-0.snappy
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:115)
	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:95)
	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
	at java.io.InputStream.read(InputStream.java:82)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)	at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
	at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
	at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
	at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)

RE: hadoop fs -text OutOfMemoryError

Posted by xiao li <xe...@outlook.com>.

Hi Tao
Thanks for your reply, 
This is the code, it is pretty simple.
'                    fsDataOutputStream.write(Snappy.compress(NEWLINE));                    fsDataOutputStream.write(Snappy.compress(json.getBytes("UTF-8")));'


but FSDataOutputStream is actually opened for appending, I guess the I can't simply append to the snappy file(know nothing about it.)


Date: Fri, 13 Dec 2013 21:42:38 +0800
Subject: Re: hadoop fs -text OutOfMemoryError
From: xiaotao.cs.nju@gmail.com
To: user@hadoop.apache.org

can you describe your problems in more details, for example, was snappy library installed correctly in your cluster, how did you code yout files with snappy, was your file correctly coded with snappy ?



2013/12/13 xiao li <xe...@outlook.com>




I could view the snappy file with hadoop fs -cat but when i issue the -text, it gives me this error though the file size is really tiny. what have i done wrong? Thanks 

hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-20131212-0.snappy
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:115)
	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:95)
	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
	at java.io.InputStream.read(InputStream.java:82)	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)	at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
	at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
	at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
	at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)

Re: hadoop fs -text OutOfMemoryError

Posted by Tao Xiao <xi...@gmail.com>.

can you describe your problems in more details, for example, was snappy
library installed correctly in your cluster, how did you code yout files
with snappy, was your file correctly coded with snappy ?


2013/12/13 xiao li <xe...@outlook.com>

> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
> at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
> at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
> at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>

Re: hadoop fs -text OutOfMemoryError

Posted by Tao Xiao <xi...@gmail.com>.

can you describe your problems in more details, for example, was snappy
library installed correctly in your cluster, how did you code yout files
with snappy, was your file correctly coded with snappy ?


2013/12/13 xiao li <xe...@outlook.com>

> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
> at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
> at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
> at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>

Re: hadoop fs -text OutOfMemoryError

Posted by Tao Xiao <xi...@gmail.com>.

can you describe your problems in more details, for example, was snappy
library installed correctly in your cluster, how did you code yout files
with snappy, was your file correctly coded with snappy ?


2013/12/13 xiao li <xe...@outlook.com>

> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
> at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
> at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
> at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>

Re: hadoop fs -text OutOfMemoryError

Posted by Tao Xiao <xi...@gmail.com>.

can you describe your problems in more details, for example, was snappy
library installed correctly in your cluster, how did you code yout files
with snappy, was your file correctly coded with snappy ?


2013/12/13 xiao li <xe...@outlook.com>

> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
> at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
> at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
> at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>