You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by satish verma <sa...@gmail.com> on 2012/10/03 06:17:56 UTC

Heap Error: Mahout RowIdJob

I am trying to run the Mahout Rowid command  as follows:

./exp/mahout/current/bin/mahout rowid -i
/tmp/satish/t2c/60k/vectors_m/vectors_m/ -o
/tmp/satish/t2c/60k/matirx_73731_131093


I always get Exception in thread "main" java.lang.OutOfMemoryError: Java
heap space when I try for larger values such as 70000 vectors , each of
size 130k.

I tried increasing the Mahout Heapsize in /bin/mahout script but it does
not help. Top shows that memory is not being over utilized.

How can I solve or debug this problem ???



12/10/02 23:06:08 INFO common.AbstractJob: Command line arguments:
{--endPhase=[2147483647],
--input=[/tmp/satish/t2c/60k/vectors_m/vectors_m/],
--output=[/tmp/satish/t2c/60k/matirx_73731_131093], --startPhase=[0],
--tempDir=[temp]}
12/10/02 23:06:11 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
12/10/02 23:06:11 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
12/10/02 23:06:11 INFO compress.CodecPool: Got brand-new compressor
12/10/02 23:06:28 INFO compress.CodecPool: Got brand-new compressor
12/10/02 23:06:33 INFO compress.CodecPool: Got brand-new decompressor
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at
org.apache.hadoop.io.compress.DecompressorStream.<init>(DecompressorStream.java:43)
    at
org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71)
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
    at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:58)
    at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:110)
    at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:106)
    at com.google.common.collect.Iterators$8.next(Iterators.java:765)
    at com.google.common.collect.Iterators$5.hasNext(Iterators.java:526)
    at
com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
    at org.apache.mahout.utils.vectors.RowIdJob.run(RowIdJob.java:75)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.mahout.utils.vectors.RowIdJob.main(RowIdJob.java:98)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Re: Heap Error: Mahout RowIdJob

Posted by satish verma <sa...@gmail.com>.
I tried changing HADOOP_HEAPSIZE to 8000 from default 1000 but same error .
I have around 70k input vectors which I want to write to the matrix using
the RowIDJob.

12/10/03 22:42:19 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
12/10/03 22:42:19 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
12/10/03 22:42:19 INFO compress.CodecPool: Got brand-new compressor
12/10/03 22:42:19 INFO compress.CodecPool: Got brand-new compressor
12/10/03 22:42:25 INFO compress.CodecPool: Got brand-new decompressor
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at
org.apache.hadoop.io.compress.DecompressorStream.<init>(DecompressorStream.java:43)
    at
org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71)
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)


On T hu, Oct 4, 2012 at 5:37 AM, Sarang Deshpande <sa...@shopzilla.com>wrote:

> Try changing HADOOP_HEAPSIZE in hadoop-env.sh to something bigger.
>
> ~Sarang
>
> -----Original Message-----
> From: satish verma [mailto:satish.bigdata@gmail.com]
> Sent: Tuesday, October 02, 2012 9:18 PM
> To: user@mahout.apache.org
> Subject: Heap Error: Mahout RowIdJob
>
> I am trying to run the Mahout Rowid command  as follows:
>
> ./exp/mahout/current/bin/mahout rowid -i
> /tmp/satish/t2c/60k/vectors_m/vectors_m/ -o
> /tmp/satish/t2c/60k/matirx_73731_131093
>
>
> I always get Exception in thread "main" java.lang.OutOfMemoryError: Java
> heap space when I try for larger values such as 70000 vectors , each of
> size 130k.
>
> I tried increasing the Mahout Heapsize in /bin/mahout script but it does
> not help. Top shows that memory is not being over utilized.
>
> How can I solve or debug this problem ???
>
>
>
> 12/10/02 23:06:08 INFO common.AbstractJob: Command line arguments:
> {--endPhase=[2147483647],
> --input=[/tmp/satish/t2c/60k/vectors_m/vectors_m/],
> --output=[/tmp/satish/t2c/60k/matirx_73731_131093], --startPhase=[0],
> --tempDir=[temp]}
> 12/10/02 23:06:11 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 12/10/02 23:06:11 INFO zlib.ZlibFactory: Successfully loaded & initialized
> native-zlib library
> 12/10/02 23:06:11 INFO compress.CodecPool: Got brand-new compressor
> 12/10/02 23:06:28 INFO compress.CodecPool: Got brand-new compressor
> 12/10/02 23:06:33 INFO compress.CodecPool: Got brand-new decompressor
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>     at
>
> org.apache.hadoop.io.compress.DecompressorStream.<init>(DecompressorStream.java:43)
>     at
>
> org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
>     at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
>     at
>
> org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:58)
>     at
>
> org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:110)
>     at
>
> org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:106)
>     at com.google.common.collect.Iterators$8.next(Iterators.java:765)
>     at com.google.common.collect.Iterators$5.hasNext(Iterators.java:526)
>     at
>
> com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
>     at org.apache.mahout.utils.vectors.RowIdJob.run(RowIdJob.java:75)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.mahout.utils.vectors.RowIdJob.main(RowIdJob.java:98)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at
>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>     at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>     at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>

RE: Heap Error: Mahout RowIdJob

Posted by Sarang Deshpande <sa...@Shopzilla.com>.
Try changing HADOOP_HEAPSIZE in hadoop-env.sh to something bigger.

~Sarang

-----Original Message-----
From: satish verma [mailto:satish.bigdata@gmail.com] 
Sent: Tuesday, October 02, 2012 9:18 PM
To: user@mahout.apache.org
Subject: Heap Error: Mahout RowIdJob

I am trying to run the Mahout Rowid command  as follows:

./exp/mahout/current/bin/mahout rowid -i /tmp/satish/t2c/60k/vectors_m/vectors_m/ -o
/tmp/satish/t2c/60k/matirx_73731_131093


I always get Exception in thread "main" java.lang.OutOfMemoryError: Java heap space when I try for larger values such as 70000 vectors , each of size 130k.

I tried increasing the Mahout Heapsize in /bin/mahout script but it does not help. Top shows that memory is not being over utilized.

How can I solve or debug this problem ???



12/10/02 23:06:08 INFO common.AbstractJob: Command line arguments:
{--endPhase=[2147483647],
--input=[/tmp/satish/t2c/60k/vectors_m/vectors_m/],
--output=[/tmp/satish/t2c/60k/matirx_73731_131093], --startPhase=[0], --tempDir=[temp]}
12/10/02 23:06:11 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/10/02 23:06:11 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
12/10/02 23:06:11 INFO compress.CodecPool: Got brand-new compressor
12/10/02 23:06:28 INFO compress.CodecPool: Got brand-new compressor
12/10/02 23:06:33 INFO compress.CodecPool: Got brand-new decompressor Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at
org.apache.hadoop.io.compress.DecompressorStream.<init>(DecompressorStream.java:43)
    at
org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71)
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
    at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:58)
    at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:110)
    at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:106)
    at com.google.common.collect.Iterators$8.next(Iterators.java:765)
    at com.google.common.collect.Iterators$5.hasNext(Iterators.java:526)
    at
com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
    at org.apache.mahout.utils.vectors.RowIdJob.run(RowIdJob.java:75)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.mahout.utils.vectors.RowIdJob.main(RowIdJob.java:98)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)