You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by pricila rr <pr...@gmail.com> on 2012/08/01 14:29:26 UTC

Re: ERROR: OutOfMemoryError: Java heap space

Even changing the memory settings the error continues. What else can I do?
And how can I split the file? With smaller files does not occur error.
I'm using Mahout and Hadoop on Linux machines, with one master and two
slaves.

Thank you.

2012/7/28 Anandha L Ranganathan <an...@gmail.com>

> You also can increase the memory using command line parameter.
>
>  <JAR FILE> -D mapred.child.java.opts=-Xmx2048M <INPUT_PARAMETER>
>
> On Sat, Jul 28, 2012 at 8:30 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>
> > I've been changing the hadoop/conf/mapred-site.xml :
> >
> >   <property>
> >      <name>mapred.child.java.opts</**name>
> >      <value>-Xmx2048m</value>
> >      <description>map heap size for child task</description>
> >   </property>
> >
> > This ups the task heap to 2G.
> >
> >
> > On 7/26/12 7:12 PM, Lance Norskog wrote:
> >
> >> Increase the memory size or split the file!
> >>
> >> On Thu, Jul 26, 2012 at 5:37 AM, pricila rr <pr...@gmail.com>
> wrote:
> >>
> >>> I'm trying to transform a file .txt of 1gb for seqfile and the error
> >>> occurs: OutOfMemoryError: Java heap space
> >>> How to solve?
> >>> I am using Hadoop and Mahout.
> >>>
> >>> $MAHOUT_HOME/bin/mahout seqdirectory --input '/home/usuario/Área de
> >>> Trabalho/Dados/base1.txt' --output '/home/usuario/Área de
> >>> Trabalho/seqFile/base1File' -c UTF-8
> >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> Running on hadoop, using /home/usuario/hadoop/bin/**hadoop and
> >>> HADOOP_CONF_DIR=/home/usuario/**hadoop/conf
> >>> MAHOUT-JOB:
> >>> /home/usuario/trunk/examples/**target/mahout-examples-0.8-**
> >>> SNAPSHOT-job.jar
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> 12/07/26 09:18:28 INFO common.AbstractJob: Command line arguments:
> >>> {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647],
> >>> --fileFilterClass=[org.apache.**mahout.text.**PrefixAdditionFilter],
> >>> --input=[/home/usuario/Área de Trabalho/Dados/base1.txt],
> --keyPrefix=[],
> >>> --output=[/home/usuario/Área de Trabalho/seqFile/base1File],
> >>> --startPhase=[0], --tempDir=[temp]}
> >>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> >>> at java.util.Arrays.copyOf(**Arrays.java:2882)
> >>> at
> >>> java.lang.**AbstractStringBuilder.**expandCapacity(**
> >>> AbstractStringBuilder.java:**100)
> >>> at java.lang.**AbstractStringBuilder.append(**
> >>> AbstractStringBuilder.java:**390)
> >>> at java.lang.StringBuilder.**append(StringBuilder.java:119)
> >>> at
> >>> org.apache.mahout.text.**PrefixAdditionFilter.process(**
> >>> PrefixAdditionFilter.java:62)
> >>> at
> >>> org.apache.mahout.text.**SequenceFilesFromDirectoryFilt**er.accept(**
> >>> SequenceFilesFromDirectoryFilt**er.java:90)
> >>> at org.apache.hadoop.fs.**FileSystem.listStatus(**FileSystem.java:845)
> >>> at org.apache.hadoop.fs.**FileSystem.listStatus(**FileSystem.java:867)
> >>> at
> >>> org.apache.mahout.text.**SequenceFilesFromDirectory.**run(**
> >>> SequenceFilesFromDirectory.**java:98)
> >>> at org.apache.hadoop.util.**ToolRunner.run(ToolRunner.**java:65)
> >>> at org.apache.hadoop.util.**ToolRunner.run(ToolRunner.**java:79)
> >>> at
> >>> org.apache.mahout.text.**SequenceFilesFromDirectory.**main(**
> >>> SequenceFilesFromDirectory.**java:53)
> >>> at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native Method)
> >>> at
> >>> sun.reflect.**NativeMethodAccessorImpl.**invoke(**
> >>> NativeMethodAccessorImpl.java:**39)
> >>> at
> >>> sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
> >>> DelegatingMethodAccessorImpl.**java:25)
> >>> at java.lang.reflect.Method.**invoke(Method.java:597)
> >>> at
> >>> org.apache.hadoop.util.**ProgramDriver$**ProgramDescription.invoke(**
> >>> ProgramDriver.java:68)
> >>> at org.apache.hadoop.util.**ProgramDriver.driver(**
> >>> ProgramDriver.java:139)
> >>> at
> org.apache.mahout.driver.**MahoutDriver.main(**MahoutDriver.java:195)
> >>> at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native Method)
> >>> at
> >>> sun.reflect.**NativeMethodAccessorImpl.**invoke(**
> >>> NativeMethodAccessorImpl.java:**39)
> >>> at
> >>> sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
> >>> DelegatingMethodAccessorImpl.**java:25)
> >>> at java.lang.reflect.Method.**invoke(Method.java:597)
> >>> at org.apache.hadoop.util.RunJar.**main(RunJar.java:156)
> >>>
> >>
> >>
> >>
> >
>

Re: ERROR: OutOfMemoryError: Java heap space

Posted by Lance Norskog <go...@gmail.com>.

If you are on Unix, and you want to split your text on line
boundaries, the 'split' program will create many files with the same
number of lines.

On Wed, Aug 1, 2012 at 5:29 AM, pricila rr <pr...@gmail.com> wrote:
> Even changing the memory settings the error continues. What else can I do?
> And how can I split the file? With smaller files does not occur error.
> I'm using Mahout and Hadoop on Linux machines, with one master and two
> slaves.
>
> Thank you.
>
> 2012/7/28 Anandha L Ranganathan <an...@gmail.com>
>
>> You also can increase the memory using command line parameter.
>>
>>  <JAR FILE> -D mapred.child.java.opts=-Xmx2048M <INPUT_PARAMETER>
>>
>> On Sat, Jul 28, 2012 at 8:30 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>>
>> > I've been changing the hadoop/conf/mapred-site.xml :
>> >
>> >   <property>
>> >      <name>mapred.child.java.opts</**name>
>> >      <value>-Xmx2048m</value>
>> >      <description>map heap size for child task</description>
>> >   </property>
>> >
>> > This ups the task heap to 2G.
>> >
>> >
>> > On 7/26/12 7:12 PM, Lance Norskog wrote:
>> >
>> >> Increase the memory size or split the file!
>> >>
>> >> On Thu, Jul 26, 2012 at 5:37 AM, pricila rr <pr...@gmail.com>
>> wrote:
>> >>
>> >>> I'm trying to transform a file .txt of 1gb for seqfile and the error
>> >>> occurs: OutOfMemoryError: Java heap space
>> >>> How to solve?
>> >>> I am using Hadoop and Mahout.
>> >>>
>> >>> $MAHOUT_HOME/bin/mahout seqdirectory --input '/home/usuario/Área de
>> >>> Trabalho/Dados/base1.txt' --output '/home/usuario/Área de
>> >>> Trabalho/seqFile/base1File' -c UTF-8
>> >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
>> >>> Warning: $HADOOP_HOME is deprecated.
>> >>>
>> >>> Running on hadoop, using /home/usuario/hadoop/bin/**hadoop and
>> >>> HADOOP_CONF_DIR=/home/usuario/**hadoop/conf
>> >>> MAHOUT-JOB:
>> >>> /home/usuario/trunk/examples/**target/mahout-examples-0.8-**
>> >>> SNAPSHOT-job.jar
>> >>> Warning: $HADOOP_HOME is deprecated.
>> >>>
>> >>> 12/07/26 09:18:28 INFO common.AbstractJob: Command line arguments:
>> >>> {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647],
>> >>> --fileFilterClass=[org.apache.**mahout.text.**PrefixAdditionFilter],
>> >>> --input=[/home/usuario/Área de Trabalho/Dados/base1.txt],
>> --keyPrefix=[],
>> >>> --output=[/home/usuario/Área de Trabalho/seqFile/base1File],
>> >>> --startPhase=[0], --tempDir=[temp]}
>> >>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>> >>> at java.util.Arrays.copyOf(**Arrays.java:2882)
>> >>> at
>> >>> java.lang.**AbstractStringBuilder.**expandCapacity(**
>> >>> AbstractStringBuilder.java:**100)
>> >>> at java.lang.**AbstractStringBuilder.append(**
>> >>> AbstractStringBuilder.java:**390)
>> >>> at java.lang.StringBuilder.**append(StringBuilder.java:119)
>> >>> at
>> >>> org.apache.mahout.text.**PrefixAdditionFilter.process(**
>> >>> PrefixAdditionFilter.java:62)
>> >>> at
>> >>> org.apache.mahout.text.**SequenceFilesFromDirectoryFilt**er.accept(**
>> >>> SequenceFilesFromDirectoryFilt**er.java:90)
>> >>> at org.apache.hadoop.fs.**FileSystem.listStatus(**FileSystem.java:845)
>> >>> at org.apache.hadoop.fs.**FileSystem.listStatus(**FileSystem.java:867)
>> >>> at
>> >>> org.apache.mahout.text.**SequenceFilesFromDirectory.**run(**
>> >>> SequenceFilesFromDirectory.**java:98)
>> >>> at org.apache.hadoop.util.**ToolRunner.run(ToolRunner.**java:65)
>> >>> at org.apache.hadoop.util.**ToolRunner.run(ToolRunner.**java:79)
>> >>> at
>> >>> org.apache.mahout.text.**SequenceFilesFromDirectory.**main(**
>> >>> SequenceFilesFromDirectory.**java:53)
>> >>> at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native Method)
>> >>> at
>> >>> sun.reflect.**NativeMethodAccessorImpl.**invoke(**
>> >>> NativeMethodAccessorImpl.java:**39)
>> >>> at
>> >>> sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
>> >>> DelegatingMethodAccessorImpl.**java:25)
>> >>> at java.lang.reflect.Method.**invoke(Method.java:597)
>> >>> at
>> >>> org.apache.hadoop.util.**ProgramDriver$**ProgramDescription.invoke(**
>> >>> ProgramDriver.java:68)
>> >>> at org.apache.hadoop.util.**ProgramDriver.driver(**
>> >>> ProgramDriver.java:139)
>> >>> at
>> org.apache.mahout.driver.**MahoutDriver.main(**MahoutDriver.java:195)
>> >>> at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native Method)
>> >>> at
>> >>> sun.reflect.**NativeMethodAccessorImpl.**invoke(**
>> >>> NativeMethodAccessorImpl.java:**39)
>> >>> at
>> >>> sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
>> >>> DelegatingMethodAccessorImpl.**java:25)
>> >>> at java.lang.reflect.Method.**invoke(Method.java:597)
>> >>> at org.apache.hadoop.util.RunJar.**main(RunJar.java:156)
>> >>>
>> >>
>> >>
>> >>
>> >
>>



-- 
Lance Norskog
goksron@gmail.com