You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Mohammed Al khooja <mk...@gmail.com> on 2011/11/15 23:12:14 UTC

Mahout heap out of space

Hi,

I'm running Mahout LDA on a cluster but I'm getting the java out of memory
exception (below).  How do I increase the MAHOUT_HEAP variable in
bin\mahout.sh ?  When I add the line MAHOUT_HEAP=2000 it doesn't seem to
change !  Do I have to reload or rebuild anything so that changes take
effect ?


thanks.

2011-11-15 16:59:39,697 INFO org.apache.hadoop.util.NativeCodeLoader:
Loaded the native-hadoop library
2011-11-15 16:59:39,836 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2011-11-15 16:59:39,888 WARN org.apache.hadoop.conf.Configuration:
/scratch1/hadoop/mapred/local/taskTracker/mkhooja/jobcache/job_201110181830_3456/job.xml:a
attempt to override final parameter: dfs.data.dir; Ignoring.
2011-11-15 16:59:39,956 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb =
100
2011-11-15 16:59:40,001 INFO org.apache.hadoop.mapred.MapTask: data buffer
= 79691776/99614720
2011-11-15 16:59:40,001 INFO org.apache.hadoop.mapred.MapTask: record
buffer = 262144/327680
2011-11-15 16:59:48,863 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2011-11-15 16:59:48,866 FATAL org.apache.hadoop.mapred.Child: Error running
child : java.lang.OutOfMemoryError: Java heap space
at org.apache.mahout.math.DenseMatrix.<init>(DenseMatrix.java:51)
at
org.apache.mahout.clustering.lda.LDAInference.createPhiMatrix(LDAInference.java:154)
at org.apache.mahout.clustering.lda.LDAInference.infer(LDAInference.java:93)
at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:48)
at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:36)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)

-- 

M.khouja

Re: Mahout heap out of space

Posted by Mohammed Al khooja <mk...@gmail.com>.
Thanks Jake,

I have 340,000 over 20 topics.

On Tue, Nov 15, 2011 at 5:21 PM, Jake Mannix <ja...@gmail.com> wrote:

> You need the mapper/reducer memory set, not the MAHOUT_HEAP, I think:
>
> mapred.map.child.java.opts
>
> needs to have something like "-Xmx3g" or the like in it.
>
> How many terms (vocabulary size) and topics are you running over?
>
>  -jake
>
>
> On Tue, Nov 15, 2011 at 2:12 PM, Mohammed Al khooja <mkhooja@gmail.com
> >wrote:
>
> > Hi,
> >
> > I'm running Mahout LDA on a cluster but I'm getting the java out of
> memory
> > exception (below).  How do I increase the MAHOUT_HEAP variable in
> > bin\mahout.sh ?  When I add the line MAHOUT_HEAP=2000 it doesn't seem to
> > change !  Do I have to reload or rebuild anything so that changes take
> > effect ?
> >
> >
> > thanks.
> >
> > 2011-11-15 16:59:39,697 INFO org.apache.hadoop.util.NativeCodeLoader:
> > Loaded the native-hadoop library
> > 2011-11-15 16:59:39,836 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > Initializing JVM Metrics with processName=MAP, sessionId=
> > 2011-11-15 16:59:39,888 WARN org.apache.hadoop.conf.Configuration:
> >
> >
> /scratch1/hadoop/mapred/local/taskTracker/mkhooja/jobcache/job_201110181830_3456/job.xml:a
> > attempt to override final parameter: dfs.data.dir; Ignoring.
> > 2011-11-15 16:59:39,956 INFO org.apache.hadoop.mapred.MapTask:
> io.sort.mb =
> > 100
> > 2011-11-15 16:59:40,001 INFO org.apache.hadoop.mapred.MapTask: data
> buffer
> > = 79691776/99614720
> > 2011-11-15 16:59:40,001 INFO org.apache.hadoop.mapred.MapTask: record
> > buffer = 262144/327680
> > 2011-11-15 16:59:48,863 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> > Initializing logs' truncater with mapRetainSize=-1 and
> reduceRetainSize=-1
> > 2011-11-15 16:59:48,866 FATAL org.apache.hadoop.mapred.Child: Error
> running
> > child : java.lang.OutOfMemoryError: Java heap space
> > at org.apache.mahout.math.DenseMatrix.<init>(DenseMatrix.java:51)
> > at
> >
> >
> org.apache.mahout.clustering.lda.LDAInference.createPhiMatrix(LDAInference.java:154)
> > at
> > org.apache.mahout.clustering.lda.LDAInference.infer(LDAInference.java:93)
> > at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:48)
> > at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:36)
> > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> > at org.apache.hadoop.mapred.Child.main(Child.java:264)
> >
> > --
> >
> > M.khouja
> >
>



-- 

M.khouja

Re: Mahout heap out of space

Posted by Jake Mannix <ja...@gmail.com>.
You need the mapper/reducer memory set, not the MAHOUT_HEAP, I think:

mapred.map.child.java.opts

needs to have something like "-Xmx3g" or the like in it.

How many terms (vocabulary size) and topics are you running over?

  -jake


On Tue, Nov 15, 2011 at 2:12 PM, Mohammed Al khooja <mk...@gmail.com>wrote:

> Hi,
>
> I'm running Mahout LDA on a cluster but I'm getting the java out of memory
> exception (below).  How do I increase the MAHOUT_HEAP variable in
> bin\mahout.sh ?  When I add the line MAHOUT_HEAP=2000 it doesn't seem to
> change !  Do I have to reload or rebuild anything so that changes take
> effect ?
>
>
> thanks.
>
> 2011-11-15 16:59:39,697 INFO org.apache.hadoop.util.NativeCodeLoader:
> Loaded the native-hadoop library
> 2011-11-15 16:59:39,836 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2011-11-15 16:59:39,888 WARN org.apache.hadoop.conf.Configuration:
>
> /scratch1/hadoop/mapred/local/taskTracker/mkhooja/jobcache/job_201110181830_3456/job.xml:a
> attempt to override final parameter: dfs.data.dir; Ignoring.
> 2011-11-15 16:59:39,956 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb =
> 100
> 2011-11-15 16:59:40,001 INFO org.apache.hadoop.mapred.MapTask: data buffer
> = 79691776/99614720
> 2011-11-15 16:59:40,001 INFO org.apache.hadoop.mapred.MapTask: record
> buffer = 262144/327680
> 2011-11-15 16:59:48,863 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2011-11-15 16:59:48,866 FATAL org.apache.hadoop.mapred.Child: Error running
> child : java.lang.OutOfMemoryError: Java heap space
> at org.apache.mahout.math.DenseMatrix.<init>(DenseMatrix.java:51)
> at
>
> org.apache.mahout.clustering.lda.LDAInference.createPhiMatrix(LDAInference.java:154)
> at
> org.apache.mahout.clustering.lda.LDAInference.infer(LDAInference.java:93)
> at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:48)
> at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:36)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> at org.apache.hadoop.mapred.Child.main(Child.java:264)
>
> --
>
> M.khouja
>