You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by Yang Sun <so...@gmail.com> on 2011/02/07 20:55:09 UTC

java heap space exception using org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver

Hi,

I'm trying to use parallel FPGrowth on a text based data set with
about 14K documents. But when I run mahout, I got the following
exception:

FATAL org.apache.hadoop.mapred.TaskTracker: Error running child :
java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:781)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:524)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)


The command:

hadoop jar mahout-examples-0.5-SNAPSHOT-job.jar
org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver -i tital_tokens -o
patterns -k 3000 -method mapreduce -g 10 -regex '[\ ]' -s 10


Can someone tell me how I can fix this? Or is it possible to use the
algorithm for a text based dataset?


Thanks

Yang

RE: java heap space exception using org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver

Posted by pr...@nokia.com.

Looks like the heap size for your mapred jvms is too small. What version of hadoop are you using? You need to set mapred.child.java.opts (for hadoop 0.20.2) variable to something reasonable. 14K documents is very small dataset but not sure what your child jvm heap size is set to.

Set it to something like 512m or higher and try again.

Praveen
________________________________________
From: ext Yang Sun [soushare.com@gmail.com]
Sent: Monday, February 07, 2011 2:55 PM
To: user@mahout.apache.org
Subject: java heap space exception using org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver

Hi,

I'm trying to use parallel FPGrowth on a text based data set with
about 14K documents. But when I run mahout, I got the following
exception:

FATAL org.apache.hadoop.mapred.TaskTracker: Error running child :
java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:781)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:524)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)

The command:

hadoop jar mahout-examples-0.5-SNAPSHOT-job.jar
org.apache.mahout.fpm.pfpgrowth.FPGrowthDriver -i tital_tokens -o
patterns -k 3000 -method mapreduce -g 10 -regex '[\ ]' -s 10

Can someone tell me how I can fix this? Or is it possible to use the
algorithm for a text based dataset?

Thanks

Yang