You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Rajesh Nikam <ra...@gmail.com> on 2012/10/12 17:43:29 UTC

mahout clusterdump : java.lang.OutOfMemoryError: Java heap space error

Hi,

I have used canopy and k-means clustering to cluster around 1.2 M instances.
csv file size if around 425 MB. However when I run "mahout clusterdump"
command as below I am getting
Java OutOfMemory error.

mahout clusterdump -dt sequencefile -i
clean-kmeans-clusters/clusters-1-final/part-r-00000 -n 20 -b 100 -o
cdump-clean.txt -p clean-kmeans-clusters/clusteredPoints/

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:44)
        at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:39)
        at
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:99)
        at
org.apache.mahout.clustering.classify.WeightedVectorWritable.readFields(WeightedVectorWritable.java:56)

I have switched to 64 bit Ubantu and even tried setting 4GB/8GB/12GB of
memory for java.

JAVA_HEAP_MAX=-Xmx4g
JAVA_HEAP_MAX=-Xmx8g
JAVA_HEAP_MAX=-Xmx12g

Not sure how to increase required memory for Java runtime.

How to check is this java on Ubantu is 64 bit or not ?

Thanks
Rajesh

Re: mahout clusterdump : java.lang.OutOfMemoryError: Java heap space error

Posted by Rajesh Nikam <ra...@gmail.com>.
Thanks Paritosh.

clusterpp command helped to dump instances per cluster and then used
vectordump to convert vectors to text.

Thanks
Rajesh

On Fri, Oct 12, 2012 at 9:34 PM, paritosh ranjan
<pa...@gmail.com>wrote:

> I think this much memory should fix the problem.
> However, If you still face OOM, then try using clusterpp command instead of
> clusterdump , its not having memory limitations as it also has a mapreduce
> version. You can find clusterpp's usage here
> https://cwiki.apache.org/MAHOUT/top-down-clustering.html.
>
> On Fri, Oct 12, 2012 at 9:13 PM, Rajesh Nikam <ra...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I have used canopy and k-means clustering to cluster around 1.2 M
> > instances.
> > csv file size if around 425 MB. However when I run "mahout clusterdump"
> > command as below I am getting
> > Java OutOfMemory error.
> >
> > mahout clusterdump -dt sequencefile -i
> > clean-kmeans-clusters/clusters-1-final/part-r-00000 -n 20 -b 100 -o
> > cdump-clean.txt -p clean-kmeans-clusters/clusteredPoints/
> >
> > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> >         at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:44)
> >         at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:39)
> >         at
> > org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:99)
> >         at
> >
> >
> org.apache.mahout.clustering.classify.WeightedVectorWritable.readFields(WeightedVectorWritable.java:56)
> >
> > I have switched to 64 bit Ubantu and even tried setting 4GB/8GB/12GB of
> > memory for java.
> >
> > JAVA_HEAP_MAX=-Xmx4g
> > JAVA_HEAP_MAX=-Xmx8g
> > JAVA_HEAP_MAX=-Xmx12g
> >
> > Not sure how to increase required memory for Java runtime.
> >
> > How to check is this java on Ubantu is 64 bit or not ?
> >
> > Thanks
> > Rajesh
> >
>

Re: mahout clusterdump : java.lang.OutOfMemoryError: Java heap space error

Posted by paritosh ranjan <pa...@gmail.com>.
I think this much memory should fix the problem.
However, If you still face OOM, then try using clusterpp command instead of
clusterdump , its not having memory limitations as it also has a mapreduce
version. You can find clusterpp's usage here
https://cwiki.apache.org/MAHOUT/top-down-clustering.html.

On Fri, Oct 12, 2012 at 9:13 PM, Rajesh Nikam <ra...@gmail.com> wrote:

> Hi,
>
> I have used canopy and k-means clustering to cluster around 1.2 M
> instances.
> csv file size if around 425 MB. However when I run "mahout clusterdump"
> command as below I am getting
> Java OutOfMemory error.
>
> mahout clusterdump -dt sequencefile -i
> clean-kmeans-clusters/clusters-1-final/part-r-00000 -n 20 -b 100 -o
> cdump-clean.txt -p clean-kmeans-clusters/clusteredPoints/
>
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>         at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:44)
>         at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:39)
>         at
> org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:99)
>         at
>
> org.apache.mahout.clustering.classify.WeightedVectorWritable.readFields(WeightedVectorWritable.java:56)
>
> I have switched to 64 bit Ubantu and even tried setting 4GB/8GB/12GB of
> memory for java.
>
> JAVA_HEAP_MAX=-Xmx4g
> JAVA_HEAP_MAX=-Xmx8g
> JAVA_HEAP_MAX=-Xmx12g
>
> Not sure how to increase required memory for Java runtime.
>
> How to check is this java on Ubantu is 64 bit or not ?
>
> Thanks
> Rajesh
>