You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by syed kather <in...@gmail.com> on 2012/10/03 21:14:48 UTC
Heap Space Problem while running in cluster in map reduce
Team,
When i am trying to run KMean clustering i had found it is throwing
Java heap space
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
at
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:139)
at
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:118)
at
org.apache.mahout.clustering.ClusterObservations.readFields(ClusterObservations.java:59)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
at
org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
at
org.apache.hadoop.mapreduce.ReduceContext$ValueIterator.next(ReduceContext.java:163)
at
org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:31)
at
org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:25)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at
org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1502)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2768)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
Can i know what may be the reason.
I have 5 Node cluster
Master 4 Core with 16GB RAM
salve1 4 Core with 8GB RAM
salve2 4 Core with 8GB RAM
salve3 4 Core with 8GB RAM
salve4 4 Core with 8GB RAM
Let me know if there is any optimization is required for this
Advance Thanks
Thanks and Regards,
S SYED ABDUL KATHER
Re: Heap Space Problem while running in cluster in map reduce
Posted by paritosh ranjan <pa...@gmail.com>.
How many initial clusters are you providing to KMeans?
Try reducing the initial number of clusters and find out the breaking
point. A good way would be to find initial number of clusters from Canopy
Clustering.https://cwiki.apache.org/MAHOUT/canopy-clustering.html
Have you analyzed the nodes of the cluster, whether they are using 16 GB of
RAM or not? If not, then the hadoop cluster configuration would need some
reconfiguration so that it can use most of the available RAM.
On Thu, Oct 4, 2012 at 12:44 AM, syed kather <in...@gmail.com> wrote:
> Team,
> When i am trying to run KMean clustering i had found it is throwing
> Java heap space
> at
>
> org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
> at
>
> org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
> at
>
> org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:139)
> at
> org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:118)
> at
>
> org.apache.mahout.clustering.ClusterObservations.readFields(ClusterObservations.java:59)
> at
>
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
> at
>
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
> at
>
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
> at
>
> org.apache.hadoop.mapreduce.ReduceContext$ValueIterator.next(ReduceContext.java:163)
> at
>
> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:31)
> at
>
> org.apache.mahout.clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:25)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> at
> org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1502)
> at
>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2768)
> at
>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
>
> Can i know what may be the reason.
>
> I have 5 Node cluster
> Master 4 Core with 16GB RAM
> salve1 4 Core with 8GB RAM
> salve2 4 Core with 8GB RAM
> salve3 4 Core with 8GB RAM
> salve4 4 Core with 8GB RAM
>
> Let me know if there is any optimization is required for this
>
> Advance Thanks
> Thanks and Regards,
> S SYED ABDUL KATHER
>