You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Ahmed Elgohary <aa...@gmail.com> on 2012/09/15 02:03:18 UTC

OutOfMemoryError in MatrixMultiplicationJob

Hi,

I was running mahout MatrixMultiplicationJob on Amazon EMR to multiply two
matrices of sizes (150k x 8.2m) and (100 x 150k ). My cluster consisted of
20 m1.large nodes. I am getting an OutOfMemoryError:

Error: java.lang.OutOfMemoryError: Java heap space
	at org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
	at org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
	at org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:139)
	at org.apache.mahout.math.AbstractVector.assign(AbstractVector.java:560)
	at org.apache.mahout.math.hadoop.MatrixMultiplicationJob$MatrixMultiplicationReducer.reduce(MatrixMultiplicationJob.java:161)
	at org.apache.mahout.math.hadoop.MatrixMultiplicationJob$MatrixMultiplicationReducer.reduce(MatrixMultiplicationJob.java:147)
	at org.apache.hadoop.mapred.Task$OldCombinerRunner.combine(Task.java:1436)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2815)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2753)

I increased the heap size of the reducer task JVM up to 4GB. But, that did
not solve the problem.
I am looking for any suggestions to solve that problem.

thanks
--ahmed