You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Stefan Beskow <St...@sas.com> on 2014/03/27 19:50:11 UTC

Out of memory

Hi.

I'm trying to run the sample connected components algorithm on a large data set on a cluster, but I get a "java.lang.OutOfMemoryError: Java heap space" error. The cluster has 16 nodes, and each node has 24 cores and 96GB of memory. I'm using Hadoop-2.2.0-cdh5.0.0-beta2 and running Giraph 1.1.0-snapshot as an MR2 application.

I tried allocating more memory to the mappers by setting mapreduce.map.java.opts in the Configuration object, but that didn't solve my problem. Any suggestions for something else I could try?

Here is the mapper exception:
Caused by: java.lang.OutOfMemoryError: Java heap space
                    at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(UnsafeByteArrayOutputStream.java:96)
                    at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.createExtendedDataOutput(ImmutableClassesGiraphConfiguration.java:974)
                    at org.apache.giraph.utils.ByteArrayVertexIdData.initialize(ByteArrayVertexIdData.java:85)
                    at org.apache.giraph.utils.ByteArrayVertexIdMessages.initialize(ByteArrayVertexIdMessages.java:88)
                    at org.apache.giraph.comm.SendVertexIdDataCache.getPartitionData(SendVertexIdDataCache.java:124)
                    at org.apache.giraph.comm.SendVertexIdDataCache.addData(SendVertexIdDataCache.java:76)
                    at org.apache.giraph.comm.SendMessageCache.addMessage(SendMessageCache.java:97)
                    at org.apache.giraph.comm.SendMessageCache.sendMessageRequest(SendMessageCache.java:157)
                    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendMessageRequest(NettyWorkerClientRequestProcessor.java:179)
                    at org.apache.giraph.graph.AbstractComputation.sendMessage(AbstractComputation.java:163)
                    at com.sas.analytics.giraph.connectedcomponents.ConnectedComponentsComputation.compute(ConnectedComponentsComputation.java:73)
                    at org.apache.giraph.graph.ComputeCallable.computePartition(ComputeCallable.java:247)
                    at org.apache.giraph.graph.ComputeCallable.call(ComputeCallable.java:168)
                    at org.apache.giraph.graph.ComputeCallable.call(ComputeCallable.java:71)
                    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
                    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
                    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
                    at java.lang.Thread.run(Thread.java:722)


Thanks for your help.
Stefan