You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Renyi Xiong <re...@gmail.com> on 2016/05/15 18:46:33 UTC
Spark shuffling OutOfMemoryError Java heap space
Hi
I am consistently observing driver OutOfMemoryError (Java heap space)
during shuffling operation indicated by the log:
…………
16/05/14 21:57:03 INFO MapOutputTrackerMaster: Size of output statuses for
shuffle 2 is 36060250 bytes à shuffle metadata size is big and the full
metadata will be sent to all workers?
16/05/14 21:57:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host1>:45757
16/05/14 21:57:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host2>:20300
16/05/14 21:57:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host3>:12389
16/05/14 21:57:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host4>:32197
…………
Exception in thread "dispatcher-event-loop-17" Exception in thread
"dispatcher-event-loop-3" Exception in thread "dispatcher-event-loop-6"
16/05/14 21:59:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host5>:19639
Exception in thread "dispatcher-event-loop-21" 16/05/14 21:59:08 INFO
MapOutputTrackerMasterEndpoint: Asked to send map output locations for
shuffle 2 to <host6>:58461
Exception in thread "dispatcher-event-loop-20" Exception in thread
"dispatcher-event-loop-13" Exception in thread
"dispatcher-event-loop-9" java.lang.OutOfMemoryError:
Java heap space
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2271)
at
java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
at org.apache.spark.serializer.
JavaSerializerInstance.serialize(JavaSerializer.scala:103) à shuffle
metadata duplicated (?) when sending to each executor?
at
org.apache.spark.rpc.netty.NettyRpcEnv.serialize(NettyRpcEnv.scala:252)
at
org.apache.spark.rpc.netty.RemoteNettyRpcCallContext.send(NettyRpcCallContext.scala:64)
at
org.apache.spark.rpc.netty.NettyRpcCallContext.reply(NettyRpcCallContext.scala:32)
at
org.apache.spark.MapOutputTrackerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(MapOutputTracker.scala:62)
at
org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:104)
at
org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
I enable memory dump and used jhat to analyze. In heap histogram, I found
146 byte array objects with exact same size of 36,060,293 bytes.
I wonder if the 146 big objects are actually duplicates of the same shuffle
metadata, *can experts please help understand if it's true?*
(8G driver memory was specified for the above run, should be sufficient for
the 36M shuffle metadata. but probably not for 146 duplicates)
thanks,
Renyi.