You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by "dan@lumity.com" <da...@lumity.com> on 2015/08/25 03:59:02 UTC
ExternalSorter: Thread *** spilling in-memory map of 352.6 MB to
disk (38 times so far)
Hi all,
I posted this on the user list but didn't get any responses to I was hoping
one of you awesome devs could shed some light:
I'm trying to run a spark 1.5 job with:
./spark-shell --driver-java-options "-Xdebug
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=1044 -Xms16g
-Xmx48g -Xss128m"
I get lots of error messages like :
15/08/24 20:24:33 INFO ExternalSorter: Thread 172 spilling in-memory map of
352.2 MB to disk (40 times so far)
15/08/24 20:24:33 INFO ExternalSorter: Thread 179 spilling in-memory map of
352.2 MB to disk (39 times so far)
15/08/24 20:24:34 INFO ExternalSorter: Thread 197 spilling in-memory map of
352.2 MB to disk (39 times so far)
15/08/24 20:24:34 INFO ExternalSorter: Thread 192 spilling in-memory map of
352.2 MB to disk (39 times so far)
15/08/24 20:24:36 INFO ShuffleMemoryManager: TID 798 waiting for at least
1/2N of shuffle memory pool to be free
15/08/24 20:24:36 INFO ExternalSorter: Thread 170 spilling in-memory map of
352.2 MB to disk (39 times so far)
15/08/24 20:24:36 INFO ExternalSorter: Thread 171 spilling in-memory map of
352.2 MB to disk (40 times so far)
When I force a stack trace with jstack, I get stack traces like this for
each of my 36 cores:
"Executor task launch worker-44" daemon prio=10 tid=0x00007fb9c404d000
nid=0x417f runnable [0x00007fb5e3ffe000]
java.lang.Thread.State: RUNNABLE
at
scala.collection.mutable.ResizableArray$class.$init$(ResizableArray.scala:32)
at scala.collection.mutable.ArrayBuffer.<init>(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.<init>(ArrayBuffer.scala:63)
at
org.apache.spark.util.SizeEstimator$SearchState.<init>(SizeEstimator.scala:154)
at
org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:183)
at
org.apache.spark.util.SizeEstimator$$anonfun$sampleArray$1.apply$mcVI$sp(SizeEstimator.scala:262)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:166)
at
org.apache.spark.util.SizeEstimator$.sampleArray(SizeEstimator.scala:254)
at
org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:238)
at
org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:194)
at
org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:186)
at
org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:54)
at
org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
at
org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70)
at
org.apache.spark.util.collection.PartitionedPairBuffer.afterUpdate(PartitionedPairBuffer.scala:30)
at
org.apache.spark.util.collection.PartitionedPairBuffer.insert(PartitionedPairBuffer.scala:53)
at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Jmap Shows:
root@ip-172-31-28-227:~# jmap -heap 16421
Attaching to process ID 16421, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 24.79-b02
using thread-local object allocation.
Parallel GC with 25 thread(s)
Heap Configuration:
MinHeapFreeRatio = 0
MaxHeapFreeRatio = 100
MaxHeapSize = 51539607552 (49152.0MB)
NewSize = 1310720 (1.25MB)
MaxNewSize = 17592186044415 MB
OldSize = 5439488 (5.1875MB)
NewRatio = 2
SurvivorRatio = 8
PermSize = 21757952 (20.75MB)
MaxPermSize = 268435456 (256.0MB)
G1HeapRegionSize = 0 (0.0MB)
Heap Usage:
PS Young Generation
Eden Space:
capacity = 15290335232 (14582.0MB)
used = 11958528232 (11404.541236877441MB)
free = 3331807000 (3177.4587631225586MB)
78.20971908433302% used
>From Space:
capacity = 869269504 (829.0MB)
used = 868910560 (828.6576843261719MB)
free = 358944 (0.342315673828125MB)
99.9587073976082% used
To Space:
capacity = 965738496 (921.0MB)
used = 0 (0.0MB)
free = 965738496 (921.0MB)
0.0% used
PS Old Generation
capacity = 11453595648 (10923.0MB)
used = 1423152248 (1357.223747253418MB)
free = 10030443400 (9565.776252746582MB)
12.425375329611077% used
PS Perm Generation
capacity = 107479040 (102.5MB)
used = 107107360 (102.14553833007812MB)
free = 371680 (0.354461669921875MB)
99.65418373666158% used
27990 interned Strings occupying 2892496 bytes.
This seems to indicate that the Threads are dumping extra data into temp
files on the disk, but none of the temp files in /tmp/ are growing as far as
I can tell. I think this means the spilling isn't actually happening, or I
don't know where to look for it if it is.
I can't tell what's going on here, does anyone have any suggestions on how
to work through this?
As an update, when I reran the job with spark.shuffle.spill = false, the job
moves along very very slowly and I get stack traces like:
Thread 20826: (state = IN_VM)
- java.util.IdentityHashMap.resize(int) @bci=54, line=469 (Compiled frame)
- java.util.IdentityHashMap.put(java.lang.Object, java.lang.Object)
@bci=118, line=445 (Compiled frame)
- org.apache.spark.util.SizeEstimator$SearchState.enqueue(java.lang.Object)
@bci=21, line=159 (Compiled frame)
-
org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply(java.lang.reflect.Field)
@bci=12, line=203 (Compiled frame)
-
org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply(java.lang.Object)
@bci=5, line=202 (Compiled frame)
- scala.collection.immutable.List.foreach(scala.Function1) @bci=15,
line=381 (Compiled frame)
- org.apache.spark.util.SizeEstimator$.visitSingleObject(java.lang.Obje
ct, org.apache.spark.util.SizeEstimator$SearchState) @bci=75, li
ne=202 (Compiled frame)
-
org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(java.lang.Object,
java.util.IdentityHashMap) @bci=
32, line=186 (Compiled frame)
-
org.apache.spark.util.SizeEstimator$$anonfun$sampleArray$1.apply$mcVI$sp(int)
@bci=71, line=262 (Compiled frame)
- scala.collection.immutable.Range.foreach$mVc$sp(scala.Function1) @bci=87,
line=166 (Compiled frame)
- org.apache.spark.util.SizeEstimator$.sampleArray(java.lang.Object,
org.apache.spark.util.SizeEstimator$SearchState, java.util.Random,
org.apache.spark.util.collection.OpenHashSet, int) @bci=39, line=254
(Compiled frame)
- org.apache.spark.util.SizeEstimator$.visitArray(java.lang.Object,
java.lang.Class, org.apache.spark.util.SizeEstimator$SearchState) @
bci=185, line=238 (Compiled frame)
- org.apache.spark.util.SizeEstimator$.visitSingleObject(java.lang.Object,
org.apache.spark.util.SizeEstimator$SearchState) @bci=16, li
ne=194 (Compiled frame)
-
org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(java.lang.Object,
java.util.IdentityHashMap) @bci=
32, line=186 (Compiled frame)
- org.apache.spark.util.SizeEstimator$.estimate(java.lang.Object) @bci=9,
line=54 (Compiled frame)
-
org.apache.spark.util.collection.SizeTracker$class.takeSample(org.apache.spark.util.collection.SizeTracker)
@bci=23, line=78 (Compile
d frame)
-
org.apache.spark.util.collection.SizeTracker$class.afterUpdate(org.apache.spark.util.collection.SizeTracker)
@bci=31, line=70 (Compil
ed frame)
- org.apache.spark.util.collection.PartitionedPairBuffer.afterUpdate()
@bci=1, line=30 (Compiled frame)
- org.apache.spark.util.collection.PartitionedPairBuffer.insert(int,
java.lang.Object, java.lang.Object) @bci=63, line=53 (Compiled fra
me)
-
org.apache.spark.util.collection.ExternalSorter.insertAll(scala.collection.Iterator)
@bci=200, line=214 (Compiled frame)
-
org.apache.spark.shuffle.sort.SortShuffleWriter.write(scala.collection.Iterator)
@bci=203, line=73 (Interpreted frame)
-
org.apache.spark.scheduler.ShuffleMapTask.runTask(org.apache.spark.TaskContext)
@bci=186, line=73 (Interpreted frame)
-
org.apache.spark.scheduler.ShuffleMapTask.runTask(org.apache.spark.TaskContext)
@bci=2, line=41 (Interpreted frame)
- org.apache.spark.scheduler.Task.run(long, int,
org.apache.spark.metrics.MetricsSystem) @bci=118, line=88 (Interpreted
frame)
- org.apache.spark.executor.Executor$TaskRunner.run() @bci=334, line=214
(Interpreted frame)
-
java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
@bci=95, line=1145 (Interpreted fra
me)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615
(Interpreted frame)
- java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
Would appreciate any advice or suggestions!
Best,
Dan
--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/ExternalSorter-Thread-spilling-in-memory-map-of-352-6-MB-to-disk-38-times-so-far-tp13806.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org