You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "roncenzhao (JIRA)" <ji...@apache.org> on 2016/07/15 08:15:20 UTC
[jira] [Closed] (SPARK-16564) DeadLock happens when ‘StaticMemoryManager‘ release in-memory block
[ https://issues.apache.org/jira/browse/SPARK-16564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
roncenzhao closed SPARK-16564.
------------------------------
Resolution: Resolved
In master branch, the release logical has been refactoring, the dead lock problem is solved.
> DeadLock happens when ‘StaticMemoryManager‘ release in-memory block
> -------------------------------------------------------------------
>
> Key: SPARK-16564
> URL: https://issues.apache.org/jira/browse/SPARK-16564
> Project: Spark
> Issue Type: Bug
> Affects Versions: 1.6.1
> Environment: spark.memory.useLegacyMode=true
> Reporter: roncenzhao
>
> the condition causes dead lock is:
> Thread 1. 'BlockManagerSlaveEndpoint' receives 'RemoveBroadcast' or 'RemoveBlock' or .., executor begins to remove block named 'blockId1'
> Thread 2. the other rdd begins to run and must 'evictBlocksToFreeSpace()' for more memory, unfortunately the chosen block to evict is 'blockId1'
> As follows,
> thread 1 is holding blockId1's blockInfo lock(0x000000053ab39f58) and waitting for StaticMemoryManager's lock(0x000000039f04fea8).
> thread 2 is holding StaticMemoryManager's lock(0x000000039f04fea8) and waitting for blockId1's blockInfo lock(0x000000053ab39f58).
> This condition causes dead lock.
> stackTrace:
> Found one Java-level deadlock:
> =============================
> "block-manager-slave-async-thread-pool-24":
> waiting to lock monitor 0x00007f0b004ec278 (object 0x000000039f04fea8, a org.apache.spark.memory.StaticMemoryManager),
> which is held by "Executor task launch worker-11"
> "Executor task launch worker-11":
> waiting to lock monitor 0x00007f0b01354018 (object 0x000000053ab39f58, a org.apache.spark.storage.BlockInfo),
> which is held by "block-manager-slave-async-thread-pool-22"
> "block-manager-slave-async-thread-pool-22":
> waiting to lock monitor 0x00007f0b004ec278 (object 0x000000039f04fea8, a org.apache.spark.memory.StaticMemoryManager),
> which is held by "Executor task launch worker-11"
> Java stack information for the threads listed above:
> ===================================================
> "Executor task launch worker-11":
> at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1032)
> - waiting to lock <0x000000053ab39f58> (a org.apache.spark.storage.BlockInfo)
> at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1009)
> at org.apache.spark.storage.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:460)
> at org.apache.spark.storage.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:449)
> at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at org.apache.spark.storage.MemoryStore.evictBlocksToFreeSpace(MemoryStore.scala:449)
> - locked <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> at org.apache.spark.memory.StorageMemoryPool.acquireMemory(StorageMemoryPool.scala:89)
> - locked <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> at org.apache.spark.memory.StaticMemoryManager.acquireUnrollMemory(StaticMemoryManager.scala:83)
> - locked <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> at org.apache.spark.storage.MemoryStore.reserveUnrollMemoryForThisTask(MemoryStore.scala:493)
> - locked <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:291)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:178)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:85)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> "block-manager-slave-async-thread-pool-22":
> at org.apache.spark.storage.MemoryStore.remove(MemoryStore.scala:216)
> - waiting to lock <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> at org.apache.spark.storage.BlockManager.removeBlock(BlockManager.scala:1114)
> - locked <0x000000053ab39f58> (a org.apache.spark.storage.BlockInfo)
> at org.apache.spark.storage.BlockManager$$anonfun$removeBroadcast$2.apply(BlockManager.scala:1101)
> at org.apache.spark.storage.BlockManager$$anonfun$removeBroadcast$2.apply(BlockManager.scala:1101)
> at scala.collection.immutable.Set$Set2.foreach(Set.scala:94)
> at org.apache.spark.storage.BlockManager.removeBroadcast(BlockManager.scala:1101)
> at org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$4.apply$mcI$sp(BlockManagerSlaveEndpoint.scala:65)
> at org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$4.apply(BlockManagerSlaveEndpoint.scala:65)
> at org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$4.apply(BlockManagerSlaveEndpoint.scala:65)
> at org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$1.apply(BlockManagerSlaveEndpoint.scala:81)
> at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Found 1 deadlock.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org