You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/07/15 08:15:20 UTC

[jira] [Reopened] (SPARK-16564) DeadLock happens when ‘StaticMemoryManager‘ release in-memory block

     [ https://issues.apache.org/jira/browse/SPARK-16564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen reopened SPARK-16564:
-------------------------------

Please search JIRA first. It's a duplicate

> DeadLock happens when ‘StaticMemoryManager‘ release in-memory block
> -------------------------------------------------------------------
>
>                 Key: SPARK-16564
>                 URL: https://issues.apache.org/jira/browse/SPARK-16564
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.6.1
>         Environment: spark.memory.useLegacyMode=true
>            Reporter: roncenzhao
>
> the condition causes dead lock is:
> Thread 1. 'BlockManagerSlaveEndpoint' receives 'RemoveBroadcast' or 'RemoveBlock' or .., executor begins to remove block named 'blockId1'
> Thread 2. the other rdd begins to run and must 'evictBlocksToFreeSpace()' for more memory, unfortunately the chosen block to evict is 'blockId1'
> As follows,
> thread 1 is holding blockId1's blockInfo lock(0x000000053ab39f58) and waitting for StaticMemoryManager's lock(0x000000039f04fea8).
> thread 2 is holding StaticMemoryManager's lock(0x000000039f04fea8) and waitting for blockId1's blockInfo lock(0x000000053ab39f58).
> This condition causes dead lock.
> stackTrace:
> Found one Java-level deadlock:
> =============================
> "block-manager-slave-async-thread-pool-24":
>   waiting to lock monitor 0x00007f0b004ec278 (object 0x000000039f04fea8, a org.apache.spark.memory.StaticMemoryManager),
>   which is held by "Executor task launch worker-11"
> "Executor task launch worker-11":
>   waiting to lock monitor 0x00007f0b01354018 (object 0x000000053ab39f58, a org.apache.spark.storage.BlockInfo),
>   which is held by "block-manager-slave-async-thread-pool-22"
> "block-manager-slave-async-thread-pool-22":
>   waiting to lock monitor 0x00007f0b004ec278 (object 0x000000039f04fea8, a org.apache.spark.memory.StaticMemoryManager),
>   which is held by "Executor task launch worker-11"
> Java stack information for the threads listed above:
> ===================================================
> "Executor task launch worker-11":
> 	at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1032)
> 	- waiting to lock <0x000000053ab39f58> (a org.apache.spark.storage.BlockInfo)
> 	at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1009)
> 	at org.apache.spark.storage.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:460)
> 	at org.apache.spark.storage.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:449)
> 	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> 	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> 	at org.apache.spark.storage.MemoryStore.evictBlocksToFreeSpace(MemoryStore.scala:449)
> 	- locked <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> 	at org.apache.spark.memory.StorageMemoryPool.acquireMemory(StorageMemoryPool.scala:89)
> 	- locked <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> 	at org.apache.spark.memory.StaticMemoryManager.acquireUnrollMemory(StaticMemoryManager.scala:83)
> 	- locked <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> 	at org.apache.spark.storage.MemoryStore.reserveUnrollMemoryForThisTask(MemoryStore.scala:493)
> 	- locked <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> 	at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:291)
> 	at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:178)
> 	at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:85)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:744)
> "block-manager-slave-async-thread-pool-22":
> 	at org.apache.spark.storage.MemoryStore.remove(MemoryStore.scala:216)
> 	- waiting to lock <0x000000039f04fea8> (a org.apache.spark.memory.StaticMemoryManager)
> 	at org.apache.spark.storage.BlockManager.removeBlock(BlockManager.scala:1114)
> 	- locked <0x000000053ab39f58> (a org.apache.spark.storage.BlockInfo)
> 	at org.apache.spark.storage.BlockManager$$anonfun$removeBroadcast$2.apply(BlockManager.scala:1101)
> 	at org.apache.spark.storage.BlockManager$$anonfun$removeBroadcast$2.apply(BlockManager.scala:1101)
> 	at scala.collection.immutable.Set$Set2.foreach(Set.scala:94)
> 	at org.apache.spark.storage.BlockManager.removeBroadcast(BlockManager.scala:1101)
> 	at org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$4.apply$mcI$sp(BlockManagerSlaveEndpoint.scala:65)
> 	at org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$4.apply(BlockManagerSlaveEndpoint.scala:65)
> 	at org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$4.apply(BlockManagerSlaveEndpoint.scala:65)
> 	at org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$1.apply(BlockManagerSlaveEndpoint.scala:81)
> 	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> 	at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:744)
> Found 1 deadlock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org