You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by "Sadayuki Furuhashi (JIRA)" <ji...@apache.org> on 2018/02/14 05:08:00 UTC

[jira] [Created] (IGNITE-7696) Deadlock at GridDhtAtomicCache.lockEntries called through GridDhtAtomicCache.updateAllAsyncInternal

Sadayuki Furuhashi created IGNITE-7696:
------------------------------------------

             Summary: Deadlock at GridDhtAtomicCache.lockEntries called through GridDhtAtomicCache.updateAllAsyncInternal
                 Key: IGNITE-7696
                 URL: https://issues.apache.org/jira/browse/IGNITE-7696
             Project: Ignite
          Issue Type: Bug
          Components: cache
    Affects Versions: 2.3
         Environment: * Ignite 2.3
 * OpenJDK version "1.8.0_151"
 * Linux 4.4.0
            Reporter: Sadayuki Furuhashi


We observed that all nodes in a cluster completely stalls and put/get/remove operations to a cache blocks for ever. When it happens, we can see following log in thread dump:
{code:java}
2018-02-14_04:21:33.84410 Found one Java-level deadlock:
2018-02-14_04:21:33.84410 =============================
2018-02-14_04:21:33.84411 "sys-#41%IgniteManager%":
2018-02-14_04:21:33.84411 waiting to lock monitor 0x00007f6d5e41a558 (object 0x0000000781083ef0, a org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry),
2018-02-14_04:21:33.84411 which is held by "sys-stripe-5-#6%IgniteManager%"
2018-02-14_04:21:33.84412 "sys-stripe-5-#6%IgniteManager%":
2018-02-14_04:21:33.84412 waiting to lock monitor 0x00007f6d5e41de68 (object 0x0000000781083e70, a org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry)
2018-02-14_04:21:33.84412 in JNI, which is held by "sys-stripe-2-#3%IgniteManager%"
2018-02-14_04:21:33.84412 "sys-stripe-2-#3%IgniteManager%":
2018-02-14_04:21:33.84413 waiting to lock monitor 0x00007f6d5e41a558 (object 0x0000000781083ef0, a org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry)
2018-02-14_04:21:33.84413 in JNI, which is held by "sys-stripe-5-#6%IgniteManager%"
2018-02-14_04:21:33.84413
2018-02-14_04:21:33.84414 Java stack information for the threads listed above:
2018-02-14_04:21:33.84414 ===================================================
2018-02-14_04:21:33.84416 "sys-#41%IgniteManager%":
2018-02-14_04:21:33.84416 at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.markObsoleteVersion(GridCacheMapEntry.java:2153)
2018-02-14_04:21:33.84417 - waiting to lock <0x0000000781083ef0> (a org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry)
2018-02-14_04:21:33.84417 at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.removeVersionedEntry(GridDhtLocalPartition.java:368)
2018-02-14_04:21:33.84418 at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.cleanupRemoveQueue(GridDhtLocalPartition.java:392)
2018-02-14_04:21:33.84418 at org.apache.ignite.internal.processors.cache.GridCacheProcessor$RemovedItemsCleanupTask$1.run(GridCacheProcessor.java:4051)
2018-02-14_04:21:33.84418 at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6687)
2018-02-14_04:21:33.84419 at org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
2018-02-14_04:21:33.84419 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
2018-02-14_04:21:33.84419 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2018-02-14_04:21:33.84420 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2018-02-14_04:21:33.84421 at java.lang.Thread.run(Thread.java:748)
2018-02-14_04:21:33.84421 "sys-stripe-5-#6%IgniteManager%":
2018-02-14_04:21:33.84421 at sun.misc.Unsafe.monitorEnter(Native Method)
2018-02-14_04:21:33.84421 at org.apache.ignite.internal.util.GridUnsafe.monitorEnter(GridUnsafe.java:1207)
2018-02-14_04:21:33.84422 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2848)
2018-02-14_04:21:33.84422 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1707)
2018-02-14_04:21:33.84423 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1629)
2018-02-14_04:21:33.84423 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3056)
2018-02-14_04:21:33.84424 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:131)
2018-02-14_04:21:33.84424 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:267)
2018-02-14_04:21:33.84425 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:262)
2018-02-14_04:21:33.84425 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
2018-02-14_04:21:33.84425 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
2018-02-14_04:21:33.84426 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
2018-02-14_04:21:33.84426 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
2018-02-14_04:21:33.84426 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
2018-02-14_04:21:33.84427 at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
2018-02-14_04:21:33.84427 at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
2018-02-14_04:21:33.84427 at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
2018-02-14_04:21:33.84428 at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
2018-02-14_04:21:33.84429 at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
2018-02-14_04:21:33.84429 at org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
2018-02-14_04:21:33.84429 at java.lang.Thread.run(Thread.java:748)
2018-02-14_04:21:33.84429 "sys-stripe-2-#3%IgniteManager%":
2018-02-14_04:21:33.84430 at sun.misc.Unsafe.monitorEnter(Native Method)
2018-02-14_04:21:33.84430 at org.apache.ignite.internal.util.GridUnsafe.monitorEnter(GridUnsafe.java:1207)
2018-02-14_04:21:33.84430 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2848)
2018-02-14_04:21:33.84431 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1707)
2018-02-14_04:21:33.84431 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1629)
2018-02-14_04:21:33.84431 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3056)
2018-02-14_04:21:33.84433 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:131)
2018-02-14_04:21:33.84433 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:267)
2018-02-14_04:21:33.84433 at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:262)
2018-02-14_04:21:33.84434 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
2018-02-14_04:21:33.84434 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
2018-02-14_04:21:33.84434 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
2018-02-14_04:21:33.84435 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
2018-02-14_04:21:33.84435 at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
2018-02-14_04:21:33.84436 at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
2018-02-14_04:21:33.84437 at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
2018-02-14_04:21:33.84437 at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
2018-02-14_04:21:33.84437 at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
2018-02-14_04:21:33.84438 at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
2018-02-14_04:21:33.84438 at org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
2018-02-14_04:21:33.84438 at java.lang.Thread.run(Thread.java:748)
2018-02-14_04:21:33.84439
2018-02-14_04:21:33.84439 Found 1 deadlock.
{code}
This indicates that GridDhtAtomicCache.lockEntries (threads sys-stripe-5-... and sys-stripe-7-...) is causing deadlock, and GridCacheMapEntry.markObsoleteVersion (sys-#41...) is involved in it:
 * Thread "sys-stripe-5-..." locked 0x...3ef0, waits for 0x...3e70
 * Thread "sys-stripe-7-..." locked 0x...3e70, waits for 0x...3ef0
 * Thread "sys-#41..." waits for 0x...3ef0

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)