You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/03/13 00:09:00 UTC
[jira] [Commented] (GEODE-6517) Race condition exists that a node
failed to be shutdown as it is stuck on
PRHARedundancyProvider.waitForPersistentBucketRecovery()
[ https://issues.apache.org/jira/browse/GEODE-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791122#comment-16791122 ]
ASF subversion and git services commented on GEODE-6517:
--------------------------------------------------------
Commit 6751717585dd3a6405578013f0d1bea5f289d8e6 in geode's branch refs/heads/feature/GEODE-6517 from eshu
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=6751717 ]
GEODE-6517: Fix a race by counting down the latch.
> Race condition exists that a node failed to be shutdown as it is stuck on PRHARedundancyProvider.waitForPersistentBucketRecovery()
> ----------------------------------------------------------------------------------------------------------------------------------
>
> Key: GEODE-6517
> URL: https://issues.apache.org/jira/browse/GEODE-6517
> Project: Geode
> Issue Type: Bug
> Components: regions
> Affects Versions: 1.1.0
> Reporter: Eric Shu
> Assignee: Eric Shu
> Priority: Major
>
> The hang thread stack:
> "Shutdown Disconnector1" #93 prio=10 os_prio=0 tid=0x00007f84b8002800 nid=0x6875 waiting on condition [0x00007f844ee31000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000f14f0490> (a java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at org.apache.geode.internal.cache.PRHARedundancyProvider.waitForPersistentBucketRecovery(PRHARedundancyProvider.java:2019)
> at org.apache.geode.internal.cache.PartitionedRegion.postDestroyRegion(PartitionedRegion.java:7536)
> at org.apache.geode.internal.cache.LocalRegion.recursiveDestroyRegion(LocalRegion.java:2707)
> at org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6308)
> at org.apache.geode.internal.cache.LocalRegion.handleCacheClose(LocalRegion.java:7387)
> at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2281)
> - locked <0x00000000f0abeb00> (a java.lang.Class for org.apache.geode.internal.cache.GemFireCacheImpl)
> at org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1593)
> - locked <0x00000000f0abeb00> (a java.lang.Class for org.apache.geode.internal.cache.GemFireCacheImpl)
> at org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1255)
> at org.apache.geode.management.internal.cli.functions.ShutDownFunction.lambda$disconnectInNonDaemonThread$0(ShutDownFunction.java:78)
> at org.apache.geode.management.internal.cli.functions.ShutDownFunction$$Lambda$94/665093117.run(Unknown Source)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> The race occurs during recoverPersistentBuckets, between following latch is created and then nulled out, shutdown thread could get hold of the reference of latch and wait for countDown forever.
> allBucketsRecoveredFromDisk = new CountDownLatch(proxyBucketArray.length);
> try {
> if (proxyBucketArray.length > 0) {
> this.redundancyLogger = new RedundancyLogger(this);
> Thread loggingThread = new LoggingThread(
> "RedundancyLogger for region " + this.prRegion.getName(), false, this.redundancyLogger);
> loggingThread.start();
> }
> } catch (RuntimeException e) {
> allBucketsRecoveredFromDisk = null;
> throw e;
> }
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)