You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Igor Barchak (JIRA)" <ji...@apache.org> on 2017/12/05 16:37:00 UTC
[jira] [Created] (GEODE-4051) Two server jvms crashed at same time
and caused some primary and redundant buckets to be cleared. Causing some
buckets to get locked and not able to recover also after bouncing all
servers
Igor Barchak created GEODE-4051:
-----------------------------------
Summary: Two server jvms crashed at same time and caused some primary and redundant buckets to be cleared. Causing some buckets to get locked and not able to recover also after bouncing all servers
Key: GEODE-4051
URL: https://issues.apache.org/jira/browse/GEODE-4051
Project: Geode
Issue Type: Bug
Components: core
Reporter: Igor Barchak
Fix For: 1.2.0
"Pooled Waiting Message Processor 5" tid=0x162
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.CountDownLatch$Sync@1993a5
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:64)
at org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:715)
at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:644)
at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:624)
at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:519)
at org.apache.geode.internal.cache.StateFlushOperation.flush(StateFlushOperation.java:243)
at org.apache.geode.internal.cache.InitialImageOperation.getFromOne(InitialImageOperation.java:349)
at org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1168)
at org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1023)
at org.apache.geode.internal.cache.BucketRegion.initialize(BucketRegion.java:253)
at org.apache.geode.internal.cache.LocalRegion.createSubregion(LocalRegion.java:962)
at org.apache.geode.internal.cache.PartitionedRegionDataStore.createBucketRegion(PartitionedRegionDataStore.java:726)
at org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucket(PartitionedRegionDataStore.java:414)
- locked org.apache.geode.internal.cache.ProxyBucketRegion@6820a0b6
at org.apache.geode.internal.cache.PartitionedRegionDataStore.grabFreeBucketRecursively(PartitionedRegionDataStore.java:272)
at org.apache.geode.internal.cache.PartitionedRegionDataStore.grabBucket(PartitionedRegionDataStore.java:2815)
at org.apache.geode.internal.cache.partitioned.ManageBackupBucketMessage.operateOnPartitionedRegion(ManageBackupBucketMessage.java:148)
at org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:332)
Seems like it was introduced in this fix
https://github.com/apache/geode/commit/3a1062e245b3ded52ea3f6b6de0aff94ce846fa3?diff=split
See StateMarkerMessage.process
The first if condition doesn't have a finally block.
The else has a finally block.
The first if condition didn't have a 'waitFor' operation earlier - it was introduced in this commit
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)