You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Bruce J Schuchardt (Jira)" <ji...@apache.org> on 2020/11/16 15:53:00 UTC

[jira] [Resolved] (GEODE-8697) Propagate ForcedDisconnectException to the user application in a network partition scenario

     [ https://issues.apache.org/jira/browse/GEODE-8697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruce J Schuchardt resolved GEODE-8697.
---------------------------------------
    Fix Version/s: 1.14.0
       Resolution: Fixed

> Propagate ForcedDisconnectException to the user application in a network partition scenario
> -------------------------------------------------------------------------------------------
>
>                 Key: GEODE-8697
>                 URL: https://issues.apache.org/jira/browse/GEODE-8697
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>    Affects Versions: 1.12.0, 1.13.0
>            Reporter: Kamilla Aslami
>            Assignee: Bruce J Schuchardt
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.14.0
>
>
> During network partitioning, we expect that the coordinator closes its cluster with a ForcedDisconnectException. However, in some cases, threads end up with a MemberDisconnectedException.
> System logs show that a ForcedDisconnect has happened:
> {code:java}
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: Membership coordinator 10.32.111.185(gemfire3_host1_7340:7340:locator)<ec><v0>:41000 has declared that a network partition has occurred
>  at org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2007)
>  at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1085)
>  at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:1422)
>  at org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1327)
>  at org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1266)
>  at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>  at org.jgroups.JChannel.up(JChannel.java:741)
>  at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>  at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>  at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>  at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>  at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>  at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>  at org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72)
>  at org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70)
>  at org.jgroups.protocols.TP.passMessageUp(TP.java:1658)
>  at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876)
>  at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10)
>  at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789)
>  at org.jgroups.protocols.TP.receive(TP.java:1714)
>  at org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:159)
>  at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701)
>  at java.lang.Thread.run(Thread.java:748){code}
> But it is never propagated upwards to the user application:
> {code:java}
> org.apache.geode.distributed.DistributedSystemDisconnectedException: This connection to a distributed system has been disconnected., caused by org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: Membership coordinator 10.32.111.185(gemfire3_host1_7340:7340:locator)<ec><v0>:41000 has declared that a network partition has occurred
>  at org.apache.geode.distributed.internal.InternalDistributedSystem.checkConnected(InternalDistributedSystem.java:978)
>  at org.apache.geode.distributed.internal.InternalDistributedSystem.getDistributionManager(InternalDistributedSystem.java:1679)
>  at org.apache.geode.distributed.internal.ReplyProcessor21.getDistributionManager(ReplyProcessor21.java:366)
>  at org.apache.geode.distributed.internal.ReplyProcessor21.postWait(ReplyProcessor21.java:600)
>  at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:824)
>  at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:779)
>  at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:865)
>  at org.apache.geode.internal.cache.partitioned.FetchKeysMessage$FetchKeysResponse.waitForKeys(FetchKeysMessage.java:584)
>  at org.apache.geode.internal.cache.PartitionedRegion.getBucketKeys(PartitionedRegion.java:4463)
>  at org.apache.geode.internal.cache.PartitionedRegionDataView.getBucketKeys(PartitionedRegionDataView.java:118)
>  at org.apache.geode.internal.cache.PartitionedRegion$KeysSet$KeysSetIterator.getNextBucketIter(PartitionedRegion.java:6180)
>  at org.apache.geode.internal.cache.PartitionedRegion$KeysSet$KeysSetIterator.hasNext(PartitionedRegion.java:6146)
>  at org.apache.geode.internal.cache.PartitionedRegion$KeysSet.toArray(PartitionedRegion.java:6251)
>  at org.apache.geode.internal.cache.PartitionedRegion$KeysSet.toArray(PartitionedRegion.java:6245){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)