You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/04/30 21:27:00 UTC

[jira] [Commented] (GEODE-6724) split brain formed on concurrent locator startup

    [ https://issues.apache.org/jira/browse/GEODE-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830715#comment-16830715 ] 

ASF subversion and git services commented on GEODE-6724:
--------------------------------------------------------

Commit 0f41c5f46fa731423f7b4d895cccbf8418b40da8 in geode's branch refs/heads/feature/GEODE-6724 from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=0f41c5f ]

GEODE-6724 split brain formed on concurrent locator startup

Ensure that either all locators have been contacted or a decent
number of attempts to join have occurred before allowing a member to
start its own cluster.

If all locators have been contacted we ought to have a sufficient
registration pool to choose a membership coordinator during concurrent
startup.


> split brain formed on concurrent locator startup
> ------------------------------------------------
>
>                 Key: GEODE-6724
>                 URL: https://issues.apache.org/jira/browse/GEODE-6724
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Bruce Schuchardt
>            Assignee: Bruce Schuchardt
>            Priority: Major
>
> In a test with network-partition-detection disabled four locators were spun up in parallel and they formed two different clusters.  Two servers were started and they joined different clusters, ending up with different data.  Consistency checks at the end of the test caught the problem.
> {noformat}
> locatorgemfire_1_2_17088/system.log: [info 2019/04/25 22:00:08.732 PDT <vm_1_thr_1_locator_1_2_host1_17088> tid=0x14] findCoordinator chose rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_2_host1_17088:17088:locator)<ec>:41001 out of these possible coordinators: [rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_2_host1_17088:17088:locator)<ec>:41001]
> locatorgemfire_1_2_17088/system.log: [info 2019/04/25 22:00:08.733 PDT <vm_1_thr_1_locator_1_2_host1_17088> tid=0x14] Discovery state after looking for membership coordinator is SearchState(locatorsContacted=2; findInViewResponses=0; alreadyTried=[]; registrants=[rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_2_host1_17088:17088:locator)<ec>:41001]; possibleCoordinator=rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_2_host1_17088:17088:locator)<ec>:41001; viewId=-1; hasContactedAJoinedLocator=false; view=null; responses=[])
> locatorgemfire_1_2_17088/system.log: [info 2019/04/25 22:00:08.733 PDT <vm_1_thr_1_locator_1_2_host1_17088> tid=0x14] found possible coordinator rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_2_host1_17088:17088:locator)<ec>:41001
> locatorgemfire_1_2_17088/system.log: [info 2019/04/25 22:00:08.733 PDT <vm_1_thr_1_locator_1_2_host1_17088> tid=0x14] This member is becoming the membership coordinator with address rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_2_host1_17088:17088:locator)<ec>:41001
> {noformat}
> {noformat}
> locatorgemfire_1_4_17106/system.log: [info 2019/04/25 22:00:08.762 PDT <vm_3_thr_3_locator_1_4_host1_17106> tid=0x14] findCoordinator chose rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_4_host1_17106:17106:locator)<ec>:41000 out of these possible coordinators: [rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_4_host1_17106:17106:locator)<ec>:41000]
> locatorgemfire_1_4_17106/system.log: [info 2019/04/25 22:00:08.763 PDT <vm_3_thr_3_locator_1_4_host1_17106> tid=0x14] Discovery state after looking for membership coordinator is SearchState(locatorsContacted=3; findInViewResponses=0; alreadyTried=[]; registrants=[rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_2_host1_17088:17088:locator)<ec>:41001, rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_4_host1_17106:17106:locator)<ec>:41000, rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_3_host1_17100:17100:locator)<ec>:41002]; possibleCoordinator=rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_4_host1_17106:17106:locator)<ec>:41000; viewId=-1; hasContactedAJoinedLocator=false; view=null; responses=[])
> locatorgemfire_1_4_17106/system.log: [info 2019/04/25 22:00:08.763 PDT <vm_3_thr_3_locator_1_4_host1_17106> tid=0x14] found possible coordinator rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_4_host1_17106:17106:locator)<ec>:41000
> locatorgemfire_1_4_17106/system.log: [info 2019/04/25 22:00:08.763 PDT <vm_3_thr_3_locator_1_4_host1_17106> tid=0x14] This member is becoming the membership coordinator with address rs-FullRegression26040030a0i3large-hydra-client-53(locatorgemfire_1_4_host1_17106:17106:locator)<ec>:41000
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)