You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2016/12/09 22:05:58 UTC

[jira] [Commented] (GEODE-2193) a member is kicked out immediately after joining

    [ https://issues.apache.org/jira/browse/GEODE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736472#comment-15736472 ] 

ASF subversion and git services commented on GEODE-2193:
--------------------------------------------------------

Commit ef86239f872c12c0aad38d5ae3044e22fd5e87af in geode's branch refs/heads/develop from [~bschuchardt]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=ef86239 ]

GEODE-2193 a member is kicked out immediately after joining

The problem is happening because we send a shutdown message, initiating
election of a new coordinator, but the old ViewCreator is allowed to
send out a view announcing a new member.  The new coordinator manages
to send out a new view before the old ViewCreator sends out the new
member's view.  Other members ignore the old ViewCreator's view
because its view ID is old.  Then the reject the new member because
it has an old view ID and it isn't in their membership view.

initial view ID is x

new coordinator prepares view x+10
old coordinator prepares view x+1
other members install x+10, reject view x+1
new member joins in view x+1 when it receives view-prepare message
new member is rejected by other members because x+1 < x+10


> a member is kicked out immediately after joining
> ------------------------------------------------
>
>                 Key: GEODE-2193
>                 URL: https://issues.apache.org/jira/browse/GEODE-2193
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Bruce Schuchardt
>            Assignee: Mark Bretl
>
> We have observed a number of cases where a member is kicked out immediately after joining.  The problem seems to be this:
> 1) the member sends a join request to the current coordinator
> 2) the current coordinator is in the process of shutting down
> 3) the current coordinator sends a view preparation message admitting the new member
> 4) another member receives the current coordinator's shutdown message and initiates becoming the coordinator
> 5) the new coordinator sends out a membership view that does not include the new member
> 6) the new member receives the prepared view and continues with startup
> 7) the new member sends startup messages to other members
> 8) the other members have the new coordinator's view and request removal of the new member as being rogue
> 9) the new coordinator sends a Leave message to the new member, causing it to issue a ForcedDisconnect
> The old coordinator should not initiate a new view if it is shutting down.  It needs to have cancellation & shutdown checks in its view transmission methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)