You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Kurt Greaves (JIRA)" <ji...@apache.org> on 2017/11/15 05:29:00 UTC

[jira] [Comment Edited] (CASSANDRA-13851) Allow existing nodes to use all peers in shadow round

    [ https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16233892#comment-16233892 ] 

Kurt Greaves edited comment on CASSANDRA-13851 at 11/15/17 5:28 AM:
--------------------------------------------------------------------

|[dtest|https://github.com/apache/cassandra-dtest/compare/master...kgreav:13851]|[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...kgreav:3.11-13851]|

Current changes pass through whatever peers exist in system.peers to the shadow round and will only use them if the first round of gossips to the seeds fail. So it will still try the seeds first, it's just if the shadow round goes longer than 5 seconds will peers be included. This allows a node to start if only peers are alive.
It also allows a node that doesn't need to bootstrap to start, even if it couldn't contact any peers or seeds. 

From what I can tell this works but open to ideas on other test cases/obvious things I've missed.

FWIW this also passed all the bootstrap dtests, if that means anything.

edit: updated dtest to actually fail when there is a timeout, rather than throwing an error.


was (Author: kurtg):
|[dtest|https://github.com/apache/cassandra-dtest/compare/master...kgreav:13851]|[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...kgreav:3.11-13851]|

Current changes pass through whatever peers exist in system.peers to the shadow round and will only use them if the first round of gossips to the seeds fail. So it will still try the seeds first, it's just if the shadow round goes longer than 5 seconds will peers be included. This allows a node to start if only peers are alive.
It also allows a node that doesn't need to bootstrap to start, even if it couldn't contact any peers or seeds. 

From what I can tell this works but open to ideas on other test cases/obvious things I've missed.

FWIW this also passed all the bootstrap dtests, if that means anything.

> Allow existing nodes to use all peers in shadow round
> -----------------------------------------------------
>
>                 Key: CASSANDRA-13851
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13851
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Lifecycle
>            Reporter: Kurt Greaves
>            Assignee: Kurt Greaves
>             Fix For: 3.11.x, 4.x
>
>
> In CASSANDRA-10134 we made collision checks necessary on every startup. A side-effect was introduced that then requires a nodes seeds to be contacted on every startup. Prior to this change an existing node could start up regardless whether it could contact a seed node or not (because checkForEndpointCollision() was only called for bootstrapping nodes). 
> Now if a nodes seeds are removed/deleted/fail it will no longer be able to start up until live seeds are configured (or itself is made a seed), even though it already knows about the rest of the ring. This is inconvenient for operators and has the potential to cause some nasty surprises and increase downtime.
> One solution would be to use all a nodes existing peers as seeds in the shadow round. Not a Gossip guru though so not sure of implications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org