You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Brandon Williams (JIRA)" <ji...@apache.org> on 2015/05/19 22:25:02 UTC

[jira] [Resolved] (CASSANDRA-9401) Better check for gossip stabilization on startup

     [ https://issues.apache.org/jira/browse/CASSANDRA-9401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams resolved CASSANDRA-9401.
-----------------------------------------
       Resolution: Fixed
    Fix Version/s:     (was: 3.x)
                   3.0 beta 1

Committed.

> Better check for gossip stabilization on startup
> ------------------------------------------------
>
>                 Key: CASSANDRA-9401
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9401
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Tyler Hobbs
>            Assignee: Brandon Williams
>             Fix For: 3.0 beta 1
>
>         Attachments: 9401.txt
>
>
> In CASSANDRA-4288, we started checking the active and pending counts in the Gossip stage on startup.  Once the active + pending counts are zero for three consecutive polls (1 second apart), we consider gossip to have stabilized.
> There are a few of problems with this approach.  In large clusters, it may take a long time for this to happen.  There was one report of it taking 10 minutes in a 1700 node cluster.  The second problem is that the polling cycle could happen to align with the gossip cycle (they both use a 1s period), resulting in active + pending being greater than zero for a long time.  Additionally, if CASSANDRA-9206 is committed, seed nodes would receive a lot of gossip traffic, making it very difficult for them to ever have no active or pending requests.
> A better approach would be to simply wait for the number of endpoint states to stabilize.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)