You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2013/11/04 21:48:17 UTC

[jira] [Updated] (CASSANDRA-6297) Gossiper blocks when updating tokens and turns node down

     [ https://issues.apache.org/jira/browse/CASSANDRA-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-6297:
--------------------------------------

    Attachment: 6297.txt

Thinking it through, the simplest solution is to just remove the blocking flush.  This does mean that we have a longer potential window of non-durability for the peer information under Periodic CommitLog, but this does not make things qualitatively worse -- e.g., if we were down entirely during the node addition we would also have to deal with not having the peer information on restart.

I see alternatives to removing the blocking flush as falling into two categories:
# semantically equivalent solutions with more complex implementations (e.g. moving updateTokens into another thread or executor)
# dramatically complex gymnastics that aren't worth the small extra benefit, such as adding a special commitlog sync instead of the blocking flush

> Gossiper blocks when updating tokens and turns node down
> --------------------------------------------------------
>
>                 Key: CASSANDRA-6297
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6297
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sergio Bossa
>         Attachments: 6297.txt
>
>
> The GossipStage call to SystemTable.updateTokens causes a blocking memtable flush that may get stuck in the postFlushExecutor queue while waiting for other memtables to flush; as a consequence, the Gossiper itself "blocks" and the node is turned down.



--
This message was sent by Atlassian JIRA
(v6.1#6144)