You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Brandon Williams (JIRA)" <ji...@apache.org> on 2014/11/06 18:47:34 UTC

[jira] [Commented] (CASSANDRA-8138) replace_address cannot find node to be replaced node after seed node restart

    [ https://issues.apache.org/jira/browse/CASSANDRA-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200541#comment-14200541 ] 

Brandon Williams commented on CASSANDRA-8138:
---------------------------------------------

I think I'd much rather say that the edge case of a node dying, and then a full cluster restart (rolling would still work) is just not supported, rather than make such invasive changes to support replacement under such strange and rare conditions.  If that happens, it's time to assassinate the node and bootstrap another one.

> replace_address cannot find node to be replaced node after seed node restart
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8138
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8138
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Oleg Anastasyev
>            Assignee: Brandon Williams
>         Attachments: ReplaceAfterSeedRestart.txt
>
>
> If a node failed and a cluster was restarted (which is common case on massive outages), replace_address fails with
> {code}
> Caused by: java.lang.RuntimeException: Cannot replace_address /172.19.56.97 because it doesn't exist in gossip
> jvm 1    | 	at org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:472)
> jvm 1    | 	at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:724)
> jvm 1    | 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:686)
> jvm 1    | 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:562)
> {code}
> Although neccessary information is saved in system tables on seed nodes, it is not loaded to gossip on seed node, so a replacement node cannot get this info.
> Attached patch loads all information from system tables to gossip with generation 0 and fixes some bugs around this info on shadow gossip round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)