You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jens Rantil (JIRA)" <ji...@apache.org> on 2014/11/16 14:56:33 UTC

[jira] [Resolved] (CASSANDRA-8318) Unable to replace a node

     [ https://issues.apache.org/jira/browse/CASSANDRA-8318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jens Rantil resolved CASSANDRA-8318.
------------------------------------
    Resolution: Cannot Reproduce

> Unable to replace a node
> ------------------------
>
>                 Key: CASSANDRA-8318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8318
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: 2.0.8.39 (Datastax DSE 4.5.3)
>            Reporter: Jens Rantil
>         Attachments: X.X.X.56.log
>
>
> Had a hardware failure of a node. I followed the Datastax documentation[1] on how to replace the node X.X.X.51 using a brand new node with the same IP. Since it didn't come up after waiting for ~5 minutes or so, I decided to replace X.X.X.51 with a brand new unused IP X.X.X.56 instead. It now seems like my gossip is some weird state. When I start the replacement node I see line like
> {noformat}
>  INFO [GossipStage:1] 2014-11-14 14:57:03,025 Gossiper.java (line 901) InetAddress /X.X.X.51 is now DOWN
>  INFO [GossipStage:1] 2014-11-14 14:57:03,042 Gossiper.java (line 901) InetAddress /X.X.X.56 is now DOWN
> {noformat}
> . The latter is somewhat surprising since that is the IP of the actual replacement node. It doesn't surprise me it can't talk to itself if it hasn't started!
> Eventually the replacement node shuts down with
> {noformat}
> ERROR [main] 2014-11-14 14:58:06,031 CassandraDaemon.java (line 513) Exception encountered during startup
> java.lang.UnsupportedOperationException: Cannot replace token -2 which does not exist!
> 	at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:782)
> 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:614)
> 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:503)
> 	at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> 	at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:374)
> 	at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> 	at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:615)
>  INFO [Thread-2] 2014-11-14 14:58:06,035 DseDaemon.java (line 461) DSE shutting down...
>  INFO [StorageServiceShutdownHook] 2014-11-14 14:58:06,037 Gossiper.java (line 1307) Announcing shutdown
>  INFO [Thread-2] 2014-11-14 14:58:06,046 PluginManager.java (line 355) All plugins are stopped.
>  INFO [Thread-2] 2014-11-14 14:58:06,047 CassandraDaemon.java (line 463) Cassandra shutting down...
> ERROR [Thread-2] 2014-11-14 14:58:06,047 CassandraDaemon.java (line 199) Exception in thread Thread[Thread-2,5,main]
> java.lang.NullPointerException
> 	at org.apache.cassandra.service.CassandraDaemon.stop(CassandraDaemon.java:464)
> 	at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:464)
> 	at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:364){noformat}
> All nodes are showing
> {noformat}
> root@machine-2:~# nodetool status company
> Datacenter: Analytics
> =====================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address     Load       Tokens  Owns (effective)  Host ID                               Rack
> UN  X.X.X.50  18.35 GB   1       16.7%             25efdbcd-14d3-4e9c-803a-3db5795d4efa  rack1
> DN  X.X.X.51  195.67 KB  1       16.7%             d97cf86f-bfaf-4488-b716-26d71635a8fc  rack1
> UN  X.X.X.52  18.7 GB    1       16.7%             caa32f68-5a6b-4d87-80bd-baa66a9b61ce  rack1
> UN  X.X.X.53  18.56 GB   1       16.7%             e219321e-a6d5-48c4-9bad-d2e25429b1d2  rack1
> UN  X.X.X.54  19.69 GB   1       16.7%             3cd36895-ee47-41c1-a5f5-41cb0f8526a6  rack1
> UN  X.X.X.55  18.88 GB   1       16.7%             7d3f73c4-724e-45a6-bcf9-d3072dfc157f  rack1
> Datacenter: Cassandra
> =====================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address     Load       Tokens  Owns (effective)  Host ID                               Rack
> UN  X.X.X.33  128.95 GB  256     100.0%            871968c9-1d6b-4f06-ba90-8b3a8d92dcf0  rack1
> UN  X.X.X.32  115.3 GB   256     100.0%            d7cacd89-8613-4de5-8a5e-a2c53c41ea45  rack1
> UN  X.X.X.31  130.45 GB  256     100.0%            48cb0782-6c9a-4805-9330-38e192b6b680  rack1
> {noformat}
> , but when X.X.X.56 is starting is shows
> {noformat}
> root@machine-1:/var/lib/cassandra# nodetool status
> Note: Ownership information does not include topology; for complete information, specify a keyspace
> Datacenter: Analytics
> =====================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address     Load       Tokens  Owns   Host ID                               Rack
> UN  X.X.X.50  18.41 GB   1       0.2%   25efdbcd-14d3-4e9c-803a-3db5795d4efa  rack1
> UN  X.X.X.52  19.07 GB   1       0.0%   caa32f68-5a6b-4d87-80bd-baa66a9b61ce  rack1
> UN  X.X.X.53  18.65 GB   1       0.1%   e219321e-a6d5-48c4-9bad-d2e25429b1d2  rack1
> UN  X.X.X.54  19.69 GB   1       0.0%   3cd36895-ee47-41c1-a5f5-41cb0f8526a6  rack1
> UN  X.X.X.55  18.97 GB   1       0.2%   7d3f73c4-724e-45a6-bcf9-d3072dfc157f  rack1
> Datacenter: Cassandra
> =====================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address     Load       Tokens  Owns   Host ID                               Rack
> UN  X.X.X.33  129.72 GB  256     21.7%  871968c9-1d6b-4f06-ba90-8b3a8d92dcf0  rack1
> UN  X.X.X.32  116 GB     256     12.4%  d7cacd89-8613-4de5-8a5e-a2c53c41ea45  rack1
> UN  X.X.X.31  130.62 GB  256     65.3%  48cb0782-6c9a-4805-9330-38e192b6b680  rack1
> {noformat}
> The above cluster state does not seem to replicate to the rest of the cluster (hasn't so far).
> Any input on how I can restore world order is appreciated.
> [1] http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)