You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Joseph Clark (JIRA)" <ji...@apache.org> on 2014/08/27 18:19:59 UTC

[jira] [Comment Edited] (CASSANDRA-7292) Can't seed new node into ring with (public) ip of an old node

    [ https://issues.apache.org/jira/browse/CASSANDRA-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112390#comment-14112390 ] 

Joseph Clark edited comment on CASSANDRA-7292 at 8/27/14 4:18 PM:
------------------------------------------------------------------

I'm running into this same issue using version 1.2.16. 

The seed list does not appear to be an issue as I have no problems bringing a new node into the cluster with a different public IP address using the same seed list.

Furthermore, On the seed node I see an incoming TCP connection from the replacement node when I try to start it up:

Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.IncomingTcpConnection  - Connection version 6 from <replacement node public IP>
Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.IncomingTcpConnection  - Upgrading incoming connection to be compressed
Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.IncomingTcpConnection  - Max version for <replacement node public IP> is 6
Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.MessagingService  - Setting version 6 for <replacement node public IP>
Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.IncomingTcpConnection  - Set version for <replacement node public IP> to 6 (will use 6)
Aug 26 21:50:24 <seed node> cassandra:  [GossipStage:1] DEBUG org.apache.cassandra.gms.Gossiper  - Shadow request received, adding all states
Aug 26 21:50:50 <seed node> cassandra:  [ScheduledTasks:1] DEBUG org.apache.cassandra.service.LoadBroadcaster  - Disseminating load info ...
Aug 26 21:50:55 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.MessagingService  - Reseting version for <replacement node public IP>

Shortly afterward the cassandra service on the replacement node logs the error message reported in the bug description and stops.

Aug 26 21:50:54 <replacement node> java.lang.RuntimeException:  Unable to gossip with any seeds
Aug 26 21:50:54 <replacement node>: at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1124)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:396)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:612)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService.initServer(StorageService.java:591)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService.initServer(StorageService.java:480)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
Aug 26 21:50:54 <replacement node>:  [StorageServiceShutdownHook] ERROR org.apache.cassandra.service.CassandraDaemon  - Exception in thread Thread[StorageServiceShutdownHook,5,main]
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.gms.Gossiper.stop(Gossiper.java:1193)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:550)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
Aug 26 21:50:54 <replacement node>:      at java.lang.Thread.run(Unknown Source)


was (Author: jw.clark):
I'm running into this same issue using version 1.2.16. 

The seed list does not appear to be an issue as I have no problems bringing a new node into the cluster with a different public IP address using the same seed list.

Furthermore, On the seed node I see an incoming TCP connection from the replacement node when I try start it up:

Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.IncomingTcpConnection  - Connection version 6 from <replacement node public IP>
Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.IncomingTcpConnection  - Upgrading incoming connection to be compressed
Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.IncomingTcpConnection  - Max version for <replacement node public IP> is 6
Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.MessagingService  - Setting version 6 for <replacement node public IP>
Aug 26 21:50:24 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.IncomingTcpConnection  - Set version for <replacement node public IP> to 6 (will use 6)
Aug 26 21:50:24 <seed node> cassandra:  [GossipStage:1] DEBUG org.apache.cassandra.gms.Gossiper  - Shadow request received, adding all states
Aug 26 21:50:50 <seed node> cassandra:  [ScheduledTasks:1] DEBUG org.apache.cassandra.service.LoadBroadcaster  - Disseminating load info ...
Aug 26 21:50:55 <seed node> cassandra:  [Thread-9] DEBUG org.apache.cassandra.net.MessagingService  - Reseting version for <replacement node public IP>

Shortly afterward the cassandra service on the replacement node logs the error message reported in the bug description and stops.

Aug 26 21:50:54 <replacement node> java.lang.RuntimeException:  Unable to gossip with any seeds
Aug 26 21:50:54 <replacement node>: at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1124)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:396)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:612)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService.initServer(StorageService.java:591)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService.initServer(StorageService.java:480)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490)
Aug 26 21:50:54 <replacement node>:  [StorageServiceShutdownHook] ERROR org.apache.cassandra.service.CassandraDaemon  - Exception in thread Thread[StorageServiceShutdownHook,5,main]
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.gms.Gossiper.stop(Gossiper.java:1193)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:550)
Aug 26 21:50:54 <replacement node>:      at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
Aug 26 21:50:54 <replacement node>:      at java.lang.Thread.run(Unknown Source)

> Can't seed new node into ring with (public) ip of an old node
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-7292
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7292
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Cassandra 2.0.7, Ec2MultiRegionSnitch
>            Reporter: Juho Mäkinen
>              Labels: bootstrap, gossip
>         Attachments: cassandra-replace-address.log
>
>
> This bug prevents node to return with bootstrap into the cluster with its old ip.
> Scenario: five node ec2 cluster spread into three AZ, all in one region. I'm using Ec2MultiRegionSnitch. Nodes are reported with their public ips (as Ec2MultiRegionSnitch requires)
> I simulated a loss of one node by terminating one instance. nodetool status reported correctly that node was down. Then I launched new instance with the old public ip (i'm using elastic ips) with "Dcassandra.replace_address=IP_ADDRESS" but the new node can't join the cluster:
>  INFO 07:20:43,424 Gathering node replacement information for /54.86.191.30
>  INFO 07:20:43,428 Starting Messaging Service on port 9043
>  INFO 07:20:43,489 Handshaking version with /54.86.171.10
>  INFO 07:20:43,491 Handshaking version with /54.86.187.245
> (some delay)
> ERROR 07:21:14,445 Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> 	at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1193)
> 	at org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:419)
> 	at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:650)
> 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)
> 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:505)
> 	at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:362)
> 	at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:480)
> 	at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:569)
> It does not help if I remove the "Dcassandra.replace_address=IP_ADDRESS" system property. 
> Also it does not help to remove the node with "nodetool removenode" with or without the cassandra.replace_address property.
> I think this is because the node information is preserved in the gossip info as seen this output of "nodetool gossipinfo"
> /54.86.191.30
>   INTERNAL_IP:172.16.1.231
>   DC:us-east
>   REMOVAL_COORDINATOR:REMOVER,d581309a-8610-40d4-ba30-cb250eda22a8
>   STATUS:removed,19311925-46b5-4fe4-928a-321e8adb731d,1401089960664
>   HOST_ID:19311925-46b5-4fe4-928a-321e8adb731d
>   RPC_ADDRESS:0.0.0.0
>   NET_VERSION:7
>   SCHEMA:226f9315-b4b2-32c1-bfe1-f4bb49fccfd5
>   RACK:1b
>   LOAD:7.075290515E9
>   SEVERITY:0.0
>   RELEASE_VERSION:2.0.7



--
This message was sent by Atlassian JIRA
(v6.2#6252)