You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefania (JIRA)" <ji...@apache.org> on 2015/12/18 16:32:46 UTC

[jira] [Commented] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

    [ https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064093#comment-15064093 ] 

Stefania commented on CASSANDRA-8072:
-------------------------------------

I've reproduced it on 2.1 HEAD with this dtest, which follows the steps of CASSANDRA-8422:

{code}
def decommissioned_wiped_node_can_gossip_to_single_seed_test(self):
        """
        @jira_ticket CASSANDRA-8072
        @jira_ticket CASSANDRA-8422
        Test that if we decommission a node, kill it and wipe its data, it can join a cluster with a single
        seed node.
        """
        cluster = self.cluster
        cluster.populate(1)
        cluster.start(wait_for_binary_proto=True)

        # Add a new node, bootstrap=True ensures that it is not a seed
        node2 = new_node(cluster, bootstrap=True)
        node2.start(wait_for_binary_proto=True, wait_other_notice=True)

        # Decommision the new node and kill it
        node2.decommission()
        node2.stop(gently=False)

        # Wipe its data
        data_dir = os.path.join(node2.get_path(), 'data')
        commitlog_dir = os.path.join(node2.get_path(), 'commitlogs')
        debug("Deleting {}".format(data_dir))
        shutil.rmtree(data_dir)
        shutil.rmtree(commitlog_dir)

        # Now start it, it should be allowed to join
        mark = node2.mark_log()
        node2.start(wait_other_notice=False)
        node2.watch_log_for("JOINING:", from_mark=mark)
{code} 

This results in the following exception in node2:

{code}
ERROR [main] 2015-12-18 16:21:07,357 CassandraDaemon.java:581 - Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
        at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1337) ~[main/:na]
        at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:541) ~[main/:na]
        at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:789) ~[main/:na]
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:721) ~[main/:na]
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:612) ~[main/:na]
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:389) [main/:na]
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564) [main/:na]
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:653) [main/:na]
WARN  [StorageServiceShutdownHook] 2015-12-18 16:21:07,360 Gossiper.java:1454 - No local state or state is in silent shutdown, not announcing shutdown
INFO  [StorageServiceShutdownHook] 2015-12-18 16:21:07,361 MessagingService.java:734 - Waiting for messaging service to quiesce
INFO  [ACCEPT-/127.0.0.2] 2015-12-18 16:21:07,361 MessagingService.java:1018 - MessagingService has terminated the accept() thread
{code}

> Exception during startup: Unable to gossip with any seeds
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-8072
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Lifecycle
>            Reporter: Ryan Springer
>            Assignee: Stefania
>             Fix For: 2.1.x
>
>         Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, casandra-system-log-with-assert-patch.log, screenshot-1.png, trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster in either ec2 or locally, an error occurs sometimes with one of the nodes refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
>         at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
>         at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
>         at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
>         at org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
>         at org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
>         at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
>         at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
>         at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 MessagingService.java (line 941) MessagingService has terminated the accept() thread
> This errors does not always occur when provisioning a 2-node cluster, but probably around half of the time on only one of the nodes.  I haven't been able to reproduce this error with DSC 2.0.9, and there have been no code or definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)