You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefania (JIRA)" <ji...@apache.org> on 2015/12/18 16:32:46 UTC
[jira] [Commented] (CASSANDRA-8072) Exception during startup:
Unable to gossip with any seeds
[ https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064093#comment-15064093 ]
Stefania commented on CASSANDRA-8072:
-------------------------------------
I've reproduced it on 2.1 HEAD with this dtest, which follows the steps of CASSANDRA-8422:
{code}
def decommissioned_wiped_node_can_gossip_to_single_seed_test(self):
"""
@jira_ticket CASSANDRA-8072
@jira_ticket CASSANDRA-8422
Test that if we decommission a node, kill it and wipe its data, it can join a cluster with a single
seed node.
"""
cluster = self.cluster
cluster.populate(1)
cluster.start(wait_for_binary_proto=True)
# Add a new node, bootstrap=True ensures that it is not a seed
node2 = new_node(cluster, bootstrap=True)
node2.start(wait_for_binary_proto=True, wait_other_notice=True)
# Decommision the new node and kill it
node2.decommission()
node2.stop(gently=False)
# Wipe its data
data_dir = os.path.join(node2.get_path(), 'data')
commitlog_dir = os.path.join(node2.get_path(), 'commitlogs')
debug("Deleting {}".format(data_dir))
shutil.rmtree(data_dir)
shutil.rmtree(commitlog_dir)
# Now start it, it should be allowed to join
mark = node2.mark_log()
node2.start(wait_other_notice=False)
node2.watch_log_for("JOINING:", from_mark=mark)
{code}
This results in the following exception in node2:
{code}
ERROR [main] 2015-12-18 16:21:07,357 CassandraDaemon.java:581 - Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1337) ~[main/:na]
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:541) ~[main/:na]
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:789) ~[main/:na]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:721) ~[main/:na]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:612) ~[main/:na]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:389) [main/:na]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564) [main/:na]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:653) [main/:na]
WARN [StorageServiceShutdownHook] 2015-12-18 16:21:07,360 Gossiper.java:1454 - No local state or state is in silent shutdown, not announcing shutdown
INFO [StorageServiceShutdownHook] 2015-12-18 16:21:07,361 MessagingService.java:734 - Waiting for messaging service to quiesce
INFO [ACCEPT-/127.0.0.2] 2015-12-18 16:21:07,361 MessagingService.java:1018 - MessagingService has terminated the accept() thread
{code}
> Exception during startup: Unable to gossip with any seeds
> ---------------------------------------------------------
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
> Issue Type: Bug
> Components: Lifecycle
> Reporter: Ryan Springer
> Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, casandra-system-log-with-assert-patch.log, screenshot-1.png, trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster in either ec2 or locally, an error occurs sometimes with one of the nodes refusing to start C*. The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
> INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java (line 1279) Announcing shutdown
> INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 MessagingService.java (line 701) Waiting for messaging service to quiesce
> INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 MessagingService.java (line 941) MessagingService has terminated the accept() thread
> This errors does not always occur when provisioning a 2-node cluster, but probably around half of the time on only one of the nodes. I haven't been able to reproduce this error with DSC 2.0.9, and there have been no code or definition file changes in Opscenter.
> I can reproduce locally with the above steps. I'm happy to test any proposed fixes since I'm the only person able to reproduce reliably so far.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)