You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jon Meredith (Jira)" <ji...@apache.org> on 2020/12/04 01:12:00 UTC
[jira] [Commented] (CASSANDRA-16159) Reduce the Severity of Errors Reported in FailureDetector#isAlive()

    [ https://issues.apache.org/jira/browse/CASSANDRA-16159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243618#comment-17243618 ] 

Jon Meredith commented on CASSANDRA-16159:
------------------------------------------

We've seen this a few more times and I've been investigating the root cause. The {{isAlive}} failure is followed by

{code}
ERROR 2020-11-20T23:27:53,218 [GossipStage:1] org.apache.cassandra.service.CassandraDaemon:510 - Exception in thread Thread[GossipStage:1,5,main]
java.lang.RuntimeException: Node /10.20.30.40:65501 is trying to replace node /10.20.30.50:65501 with tokens [] with a different set of tokens [-3074457345618258603].
        at org.apache.cassandra.locator.TokenMetadata.addReplaceTokens(TokenMetadata.java:366) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.service.StorageService.handleStateBootreplacing(StorageService.java:2606) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.service.StorageService.onChange(StorageService.java:2269) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:3232) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1268) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1395) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:50) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44) ~[cassandra-4.0.jar:4.0]
        at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:890) ~[cassandra-4.0.jar:4.0]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-all-4.1.50.Final.jar:4.1.50.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
{code}

At the time multiple nodes are being replaced concurrently and one of the nodes is included in the seed list for the others.

My current hypothesis is that the replacement for the seed node starts up, goes through {{prepareForReplacement}} doing a successful shadow gossip round with an older running node in the cluster so that the check for the replaced address passes. Then {{prepareToJoin}} starts up the {{Gossiper}} on the empty replacement, and populates it with just the local state (as it was an empty replacement node, there was nothing to load from {{SystemKeyspace}}.

The second replacement (which reports the two error messages above) starts up around the same time as the first node, does it's shadow round too and also talks to an older running node so that it completes {{prepareForReplacement}} and starts up gossiper too which happens to pick the other replacement node as the first seed to gossip to.

The GossipDigestSync/Ack sequence happens, with the replacement seed sending back the single entry for itself in state BOOTSTRAPPING_REPLACE, which the non-seed replacement node tries to handle but fails with the exception above because it can't confirm the tokens.


> Reduce the Severity of Errors Reported in FailureDetector#isAlive()
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-16159
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16159
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip
>            Reporter: Caleb Rackliffe
>            Assignee: Jon Meredith
>            Priority: Normal
>             Fix For: 4.0-rc
>
>
> Noticed the following error in the failure detector during a host replacement:
> {noformat}
> java.lang.IllegalArgumentException: Unknown endpoint: 10.38.178.98:7000
> 	at org.apache.cassandra.gms.FailureDetector.isAlive(FailureDetector.java:281)
> 	at org.apache.cassandra.service.StorageService.handleStateBootreplacing(StorageService.java:2502)
> 	at org.apache.cassandra.service.StorageService.onChange(StorageService.java:2182)
> 	at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:3145)
> 	at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1242)
> 	at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1368)
> 	at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:50)
> 	at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
> 	at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
> 	at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
> 	at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:884)
> 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> {noformat}
> This particular error looks benign, given that even if it occurs, the node continues to handle the {{BOOT_REPLACE}} state. There are two things we might be able to do to improve {{FailureDetector#isAlive()}} though:
> 1.) We don’t short circuit in the case that the endpoint in question is in quarantine after being removed. It may be useful to check for this so we can avoid logging an ERROR when the endpoint is clearly doomed/dead. (Quarantine works great when the gossip message is _from_ a quarantined endpoint, but in this case, that would be the new/replacing and not the old/replaced one.)
> 2.) We can reduce the severity of the logging from ERROR to WARN and provide better context around how to determine whether or not there’s actually a problem. (ex. “If this occurs while trying to determine liveness for a node that is currently being replaced, it can be safely ignored.”)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org