You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Caleb Rackliffe (Jira)" <ji...@apache.org> on 2021/07/19 23:32:00 UTC

[jira] [Commented] (CASSANDRA-16796) Clear pending ranges for a SHUTDOWN peer

    [ https://issues.apache.org/jira/browse/CASSANDRA-16796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383658#comment-17383658 ] 

Caleb Rackliffe commented on CASSANDRA-16796:
---------------------------------------------

Two questions...

1.) So just to make things explicit (for me, a non-gossip expert), notifying subscribers in {{onChange()}} means we hit {{updateNormalTokens()}}, which removes the endpoint from the "moving endpoints" set and eventually removes the pending ranges?

2.) What guarantees that a `PendingRangeTask` has run by the time `gossipShutdownUpdatesTokenMetadata()` verifies there are no longer pending range for node 2? 

The 3.0 patch looks good, so moving on to 4.0...



> Clear pending ranges for a SHUTDOWN peer
> ----------------------------------------
>
>                 Key: CASSANDRA-16796
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16796
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Membership
>            Reporter: Sam Tunnicliffe
>            Assignee: Sam Tunnicliffe
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> If a node involved in a MOVE operation should fail, peers can sometimes maintain pending ranges for it even when it has left the ring and/or been replaced (in practice until the peer is next bounced). This in turn can lead to bogus unavailable responses to clients if a replica for the any of the pending ranges should go down.
> If the moving node crashes hard, a subsequent replacement will correctly fail as long as cassandra.consistent.rangemovement is set to true because the new node will learn the MOVING status from the remaining peers. A graceful shutdown, however, causes that status to be replaced with SHUTDOWN, but doesn't update TokenMetadata, so pending ranges remain for the down node, even after it has been removed from the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org