You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Paulo Motta (Jira)" <ji...@apache.org> on 2020/09/23 14:39:00 UTC

[jira] [Commented] (CASSANDRA-16138) Refactor Local Ring Management

    [ https://issues.apache.org/jira/browse/CASSANDRA-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200863#comment-17200863 ] 

Paulo Motta commented on CASSANDRA-16138:
-----------------------------------------

For those interested, an initial WIP prototype of the refactoring is available on this branch: [pauloricardomg/tokenmetadata-refactor-wip|https://github.com/pauloricardomg/cassandra/tree/tokenmetadata-refactor-wip].

Virtual Node Lifecycle:

!vnode-lifecyle.png!

The refactoring is mostly contained in the [a.o.c.ring package|https://github.com/pauloricardomg/cassandra/tree/tokenmetadata-refactor-wip/src/java/org/apache/cassandra/ring]. Below is a short description of each relevant class:
 * [RingManager|https://github.com/pauloricardomg/cassandra/blob/tokenmetadata-refactor-wip/src/java/org/apache/cassandra/ring/RingManager.java] : Receives cluster membership updates from gossip and maintains an immutable snapshot of the current local ring membership view.
 * [RingSnapshot|https://github.com/pauloricardomg/cassandra/blob/tokenmetadata-refactor-wip/src/java/org/apache/cassandra/ring/RingSnapshot.java] : an immutable local view of the current cluster ring membership state.
 * [VirtualNode|https://github.com/pauloricardomg/cassandra/blob/tokenmetadata-refactor-wip/src/java/org/apache/cassandra/ring/token/VirtualNode.java] : an immutable Token-Owner relationship and its current state.
 * [RingOverlay|https://github.com/pauloricardomg/cassandra/blob/tokenmetadata-refactor-wip/src/java/org/apache/cassandra/ring/RingOverlay.java]: a token routing overlay on top of a RingSnapshot.
 * [MultiDataCenterRingOverlay|https://github.com/pauloricardomg/cassandra/blob/tokenmetadata-refactor-wip/src/java/org/apache/cassandra/ring/MultiDatacenterRingOverlay.java]: the RingOverlay generated by NetworkTopologyStrategy.
 * [RingOverlayStableTest|https://github.com/pauloricardomg/cassandra/blob/tokenmetadata-refactor-wip/test/unit/org/apache/cassandra/ring/RingOverlayStableTest.java]: ensure parity between MultiDataCenterRingOverlay and legacy TokenMetadata/NTS on a stable ring.
 * [RingOverlayStableTest|https://github.com/pauloricardomg/cassandra/blob/tokenmetadata-refactor-wip/test/unit/org/apache/cassandra/ring/RingOverlayBootstrapTest.java] : ensure parity between MultiDataCenterRingOverlay and legacy TokenMetadata/NTS on a ring with bootstrapping nodes.

 

Current TODO is to finish mapping all current legacy states to the vnode lifecycle above and testing to ensure parity with the current implementation.

> Refactor Local Ring Management
> ------------------------------
>
>                 Key: CASSANDRA-16138
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16138
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Cluster/Membership, Feature/Virtual Nodes, Legacy/Distributed Metadata
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>            Priority: Normal
>         Attachments: vnode-lifecyle.png
>
>
> Token ring management is one of the most critical parts of Cassandra, yet one of the most overlooked. Some of the problems include but are not limited to:
> * Complexity (ie. [pending range calculation|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/TokenMetadata.java#L878]) 
> * Inefficiency (ie. [pending range calculation|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/TokenMetadata.java#L878], [AbstractReplicationStrategy.getAddressReplicas|https://github.com/apache/cassandra/blob/33eada06a6dd3529da644377dba180795f522176/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L233])
> * Prone to race conditions (ie. [here|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/ReplicaLayout.java#L198])
> * Poor modularity and consistency (ie. natural replicas computed from [NetworkTopologyStrategy|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java] and pending replicas computed from [TokenMetadata|https://github.com/apache/cassandra/blob/8ba163f25a56cb507e621b89b6928c2aef0ecc57/src/java/org/apache/cassandra/locator/TokenMetadata.java#L1271])
> * Insufficient testing (due to complexity and poor modularity)
> These limitations make it difficult to reliably fix bugs like properly supporting node replacement with the same IP address (CASSANDRA-12344), add improvements such as safe ring membership changes, support for networking via identity instead of IP (CASSANDRA-15823) or add new features such as dynamic virtual nodes.
> This ticket aims at refactoring the ring management sub-module (namely TokenMetadata and related classes) to address most of its current limitations in order to support further improvements and new features.
> Some of the requirements of the proposed refactoring are:
> # Make node-local ring representation fully immutable and snapshottable.
> # Add content-based versioning to uniquely identify a ring snapshot throughout the cluster.
> # Make token ring management vnode-centric to support membership operations on individual tokens and simplify token assignment calculations.
> # Primarily identify ring endpoints by node ID to decouple a node’s identity from its IP address.
> # Add a local publish/subscribe mechanism for ring change notifications, so other modules can subscribe to it and receive the newest snapshot of the ring after membership changes.
> # Add testing framework to verify correctness of ring membership operations.
> # Ensure the refactored sub-module does not change current behavior via comprehensive testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org