You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2014/04/14 14:23:14 UTC

[jira] [Comment Edited] (CASSANDRA-7032) Improve vnode allocation

    [ https://issues.apache.org/jira/browse/CASSANDRA-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968285#comment-13968285 ] 

Benedict edited comment on CASSANDRA-7032 at 4/14/14 12:21 PM:
---------------------------------------------------------------

Attached a very simple algorithm implementation, and some debug code for viewing the resulting cluster token range distribution for various sizes of cluster.

The approach is simple: rank the nodes descending in order of ownership size, and in each node rank each token range in descending order of size.

When a node joins the cluster, each token range for the new node is selected as a suffix of the largest range from the largest node, adopting a certain proportion (the adoption ratio - somewhere between 0.5 and ~0.6; can be configured as constant or random), then the cluster is resorted and the process is repeated until all tokens have been allocated.

An adoption ratio of 0.5 ultimately results in a perfectly allocated cluster when the cluster size is a power of 2, with 1/(clustersize/2) variance at all other times. Selecting a value larger than 0.5 causes the variance at most cluster sizes to diminish but results in a never-quite-perfect distribution. But this is probably preferable.


was (Author: benedict):
Attached a very simple algorithm implementation, and some debug code for viewing the resulting cluster token range distribution for various sizes of cluster.

> Improve vnode allocation
> ------------------------
>
>                 Key: CASSANDRA-7032
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7032
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>              Labels: performance, vnodes
>             Fix For: 3.0
>
>         Attachments: TestVNodeAllocation.java
>
>
> It's been known for a little while that random vnode allocation causes hotspots of ownership. It should be possible to improve dramatically on this with deterministic allocation. I have quickly thrown together a simple greedy algorithm that allocates vnodes efficiently, and will repair hotspots in a randomly allocated cluster gradually as more nodes are added, and also ensures that token ranges are fairly evenly spread between nodes (somewhat tunably so). The allocation still permits slight discrepancies in ownership, but it is bound by the inverse of the size of the cluster (as opposed to random allocation, which strangely gets worse as the cluster size increases). I'm sure there is a decent dynamic programming solution to this that would be even better.
> If on joining the ring a new node were to CAS a shared table where a canonical allocation of token ranges lives after running this (or a similar) algorithm, we could then get guaranteed bounds on the ownership distribution in a cluster. This will also help for CASSANDRA-6696.



--
This message was sent by Atlassian JIRA
(v6.2#6252)