You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Chris Shorrock <ch...@shorrockin.com> on 2010/05/18 01:06:25 UTC

unbalanced token assignment with random partioner

I have a feeling this issue may be more misunderstanding than anything else,
but after searching for an explanation in the wiki and elsewhere my
understanding of token assignments leads me to believe that unbalancing is
bound to occur.

Given a relatively simple example if we take a 2 node cassandra setup with a
random partitioner (letting Cassandra assign the tokens), we end up with a
ring that looks like:

Address       Status     Load          Range
     Ring

69518187202527923173412511728767069233
10.10.249.11  Up         1023.44 MB
 34433420789685454480210475042362028556     |<--|
10.10.249.12  Up         251.16 MB
69518187202527923173412511728767069233     |-->|


Given my understanding of how data works based on the following wiki
statement:

*Each Cassandra server [node] is assigned a unique Token that determines
> what keys it is the first replica for. If you sort all nodes' Tokens, the
> Range of keys each is responsible for is (PreviousToken, MyToken], that is,
> from the previous token (exclusive) to the node's token (inclusive). The
> machine with the lowest Token gets both all keys less than that token, and
> all keys greater than the largest Token; this is called a "wrapping Range."
> *


Given this description this implies, in our example above that 10.10.249.11
would server keys 0 to 3.4E37 and 6.9E37 to 1.7E38 (the "wrapping Range")
while 10.10.249.12 servers 3.4E37 to 6.9E37.  Given this it seems that
10.10.249.11 would end up serving an uneven amount of data.

This issue would of course be mitigated as the cluster grows - but it seems
like the automatic token initial selection of token ranges isn't optimal.

Is this a configuration issue, a misunderstanding, a new version of math
I've developed, or?

Re: unbalanced token assignment with random partioner

Posted by Jonathan Ellis <jb...@gmail.com>.
Yes, if you add nodes when the existing one doesn't have enough data
to guess a good token from the keys it has, it uses a random token.
Created https://issues.apache.org/jira/browse/CASSANDRA-1112 to use
midpoint instead.

On Mon, May 17, 2010 at 4:06 PM, Chris Shorrock <ch...@shorrockin.com> wrote:
> I have a feeling this issue may be more misunderstanding than anything else,
> but after searching for an explanation in the wiki and elsewhere my
> understanding of token assignments leads me to believe that unbalancing is
> bound to occur.
> Given a relatively simple example if we take a 2 node cassandra setup with a
> random partitioner (letting Cassandra assign the tokens), we end up with a
> ring that looks like:
>
> Address       Status     Load          Range
>      Ring
>
> 69518187202527923173412511728767069233
> 10.10.249.11  Up         1023.44 MB
>  34433420789685454480210475042362028556     |<--|
> 10.10.249.12  Up         251.16 MB
> 69518187202527923173412511728767069233     |-->|
>
> Given my understanding of how data works based on the following wiki
> statement:
>>
>> Each Cassandra server [node] is assigned a unique Token that determines
>> what keys it is the first replica for. If you sort all nodes' Tokens, the
>> Range of keys each is responsible for is (PreviousToken, MyToken], that is,
>> from the previous token (exclusive) to the node's token (inclusive). The
>> machine with the lowest Token gets both all keys less than that token, and
>> all keys greater than the largest Token; this is called a "wrapping Range."
>
> Given this description this implies, in our example above that 10.10.249.11
> would server keys 0 to 3.4E37 and 6.9E37 to 1.7E38 (the "wrapping Range")
> while 10.10.249.12 servers 3.4E37 to 6.9E37.  Given this it seems that
> 10.10.249.11 would end up serving an uneven amount of data.
> This issue would of course be mitigated as the cluster grows - but it seems
> like the automatic token initial selection of token ranges isn't optimal.
> Is this a configuration issue, a misunderstanding, a new version of math
> I've developed, or?
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com