You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Baron Schwartz <ba...@xaprb.com> on 2013/02/05 17:43:50 UTC

Clarification on num_tokens setting

As I understand the num_tokens setting, it makes Cassandra do the following
pseudocode when a new node is added:

for 1...num_tokens do
   my_token = rand(0, 2^128-1)
   next_token = min(tokens in cluster where token > my_token)
   my_range = (my_token, next_token - 1)
done

Now the new node owns num_tokens chunks of keys that previously belonged to
other nodes.

My point is, with 1 node in the cluster, the ring is divided into
num_tokens ranges. With N nodes, the ring is divided into N*num_tokens.
Correct? The docs do not make this clear for me.

And another point: the tokens are randomly chosen, so the ranges of keys
are not uniform, although with enough nodes in the cluster there probably
won't be any really large ranges. Correct?

Re: Clarification on num_tokens setting

Posted by Eric Evans <ee...@acunu.com>.

On Tue, Feb 5, 2013 at 4:19 PM, aaron morton <aa...@thelastpickle.com> wrote:
>> There is always num_tokens tokens in the ring.
>>
>>
> I got this wrong.
> Each node *does* have num_tokens tokens.
>
>>  With N nodes, the ring is divided into N*num_tokens. Correct?
>
> Yes
>
> In other words it is cluster wide parameter. Correct?
>
> Yes.

Actually, num_tokens is a per node setting.  It might make sense for
example to assign different numbers of tokens in a cluster with
heterogeneous hardware, but I would urge caution as there is currently
no way post facto way to increase or decrease a nodes token count.

--
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: Clarification on num_tokens setting

Posted by aaron morton <aa...@thelastpickle.com>.

> There is always num_tokens tokens in the ring.


I got this wrong. 
Each node *does* have num_tokens tokens. 

>>  With N nodes, the ring is divided into N*num_tokens. Correct? 
Yes

> In other words it is cluster wide parameter. Correct?
Yes.

Cheers
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 6/02/2013, at 10:36 AM, Andrey Ilinykh <ai...@gmail.com> wrote:

> 
> 
> 
> On Tue, Feb 5, 2013 at 12:42 PM, aaron morton <aa...@thelastpickle.com> wrote:
>>  With N nodes, the ring is divided into N*num_tokens. Correct? 
> 
> There is always num_tokens tokens in the ring.
> Each node has (num_tokens / N) * RF ranges on it. 
> 
> That means every node should have the same num_token parameter? In other words it is cluster wide parameter. Correct?
> 
> Thank you,
>   Andrey

Re: Clarification on num_tokens setting

Posted by Andrey Ilinykh <ai...@gmail.com>.

On Tue, Feb 5, 2013 at 12:42 PM, aaron morton <aa...@thelastpickle.com>wrote:

>  With N nodes, the ring is divided into N*num_tokens. Correct?
>
> There is always num_tokens tokens in the ring.
> Each node has (num_tokens / N) * RF ranges on it.
>
> That means every node should have the same num_token parameter? In other
words it is cluster wide parameter. Correct?

Thank you,
  Andrey

Re: Clarification on num_tokens setting

Posted by aaron morton <aa...@thelastpickle.com>.

>  With N nodes, the ring is divided into N*num_tokens. Correct? 
There is always num_tokens tokens in the ring.
Each node has (num_tokens / N) * RF ranges on it. 

> so the ranges of keys are not uniform, although with enough nodes in the cluster there probably won't be any really large ranges. Correct?
Even without vnodes there is no guarantee that nodes had contiguous key ranges. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 6/02/2013, at 5:43 AM, Baron Schwartz <ba...@xaprb.com> wrote:

> As I understand the num_tokens setting, it makes Cassandra do the following pseudocode when a new node is added:
> 
> for 1...num_tokens do
>    my_token = rand(0, 2^128-1)
>    next_token = min(tokens in cluster where token > my_token)
>    my_range = (my_token, next_token - 1)
> done
> 
> Now the new node owns num_tokens chunks of keys that previously belonged to other nodes.
> 
> My point is, with 1 node in the cluster, the ring is divided into num_tokens ranges. With N nodes, the ring is divided into N*num_tokens. Correct? The docs do not make this clear for me.
> 
> And another point: the tokens are randomly chosen, so the ranges of keys are not uniform, although with enough nodes in the cluster there probably won't be any really large ranges. Correct?