You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Luca Rondanini <lu...@gmail.com> on 2022/06/15 07:08:04 UTC

more nodes than vnodes

Hi all,

I'm just trying to understand better how cassandra works.

My understanding is that, once set, the number of vnodes does not change in
a cluster. The partitioner allocates vnodes to nodes ensuring replication
data are not stored on the same node.

But what happens if there are more nodes than vnodes? If I set num_tokens
to 3 and I have 5 servers? Unless the partitioner adds vnodes and moves
data around but it seems an extremely expensive operation. I'm sure I'm
missing something, I'm not quite sure what! :)

Thanks,
Luca

Re: more nodes than vnodes

Posted by Luca Rondanini <lu...@gmail.com>.
Awesome, thank you so much! I completely missed the part "the token range
that it hits will be split", now everything makes sense!

Again, thanks a lot for your help!

Luca


On Wed, Jun 15, 2022 at 1:04 AM Hannu Kröger <hk...@gmail.com> wrote:

> Adding a token (which in essence is a vnode) means that the token range
> that it hits will be split into two. And that data range which has a new
> owner will be replicated to the new owner node. If there are a lot of
> tokens (=vnodes) in the cluster, adding some amount of vnodes (e.g.
> num_tokens=16) is going to affect that amount (e.g. 16) of existing ranges
> but if there are a lot of tokens, each range is relatively small and
> distributed across the cluster.
>
>
> A very naive example:
> Cluster has 100 nodes and 100GB data with replication factor=3 => 300GB
> data altogether. Each node will have ~3GB data. num_tokens is let’s say
> 256. In the cluster there would be 256*100 => 25600 tokens altogether.
> You add one more node and let’s imagine that tokens are perfectly
> distributed, in the future each node will contain 2.97GB of data.
>
> When that new node is joining, those 256 tokens are (hopefully)
> distributed evenly and each of those 100 nodes will replicate ~0.03GB of
> data to that new node so that it will eventually have that 2.97GB of data.
> And the cluster would have 25856 tokens after the scaling out operation.
> And only 256 existing token ranges would be changed, not all 25600 when a
> new node is joining.
>
> So you see that for each node it’s only 30mb to replicate to the new node.
> Not very expensive, right?
>
> In real life, it’s not so precise and all but the basic idea is the same.
>
> Cheers,
> Hannu
>
> On 15. Jun 2022, at 10.32, Luca Rondanini <lu...@gmail.com>
> wrote:
>
> Thanks a lot Hannu,
>
> really helpful! But isn't that crazy expensive? adding a vnode means that
> every vnode in the cluster will have a different range of tokens which
> means a lot of data will need to be moved around.
>
> Thanks again,
> Luca
>
>
>
> On Wed, Jun 15, 2022 at 12:25 AM Hannu Kröger <hk...@gmail.com> wrote:
>
>> When a node joins a cluster, it gets (semi-)random tokens based on
>> num_tokens value.
>>
>> Total amount of vnodes is not fixed. I don’t remember top of my hat if
>> num_tokens can be different on each node but whenever you add a node, new
>> vnodes get “created”. Existing token ranges will be split and some range
>> will be allocated for the new node and data is being replicated to the
>> joining node. So if you have num_tokens set to a higher value like 16 or
>> so, adding and removing a single node in a cluster is standard operation
>> and although it causes some load on the cluster, it should be somewhat
>> evenly distributed among other nodes. If you have just a single token per
>> node then scaling up or down has a bit different effects due to balancing
>> issues etc. So there is a reason why default num_tokens is 16 currently.
>>
>> Cheers,
>> Hannu
>>
>> On 15. Jun 2022, at 10.12, Luca Rondanini <lu...@gmail.com>
>> wrote:
>>
>> ok, that makes sense, but does the partitioner add vnodes? is the number
>> of vnodes fixed in a cluster?
>>
>> On Wed, Jun 15, 2022 at 12:10 AM Hannu Kröger <hk...@gmail.com> wrote:
>>
>>> Hey,
>>>
>>> num_tokens is tokens per node.
>>>
>>> So in your case you would have 15 vnodes altogether.
>>>
>>> Cheers,
>>> Hannu
>>>
>>> > On 15. Jun 2022, at 10.08, Luca Rondanini <lu...@gmail.com>
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I'm just trying to understand better how cassandra works.
>>> >
>>> > My understanding is that, once set, the number of vnodes does not
>>> change in a cluster. The partitioner allocates vnodes to nodes ensuring
>>> replication data are not stored on the same node.
>>> >
>>> > But what happens if there are more nodes than vnodes? If I set
>>> num_tokens to 3 and I have 5 servers? Unless the partitioner adds vnodes
>>> and moves data around but it seems an extremely expensive operation. I'm
>>> sure I'm missing something, I'm not quite sure what! :)
>>> >
>>> > Thanks,
>>> > Luca
>>> >
>>>
>>>
>>
>

Re: more nodes than vnodes

Posted by Hannu Kröger <hk...@gmail.com>.
Adding a token (which in essence is a vnode) means that the token range that it hits will be split into two. And that data range which has a new owner will be replicated to the new owner node. If there are a lot of tokens (=vnodes) in the cluster, adding some amount of vnodes (e.g. num_tokens=16) is going to affect that amount (e.g. 16) of existing ranges but if there are a lot of tokens, each range is relatively small and distributed across the cluster.


A very naive example:
Cluster has 100 nodes and 100GB data with replication factor=3 => 300GB data altogether. Each node will have ~3GB data. num_tokens is let’s say 256. In the cluster there would be 256*100 => 25600 tokens altogether.
You add one more node and let’s imagine that tokens are perfectly distributed, in the future each node will contain 2.97GB of data.

When that new node is joining, those 256 tokens are (hopefully) distributed evenly and each of those 100 nodes will replicate ~0.03GB of data to that new node so that it will eventually have that 2.97GB of data. And the cluster would have 25856 tokens after the scaling out operation. And only 256 existing token ranges would be changed, not all 25600 when a new node is joining.

So you see that for each node it’s only 30mb to replicate to the new node. Not very expensive, right?

In real life, it’s not so precise and all but the basic idea is the same.

Cheers,
Hannu

> On 15. Jun 2022, at 10.32, Luca Rondanini <lu...@gmail.com> wrote:
> 
> Thanks a lot Hannu,
> 
> really helpful! But isn't that crazy expensive? adding a vnode means that every vnode in the cluster will have a different range of tokens which means a lot of data will need to be moved around. 
> 
> Thanks again, 
> Luca
> 
> 
> 
> On Wed, Jun 15, 2022 at 12:25 AM Hannu Kröger <hkroger@gmail.com <ma...@gmail.com>> wrote:
> When a node joins a cluster, it gets (semi-)random tokens based on num_tokens value.
> 
> Total amount of vnodes is not fixed. I don’t remember top of my hat if num_tokens can be different on each node but whenever you add a node, new vnodes get “created”. Existing token ranges will be split and some range will be allocated for the new node and data is being replicated to the joining node. So if you have num_tokens set to a higher value like 16 or so, adding and removing a single node in a cluster is standard operation and although it causes some load on the cluster, it should be somewhat evenly distributed among other nodes. If you have just a single token per node then scaling up or down has a bit different effects due to balancing issues etc. So there is a reason why default num_tokens is 16 currently.
> 
> Cheers,
> Hannu
> 
>> On 15. Jun 2022, at 10.12, Luca Rondanini <luca.rondanini@gmail.com <ma...@gmail.com>> wrote:
>> 
>> ok, that makes sense, but does the partitioner add vnodes? is the number of vnodes fixed in a cluster?
>> 
>> On Wed, Jun 15, 2022 at 12:10 AM Hannu Kröger <hkroger@gmail.com <ma...@gmail.com>> wrote:
>> Hey,
>> 
>> num_tokens is tokens per node.
>> 
>> So in your case you would have 15 vnodes altogether.
>> 
>> Cheers,
>> Hannu
>> 
>> > On 15. Jun 2022, at 10.08, Luca Rondanini <luca.rondanini@gmail.com <ma...@gmail.com>> wrote:
>> > 
>> > Hi all,
>> > 
>> > I'm just trying to understand better how cassandra works. 
>> > 
>> > My understanding is that, once set, the number of vnodes does not change in a cluster. The partitioner allocates vnodes to nodes ensuring replication data are not stored on the same node.
>> > 
>> > But what happens if there are more nodes than vnodes? If I set num_tokens to 3 and I have 5 servers? Unless the partitioner adds vnodes and moves data around but it seems an extremely expensive operation. I'm sure I'm missing something, I'm not quite sure what! :)
>> > 
>> > Thanks,
>> > Luca
>> > 
>> 
> 


Re: more nodes than vnodes

Posted by Luca Rondanini <lu...@gmail.com>.
Thanks a lot Hannu,

really helpful! But isn't that crazy expensive? adding a vnode means that
every vnode in the cluster will have a different range of tokens which
means a lot of data will need to be moved around.

Thanks again,
Luca



On Wed, Jun 15, 2022 at 12:25 AM Hannu Kröger <hk...@gmail.com> wrote:

> When a node joins a cluster, it gets (semi-)random tokens based on
> num_tokens value.
>
> Total amount of vnodes is not fixed. I don’t remember top of my hat if
> num_tokens can be different on each node but whenever you add a node, new
> vnodes get “created”. Existing token ranges will be split and some range
> will be allocated for the new node and data is being replicated to the
> joining node. So if you have num_tokens set to a higher value like 16 or
> so, adding and removing a single node in a cluster is standard operation
> and although it causes some load on the cluster, it should be somewhat
> evenly distributed among other nodes. If you have just a single token per
> node then scaling up or down has a bit different effects due to balancing
> issues etc. So there is a reason why default num_tokens is 16 currently.
>
> Cheers,
> Hannu
>
> On 15. Jun 2022, at 10.12, Luca Rondanini <lu...@gmail.com>
> wrote:
>
> ok, that makes sense, but does the partitioner add vnodes? is the number
> of vnodes fixed in a cluster?
>
> On Wed, Jun 15, 2022 at 12:10 AM Hannu Kröger <hk...@gmail.com> wrote:
>
>> Hey,
>>
>> num_tokens is tokens per node.
>>
>> So in your case you would have 15 vnodes altogether.
>>
>> Cheers,
>> Hannu
>>
>> > On 15. Jun 2022, at 10.08, Luca Rondanini <lu...@gmail.com>
>> wrote:
>> >
>> > Hi all,
>> >
>> > I'm just trying to understand better how cassandra works.
>> >
>> > My understanding is that, once set, the number of vnodes does not
>> change in a cluster. The partitioner allocates vnodes to nodes ensuring
>> replication data are not stored on the same node.
>> >
>> > But what happens if there are more nodes than vnodes? If I set
>> num_tokens to 3 and I have 5 servers? Unless the partitioner adds vnodes
>> and moves data around but it seems an extremely expensive operation. I'm
>> sure I'm missing something, I'm not quite sure what! :)
>> >
>> > Thanks,
>> > Luca
>> >
>>
>>
>

Re: more nodes than vnodes

Posted by Hannu Kröger <hk...@gmail.com>.
When a node joins a cluster, it gets (semi-)random tokens based on num_tokens value.

Total amount of vnodes is not fixed. I don’t remember top of my hat if num_tokens can be different on each node but whenever you add a node, new vnodes get “created”. Existing token ranges will be split and some range will be allocated for the new node and data is being replicated to the joining node. So if you have num_tokens set to a higher value like 16 or so, adding and removing a single node in a cluster is standard operation and although it causes some load on the cluster, it should be somewhat evenly distributed among other nodes. If you have just a single token per node then scaling up or down has a bit different effects due to balancing issues etc. So there is a reason why default num_tokens is 16 currently.

Cheers,
Hannu

> On 15. Jun 2022, at 10.12, Luca Rondanini <lu...@gmail.com> wrote:
> 
> ok, that makes sense, but does the partitioner add vnodes? is the number of vnodes fixed in a cluster?
> 
> On Wed, Jun 15, 2022 at 12:10 AM Hannu Kröger <hkroger@gmail.com <ma...@gmail.com>> wrote:
> Hey,
> 
> num_tokens is tokens per node.
> 
> So in your case you would have 15 vnodes altogether.
> 
> Cheers,
> Hannu
> 
> > On 15. Jun 2022, at 10.08, Luca Rondanini <luca.rondanini@gmail.com <ma...@gmail.com>> wrote:
> > 
> > Hi all,
> > 
> > I'm just trying to understand better how cassandra works. 
> > 
> > My understanding is that, once set, the number of vnodes does not change in a cluster. The partitioner allocates vnodes to nodes ensuring replication data are not stored on the same node.
> > 
> > But what happens if there are more nodes than vnodes? If I set num_tokens to 3 and I have 5 servers? Unless the partitioner adds vnodes and moves data around but it seems an extremely expensive operation. I'm sure I'm missing something, I'm not quite sure what! :)
> > 
> > Thanks,
> > Luca
> > 
> 


Re: more nodes than vnodes

Posted by Luca Rondanini <lu...@gmail.com>.
ok, that makes sense, but does the partitioner add vnodes? is the number of
vnodes fixed in a cluster?

On Wed, Jun 15, 2022 at 12:10 AM Hannu Kröger <hk...@gmail.com> wrote:

> Hey,
>
> num_tokens is tokens per node.
>
> So in your case you would have 15 vnodes altogether.
>
> Cheers,
> Hannu
>
> > On 15. Jun 2022, at 10.08, Luca Rondanini <lu...@gmail.com>
> wrote:
> >
> > Hi all,
> >
> > I'm just trying to understand better how cassandra works.
> >
> > My understanding is that, once set, the number of vnodes does not change
> in a cluster. The partitioner allocates vnodes to nodes ensuring
> replication data are not stored on the same node.
> >
> > But what happens if there are more nodes than vnodes? If I set
> num_tokens to 3 and I have 5 servers? Unless the partitioner adds vnodes
> and moves data around but it seems an extremely expensive operation. I'm
> sure I'm missing something, I'm not quite sure what! :)
> >
> > Thanks,
> > Luca
> >
>
>

Re: more nodes than vnodes

Posted by Hannu Kröger <hk...@gmail.com>.
Hey,

num_tokens is tokens per node.

So in your case you would have 15 vnodes altogether.

Cheers,
Hannu

> On 15. Jun 2022, at 10.08, Luca Rondanini <lu...@gmail.com> wrote:
> 
> Hi all,
> 
> I'm just trying to understand better how cassandra works. 
> 
> My understanding is that, once set, the number of vnodes does not change in a cluster. The partitioner allocates vnodes to nodes ensuring replication data are not stored on the same node.
> 
> But what happens if there are more nodes than vnodes? If I set num_tokens to 3 and I have 5 servers? Unless the partitioner adds vnodes and moves data around but it seems an extremely expensive operation. I'm sure I'm missing something, I'm not quite sure what! :)
> 
> Thanks,
> Luca
>