You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Carlos Pérez Miguel <cp...@gmail.com> on 2012/01/13 18:40:27 UTC

About initial token, autobootstraping and load balance

Hello,

I have a doubt about how initial token is determined. In Cassandra's
documentation it is said that it is better to manually configure the
initial token to each node in the system but also is said that if
initial token is not defined and autobootstrap is true, new nodes
choose initial token in order to better the load balance of the
cluster. But what happens if no initial token is chosen and
autobootstrap is not activated? How each node selects its initial
token to balance the ring?

I ask this because I am making tests with a 20 nodes cassandra cluster
with cassandra 0.7.9. Any node has initial token, nor
autobootstraping. I restart the cluster with each test I want to make
and in the end the cluster is always well balanced.

Thanks

Carlos Pérez Miguel

Re: About initial token, autobootstraping and load balance

Posted by Carlos Pérez Miguel <cp...@gmail.com>.
Sorry, my english is not very well on sundays. By "partage" i mean "to
share" and by greate I mean "great".

Anyway, thanks everybody for your answers.

Carlos Pérez Miguel



El día 15 de enero de 2012 21:53, Carlos Pérez Miguel
<cp...@gmail.com> escribió:
> If you can partage it would be greate
>
> Carlos Pérez Miguel
>
>
>
> 2012/1/15 Віталій Тимчишин <ti...@gmail.com>:
>> Yep. Have written groovy script this friday to perform autobalancing :) I am
>> going to add it to my jenkins soon.
>>
>>
>> 2012/1/15 Maxim Potekhin <po...@bnl.gov>
>>>
>>> I see. Sure, that's a bit more complicated and you'd have to move tokens
>>> after adding a machine.
>>>
>>> Maxim
>>>
>>>
>>>
>>> On 1/15/2012 4:40 AM, Віталій Тимчишин wrote:
>>>
>>> It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ nodes,
>>> growing.
>>>
>>> 2012/1/14 Maxim Potekhin <po...@bnl.gov>
>>>>
>>>> I'm just wondering -- what's wrong with manual specification of tokens?
>>>> I'm so glad I did it and have not had problems with balancing and all.
>>>>
>>>> Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster,
>>>> when had to move tokens to make it 33/33/33 and I screwed up a little in
>>>> that the first one did not start with 0, which is not a good idea.
>>>>
>>>> Maxim
>>>>
>>>>
>>>
>>> --
>>> Best regards,
>>>  Vitalii Tymchyshyn
>>>
>>>
>>
>>
>>
>> --
>> Best regards,
>>  Vitalii Tymchyshyn

Re: About initial token, autobootstraping and load balance

Posted by Віталій Тимчишин <ti...@gmail.com>.
Yep, I think I can. Here you are: https://github.com/tivv/cassandra-balancer

2012/1/15 Carlos Pérez Miguel <cp...@gmail.com>

> If you can partage it would be greate
>
> Carlos Pérez Miguel
>
>
>
> 2012/1/15 Віталій Тимчишин <ti...@gmail.com>:
> > Yep. Have written groovy script this friday to perform autobalancing :)
> I am
> > going to add it to my jenkins soon.
> >
> >
> > 2012/1/15 Maxim Potekhin <po...@bnl.gov>
> >>
> >> I see. Sure, that's a bit more complicated and you'd have to move tokens
> >> after adding a machine.
> >>
> >> Maxim
> >>
> >>
> >>
> >> On 1/15/2012 4:40 AM, Віталій Тимчишин wrote:
> >>
> >> It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ nodes,
> >> growing.
> >>
> >> 2012/1/14 Maxim Potekhin <po...@bnl.gov>
> >>>
> >>> I'm just wondering -- what's wrong with manual specification of tokens?
> >>> I'm so glad I did it and have not had problems with balancing and all.
> >>>
> >>> Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster,
> >>> when had to move tokens to make it 33/33/33 and I screwed up a little
> in
> >>> that the first one did not start with 0, which is not a good idea.
> >>>
> >>> Maxim
> >>>
> >>>
> >>
> >> --
> >> Best regards,
> >>  Vitalii Tymchyshyn
> >>
> >>
> >
> >
> >
> > --
> > Best regards,
> >  Vitalii Tymchyshyn
>



-- 
Best regards,
 Vitalii Tymchyshyn

Re: About initial token, autobootstraping and load balance

Posted by Carlos Pérez Miguel <cp...@gmail.com>.
If you can partage it would be greate

Carlos Pérez Miguel



2012/1/15 Віталій Тимчишин <ti...@gmail.com>:
> Yep. Have written groovy script this friday to perform autobalancing :) I am
> going to add it to my jenkins soon.
>
>
> 2012/1/15 Maxim Potekhin <po...@bnl.gov>
>>
>> I see. Sure, that's a bit more complicated and you'd have to move tokens
>> after adding a machine.
>>
>> Maxim
>>
>>
>>
>> On 1/15/2012 4:40 AM, Віталій Тимчишин wrote:
>>
>> It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ nodes,
>> growing.
>>
>> 2012/1/14 Maxim Potekhin <po...@bnl.gov>
>>>
>>> I'm just wondering -- what's wrong with manual specification of tokens?
>>> I'm so glad I did it and have not had problems with balancing and all.
>>>
>>> Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster,
>>> when had to move tokens to make it 33/33/33 and I screwed up a little in
>>> that the first one did not start with 0, which is not a good idea.
>>>
>>> Maxim
>>>
>>>
>>
>> --
>> Best regards,
>>  Vitalii Tymchyshyn
>>
>>
>
>
>
> --
> Best regards,
>  Vitalii Tymchyshyn

Re: About initial token, autobootstraping and load balance

Posted by Віталій Тимчишин <ti...@gmail.com>.
Yep. Have written groovy script this friday to perform autobalancing :) I
am going to add it to my jenkins soon.

2012/1/15 Maxim Potekhin <po...@bnl.gov>

>  I see. Sure, that's a bit more complicated and you'd have to move tokens
> after adding a machine.
>
> Maxim
>
>
>
> On 1/15/2012 4:40 AM, Віталій Тимчишин wrote:
>
> It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ nodes,
> growing.
>
> 2012/1/14 Maxim Potekhin <po...@bnl.gov>
>
>>  I'm just wondering -- what's wrong with manual specification of tokens?
>> I'm so glad I did it and have not had problems with balancing and all.
>>
>> Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster,
>> when had to move tokens to make it 33/33/33 and I screwed up a little in
>> that the first one did not start with 0, which is not a good idea.
>>
>> Maxim
>>
>>
>>
>  --
> Best regards,
>  Vitalii Tymchyshyn
>
>
>


-- 
Best regards,
 Vitalii Tymchyshyn

Re: About initial token, autobootstraping and load balance

Posted by Maxim Potekhin <po...@bnl.gov>.
I see. Sure, that's a bit more complicated and you'd have to move tokens 
after adding a machine.

Maxim


On 1/15/2012 4:40 AM, ??????? ???????? wrote:
> It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ 
> nodes, growing.
>
> 2012/1/14 Maxim Potekhin <potekhin@bnl.gov <ma...@bnl.gov>>
>
>     I'm just wondering -- what's wrong with manual specification of
>     tokens? I'm so glad I did it and have not had problems with
>     balancing and all.
>
>     Before I was indeed stuck with 25/25/50 setup in a 3 machine
>     cluster, when had to move tokens to make it 33/33/33 and I screwed
>     up a little in that the first one did not start with 0, which is
>     not a good idea.
>
>     Maxim
>
>
>
> -- 
> Best regards,
>  Vitalii Tymchyshyn


Re: About initial token, autobootstraping and load balance

Posted by Віталій Тимчишин <ti...@gmail.com>.
It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ nodes,
growing.

2012/1/14 Maxim Potekhin <po...@bnl.gov>

>  I'm just wondering -- what's wrong with manual specification of tokens?
> I'm so glad I did it and have not had problems with balancing and all.
>
> Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster, when
> had to move tokens to make it 33/33/33 and I screwed up a little in that
> the first one did not start with 0, which is not a good idea.
>
> Maxim
>
>
>
-- 
Best regards,
 Vitalii Tymchyshyn

Re: About initial token, autobootstraping and load balance

Posted by Maxim Potekhin <po...@bnl.gov>.
I'm just wondering -- what's wrong with manual specification of tokens? 
I'm so glad I did it and have not had problems with balancing and all.

Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster, 
when had to move tokens to make it 33/33/33 and I screwed up a little in 
that the first one did not start with 0, which is not a good idea.

Maxim

On 1/13/2012 2:10 PM, David McNelis wrote:
> The documentation for that section needs to be updated...
>
> What happens is that if you just autobootstrap without setting a token 
> it will by default bisect the range of the largest node.
>
> So if you go through several iterations of adding nodes, then this is 
> what you would see:
>
> Gen 1:
> Node A:  100% of tokens, token range 1-10 (for example)
>
> Gen 2:
> Node A: 50% of tokens  (1-5)
> Node B: 50% of tokens (6-10)
>
> Gen 3:
> Node A: 25% of tokens (1-2.5)
> Node B: 50% of tokens (6-10)
> Node C: 25% of tokens (2.6-5)
>
> In reality, what you'd want in gen 3 is every node to be 33%, but it 
> would not be the case without setting the tokens to begin with.
>
> You'll notice that there are a couple of scripts available to generate 
> a list of  initial tokens for your particular cluster size, then ever 
> time you add a node you'll need to update all the nodes with new 
> tokens in order to properly load balance.
>
> Does this make sense?
>
> Other folks, am I explaining this correctly?
>
> David
>
> 2012/1/13 Carlos Pérez Miguel <cperezmig@gmail.com 
> <ma...@gmail.com>>
>
>     Hello,
>
>     I have a doubt about how initial token is determined. In Cassandra's
>     documentation it is said that it is better to manually configure the
>     initial token to each node in the system but also is said that if
>     initial token is not defined and autobootstrap is true, new nodes
>     choose initial token in order to better the load balance of the
>     cluster. But what happens if no initial token is chosen and
>     autobootstrap is not activated? How each node selects its initial
>     token to balance the ring?
>
>     I ask this because I am making tests with a 20 nodes cassandra cluster
>     with cassandra 0.7.9. Any node has initial token, nor
>     autobootstraping. I restart the cluster with each test I want to make
>     and in the end the cluster is always well balanced.
>
>     Thanks
>
>     Carlos Pérez Miguel
>
>


Re: About initial token, autobootstraping and load balance

Posted by Віталій Тимчишин <ti...@gmail.com>.
Actually for me it seems that largest means with most data, not range, that
with replication involved makes the feature useless.

2012/1/13 David McNelis <dm...@gmail.com>

> The documentation for that section needs to be updated...
>
> What happens is that if you just autobootstrap without setting a token it
> will by default bisect the range of the largest node.
>
> So if you go through several iterations of adding nodes, then this is what
> you would see:
>
> Gen 1:
> Node A:  100% of tokens, token range 1-10 (for example)
>
> Gen 2:
> Node A: 50% of tokens  (1-5)
> Node B: 50% of tokens (6-10)
>
> Gen 3:
> Node A: 25% of tokens (1-2.5)
> Node B: 50% of tokens (6-10)
> Node C: 25% of tokens (2.6-5)
>
> In reality, what you'd want in gen 3 is every node to be 33%, but it would
> not be the case without setting the tokens to begin with.
>
> You'll notice that there are a couple of scripts available to generate a
> list of  initial tokens for your particular cluster size, then ever time
> you add a node you'll need to update all the nodes with new tokens in order
> to properly load balance.
>
> Does this make sense?
>
> Other folks, am I explaining this correctly?
>
> David
>
>
> 2012/1/13 Carlos Pérez Miguel <cp...@gmail.com>
>
>> Hello,
>>
>> I have a doubt about how initial token is determined. In Cassandra's
>> documentation it is said that it is better to manually configure the
>> initial token to each node in the system but also is said that if
>> initial token is not defined and autobootstrap is true, new nodes
>> choose initial token in order to better the load balance of the
>> cluster. But what happens if no initial token is chosen and
>> autobootstrap is not activated? How each node selects its initial
>> token to balance the ring?
>>
>> I ask this because I am making tests with a 20 nodes cassandra cluster
>> with cassandra 0.7.9. Any node has initial token, nor
>> autobootstraping. I restart the cluster with each test I want to make
>> and in the end the cluster is always well balanced.
>>
>> Thanks
>>
>> Carlos Pérez Miguel
>>
>
>


-- 
Best regards,
 Vitalii Tymchyshyn

Re: About initial token, autobootstraping and load balance

Posted by Carlos Pérez Miguel <cp...@gmail.com>.
Thanks David for your explanation. What happens if autobootstrap is
false in the configuration file? nodes seem to choose the correct
token and balance well the cluster. In this case, how did each node to
select its initial token?

Carlos Pérez Miguel



2012/1/13 David McNelis <dm...@gmail.com>:
> The documentation for that section needs to be updated...
>
> What happens is that if you just autobootstrap without setting a token it
> will by default bisect the range of the largest node.
>
> So if you go through several iterations of adding nodes, then this is what
> you would see:
>
> Gen 1:
> Node A:  100% of tokens, token range 1-10 (for example)
>
> Gen 2:
> Node A: 50% of tokens  (1-5)
> Node B: 50% of tokens (6-10)
>
> Gen 3:
> Node A: 25% of tokens (1-2.5)
> Node B: 50% of tokens (6-10)
> Node C: 25% of tokens (2.6-5)
>
> In reality, what you'd want in gen 3 is every node to be 33%, but it would
> not be the case without setting the tokens to begin with.
>
> You'll notice that there are a couple of scripts available to generate a
> list of  initial tokens for your particular cluster size, then ever time you
> add a node you'll need to update all the nodes with new tokens in order to
> properly load balance.
>
> Does this make sense?
>
> Other folks, am I explaining this correctly?
>
> David
>
>
> 2012/1/13 Carlos Pérez Miguel <cp...@gmail.com>
>>
>> Hello,
>>
>> I have a doubt about how initial token is determined. In Cassandra's
>> documentation it is said that it is better to manually configure the
>> initial token to each node in the system but also is said that if
>> initial token is not defined and autobootstrap is true, new nodes
>> choose initial token in order to better the load balance of the
>> cluster. But what happens if no initial token is chosen and
>> autobootstrap is not activated? How each node selects its initial
>> token to balance the ring?
>>
>> I ask this because I am making tests with a 20 nodes cassandra cluster
>> with cassandra 0.7.9. Any node has initial token, nor
>> autobootstraping. I restart the cluster with each test I want to make
>> and in the end the cluster is always well balanced.
>>
>> Thanks
>>
>> Carlos Pérez Miguel
>
>

Re: About initial token, autobootstraping and load balance

Posted by David McNelis <dm...@gmail.com>.
The documentation for that section needs to be updated...

What happens is that if you just autobootstrap without setting a token it
will by default bisect the range of the largest node.

So if you go through several iterations of adding nodes, then this is what
you would see:

Gen 1:
Node A:  100% of tokens, token range 1-10 (for example)

Gen 2:
Node A: 50% of tokens  (1-5)
Node B: 50% of tokens (6-10)

Gen 3:
Node A: 25% of tokens (1-2.5)
Node B: 50% of tokens (6-10)
Node C: 25% of tokens (2.6-5)

In reality, what you'd want in gen 3 is every node to be 33%, but it would
not be the case without setting the tokens to begin with.

You'll notice that there are a couple of scripts available to generate a
list of  initial tokens for your particular cluster size, then ever time
you add a node you'll need to update all the nodes with new tokens in order
to properly load balance.

Does this make sense?

Other folks, am I explaining this correctly?

David

2012/1/13 Carlos Pérez Miguel <cp...@gmail.com>

> Hello,
>
> I have a doubt about how initial token is determined. In Cassandra's
> documentation it is said that it is better to manually configure the
> initial token to each node in the system but also is said that if
> initial token is not defined and autobootstrap is true, new nodes
> choose initial token in order to better the load balance of the
> cluster. But what happens if no initial token is chosen and
> autobootstrap is not activated? How each node selects its initial
> token to balance the ring?
>
> I ask this because I am making tests with a 20 nodes cassandra cluster
> with cassandra 0.7.9. Any node has initial token, nor
> autobootstraping. I restart the cluster with each test I want to make
> and in the end the cluster is always well balanced.
>
> Thanks
>
> Carlos Pérez Miguel
>