You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by "Desimpel, Ignace" <Ig...@nuance.com> on 2014/10/01 17:47:04 UTC

CASSANDRA-7649 : upgrade existing db to 2.0.10

I deploy/distribute the Cassandra database as an embedded service allowing me to create a basic cassandra.yaml file based on the global cluster of machines (seeds, non-seeds, ports, disks, etc...). That allows me to configure and upgrade my own software and the cassandra software using the same cassandra.yaml. That yaml file has no tokens specified in it, still having a vnode cluster (thanks cassandra) .

In previous versions that was ok, since the cassandra code was simply accepting the tokens it saved in its own database, disregarding any changes one made in the yaml file ( there was no test like bootstrapTokens.size() != DatabaseDescriptor.getNumTokens() ). I guess there was some logic to that, since at that time the system is not bootstrapping and thus should/could use the known token configuration without using the yaml token parameter.

Also, isn't this small code change of CASSANDRA-7649 inspired on balancing problems going to vnodes (CASSANDRA-7601) using a random partitioner. And in my case I'm using a ByteOrdered partitioner, forcing me to balance/move/add nodes/tokens myself.
And as the description is saying, it was meant to avoid 'to change the number of tokens', that test is doing a little more (from my point of view).

Well, in short : I would be in favor of removing that test, clearly leaving a message that the "saved tokens" are used, not the yaml configured tokens.

Regards,
Ignace

RE: CASSANDRA-7649 : upgrade existing db to 2.0.10

Posted by "Desimpel, Ignace" <Ig...@nuance.com>.

Below I added some more comment as info, it is not really my goal to push this item.
Thanks Robert!

Ignace

From: Robert Coli [mailto:rcoli@eventbrite.com]
Sent: donderdag 2 oktober 2014 09:57
To: user@cassandra.apache.org
Subject: Re: CASSANDRA-7649 : upgrade existing db to 2.0.10

On Wed, Oct 1, 2014 at 8:47 AM, Desimpel, Ignace <Ig...@nuance.com>> wrote:
I deploy/distribute the Cassandra database as an embedded service allowing me to create a basic cassandra.yaml file based on the global cluster of machines (seeds, non-seeds, ports, disks, etc…). That allows me to configure and upgrade my own software and the cassandra software using the same cassandra.yaml. That yaml file has no tokens specified in it, still having a vnode cluster (thanks cassandra) .

IMO, this is the error. Why do you not want to specify your tokens?
>> Only my own pratical reason related to the way we install/deploy/upgrade. Other than that I do agree.

In previous versions that was ok, since the cassandra code was simply accepting the tokens it saved in its own database, disregarding any changes one made in the yaml file ( there was no test like bootstrapTokens.size() != DatabaseDescriptor.getNumTokens() ). I guess there was some logic to that, since at that time the system is not bootstrapping and thus should/could use the known token configuration without using the yaml token parameter.

I'm not really sure I understand the scenario you are describing. In general if a node has bootstrapped, and you have the system keyspace for that node, it tends to use the stored tokens. Is there a specific exception you're getting?
>> Yes.   Before the Cassandra code was using the saved tokens, without testing if the number of saved tokens equals the number of tokens in the yaml file. Now there is the extra test added by CASSANDRA-7649. This throws the exception : throw new ConfigurationException("Cannot change the number of tokens from " + bootstrapTokens.size() + " to " + DatabaseDescriptor.getNumTokens()); Code is in StorageService.class line 824.
That exception is the reason why I started this email, but I do understand why that test is there (don’t want to make a big deal of it).

Also, isn’t this small code change of CASSANDRA-7649 inspired on balancing problems going to vnodes (CASSANDRA-7601) using a random partitioner. And in my case I’m using a ByteOrdered partitioner, forcing me to balance/move/add nodes/tokens myself.

Using BOP is a strong smell of Doing It Wrong. You are probably the only person on Earth using the combination of BOP and Vnodes.
>> I added Vnodes only to get the extra ability to have a faster rebuild if ever needed. BOP , I know…, we did try to solve our problem using random partitioner, cql was not really there yet, … but it works quite well thanks to Cassandra and BOP. Damn, feel lonely now ☺…

And as the description is saying, it was meant to avoid ‘to change the number of tokens’, that test is doing a little more (from my point of view).

Well, in short : I would be in favor of removing that test, clearly leaving a message that the “saved tokens” are used, not the yaml configured tokens.

I am still not sure I completely understand your case, but it seems like you can probably avoid it by simply specifying a comma delimited list of your tokens in initial_token.

Always including your tokens in initial_token is, IMO, a Cassandra Operations best practice. It helps you in various cases and hurts you in almost none. Eventually I will write up a blog post explaining some of these cases..
>>Once again I agree, with the exception that one can still use the num_tokens to have the partitioner generate random tokens, without ever setting initial_token (is it not?)

=Rob

Re: CASSANDRA-7649 : upgrade existing db to 2.0.10

Posted by Robert Coli <rc...@eventbrite.com>.

On Wed, Oct 1, 2014 at 8:47 AM, Desimpel, Ignace <Ignace.Desimpel@nuance.com
> wrote:

>  I deploy/distribute the Cassandra database as an embedded service
> allowing me to create a basic cassandra.yaml file based on the global
> cluster of machines (seeds, non-seeds, ports, disks, etc…). That allows me
> to configure and upgrade my own software and the cassandra software using
> the same cassandra.yaml. That yaml file has no tokens specified in it,
> still having a vnode cluster (thanks cassandra) .
>

IMO, this is the error. Why do you not want to specify your tokens?


>  In previous versions that was ok, since the cassandra code was simply
> accepting the tokens it saved in its own database, disregarding any changes
> one made in the yaml file ( there was no test like bootstrapTokens.size()
> != DatabaseDescriptor.getNumTokens() ). I guess there was some logic to
> that, since at that time the system is not bootstrapping and thus
> should/could use the known token configuration without using the yaml token
> parameter.
>

I'm not really sure I understand the scenario you are describing. In
general if a node has bootstrapped, and you have the system keyspace for
that node, it tends to use the stored tokens. Is there a specific exception
you're getting?


> Also, isn’t this small code change of CASSANDRA-7649 inspired on balancing
> problems going to vnodes (CASSANDRA-7601) using a random partitioner. And
> in my case I’m using a ByteOrdered partitioner, forcing me to
> balance/move/add nodes/tokens myself.
>
>
Using BOP is a strong smell of Doing It Wrong. You are probably the only
person on Earth using the combination of BOP and Vnodes.


> And as the description is saying, it was meant to avoid ‘to change the
> number of tokens’, that test is doing a little more (from my point of view).
>
>
>
> Well, in short : I would be in favor of removing that test, clearly
> leaving a message that the “saved tokens” are used, not the yaml configured
> tokens.
>

I am still not sure I completely understand your case, but it seems like
you can probably avoid it by simply specifying a comma delimited list of
your tokens in initial_token.

Always including your tokens in initial_token is, IMO, a Cassandra
Operations best practice. It helps you in various cases and hurts you in
almost none. Eventually I will write up a blog post explaining some of
these cases..

=Rob