You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by John Buczkowski <jo...@concordusa.com> on 2012/11/30 23:46:55 UTC

Having issues with node token collisions when starting up Cassandra nodes in a cluster on VMWare.

Hi:

Are there any known issues with initial_token collision when adding nodes to a cluster in a VM environment?

I'm working on a 4 node cluster set up on a VM. We're running into issues when we attempt to add nodes to the cluster.

In the cassandra.yaml file, initial_token is left blank.
Since we're running > 1.0 cassandra, auto_bootstrap should be true by default.

It's my understanding that each of the nodes in the cluster should be assigned an initial token at startup.

This is not what we're currently seeing.
We do not want to manually set the value for initial_token for each node (kind of defeats the goal of being dynamic..)
We also have set the partitioner to random:  partitioner: org.apache.cassandra.dht.RandomPartitioner

I've outlined the steps we follow and results we are seeing below.
Can someone please asdvise as to what we're missing here?

Here are the detailed steps we are taking:

1) Kill all cassandra instances and delete data & commit log files on each node.

2) Startup Seed Node (S.S.S.S)
---------------------
Starts up fine.

3) Run nodetool -h W.W.W.W  ring and see:
-------------------------------------
Address         DC          Rack        Status State   Load            Effective-Ownership Token
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463

4) X.X.X.X Startup
-----------------
 INFO [GossipStage:1] 2012-11-29 21:16:02,194 Gossiper.java (line 850) Node /X.X.X.X is now part of the cluster
 INFO [GossipStage:1] 2012-11-29 21:16:02,194 Gossiper.java (line 816) InetAddress /X.X.X.X is now UP
 INFO [GossipStage:1] 2012-11-29 21:16:02,195 StorageService.java (line 1138) Nodes /X.X.X.X and /Y.Y.Y.Y have the same token 113436792799830839333714191906879955254.  /X.X.X.X is the new owner
 WARN [GossipStage:1] 2012-11-29 21:16:02,195 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /Y.Y.Y.Y to /X.X.X.X

5) Run nodetool -h W.W.W.W  ring and see:
-------------------------------------
Address         DC          Rack        Status State   Load            Effective-Ownership Token
                                                                                           113436792799830839333714191906879955254
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
W.W.W.W         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254

6) Y.Y.Y.Y Startup
-----------------
 INFO [GossipStage:1] 2012-11-29 21:17:36,458 Gossiper.java (line 850) Node /Y.Y.Y.Y is now part of the cluster
 INFO [GossipStage:1] 2012-11-29 21:17:36,459 Gossiper.java (line 816) InetAddress /Y.Y.Y.Y is now UP
 INFO [GossipStage:1] 2012-11-29 21:17:36,459 StorageService.java (line 1138) Nodes /Y.Y.Y.Y and /X.X.X.X have the same token 113436792799830839333714191906879955254.  /Y.Y.Y.Y is the new owner
 WARN [GossipStage:1] 2012-11-29 21:17:36,459 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /X.X.X.X to /Y.Y.Y.Y

7) Run nodetool -h W.W.W.W  ring and see:
-------------------------------------
Address         DC          Rack        Status State   Load            Effective-Ownership Token
                                                                                           113436792799830839333714191906879955254
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
Y.Y.Y.Y         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254

8) Z.Z.Z.Z Startup
-----------------
 INFO [GossipStage:1] 2012-11-30 04:52:28,590 Gossiper.java (line 850) Node /Z.Z.Z.Z is now part of the cluster
 INFO [GossipStage:1] 2012-11-30 04:52:28,591 Gossiper.java (line 816) InetAddress /Z.Z.Z.Z is now UP
 INFO [GossipStage:1] 2012-11-30 04:52:28,591 StorageService.java (line 1138) Nodes /Z.Z.Z.Z and /Y.Y.Y.Y have the same token 113436792799830839333714191906879955254.  /Z.Z.Z.Z is the new owner
 WARN [GossipStage:1] 2012-11-30 04:52:28,592 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /Y.Y.Y.Y to /Z.Z.Z.Z

9) Run nodetool -h W.W.W.W  ring and see:
-------------------------------------
Address         DC          Rack        Status State   Load            Effective-Ownership Token
                                                                                           113436792799830839333714191906879955254
W.W.W.W         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
Z.Z.Z.Z         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254



Thanks in advance.



Re: Having issues with node token collisions when starting up Cassandra nodes in a cluster on VMWare.

Posted by aaron morton <aa...@thelastpickle.com>.
> We do not want to manually set the value for initial_token for each node (kind of defeats the goal of being dynamic..)
You *really* do want to do this. 
Adding without setting a token will result in an unbalanced cluster. 

The 1.1 distro includes a token generator in tools/bin/token-generator. 

> 1) Kill all cassandra instances and delete data & commit log files on each node.
Did you delete the System keyspace data ?

> 3) Run nodetool -h W.W.W.W  ring and see:
> -------------------------------------
> Address         DC          Rack        Status State   Load            Effective-Ownership Token
> S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
Is the W.W.W.W machine a different node running in the cluster ? Or was node tool ran on S.S.S.S ?

>  INFO [GossipStage:1] 2012-11-29 21:16:02,195 StorageService.java (line 1138) Nodes /X.X.X.X and /Y.Y.Y.Y have the same token 113436792799830839333714191906879955254.  /X.X.X.X is the new owner
This looks like the previous ring state was read from the system keyspace. 
Either by one of the other nodes, which then gossiped it around, or by this one.


When a node automatically selects a token at bootstrap it logs a message such as "New token will be {} to assume load from {}" do you see that ? If not the token has been read from the system KS. 

Hope that helps

 
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 1/12/2012, at 11:46 AM, John Buczkowski <jo...@concordusa.com> wrote:

> Hi:
> Are there any known issues with initial_token collision when adding nodes to a cluster in a VM environment?
> I'm working on a 4 node cluster set up on a VM. We're running into issues when we attempt to add nodes to the cluster.
> In the cassandra.yaml file, initial_token is left blank.
> Since we're running > 1.0 cassandra, auto_bootstrap should be true by default.
> 
> It's my understanding that each of the nodes in the cluster should be assigned an initial token at startup.
> This is not what we're currently seeing. 
> We do not want to manually set the value for initial_token for each node (kind of defeats the goal of being dynamic..)
> We also have set the partitioner to random:  partitioner: org.apache.cassandra.dht.RandomPartitioner
> I've outlined the steps we follow and results we are seeing below.
> Can someone please asdvise as to what we're missing here?
> 
> Here are the detailed steps we are taking:
> 1) Kill all cassandra instances and delete data & commit log files on each node.
> 2) Startup Seed Node (S.S.S.S)
> ---------------------
> Starts up fine.
> 3) Run nodetool -h W.W.W.W  ring and see:
> -------------------------------------
> Address         DC          Rack        Status State   Load            Effective-Ownership Token
> S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
> 
> 4) X.X.X.X Startup
> -----------------
>  INFO [GossipStage:1] 2012-11-29 21:16:02,194 Gossiper.java (line 850) Node /X.X.X.X is now part of the cluster
>  INFO [GossipStage:1] 2012-11-29 21:16:02,194 Gossiper.java (line 816) InetAddress /X.X.X.X is now UP
>  INFO [GossipStage:1] 2012-11-29 21:16:02,195 StorageService.java (line 1138) Nodes /X.X.X.X and /Y.Y.Y.Y have the same token 113436792799830839333714191906879955254.  /X.X.X.X is the new owner
>  WARN [GossipStage:1] 2012-11-29 21:16:02,195 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /Y.Y.Y.Y to /X.X.X.X
> 5) Run nodetool -h W.W.W.W  ring and see:
> -------------------------------------
> Address         DC          Rack        Status State   Load            Effective-Ownership Token
>                                                                                            113436792799830839333714191906879955254
> S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
> W.W.W.W         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254
> 
> 6) Y.Y.Y.Y Startup
> -----------------
>  INFO [GossipStage:1] 2012-11-29 21:17:36,458 Gossiper.java (line 850) Node /Y.Y.Y.Y is now part of the cluster
>  INFO [GossipStage:1] 2012-11-29 21:17:36,459 Gossiper.java (line 816) InetAddress /Y.Y.Y.Y is now UP
>  INFO [GossipStage:1] 2012-11-29 21:17:36,459 StorageService.java (line 1138) Nodes /Y.Y.Y.Y and /X.X.X.X have the same token 113436792799830839333714191906879955254.  /Y.Y.Y.Y is the new owner
>  WARN [GossipStage:1] 2012-11-29 21:17:36,459 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /X.X.X.X to /Y.Y.Y.Y
> 
> 7) Run nodetool -h W.W.W.W  ring and see:
> -------------------------------------
> Address         DC          Rack        Status State   Load            Effective-Ownership Token
>                                                                                            113436792799830839333714191906879955254
> S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
> Y.Y.Y.Y         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254
> 
> 8) Z.Z.Z.Z Startup
> -----------------
>  INFO [GossipStage:1] 2012-11-30 04:52:28,590 Gossiper.java (line 850) Node /Z.Z.Z.Z is now part of the cluster
>  INFO [GossipStage:1] 2012-11-30 04:52:28,591 Gossiper.java (line 816) InetAddress /Z.Z.Z.Z is now UP
>  INFO [GossipStage:1] 2012-11-30 04:52:28,591 StorageService.java (line 1138) Nodes /Z.Z.Z.Z and /Y.Y.Y.Y have the same token 113436792799830839333714191906879955254.  /Z.Z.Z.Z is the new owner
>  WARN [GossipStage:1] 2012-11-30 04:52:28,592 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /Y.Y.Y.Y to /Z.Z.Z.Z
> 9) Run nodetool -h W.W.W.W  ring and see:
> -------------------------------------
> Address         DC          Rack        Status State   Load            Effective-Ownership Token
>                                                                                            113436792799830839333714191906879955254
> W.W.W.W         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
> S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
> Z.Z.Z.Z         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254
>  
> Thanks in advance.
>