You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Владимир Рудев <vl...@gmail.com> on 2014/06/04 18:52:37 UTC

Cassandra 2.0 unbalanced ring with vnodes after adding new node

Hello to everyone!

Please, can someone explain where we made a mistake?

We have cluster with 4 nodes which uses vnodes(256 per node, default settings), snitch is default on every node: SimpleSnitch.  
These four nodes was from beginning of a cluster.
In this cluster we have keyspace with this options:
Keyspace: K:
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
  Durable Writes: true
    Options: [replication_factor:3]


All was normal and nodetool status K shows that each node owns 75% of all key range. All 4 nodes are located in same datacenter and have same first two bytes in IP address(others are different).  

Then we buy new server on different datacenter and add it to the cluster with same settings as in previous four nodes(difference only in listen_address), assuming that the effective own of each node for this keyspace will be 300/5=60% or near. But after 3-5 minutes after start nodetool status K show this:
nodetool status K;
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  N1   6,06 GB    256     50.0%             62f295b3-0da6-4854-a53a-f03d6b424b03  rack1
UN  N2   5,89 GB    256     50.0%             af4e4a23-2610-44dd-9061-09c7a6512a54  rack1
UN  N3   6,02 GB    256     50.0%             0f0e4e78-6fb2-479f-ad76-477006f76795  rack1
UN  N4   5,8 GB     256     50.0%             670344c0-9856-48cf-9ec9-1a98f9a89460  rack1
UN  N5   7,51 GB    256     100.0%            82473d14-9e36-4ae7-86d2-a3e526efb53f  rack1


N5 is newly added node

nodetool repair -pr on N5 doesn't change anything

nodetool describering K shows that new node N5 participate in EACH range. This is not we want at all.  

It looks like cassandra add new node to each range because it located in different datacenter, but all settings and output are exactly prevent this.

Also interesting point is that while in all config files snitch is defined as SimpleSnitch the output of the command nodetool describecluster is:
Cluster Information:
        Name: Some Cluster Name
        Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        Schema versions:
                26b8fa37-e666-31ed-aa3b-85be75f2aa1a: [N1, N2, N3, N4, N5]


We use Cassandra 2.0.6

Questions we have at this moment:
1. How to rebalance ring so all nodes will own 60% of range?
   1a. Removing node from cluster and adding it again is a solution?
2. Where we possibly make a mistake when adding new node?
3. If we add new 6th node to ring it will take 50% from N5 or some portion from each node?

Thanks in advance!

--  
С уважением,  
Владимир Рудев
(With regards, Vladimir Rudev)
vladimir.rudev@gmail.com (mailto:vladimir.rudev@gmail.com)

Re: Cassandra 2.0 unbalanced ring with vnodes after adding new node

Posted by Владимир Рудев <vl...@gmail.com>.

Hmm, maybe, actually cluster was created not by me.  
Another interesting thing was yesterday - by some reason one old node lost one sstable file(no matter how - thats another problem) and we shut down this node, clean up all data, and start again. After this result of nodetool status K was this:
nodetool status K;
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  N1    6,06 GB    256     100.0%             62f295b3-0da6-4854-a53a-f03d6b424b03  rack1
UN  N2   5,89 GB    256     25.0%             af4e4a23-2610-44dd-9061-09c7a6512a54  rack1
UN  N3    6,02 GB    256     50.0%             0f0e4e78-6fb2-479f-ad76-477006f76795  rack1
UN  N4    5,8 GB     256     50.0%             670344c0-9856-48cf-9ec9-1a98f9a89460  rack1
UN  N5   7,51 GB    256     100.0%            82473d14-9e36-4ae7-86d2-a3e526efb53f  rack1

so new node and cleaned owns 100%. And today we add 6th node to the cluster and now status is:
nodetool status K
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  N1   7,72 GB    256     71.4%             be295bec-8184-4c21-8eaa-5669deed1d73  rack1
UN  N2   5,98 GB    256     25.1%             670344c0-9856-48cf-9ec9-1a98f9a89460  rack1
UN  N3   6,21 GB    256     25.4%             0f0e4e78-6fb2-479f-ad76-477006f76795  rack1
UN  N4   6,17 GB    256     26.4%             af4e4a23-2610-44dd-9061-09c7a6512a54  rack1
UN  N5   15,42 GB   256     75.7%             82473d14-9e36-4ae7-86d2-a3e526efb53f  rack1
UN  N6   5,87 GB    256     76.0%             3181b7ac-5778-499d-8965-2eff86507afc  rack1


So token ranges starts dividing between newly added nodes - decommission+clean+add is a solution I think.
Thank you, Jeremiah!

--  
С уважением,  
Владимир Рудев
(With regards, Vladimir Rudev)
vladimir.rudev@gmail.com (mailto:vladimir.rudev@gmail.com)



понедельник, 9 июня 2014 г. в 21:27, Jeremiah D Jordan написал:

> That looks like you started the initial nodes with num tokens=1, then later switched to vnodes, by setting num tokens to 256, then added that new node with 256 vnodes to start.  Am I right?
>  
> Since you don't have very much data, the easiest way out of this will be to decommission the original nodes one at a time.  Wipe all the data off.  Then bootstrap them back into the cluster.
>  
> -Jeremiah
>  
> On Jun 4, 2014, at 11:52 AM, Владимир Рудев <vladimir.rudev@gmail.com (mailto:vladimir.rudev@gmail.com)> wrote:
> > Hello to everyone!
> >  
> > Please, can someone explain where we made a mistake?
> >  
> > We have cluster with 4 nodes which uses vnodes(256 per node, default settings), snitch is default on every node: SimpleSnitch.  
> > These four nodes was from beginning of a cluster.
> > In this cluster we have keyspace with this options:
> > Keyspace: K:
> >   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
> >   Durable Writes: true
> >     Options: [replication_factor:3]
> >  
> >  
> > All was normal and nodetool status K shows that each node owns 75% of all key range. All 4 nodes are located in same datacenter and have same first two bytes in IP address(others are different).  
> >  
> > Then we buy new server on different datacenter and add it to the cluster with same settings as in previous four nodes(difference only in listen_address), assuming that the effective own of each node for this keyspace will be 300/5=60% or near. But after 3-5 minutes after start nodetool status K show this:
> > nodetool status K;
> > Datacenter: datacenter1
> > =======================
> > Status=Up/Down
> > |/ State=Normal/Leaving/Joining/Moving
> > --  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
> > UN  N1   6,06 GB    256     50.0%             62f295b3-0da6-4854-a53a-f03d6b424b03  rack1
> > UN  N2   5,89 GB    256     50.0%             af4e4a23-2610-44dd-9061-09c7a6512a54  rack1
> > UN  N3   6,02 GB    256     50.0%             0f0e4e78-6fb2-479f-ad76-477006f76795  rack1
> > UN  N4   5,8 GB     256     50.0%             670344c0-9856-48cf-9ec9-1a98f9a89460  rack1
> > UN  N5   7,51 GB    256     100.0%            82473d14-9e36-4ae7-86d2-a3e526efb53f  rack1
> >  
> >  
> > N5 is newly added node
> >  
> > nodetool repair -pr on N5 doesn't change anything
> >  
> > nodetool describering K shows that new node N5 participate in EACH range. This is not we want at all.  
> >  
> > It looks like cassandra add new node to each range because it located in different datacenter, but all settings and output are exactly prevent this.
> >  
> > Also interesting point is that while in all config files snitch is defined as SimpleSnitch the output of the command nodetool describecluster is:
> > Cluster Information:
> >         Name: Some Cluster Name
> >         Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
> >         Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> >         Schema versions:
> >                 26b8fa37-e666-31ed-aa3b-85be75f2aa1a: [N1, N2, N3, N4, N5]
> >  
> >  
> > We use Cassandra 2.0.6
> >  
> > Questions we have at this moment:
> > 1. How to rebalance ring so all nodes will own 60% of range?
> >    1a. Removing node from cluster and adding it again is a solution?
> > 2. Where we possibly make a mistake when adding new node?
> > 3. If we add new 6th node to ring it will take 50% from N5 or some portion from each node?
> >  
> > Thanks in advance!
> >  
> > --  
> > С уважением,  
> > Владимир Рудев
> > (With regards, Vladimir Rudev)
> > vladimir.rudev@gmail.com (mailto:vladimir.rudev@gmail.com)
> >  
> >  
>

Re: Cassandra 2.0 unbalanced ring with vnodes after adding new node

Posted by Jeremiah D Jordan <je...@gmail.com>.

That looks like you started the initial nodes with num tokens=1, then later switched to vnodes, by setting num tokens to 256, then added that new node with 256 vnodes to start.  Am I right?

Since you don't have very much data, the easiest way out of this will be to decommission the original nodes one at a time.  Wipe all the data off.  Then bootstrap them back into the cluster.

-Jeremiah

On Jun 4, 2014, at 11:52 AM, Владимир Рудев <vl...@gmail.com> wrote:

> Hello to everyone!
> 
> Please, can someone explain where we made a mistake?
> 
> We have cluster with 4 nodes which uses vnodes(256 per node, default settings), snitch is default on every node: SimpleSnitch.
> These four nodes was from beginning of a cluster.
> In this cluster we have keyspace with this options:
> Keyspace: K:
>   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
>   Durable Writes: true
>     Options: [replication_factor:3]
> 
> All was normal and nodetool status K shows that each node owns 75% of all key range. All 4 nodes are located in same datacenter and have same first two bytes in IP address(others are different). 
> 
> Then we buy new server on different datacenter and add it to the cluster with same settings as in previous four nodes(difference only in listen_address), assuming that the effective own of each node for this keyspace will be 300/5=60% or near. But after 3-5 minutes after start nodetool status K show this:
> nodetool status K;
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
> UN  N1   	6,06 GB    256     50.0%             62f295b3-0da6-4854-a53a-f03d6b424b03  rack1
> UN  N2  	5,89 GB    256     50.0%             af4e4a23-2610-44dd-9061-09c7a6512a54  rack1
> UN  N3   	6,02 GB    256     50.0%             0f0e4e78-6fb2-479f-ad76-477006f76795  rack1
> UN  N4   	5,8 GB     256     50.0%             670344c0-9856-48cf-9ec9-1a98f9a89460  rack1
> UN  N5  	7,51 GB    256     100.0%            82473d14-9e36-4ae7-86d2-a3e526efb53f  rack1
> 
> N5 is newly added node
> 
> nodetool repair -pr on N5 doesn't change anything
> 
> nodetool describering K shows that new node N5 participate in EACH range. This is not we want at all. 
> 
> It looks like cassandra add new node to each range because it located in different datacenter, but all settings and output are exactly prevent this.
> 
> Also interesting point is that while in all config files snitch is defined as SimpleSnitch the output of the command nodetool describecluster is:
> Cluster Information:
>         Name: Some Cluster Name
>         Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
>         Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>         Schema versions:
>                 26b8fa37-e666-31ed-aa3b-85be75f2aa1a: [N1, N2, N3, N4, N5]
> 
> We use Cassandra 2.0.6
> 
> Questions we have at this moment:
> 1. How to rebalance ring so all nodes will own 60% of range?
>    1a. Removing node from cluster and adding it again is a solution?
> 2. Where we possibly make a mistake when adding new node?
> 3. If we add new 6th node to ring it will take 50% from N5 or some portion from each node?
> 
> Thanks in advance!
> 
> -- 
> С уважением, 
> Владимир Рудев
> (With regards, Vladimir Rudev)
> vladimir.rudev@gmail.com
> 
>

Re: Cassandra 2.0 unbalanced ring with vnodes after adding new node

Posted by Marcelo Elias Del Valle <ma...@s1mbi0se.com.br>.

Actually, I have the same doubt. The same happens to me, but I guess it's
because of lack of knowledge in Cassandra vnodes, somehow...

I just added 3 nodes to my old 2 nodes cluster, now I have a 5 nodes
cluster.

As rows should be in a node calculated by HASH / number of nodes, adding a
new node should move data from all other nodes to the new ones, right?
Considering I have an enough number of different row keys.

I noticed that:


   1. Even reading data with read consistency = ALL, I get the wrong
   results while the repair is not complete. Should this happen?
   2. I have run nodetool repair in each new node and nodetool cleanup in
   the 2 old nodes. There is some streaming happening, but it's really slow,
   considering my bandwith and use of SSDs.

What should I do make the data stream from the old nodes to the new ones
faster?

And everytime I add new nodes to the cluster I will have to stop my
processes that reads data from cassandra until the move is complete? Isn't
there any other way?

Best regards,
Marcelo.



2014-06-04 13:52 GMT-03:00 Владимир Рудев <vl...@gmail.com>:

> Hello to everyone!
>
> Please, can someone explain where we made a mistake?
>
> We have cluster with 4 nodes which uses vnodes(256 per node, default
> settings), snitch is default on every node: SimpleSnitch.
> These four nodes was from beginning of a cluster.
> In this cluster we have keyspace with this options:
> Keyspace: K:
>   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
>   Durable Writes: true
>     Options: [replication_factor:3]
>
> All was normal and nodetool status K shows that each node owns 75% of all
> key range. All 4 nodes are located in same datacenter and have same first
> two bytes in IP address(others are different).
>
> Then we buy new server on different datacenter and add it to the cluster
> with same settings as in previous four nodes(difference only in
> listen_address), assuming that the effective own of each node for this
> keyspace will be 300/5=60% or near. But after 3-5 minutes after start nodetool
> status K show this:
> nodetool status K;
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens  Owns (effective)  Host ID
>                     Rack
> UN  N1   6,06 GB    256     50.0%
> 62f295b3-0da6-4854-a53a-f03d6b424b03  rack1
> UN  N2   5,89 GB    256     50.0%
> af4e4a23-2610-44dd-9061-09c7a6512a54  rack1
> UN  N3   6,02 GB    256     50.0%
> 0f0e4e78-6fb2-479f-ad76-477006f76795  rack1
> UN  N4   5,8 GB     256     50.0%
> 670344c0-9856-48cf-9ec9-1a98f9a89460  rack1
> UN  N5   7,51 GB    256     100.0%
>  82473d14-9e36-4ae7-86d2-a3e526efb53f  rack1
>
> N5 is newly added node
>
> nodetool repair -pr on N5 doesn't change anything
>
> nodetool describering K shows that new node N5 participate in EACH range.
> This is not we want at all.
>
> It looks like cassandra add new node to each range because it located in
> different datacenter, but all settings and output are exactly prevent this.
>
> Also interesting point is that while in all config files snitch is defined
> as SimpleSnitch the output of the command nodetool describecluster is:
> Cluster Information:
>         Name: Some Cluster Name
>         Snitch: org.apache.cassandra.locator.*DynamicEndpointSnitch*
>         Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>         Schema versions:
>                 26b8fa37-e666-31ed-aa3b-85be75f2aa1a: [N1, N2, N3, N4, N5]
>
> We use Cassandra 2.0.6
>
> Questions we have at this moment:
> 1. How to rebalance ring so all nodes will own 60% of range?
>    1a. Removing node from cluster and adding it again is a solution?
> 2. Where we possibly make a mistake when adding new node?
> 3. If we add new 6th node to ring it will take 50% from N5 or some portion
> from each node?
>
> Thanks in advance!
>
> --
> С уважением,
> Владимир Рудев
> (With regards, Vladimir Rudev)
> vladimir.rudev@gmail.com
>
>
>