You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jonathan Colby <jo...@gmail.com> on 2011/06/09 14:21:51 UTC
fixing unbalanced cluster !?
I got myself into a situation where one node (10.47.108.100) has a lot more data than the other nodes. In fact, the 1 TB disk on this node is almost full. I added 3 new nodes and let cassandra automatically calculate new tokens by taking the highest loaded nodes. Unfortunately there is still a big token range this node is responsible for (5113... - 85070...). Yes, I know that one option would be to rebalance the entire cluster with move but this is an extremely time-consuming and error-prone process because of the amount of data involved.
Our RF = 3 and we read/write quorum. The nodes have been repaired so I think the data should be in good shape.
Question: Can I get myself out of this mess without installing new nodes? I was thinking of either decommission or removetoken to have the cluster "rebalance itself". The re-bootstrap this node to a new token.
Address Status State Load Owns Token
127605887595351923798765477786913079296
10.46.108.100 Up Normal 218.52 GB 25.00% 0
10.46.108.101 Up Normal 260.04 GB 12.50% 21267647932558653966460912964485513216
10.46.108.104 Up Normal 286.79 GB 17.56% 51138582157040063602728874106478613120
10.47.108.100 Up Normal 874.91 GB 19.94% 85070591730234615865843651857942052863
10.47.108.102 Up Normal 302.79 GB 4.16% 92156241323118845370666296304459139297
10.47.108.103 Up Normal 242.02 GB 4.16% 99241191538897700272878550821956884116
10.47.108.101 Up Normal 439.9 GB 8.34% 113427455640312821154458202477256070484
10.46.108.103 Up Normal 304 GB 8.33% 127605887595351923798765477786913079296
Re: fixing unbalanced cluster !?
Posted by Jonathan Colby <jo...@gmail.com>.
Thanks Ben. That's what I was afraid I had to do. I can see how it's a lot easier if you simply double the cluster when adding capacity.
Jon
On Jun 9, 2011, at 4:44 PM, Benjamin Coverston wrote:
> Because you were able to successfully run repair you can follow up with a nodetool cleanup which will git rid of some of the extraneous data on that (bigger) node. You're also assured after you run repair that entropy beteen the nodes is minimal.
>
> Assuming you're using the random ordered partitioner: To balance your ring I would start by calculating the new token locations, then moving each of your nodes backwards along their owned range to their new locations.
>
> From the script on http://wiki.apache.org/cassandra/Operations your new balanced tokens would be:
>
> 0
> 21267647932558653966460912964485513216
> 42535295865117307932921825928971026432
> 63802943797675961899382738893456539648
> 85070591730234615865843651857942052864
> 106338239662793269832304564822427566080
> 127605887595351923798765477786913079296
> 148873535527910577765226390751398592512
>
> From this you can see that 10.46.108.{100, 101} is already in the right place so you don't have to do anything with those nodes. Proceed with moving 10.46.108.104 to its new token, the safest way to do this would be to use nodetool move. Another way to do it could be to run a remove-token followed by re-adding the node into the ring at its new location. The risk here is that if you do not at least repair after re-joining the ring (and before you move the next node in the ring) then some of the data on that node would be ignored as it would now fall out of the owned range, so it's good practice to immediately run repair on a node that you do a removetoken / re-join on.
>
> The rest of your balancing should be an iteration on the above steps moving through the range.
>
>
> On 6/9/11 6:21 AM, Jonathan Colby wrote:
>> I got myself into a situation where one node (10.47.108.100) has a lot more data than the other nodes. In fact, the 1 TB disk on this node is almost full. I added 3 new nodes and let cassandra automatically calculate new tokens by taking the highest loaded nodes. Unfortunately there is still a big token range this node is responsible for (5113... - 85070...). Yes, I know that one option would be to rebalance the entire cluster with move but this is an extremely time-consuming and error-prone process because of the amount of data involved.
>>
>> Our RF = 3 and we read/write quorum. The nodes have been repaired so I think the data should be in good shape.
>>
>> Question: Can I get myself out of this mess without installing new nodes? I was thinking of either decommission or removetoken to have the cluster "rebalance itself". The re-bootstrap this node to a new token.
>>
>>
>> Address Status State Load Owns Token
>> 127605887595351923798765477786913079296
>> 10.46.108.100 Up Normal 218.52 GB 25.00% 0
>> 10.46.108.101 Up Normal 260.04 GB 12.50% 21267647932558653966460912964485513216
>> 10.46.108.104 Up Normal 286.79 GB 17.56% 51138582157040063602728874106478613120
>> 10.47.108.100 Up Normal 874.91 GB 19.94% 85070591730234615865843651857942052863
>> 10.47.108.102 Up Normal 302.79 GB 4.16% 92156241323118845370666296304459139297
>> 10.47.108.103 Up Normal 242.02 GB 4.16% 99241191538897700272878550821956884116
>> 10.47.108.101 Up Normal 439.9 GB 8.34% 113427455640312821154458202477256070484
>> 10.46.108.103 Up Normal 304 GB 8.33% 127605887595351923798765477786913079296
>
> --
> Ben Coverston
> Director of Operations
> DataStax -- The Apache Cassandra Company
> http://www.datastax.com/
>
Re: fixing unbalanced cluster !?
Posted by Benjamin Coverston <be...@datastax.com>.
Because you were able to successfully run repair you can follow up with
a nodetool cleanup which will git rid of some of the extraneous data on
that (bigger) node. You're also assured after you run repair that
entropy beteen the nodes is minimal.
Assuming you're using the random ordered partitioner: To balance your
ring I would start by calculating the new token locations, then moving
each of your nodes backwards along their owned range to their new locations.
From the script on http://wiki.apache.org/cassandra/Operations your new
balanced tokens would be:
0
21267647932558653966460912964485513216
42535295865117307932921825928971026432
63802943797675961899382738893456539648
85070591730234615865843651857942052864
106338239662793269832304564822427566080
127605887595351923798765477786913079296
148873535527910577765226390751398592512
From this you can see that 10.46.108.{100, 101} is already in the
right place so you don't have to do anything with those nodes. Proceed
with moving 10.46.108.104 to its new token, the safest way to do this
would be to use nodetool move. Another way to do it could be to run a
remove-token followed by re-adding the node into the ring at its new
location. The risk here is that if you do not at least repair after
re-joining the ring (and before you move the next node in the ring) then
some of the data on that node would be ignored as it would now fall out
of the owned range, so it's good practice to immediately run repair on a
node that you do a removetoken / re-join on.
The rest of your balancing should be an iteration on the above steps
moving through the range.
On 6/9/11 6:21 AM, Jonathan Colby wrote:
> I got myself into a situation where one node (10.47.108.100) has a lot more data than the other nodes. In fact, the 1 TB disk on this node is almost full. I added 3 new nodes and let cassandra automatically calculate new tokens by taking the highest loaded nodes. Unfortunately there is still a big token range this node is responsible for (5113... - 85070...). Yes, I know that one option would be to rebalance the entire cluster with move but this is an extremely time-consuming and error-prone process because of the amount of data involved.
>
> Our RF = 3 and we read/write quorum. The nodes have been repaired so I think the data should be in good shape.
>
> Question: Can I get myself out of this mess without installing new nodes? I was thinking of either decommission or removetoken to have the cluster "rebalance itself". The re-bootstrap this node to a new token.
>
>
> Address Status State Load Owns Token
> 127605887595351923798765477786913079296
> 10.46.108.100 Up Normal 218.52 GB 25.00% 0
> 10.46.108.101 Up Normal 260.04 GB 12.50% 21267647932558653966460912964485513216
> 10.46.108.104 Up Normal 286.79 GB 17.56% 51138582157040063602728874106478613120
> 10.47.108.100 Up Normal 874.91 GB 19.94% 85070591730234615865843651857942052863
> 10.47.108.102 Up Normal 302.79 GB 4.16% 92156241323118845370666296304459139297
> 10.47.108.103 Up Normal 242.02 GB 4.16% 99241191538897700272878550821956884116
> 10.47.108.101 Up Normal 439.9 GB 8.34% 113427455640312821154458202477256070484
> 10.46.108.103 Up Normal 304 GB 8.33% 127605887595351923798765477786913079296
--
Ben Coverston
Director of Operations
DataStax -- The Apache Cassandra Company
http://www.datastax.com/