You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Vanger <di...@gmail.com> on 2012/03/12 10:23:13 UTC

Adding node to Cassandra

*We have cassandra 4 nodes cluster* with RF = 3 (nodes named from 'A' to 
'D', initial tokens:
*A (25%)*: 20543402371996174596346065790779111550, *
B (25%)*: 63454860067234500516210522518260948578,
*C (25%)*: 106715317233367107622067286720208938865,
*D (25%)*: 150141183460469231731687303715884105728),
*and want to add 5th node* ('E') with initial token = 
164163260474281062972548100673162157075,  then we want to rebalance A, 
D, E nodes such way they'll own equal percentage of data. All nodes have 
~400 GB of data and around ~300GB disk free space.
What we did:
1. 'Join' new cassandra instance (node 'E') to cluster and wait 'till it 
loads data for it tokens range.

2. Move node 'D' initial token down from 150... to 130...
Here we ran into a problem. When "move" started disk usage for node C 
grows from 400 to 750GB, we saw running compactions on node 'D' but some 
compactions failed with /"WARN [CompactionExecutor:580] 2012-03-11 
16:57:56,036 CompactionTask.java (line 87) insufficient space to compact 
all requested files SSTableReader"/ after that we killed "move" process 
to avoid "out of disk space" error (when 5GB of free space left). After 
restart it frees 100GB of space and now we have total of 105GB free disk 
space on node 'D'. Also we noticed increased disk usage by ~150GB at 
node 'B' but it stops growing before we stopped "move token".


So now we have 5 nodes in cluster in status like this:
Node, Owns%,     Load,     Init. token
A:         16%       400GB        020...
B:         25%       520GB        063...
C:         25%       400GB        106...
D:         25%       640GB        150...
E:          9%         300GB        164...

We'll add disk space for all nodes and run some cleanups, but there's 
still left some questions:

What is the best next step  for us from this point?
What is correct procedure after all and what should we expect when 
adding node to cassandra cluster?
We expected decrease of used disk space on node 'D' 'cause we shrink 
token range for this node, but saw the opposite, why it happened and is 
it normal behavior?
What if we'll have 2TB of data on 2.5TB disk and we wanted to add 
another node and move tokens?
Is it possible to automate node addition to cluster and be sure we won't 
run out of space?

Thank.

Re: Adding node to Cassandra

Posted by aaron morton <aa...@thelastpickle.com>.

>>>>>> 2. Move node 'D' initial token down from 150... to 130... 
>>>>>> Here we ran into a problem. When "move" started disk usage for node C grows from 400 to 750GB, we saw running compactions on node 'D' but some compactions failed with 
Did you run out of space on C or D ? 

>>>>>> We expected decrease of used disk space on node 'D' 'cause we shrink token range for this node, but saw the opposite, why it happened and is it normal behavior?
Remember that node D is also holding replicas of the token ranges assigned to node B and C. 

At first glance it sounds unusual but it's hard to tell without knowing more about what happened. How long did it take to build up ? What sort of load was the system under? What was in the data directory, was there -tmp files in there or lots of small files ? What did  nodetool compactionstats say, was compaction was keeping up ? 

Moving forward, *if* you see a lot of old files in the data dir you may benefit from running a manual compaction as it may reduce the amount of data transferred. There are some downsides to this. Check the data stax site or ask if you do not know what they are. 

Hope that helps


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/03/2012, at 3:38 AM, Rustam Aliyev wrote:

> It's hard to answer this question because there are whole bunch of operations which may cause disk usage growth - repair, compaction, move etc. Any combination of these operations will make things only worse. But let's assume that in your case the only operation increasing disk usage was "move".
> 
> Simply speaking "move" does not move data from one node to another, it just copies data. Once data copied, you need to cleanup data which node is not responsible for using "cleanup" command.
> 
> If you can't increase storage, maybe you can try moving nodes slowly. I.e. Instead of moving node D from 150... to 130..., try going first to 140..., cleanup and then from 140... to 130... However, I never tried this and can't guarantee that it will use less disk space.
> 
> In the past, someone reported x2.5 increase when they went from 4 nodes to 5.
> 
> --
> Rustam.
> 
> On 12/03/2012 12:46, Vanger wrote:
>> 
>> Cassandra v1.0.8
>> once again: 4-nodes cluster, RF = 3. 
>> 
>> 
>> On 12.03.2012 16:18, Rustam Aliyev wrote:
>>> 
>>> What version of Cassandra do you have?
>>> 
>>> On 12/03/2012 11:38, Vanger wrote:
>>>> 
>>>> We were aware of compaction overhead, but still don't understand why that shall happened: node 'D' was in stable condition, works for at least month, had all data for its token range and was comfortable with such disk space. 
>>>> Why suddenly node needs 2x more space for data it already have? Why decreasing token range not lead to decreasing disk usage? 
>>>> 
>>>> On 12.03.2012 15:14, Rustam Aliyev wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> If you use SizeTieredCompactionStrategy, you should have x2 disk space to be on the safe side. So if you want to store 2TB data, you need partition size of 4TB at least.  LeveledCompactionStrategy is available in 1.x and supposed to require less free disk space (but comes at price of I/O).
>>>>> 
>>>>> --
>>>>> Rustam.
>>>>> 
>>>>> On 12/03/2012 09:23, Vanger wrote:
>>>>>> 
>>>>>> We have cassandra 4 nodes cluster with RF = 3 (nodes named from 'A' to 'D', initial tokens:     
>>>>>> A (25%): 20543402371996174596346065790779111550,     
>>>>>> B (25%): 63454860067234500516210522518260948578,     
>>>>>> C (25%): 106715317233367107622067286720208938865,    
>>>>>> D (25%): 150141183460469231731687303715884105728), 
>>>>>> and want to add 5th node ('E') with initial token = 164163260474281062972548100673162157075,  then we want to rebalance A, D, E nodes such way they'll own equal percentage of data. All nodes have ~400 GB of data and around ~300GB disk free space.
>>>>>> What we did:
>>>>>> 1. 'Join' new cassandra instance (node 'E') to cluster and wait 'till it loads data for it tokens range.
>>>>>> 
>>>>>> 2. Move node 'D' initial token down from 150... to 130... 
>>>>>> Here we ran into a problem. When "move" started disk usage for node C grows from 400 to 750GB, we saw running compactions on node 'D' but some compactions failed with "WARN [CompactionExecutor:580] 2012-03-11 16:57:56,036 CompactionTask.java (line 87) insufficient space to compact all requested files SSTableReader" after that we killed "move" process to avoid "out of disk space" error (when 5GB of free space left). After restart it               frees 100GB of space and now we have total of 105GB free disk space on node 'D'. Also we noticed increased disk usage by ~150GB at node 'B' but it stops growing before we stopped "move token".
>>>>>> 
>>>>>> 
>>>>>> So now we have 5 nodes in cluster in status like this:
>>>>>> Node, Owns%,     Load,     Init. token
>>>>>> A:         16%       400GB        020...
>>>>>> B:         25%       520GB        063...
>>>>>> C:         25%       400GB        106...
>>>>>> D:         25%       640GB        150...
>>>>>> E:          9%         300GB        164...
>>>>>> 
>>>>>> We'll add disk space for all nodes and run some cleanups, but there's still left some questions:
>>>>>> 
>>>>>> What is the best next step  for us from this point?
>>>>>> What is correct procedure after all and what should we expect when adding node to cassandra cluster?
>>>>>> We expected decrease of used disk space on node 'D' 'cause we shrink token range for this node, but saw the opposite, why it happened and is it normal behavior?
>>>>>> What if we'll have 2TB of data on 2.5TB disk and we wanted to add another node and move tokens?
>>>>>> Is it possible to automate node addition to cluster and be sure we won't run out of space?
>>>>>> 
>>>>>> Thank.
>>>> 
>>

Re: Adding node to Cassandra

Posted by Rustam Aliyev <ru...@code.az>.

It's hard to answer this question because there are whole bunch of 
operations which may cause disk usage growth - repair, compaction, move 
etc. Any combination of these operations will make things only worse. 
But let's assume that in your case the only operation increasing disk 
usage was "move".

Simply speaking "move" does not move data from one node to another, it 
just copies data. Once data copied, you need to cleanup data which node 
is not responsible for using "cleanup" command.

If you can't increase storage, maybe you can try moving nodes slowly. 
I.e. Instead of moving node D from 150... to 130..., try going first to 
140..., cleanup and then from 140... to 130... However, I never tried 
this and can't guarantee that it will use less disk space.

In the past, someone reported x2.5 increase when they went from 4 nodes 
to 5.

--
Rustam.

On 12/03/2012 12:46, Vanger wrote:
> Cassandra v1.0.8
> once again: 4-nodes cluster, RF = 3.
>
>
> On 12.03.2012 16:18, Rustam Aliyev wrote:
>> What version of Cassandra do you have?
>>
>> On 12/03/2012 11:38, Vanger wrote:
>>> We were aware of compaction overhead, but still don't understand why 
>>> that shall happened: node 'D' was in stable condition, works for at 
>>> least month, had all data for its token range and was comfortable 
>>> with such disk space.
>>> Why suddenly node needs 2x more space for data it already have? Why 
>>> decreasing token range not lead to decreasing disk usage?
>>>
>>> On 12.03.2012 15:14, Rustam Aliyev wrote:
>>>> Hi,
>>>>
>>>> If you use SizeTieredCompactionStrategy, you should have x2 disk 
>>>> space to be on the safe side. So if you want to store 2TB data, you 
>>>> need partition size of 4TB at least.  LeveledCompactionStrategy is 
>>>> available in 1.x and supposed to require less free disk space (but 
>>>> comes at price of I/O).
>>>>
>>>> --
>>>> Rustam.
>>>>
>>>> On 12/03/2012 09:23, Vanger wrote:
>>>>> *We have cassandra 4 nodes cluster* with RF = 3 (nodes named from 
>>>>> 'A' to 'D', initial tokens:
>>>>> *A (25%)*: 20543402371996174596346065790779111550, *
>>>>> B (25%)*: 63454860067234500516210522518260948578,
>>>>> *C (25%)*: 106715317233367107622067286720208938865,
>>>>> *D (25%)*: 150141183460469231731687303715884105728),
>>>>> *and want to add 5th node* ('E') with initial token = 
>>>>> 164163260474281062972548100673162157075,  then we want to 
>>>>> rebalance A, D, E nodes such way they'll own equal percentage of 
>>>>> data. All nodes have ~400 GB of data and around ~300GB disk free 
>>>>> space.
>>>>> What we did:
>>>>> 1. 'Join' new cassandra instance (node 'E') to cluster and wait 
>>>>> 'till it loads data for it tokens range.
>>>>>
>>>>> 2. Move node 'D' initial token down from 150... to 130...
>>>>> Here we ran into a problem. When "move" started disk usage for 
>>>>> node C grows from 400 to 750GB, we saw running compactions on node 
>>>>> 'D' but some compactions failed with /"WARN 
>>>>> [CompactionExecutor:580] 2012-03-11 16:57:56,036 
>>>>> CompactionTask.java (line 87) insufficient space to compact all 
>>>>> requested files SSTableReader"/ after that we killed "move" 
>>>>> process to avoid "out of disk space" error (when 5GB of free space 
>>>>> left). After restart it frees 100GB of space and now we have total 
>>>>> of 105GB free disk space on node 'D'. Also we noticed increased 
>>>>> disk usage by ~150GB at node 'B' but it stops growing before we 
>>>>> stopped "move token".
>>>>>
>>>>>
>>>>> So now we have 5 nodes in cluster in status like this:
>>>>> Node, Owns%,     Load,     Init. token
>>>>> A:         16%       400GB        020...
>>>>> B:         25%       520GB        063...
>>>>> C:         25%       400GB        106...
>>>>> D:         25%       640GB        150...
>>>>> E:          9%         300GB        164...
>>>>>
>>>>> We'll add disk space for all nodes and run some cleanups, but 
>>>>> there's still left some questions:
>>>>>
>>>>> What is the best next step  for us from this point?
>>>>> What is correct procedure after all and what should we expect when 
>>>>> adding node to cassandra cluster?
>>>>> We expected decrease of used disk space on node 'D' 'cause we 
>>>>> shrink token range for this node, but saw the opposite, why it 
>>>>> happened and is it normal behavior?
>>>>> What if we'll have 2TB of data on 2.5TB disk and we wanted to add 
>>>>> another node and move tokens?
>>>>> Is it possible to automate node addition to cluster and be sure we 
>>>>> won't run out of space?
>>>>>
>>>>> Thank.
>>>
>

Re: Adding node to Cassandra

Posted by Vanger <di...@gmail.com>.

Cassandra v1.0.8
once again: 4-nodes cluster, RF = 3.


On 12.03.2012 16:18, Rustam Aliyev wrote:
> What version of Cassandra do you have?
>
> On 12/03/2012 11:38, Vanger wrote:
>> We were aware of compaction overhead, but still don't understand why 
>> that shall happened: node 'D' was in stable condition, works for at 
>> least month, had all data for its token range and was comfortable 
>> with such disk space.
>> Why suddenly node needs 2x more space for data it already have? Why 
>> decreasing token range not lead to decreasing disk usage?
>>
>> On 12.03.2012 15:14, Rustam Aliyev wrote:
>>> Hi,
>>>
>>> If you use SizeTieredCompactionStrategy, you should have x2 disk 
>>> space to be on the safe side. So if you want to store 2TB data, you 
>>> need partition size of 4TB at least.  LeveledCompactionStrategy is 
>>> available in 1.x and supposed to require less free disk space (but 
>>> comes at price of I/O).
>>>
>>> --
>>> Rustam.
>>>
>>> On 12/03/2012 09:23, Vanger wrote:
>>>> *We have cassandra 4 nodes cluster* with RF = 3 (nodes named from 
>>>> 'A' to 'D', initial tokens:
>>>> *A (25%)*: 20543402371996174596346065790779111550, *
>>>> B (25%)*: 63454860067234500516210522518260948578,
>>>> *C (25%)*: 106715317233367107622067286720208938865,
>>>> *D (25%)*: 150141183460469231731687303715884105728),
>>>> *and want to add 5th node* ('E') with initial token = 
>>>> 164163260474281062972548100673162157075,  then we want to rebalance 
>>>> A, D, E nodes such way they'll own equal percentage of data. All 
>>>> nodes have ~400 GB of data and around ~300GB disk free space.
>>>> What we did:
>>>> 1. 'Join' new cassandra instance (node 'E') to cluster and wait 
>>>> 'till it loads data for it tokens range.
>>>>
>>>> 2. Move node 'D' initial token down from 150... to 130...
>>>> Here we ran into a problem. When "move" started disk usage for node 
>>>> C grows from 400 to 750GB, we saw running compactions on node 'D' 
>>>> but some compactions failed with /"WARN [CompactionExecutor:580] 
>>>> 2012-03-11 16:57:56,036 CompactionTask.java (line 87) insufficient 
>>>> space to compact all requested files SSTableReader"/ after that we 
>>>> killed "move" process to avoid "out of disk space" error (when 5GB 
>>>> of free space left). After restart it frees 100GB of space and now 
>>>> we have total of 105GB free disk space on node 'D'. Also we noticed 
>>>> increased disk usage by ~150GB at node 'B' but it stops growing 
>>>> before we stopped "move token".
>>>>
>>>>
>>>> So now we have 5 nodes in cluster in status like this:
>>>> Node, Owns%,     Load,     Init. token
>>>> A:         16%       400GB        020...
>>>> B:         25%       520GB        063...
>>>> C:         25%       400GB        106...
>>>> D:         25%       640GB        150...
>>>> E:          9%         300GB        164...
>>>>
>>>> We'll add disk space for all nodes and run some cleanups, but 
>>>> there's still left some questions:
>>>>
>>>> What is the best next step  for us from this point?
>>>> What is correct procedure after all and what should we expect when 
>>>> adding node to cassandra cluster?
>>>> We expected decrease of used disk space on node 'D' 'cause we 
>>>> shrink token range for this node, but saw the opposite, why it 
>>>> happened and is it normal behavior?
>>>> What if we'll have 2TB of data on 2.5TB disk and we wanted to add 
>>>> another node and move tokens?
>>>> Is it possible to automate node addition to cluster and be sure we 
>>>> won't run out of space?
>>>>
>>>> Thank.
>>

Re: Adding node to Cassandra

Posted by Rustam Aliyev <ru...@code.az>.

What version of Cassandra do you have?

On 12/03/2012 11:38, Vanger wrote:
> We were aware of compaction overhead, but still don't understand why 
> that shall happened: node 'D' was in stable condition, works for at 
> least month, had all data for its token range and was comfortable with 
> such disk space.
> Why suddenly node needs 2x more space for data it already have? Why 
> decreasing token range not lead to decreasing disk usage?
>
> On 12.03.2012 15:14, Rustam Aliyev wrote:
>> Hi,
>>
>> If you use SizeTieredCompactionStrategy, you should have x2 disk 
>> space to be on the safe side. So if you want to store 2TB data, you 
>> need partition size of 4TB at least.  LeveledCompactionStrategy is 
>> available in 1.x and supposed to require less free disk space (but 
>> comes at price of I/O).
>>
>> --
>> Rustam.
>>
>> On 12/03/2012 09:23, Vanger wrote:
>>> *We have cassandra 4 nodes cluster* with RF = 3 (nodes named from 
>>> 'A' to 'D', initial tokens:
>>> *A (25%)*: 20543402371996174596346065790779111550, *
>>> B (25%)*: 63454860067234500516210522518260948578,
>>> *C (25%)*: 106715317233367107622067286720208938865,
>>> *D (25%)*: 150141183460469231731687303715884105728),
>>> *and want to add 5th node* ('E') with initial token = 
>>> 164163260474281062972548100673162157075,  then we want to rebalance 
>>> A, D, E nodes such way they'll own equal percentage of data. All 
>>> nodes have ~400 GB of data and around ~300GB disk free space.
>>> What we did:
>>> 1. 'Join' new cassandra instance (node 'E') to cluster and wait 
>>> 'till it loads data for it tokens range.
>>>
>>> 2. Move node 'D' initial token down from 150... to 130...
>>> Here we ran into a problem. When "move" started disk usage for node 
>>> C grows from 400 to 750GB, we saw running compactions on node 'D' 
>>> but some compactions failed with /"WARN [CompactionExecutor:580] 
>>> 2012-03-11 16:57:56,036 CompactionTask.java (line 87) insufficient 
>>> space to compact all requested files SSTableReader"/ after that we 
>>> killed "move" process to avoid "out of disk space" error (when 5GB 
>>> of free space left). After restart it frees 100GB of space and now 
>>> we have total of 105GB free disk space on node 'D'. Also we noticed 
>>> increased disk usage by ~150GB at node 'B' but it stops growing 
>>> before we stopped "move token".
>>>
>>>
>>> So now we have 5 nodes in cluster in status like this:
>>> Node, Owns%,     Load,     Init. token
>>> A:         16%       400GB        020...
>>> B:         25%       520GB        063...
>>> C:         25%       400GB        106...
>>> D:         25%       640GB        150...
>>> E:          9%         300GB        164...
>>>
>>> We'll add disk space for all nodes and run some cleanups, but 
>>> there's still left some questions:
>>>
>>> What is the best next step  for us from this point?
>>> What is correct procedure after all and what should we expect when 
>>> adding node to cassandra cluster?
>>> We expected decrease of used disk space on node 'D' 'cause we shrink 
>>> token range for this node, but saw the opposite, why it happened and 
>>> is it normal behavior?
>>> What if we'll have 2TB of data on 2.5TB disk and we wanted to add 
>>> another node and move tokens?
>>> Is it possible to automate node addition to cluster and be sure we 
>>> won't run out of space?
>>>
>>> Thank.
>

Re: Adding node to Cassandra

Posted by Vanger <di...@gmail.com>.

We were aware of compaction overhead, but still don't understand why 
that shall happened: node 'D' was in stable condition, works for at 
least month, had all data for its token range and was comfortable with 
such disk space.
Why suddenly node needs 2x more space for data it already have? Why 
decreasing token range not lead to decreasing disk usage?

On 12.03.2012 15:14, Rustam Aliyev wrote:
> Hi,
>
> If you use SizeTieredCompactionStrategy, you should have x2 disk space 
> to be on the safe side. So if you want to store 2TB data, you need 
> partition size of 4TB at least.  LeveledCompactionStrategy is 
> available in 1.x and supposed to require less free disk space (but 
> comes at price of I/O).
>
> --
> Rustam.
>
> On 12/03/2012 09:23, Vanger wrote:
>> *We have cassandra 4 nodes cluster* with RF = 3 (nodes named from 'A' 
>> to 'D', initial tokens:
>> *A (25%)*: 20543402371996174596346065790779111550, *
>> B (25%)*: 63454860067234500516210522518260948578,
>> *C (25%)*: 106715317233367107622067286720208938865,
>> *D (25%)*: 150141183460469231731687303715884105728),
>> *and want to add 5th node* ('E') with initial token = 
>> 164163260474281062972548100673162157075,  then we want to rebalance 
>> A, D, E nodes such way they'll own equal percentage of data. All 
>> nodes have ~400 GB of data and around ~300GB disk free space.
>> What we did:
>> 1. 'Join' new cassandra instance (node 'E') to cluster and wait 'till 
>> it loads data for it tokens range.
>>
>> 2. Move node 'D' initial token down from 150... to 130...
>> Here we ran into a problem. When "move" started disk usage for node C 
>> grows from 400 to 750GB, we saw running compactions on node 'D' but 
>> some compactions failed with /"WARN [CompactionExecutor:580] 
>> 2012-03-11 16:57:56,036 CompactionTask.java (line 87) insufficient 
>> space to compact all requested files SSTableReader"/ after that we 
>> killed "move" process to avoid "out of disk space" error (when 5GB of 
>> free space left). After restart it frees 100GB of space and now we 
>> have total of 105GB free disk space on node 'D'. Also we noticed 
>> increased disk usage by ~150GB at node 'B' but it stops growing 
>> before we stopped "move token".
>>
>>
>> So now we have 5 nodes in cluster in status like this:
>> Node, Owns%,     Load,     Init. token
>> A:         16%       400GB        020...
>> B:         25%       520GB        063...
>> C:         25%       400GB        106...
>> D:         25%       640GB        150...
>> E:          9%         300GB        164...
>>
>> We'll add disk space for all nodes and run some cleanups, but there's 
>> still left some questions:
>>
>> What is the best next step  for us from this point?
>> What is correct procedure after all and what should we expect when 
>> adding node to cassandra cluster?
>> We expected decrease of used disk space on node 'D' 'cause we shrink 
>> token range for this node, but saw the opposite, why it happened and 
>> is it normal behavior?
>> What if we'll have 2TB of data on 2.5TB disk and we wanted to add 
>> another node and move tokens?
>> Is it possible to automate node addition to cluster and be sure we 
>> won't run out of space?
>>
>> Thank.

Re: Adding node to Cassandra

Posted by Rustam Aliyev <ru...@code.az>.

Hi,

If you use SizeTieredCompactionStrategy, you should have x2 disk space 
to be on the safe side. So if you want to store 2TB data, you need 
partition size of 4TB at least.  LeveledCompactionStrategy is available 
in 1.x and supposed to require less free disk space (but comes at price 
of I/O).

--
Rustam.

On 12/03/2012 09:23, Vanger wrote:
> *We have cassandra 4 nodes cluster* with RF = 3 (nodes named from 'A' 
> to 'D', initial tokens:
> *A (25%)*: 20543402371996174596346065790779111550, *
> B (25%)*: 63454860067234500516210522518260948578,
> *C (25%)*: 106715317233367107622067286720208938865,
> *D (25%)*: 150141183460469231731687303715884105728),
> *and want to add 5th node* ('E') with initial token = 
> 164163260474281062972548100673162157075,  then we want to rebalance A, 
> D, E nodes such way they'll own equal percentage of data. All nodes 
> have ~400 GB of data and around ~300GB disk free space.
> What we did:
> 1. 'Join' new cassandra instance (node 'E') to cluster and wait 'till 
> it loads data for it tokens range.
>
> 2. Move node 'D' initial token down from 150... to 130...
> Here we ran into a problem. When "move" started disk usage for node C 
> grows from 400 to 750GB, we saw running compactions on node 'D' but 
> some compactions failed with /"WARN [CompactionExecutor:580] 
> 2012-03-11 16:57:56,036 CompactionTask.java (line 87) insufficient 
> space to compact all requested files SSTableReader"/ after that we 
> killed "move" process to avoid "out of disk space" error (when 5GB of 
> free space left). After restart it frees 100GB of space and now we 
> have total of 105GB free disk space on node 'D'. Also we noticed 
> increased disk usage by ~150GB at node 'B' but it stops growing before 
> we stopped "move token".
>
>
> So now we have 5 nodes in cluster in status like this:
> Node, Owns%,     Load,     Init. token
> A:         16%       400GB        020...
> B:         25%       520GB        063...
> C:         25%       400GB        106...
> D:         25%       640GB        150...
> E:          9%         300GB        164...
>
> We'll add disk space for all nodes and run some cleanups, but there's 
> still left some questions:
>
> What is the best next step  for us from this point?
> What is correct procedure after all and what should we expect when 
> adding node to cassandra cluster?
> We expected decrease of used disk space on node 'D' 'cause we shrink 
> token range for this node, but saw the opposite, why it happened and 
> is it normal behavior?
> What if we'll have 2TB of data on 2.5TB disk and we wanted to add 
> another node and move tokens?
> Is it possible to automate node addition to cluster and be sure we 
> won't run out of space?
>
> Thank.