You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Cameron Gandevia <ca...@globalrelay.net> on 2019/02/25 23:00:05 UTC

Determine disc space that will be freed after expansion cleanup

Hi

Some Cassandra nodes could have rows that are associated with tokens 
that aren't owned by those nodes anymore as a result of expansion, this 
data will remain until a cleanup compaction is run.

We would like to know the best way to calculate the amount (or close to) 
of data that is essentially dead data on each node to determine how much 
disk space will be freed once the expansion is complete. One possible 
approach we considered is to identify the rows no longer owned by each 
node and their size by scanning the sstables.




RE: Determine disc space that will be freed after expansion cleanup

Posted by Kenneth Brotman <ke...@yahoo.com.INVALID>.
A real quick way to get an idea might be to run nodetool status and look at the imbalance of the data on each node assuming all the nodes have the same specs.

 

From: Cameron Gandevia [mailto:cameron.gandevia@globalrelay.net] 
Sent: Monday, February 25, 2019 3:00 PM
To: user@cassandra.apache.org
Subject: Determine disc space that will be freed after expansion cleanup

 

Hi

Some Cassandra nodes could have rows that are associated with tokens that aren't owned by those nodes anymore as a result of expansion, this data will remain until a cleanup compaction is run.

We would like to know the best way to calculate the amount (or close to) of data that is essentially dead data on each node to determine how much disk space will be freed once the expansion is complete. One possible approach we considered is to identify the rows no longer owned by each node and their size by scanning the sstables.