You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by onmstester onmstester <on...@zoho.com.INVALID> on 2019/02/13 04:07:37 UTC

forgot to run nodetool cleanup

Hi,



I should have run cleanup after adding a few nodes to my cluster, about 2 months ago, the ttl is 6 month, What happens now? Should i worry about any catastrophics? 

Should i run the cleanup now?



Thanks in advance


Sent using https://www.zoho.com/mail/

Re: forgot to run nodetool cleanup

Posted by shalom sagges <sh...@gmail.com>.
Cleanup is a great way to free up disk space.

Just note you might run into
https://issues.apache.org/jira/browse/CASSANDRA-9036 if you use a version
older than 2.0.15.



On Thu, Feb 14, 2019 at 10:20 AM Oleksandr Shulgin <
oleksandr.shulgin@zalando.de> wrote:

> On Wed, Feb 13, 2019 at 6:47 PM Jeff Jirsa <jj...@gmail.com> wrote:
>
>> Depending on how bad data resurrection is, you should run it for any host
>> that loses a range. In vnodes, that's usually all hosts.
>>
>> Cleanup with LCS is very cheap. Cleanup with STCS/TWCS is a bit more work.
>>
>
> Wait, doesn't cleanup just rewrite every SSTable one by one?  Why would
> compaction strategy matter?  Do you mean that after cleanup STCS may pick
> some resulting tables to re-compact them due to the min/max size
> difference, which would not be the case with LCS?
>
>
>> If you're just TTL'ing all data, it may not be worth the effort.
>>
>
> Indeed, but in our case the main reason to scale out is that the nodes are
> running out of disk space, so we really want to get rid of the extra copies.
>
> --
> Alex
>
>

Re: forgot to run nodetool cleanup

Posted by Oleksandr Shulgin <ol...@zalando.de>.
On Thu, Feb 14, 2019 at 4:39 PM Jeff Jirsa <jj...@gmail.com> wrote:
>
> Wait, doesn't cleanup just rewrite every SSTable one by one?  Why would
compaction strategy matter?  Do you mean that after cleanup STCS may pick
some resulting tables to re-compact them due to the min/max size
difference, which would not be the case with LCS?
>
>
> LCS has smaller, non-overlapping files. The upleveling process and
non-overlapping part makes it very likely (but not guaranteed) that within
a level, only 2 sstables will overlap a losing range.
>
> Since cleanup only rewrites files if they’re out of range, LCS probably
only has 5 (levels) * 2 (lower and upper) * number of ranges sstables that
are going to get rewritten, where TWCS / stcs is probably going to rewrite
all of them.

Thanks for the explanation!

Still with the default number of vnodes, there is probably not much of a
difference as even a single additional node will touch a lot of ranges?

--
Alex

Re: forgot to run nodetool cleanup

Posted by Jeff Jirsa <jj...@gmail.com>.


> On Feb 14, 2019, at 12:19 AM, Oleksandr Shulgin <ol...@zalando.de> wrote:
> 
>> On Wed, Feb 13, 2019 at 6:47 PM Jeff Jirsa <jj...@gmail.com> wrote:
>> Depending on how bad data resurrection is, you should run it for any host that loses a range. In vnodes, that's usually all hosts. 
>> 
>> Cleanup with LCS is very cheap. Cleanup with STCS/TWCS is a bit more work.
> 
> Wait, doesn't cleanup just rewrite every SSTable one by one?  Why would compaction strategy matter?  Do you mean that after cleanup STCS may pick some resulting tables to re-compact them due to the min/max size difference, which would not be the case with LCS?

LCS has smaller, non-overlapping files. The upleveling process and non-overlapping part makes it very likely (but not guaranteed) that within a level, only 2 sstables will overlap a losing range. 

Since cleanup only rewrites files if they’re out of range, LCS probably only has 5 (levels) * 2 (lower and upper) * number of ranges sstables that are going to get rewritten, where TWCS / stcs is probably going to rewrite all of them.

Re: forgot to run nodetool cleanup

Posted by Oleksandr Shulgin <ol...@zalando.de>.
On Wed, Feb 13, 2019 at 6:47 PM Jeff Jirsa <jj...@gmail.com> wrote:

> Depending on how bad data resurrection is, you should run it for any host
> that loses a range. In vnodes, that's usually all hosts.
>
> Cleanup with LCS is very cheap. Cleanup with STCS/TWCS is a bit more work.
>

Wait, doesn't cleanup just rewrite every SSTable one by one?  Why would
compaction strategy matter?  Do you mean that after cleanup STCS may pick
some resulting tables to re-compact them due to the min/max size
difference, which would not be the case with LCS?


> If you're just TTL'ing all data, it may not be worth the effort.
>

Indeed, but in our case the main reason to scale out is that the nodes are
running out of disk space, so we really want to get rid of the extra copies.

--
Alex

Re: forgot to run nodetool cleanup

Posted by Jeff Jirsa <jj...@gmail.com>.
On Wed, Feb 13, 2019 at 7:47 AM Oleksandr Shulgin <
oleksandr.shulgin@zalando.de> wrote:

> On Wed, Feb 13, 2019 at 4:40 PM Jeff Jirsa <jj...@gmail.com> wrote:
>
>> Some people who add new hosts rebalance the ring afterward - that
>> rebalancing can look a lot like a shrink.
>>
>
> You mean by moving the tokens?  That's only possible if one is not using
> vnodes, correct?
>

Believe that's correct (I'm not sure how to move individual tokens in a
vnode cluster, at least, short of adding a host with exact token placement
you want a decommissioning another host).

>
> I also believe, but don’t have time to prove, that enough new hosts can
>> eventually give you a range back (moving it all the way around the ring) -
>> less likely but probably possible.
>>
>> Easiest to just assume that any range movement may resurrect data if you
>> haven’t run cleanup.
>>
>
> Does this mean that it is recommended to run cleanup on all hosts after
> every single node added?  We currently do this after every 3 or 6 nodes (1
> or 2 new per rack), to minimize the number of times we have to rewrite the
> sstable files.  Arguably, we don't do explicit deletes, the data is only
> expiring due to TTL, so this should not be a problem for us, but in general?
>

Depending on how bad data resurrection is, you should run it for any host
that loses a range. In vnodes, that's usually all hosts.

Cleanup with LCS is very cheap. Cleanup with STCS/TWCS is a bit more work.
If you're just TTL'ing all data, it may not be worth the effort.

Re: forgot to run nodetool cleanup

Posted by Oleksandr Shulgin <ol...@zalando.de>.
On Wed, Feb 13, 2019 at 4:40 PM Jeff Jirsa <jj...@gmail.com> wrote:

> Some people who add new hosts rebalance the ring afterward - that
> rebalancing can look a lot like a shrink.
>

You mean by moving the tokens?  That's only possible if one is not using
vnodes, correct?

I also believe, but don’t have time to prove, that enough new hosts can
> eventually give you a range back (moving it all the way around the ring) -
> less likely but probably possible.
>
> Easiest to just assume that any range movement may resurrect data if you
> haven’t run cleanup.
>

Does this mean that it is recommended to run cleanup on all hosts after
every single node added?  We currently do this after every 3 or 6 nodes (1
or 2 new per rack), to minimize the number of times we have to rewrite the
sstable files.  Arguably, we don't do explicit deletes, the data is only
expiring due to TTL, so this should not be a problem for us, but in general?

--
Alex

Re: forgot to run nodetool cleanup

Posted by Jeff Jirsa <jj...@gmail.com>.
Some people who add new hosts rebalance the ring afterward - that rebalancing can look a lot like a shrink. 

I also believe, but don’t have time to prove, that enough new hosts can eventually give you a range back (moving it all the way around the ring) - less likely but probably possible.

Easiest to just assume that any range movement may resurrect data if you haven’t run cleanup.

-- 
Jeff Jirsa


> On Feb 13, 2019, at 12:34 AM, Oleksandr Shulgin <ol...@zalando.de> wrote:
> 
>> On Wed, Feb 13, 2019 at 5:31 AM Jeff Jirsa <jj...@gmail.com> wrote:
> 
>> The most likely result of not running cleanup is wasted disk space.
>> 
>> The second most likely result is resurrecting deleted data if you do a second range movement (expansion, shrink, etc). 
>> 
>> If this is bad for you, you should run cleanup now. For many use cases, it’s a nonissue. 
>> 
>> If you know you’re going to add more hosts, be very sure you run cleanup before you do so.
> 
> Jeff,
> 
> Could you please expand a little?  Do you mean that adding new hosts can lead to deleted data resurrection if cleanup isn't done prior to that?
> 
> I would only expect this to be a potential problem if one removes nodes, since then range ownership can expand, but not with adding nodes, as then ownership can only shrink.  Or am I missing something bigger?
> 
> --
> Alex
> 

Re: forgot to run nodetool cleanup

Posted by Oleksandr Shulgin <ol...@zalando.de>.
On Wed, Feb 13, 2019 at 5:31 AM Jeff Jirsa <jj...@gmail.com> wrote:

> The most likely result of not running cleanup is wasted disk space.
>
> The second most likely result is resurrecting deleted data if you do a
> second range movement (expansion, shrink, etc).
>
> If this is bad for you, you should run cleanup now. For many use cases,
> it’s a nonissue.
>
> If you know you’re going to add more hosts, be very sure you run cleanup
> before you do so.
>

Jeff,

Could you please expand a little?  Do you mean that adding new hosts can
lead to deleted data resurrection if cleanup isn't done prior to that?

I would only expect this to be a potential problem if one removes nodes,
since then range ownership can expand, but not with adding nodes, as then
ownership can only shrink.  Or am I missing something bigger?

--
Alex

Re: forgot to run nodetool cleanup

Posted by Jeff Jirsa <jj...@gmail.com>.
The most likely result of not running cleanup is wasted disk space.

The second most likely result is resurrecting deleted data if you do a second range movement (expansion, shrink, etc). 

If this is bad for you, you should run cleanup now. For many use cases, it’s a nonissue. 

If you know you’re going to add more hosts, be very sure you run cleanup before you do so. 

There’s no harm in running cleanup now.



-- 
Jeff Jirsa


> On Feb 12, 2019, at 8:07 PM, onmstester onmstester <on...@zoho.com.invalid> wrote:
> 
> Hi,
> 
> I should have run cleanup after adding a few nodes to my cluster, about 2 months ago, the ttl is 6 month, What happens now? Should i worry about any catastrophics? 
> Should i run the cleanup now?
> 
> Thanks in advance
> 
> Sent using Zoho Mail
> 
> 
>