You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Peng Xiao <25...@qq.com> on 2017/09/27 04:25:17 UTC

回复: nodetool cleanup in parallel

Thanks Kurt.




------------------ 原始邮件 ------------------
发件人: "kurt";<ku...@instaclustr.com>;
发送时间: 2017年9月27日(星期三) 中午11:57
收件人: "User"<us...@cassandra.apache.org>;

主题: Re: nodetool cleanup in parallel



correct. you can run it in parallel across many nodes if you have capacity. generally see about a 10% CPU increase from cleanups which isn't a big deal if you have the capacity to handle it + the io.

on that note on later versions you can specify -j <num jobs> to run multiple cleanup compactions at the same time on a single node, and also increase compaction throughput to speed the process up.


On 27 Sep. 2017 13:20, "Peng Xiao" <25...@qq.com> wrote:
hi,


nodetool cleanup will only remove those keys which no longer belong to those nodes,than theoretically we can run nodetool cleanup in parallel,right?the document suggests us to run this one by one,but it's too slow.


Thanks,
Peng Xiao

RE: 回复: nodetool cleanup in parallel

Posted by "Steinmaurer, Thomas" <th...@dynatrace.com>.
Side-note: At least with 2.1 (or even later), be aware that you might run into the following issue:
https://issues.apache.org/jira/browse/CASSANDRA-11155

We are doing cron―job based hourly snapshots in production and have tried to also run cleanup after extending a cluster from 6 to 9 nodes. This resulted in snapshot creation getting stuck, so we gave up running cleanup (and reclaiming disk space) in favor of still having actual snapshots in place.

We might find a time window where we disable snapshots, but cleanup may take while depending on the data volume, thus this would mean possibly disabling snapshots for many hours.

Regards,
Thomas


From: Peng Xiao [mailto:2535053@qq.com]
Sent: Mittwoch, 27. September 2017 06:25
To: user <us...@cassandra.apache.org>
Subject: 回复: nodetool cleanup in parallel

Thanks Kurt.


------------------ 原始邮件 ------------------
发件人: "kurt";<ku...@instaclustr.com>>;
发送时间: 2017年9月27日(星期三) 中午11:57
收件人: "User"<us...@cassandra.apache.org>>;
主题: Re: nodetool cleanup in parallel

correct. you can run it in parallel across many nodes if you have capacity. generally see about a 10% CPU increase from cleanups which isn't a big deal if you have the capacity to handle it + the io.

on that note on later versions you can specify -j <num jobs> to run multiple cleanup compactions at the same time on a single node, and also increase compaction throughput to speed the process up.

On 27 Sep. 2017 13:20, "Peng Xiao" <25...@qq.com>> wrote:
hi,

nodetool cleanup will only remove those keys which no longer belong to those nodes,than theoretically we can run nodetool cleanup in parallel,right?the document suggests us to run this one by one,but it's too slow.

Thanks,
Peng Xiao

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freist?dterstra?e 313