You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Edward Capriolo <ed...@gmail.com> on 2011/12/01 01:45:28 UTC

Re: Cleanup in a write-only environment

Your understanding of nodetool cleanup is not correct. cleanup is used only
after cluster balancing like adding or removing nodes. It removes data that
does not belong on the node anymore (in older versions it removed hints as
well)

Your debate is needing to run companion . In a write only workload you
should let cassandra do its normal connection.(in most cases)

On Wednesday, November 30, 2011, David McNelis <dm...@agentisenergy.com>
wrote:
> In my understanding Cleanup is meant to help clear out data that has
 been removed.  If you have an environment where data is only ever added
(the case for the production system I'm working with), is there a point to
automating cleanup?   I understand that if we were to ever purge a segment
of data from our cluster we'd certainly want to run it, or after added a
new node and adjusting the tokens.
> So I want to make sure I'm not missing something here and that there
 would be other  reasons to run cleanup regularly?
>
> --
> David McNelis
> Lead Software Engineer
> Agentis Energy
> www.agentisenergy.com
> c: 219.384.5143
> A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.
>
>

Re: Cleanup in a write-only environment

Posted by David McNelis <dm...@agentisenergy.com>.
Thanks, folks.

I think I must have read compaction, thought cleanup, and gotten muddled
from there.

David
On Nov 30, 2011 6:45 PM, "Edward Capriolo" <ed...@gmail.com> wrote:

> Your understanding of nodetool cleanup is not correct. cleanup is used
> only after cluster balancing like adding or removing nodes. It removes data
> that does not belong on the node anymore (in older versions it removed
> hints as well)
>
> Your debate is needing to run companion . In a write only workload you
> should let cassandra do its normal connection.(in most cases)
>
> On Wednesday, November 30, 2011, David McNelis <dm...@agentisenergy.com>
> wrote:
> > In my understanding Cleanup is meant to help clear out data that has
>  been removed.  If you have an environment where data is only ever added
> (the case for the production system I'm working with), is there a point to
> automating cleanup?   I understand that if we were to ever purge a segment
> of data from our cluster we'd certainly want to run it, or after added a
> new node and adjusting the tokens.
> > So I want to make sure I'm not missing something here and that there
>  would be other  reasons to run cleanup regularly?
> >
> > --
> > David McNelis
> > Lead Software Engineer
> > Agentis Energy
> > www.agentisenergy.com
> > c: 219.384.5143
> > A Smart Grid technology company focused on helping consumers of energy
> control an often under-managed resource.
> >
> >