You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Thomas Stets <th...@gmail.com> on 2012/09/26 10:26:56 UTC

Why periodical repairs?

The Cassandra Operations page
(http://wiki.apache.org/cassandra/Operations) says:

> Unless your application performs no deletes, it is vital that production
clusters run nodetool repair periodically on all nodes in the cluster. The
hard requirement for repair frequency is the value used for GCGraceSeconds
Running nodetool repair often enough to guarantee that all nodes have
performed a repair in a given period GCGraceSeconds long, ensures that
deletes are not "forgotten" in the cluster.

Is it really that common for deletes to be forgotten, or is it just a
precaution against an unlikely-but-hard-to-fix problem?

  regards, Thomas

Re: Why periodical repairs?

Posted by Tyler Hobbs <ty...@datastax.com>.

The DistributedDeletes link in that section explains the root reason for
needing to do this.  It's not that deletes are forgotten, it's that a write
(deletes are basically tombstone writes) didn't get replicated to all
replicas.  For example, at RF=3, write consistency level QUORUM, if one of
the replicas goes down for several hours while you're performing deletes,
then comes back up, it won't necessarily have all of those tombstones.
Hinted handoff will replay some of the deletes, but not all of them if
you're down for an extended period of time.

Once you have "zombie" data, the only way to get rid of it is to re-run the
delete.

On Wed, Sep 26, 2012 at 3:26 AM, Thomas Stets <th...@gmail.com>wrote:

> The Cassandra Operations page (http://wiki.apache.org/cassandra/Operations) says:
>
> > Unless your application performs no deletes, it is vital that production
> clusters run nodetool repair periodically on all nodes in the cluster.
> The hard requirement for repair frequency is the value used for
> GCGraceSeconds Running nodetool repair often enough to guarantee that all
> nodes have performed a repair in a given period GCGraceSeconds long,
> ensures that deletes are not "forgotten" in the cluster.
>
> Is it really that common for deletes to be forgotten, or is it just a
> precaution against an unlikely-but-hard-to-fix problem?
>
>   regards, Thomas
>
>

-- 
Tyler Hobbs
DataStax <http://datastax.com/>