You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Samuru Jackson <sa...@googlemail.com> on 2010/07/12 17:17:07 UTC

Question regarding consistency and deletion

Hi,

I'm fairly new to Cassandra and started to set up a small cluster for
playing around and evaluating it for my potential purposes.

As far as I understand I can't remove whole rows - instead the columns
of a deleted rows are removed and a client can decided based on the
row's column count if it treats a part of a returned slice as deleted
or not. Those empty rows are referenced as a Tombstone in Cassandras
terminology- right?

Is there any way to force the sync/garbage collection of the deletion
of the such empty rows?

Reading the mailinglist, this behaviour is relating to the weak
consistency of Cassandra. What I don't understand is, why is it
possible to remove the columns of a row, but not the whole row? Could
you give me some further reading on this topic?

Thanks!

SJ

Re: Question regarding consistency and deletion

Posted by Benjamin Black <b...@b3k.us>.
On Tue, Jul 13, 2010 at 5:47 AM, Samuru Jackson
<sa...@googlemail.com> wrote:
> Thanks for the links.
>
> Actually it is pretty easy to catch those tombstoned keys on the
> client side. However, in certain applications it can generate some
> additional overhead on the network.
>
> I think it would be nice to have a forced garbage collection in the
> API. This would IMHO ease to write Unit-Tests.

You can, via JMX.  Given how painful frequent major compactions can
be, I don't see value in putting it in the Thrift API.


b

Re: Question regarding consistency and deletion

Posted by Samuru Jackson <sa...@googlemail.com>.
Thanks for the links.

Actually it is pretty easy to catch those tombstoned keys on the
client side. However, in certain applications it can generate some
additional overhead on the network.

I think it would be nice to have a forced garbage collection in the
API. This would IMHO ease to write Unit-Tests.

/SJ

On Mon, Jul 12, 2010 at 6:34 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> The Tomstones are removed after GCGraceSeconds (in the storage-config.xml),
> at the next Major Compaction
> http://wiki.apache.org/cassandra/MemtableSSTable?highlight=%28tombstones%29
>
> Take a look at http://wiki.apache.org/cassandra/DistributedDeletes  and
> Handling Failure on http://wiki.apache.org/cassandra/Operations
>
> This one explains the internal reason the tombstoned keys are returned
> http://wiki.apache.org/cassandra/FAQ#range_ghosts
>
> You could reduce the GCGraceSeconds. Others may have a better idea how to
> force it.
>
> Aaron
> On 13 Jul, 2010,at 03:17 AM, Samuru Jackson <sa...@googlemail.com>
> wrote:
>
> Hi,
>
> I'm fairly new to Cassandra and started to set up a small cluster for
> playing around and evaluating it for my potential purposes.
>
> As far as I understand I can't remove whole rows - instead the columns
> of a deleted rows are removed and a client can decided based on the
> row's column count if it treats a part of a returned slice as deleted
> or not. Those empty rows are referenced as a Tombstone in Cassandras
> terminology- right?
>
> Is there any way to force the sync/garbage collection of the deletion
> of the such empty rows?
>
> Reading the mailinglist, this behaviour is relating to the weak
> consistency of Cassandra. What I don't understand is, why is it
> possible to remove the columns of a row, but not the whole row? Could
> you give me some further reading on this topic?
>
> Thanks!
>
> SJ
>

Re: Question regarding consistency and deletion

Posted by Aaron Morton <aa...@thelastpickle.com>.
The Tomstones are removed after GCGraceSeconds (in the storage-config.xml), at the next Major Compaction http://wiki.apache.org/cassandra/MemtableSSTable?highlight=%28tombstones%29

Take a look at http://wiki.apache.org/cassandra/DistributedDeletes  and Handling Failure on http://wiki.apache.org/cassandra/Operations

This one explains the internal reason the tombstoned keys are returned http://wiki.apache.org/cassandra/FAQ#range_ghosts

You could reduce the GCGraceSeconds. Others may have a better idea how to force it.

Aaron

On 13 Jul, 2010,at 03:17 AM, Samuru Jackson <sa...@googlemail.com> wrote:

> Hi,
>
> I'm fairly new to Cassandra and started to set up a small cluster for
> playing around and evaluating it for my potential purposes.
>
> As far as I understand I can't remove whole rows - instead the columns
> of a deleted rows are removed and a client can decided based on the
> row's column count if it treats a part of a returned slice as deleted
> or not. Those empty rows are referenced as a Tombstone in Cassandras
> terminology- right?
>
> Is there any way to force the sync/garbage collection of the deletion
> of the such empty rows?
>
> Reading the mailinglist, this behaviour is relating to the weak
> consistency of Cassandra. What I don't understand is, why is it
> possible to remove the columns of a row, but not the whole row? Could
> you give me some further reading on this topic?
>
> Thanks!
>
> SJ