You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jonathan Ellis <jb...@gmail.com> on 2012/04/04 01:10:55 UTC

Re: tombstones problem with 1.0.8

Removing expired columns actually requires two compaction passes: one
to turn the expired column into a tombstone; one to remove the
tombstone after gc_grace_seconds. (See
https://issues.apache.org/jira/browse/CASSANDRA-1537.)

Perhaps CASSANDRA-2786 was causing things to (erroneously) be cleaned
up early enough that this helped you out in 0.8.2?

On Wed, Mar 21, 2012 at 8:38 PM, Ross Black <ro...@gmail.com> wrote:
> Hi,
>
> We recently moved from 0.8.2 to 1.0.8 and the behaviour seems to have
> changed so that tombstones are now not being deleted.
>
> Our application continually adds and removes columns from Cassandra.  We
> have set a short gc_grace time (3600) since our application would
> automatically delete zombies if they appear.
> Under 0.8.2, the tombstones remained at a relatively constant number.
> Under 1.0.8, the tombstones have been continually increasing so that they
> exceed the size of our real data (at this stage we have over 100G of
> tombstones).
> Even after running a full compact the new compacted SSTable contains a
> massive number of tombstones, many that are several weeks old.
>
> Have I missed some new configuration option to allow deletion of tombstones?
>
> I also noticed that one of the changes between 0.8.2 and 1.0.8 was
> https://issues.apache.org/jira/browse/CASSANDRA-2786 which changed code to
> "avoid dropping tombstones when they might still be needed to shadow data in
> another sstable".
> Could this be having an impact since we continually add and remove columns
> even while a major compact is executing?
>
>
> Thanks,
> Ross
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: tombstones problem with 1.0.8

Posted by Ross Black <ro...@gmail.com>.
Hi Jonathan,

Thanks for your response.

We were running a compact at least once a day over the keyspace.  The
gc_grace was set to only 1 hour, so from what you said I would expect that
tombstones should be deleted after max 3 days.
When I inspected the data in the SSTables after a compact, some rows
contained millions of tombstones with many having timestamps indicating
they were older than 2 weeks.

We have recently migrated to a new schema design that avoids deleting
columns or rows.
I ran another compact once data was not being added to the new keyspace (it
only ever added new columns, never modified existing or deleted columns).
That compact deleted all of the existing tombstones, reducing our data from
~250G down to ~30G.
I assume there must have been something strange in our keyspace that
prevented tombstones from being deleted just while data was being added.

We longer delete columns so the issue is no longer critical for us, but I
am still curious as to what/why the issue was occurring just in case we
start deleting columns again ;-)

Thanks,
Ross



On 4 April 2012 09:10, Jonathan Ellis <jb...@gmail.com> wrote:

> Removing expired columns actually requires two compaction passes: one
> to turn the expired column into a tombstone; one to remove the
> tombstone after gc_grace_seconds. (See
> https://issues.apache.org/jira/browse/CASSANDRA-1537.)
>
> Perhaps CASSANDRA-2786 was causing things to (erroneously) be cleaned
> up early enough that this helped you out in 0.8.2?
>
> On Wed, Mar 21, 2012 at 8:38 PM, Ross Black <ro...@gmail.com>
> wrote:
> > Hi,
> >
> > We recently moved from 0.8.2 to 1.0.8 and the behaviour seems to have
> > changed so that tombstones are now not being deleted.
> >
> > Our application continually adds and removes columns from Cassandra.  We
> > have set a short gc_grace time (3600) since our application would
> > automatically delete zombies if they appear.
> > Under 0.8.2, the tombstones remained at a relatively constant number.
> > Under 1.0.8, the tombstones have been continually increasing so that they
> > exceed the size of our real data (at this stage we have over 100G of
> > tombstones).
> > Even after running a full compact the new compacted SSTable contains a
> > massive number of tombstones, many that are several weeks old.
> >
> > Have I missed some new configuration option to allow deletion of
> tombstones?
> >
> > I also noticed that one of the changes between 0.8.2 and 1.0.8 was
> > https://issues.apache.org/jira/browse/CASSANDRA-2786 which changed code
> to
> > "avoid dropping tombstones when they might still be needed to shadow
> data in
> > another sstable".
> > Could this be having an impact since we continually add and remove
> columns
> > even while a major compact is executing?
> >
> >
> > Thanks,
> > Ross
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>