You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Michal Michalski <mi...@opera.com> on 2013/01/02 15:20:06 UTC

Re: Question on TTLs and Tombstones

Yup, I know it was pretty long mail and it was Christmas time, so I 
thought it might be left without a reply for some time, but as some time 
has passed, I'll try to remind you about my question with additional help:

TL;DR version:

WHEN does Cassandra remove expired (because of TTL) data? Which 
operations cause Cassandra to check for TTL and create Tombstones for 
them if needed? It happens during compaction, for sure. How about scrub, 
repair? Others?

Regards,
Michał

W dniu 28.12.2012 09:08, Michal Michalski pisze:
> Hi,
>
> I have a question regarding TTLs and Tombstones with a pretty long
> scenario + solution question. My first, general question is - when
> Cassandra checks for the TTL (if it expired) and creates the Tombstone
> if needed? I know it happens during compaction, but is this the only
> situation? How about checking it on reads? How about the
> "nodetool-based" actions? Scrub? Repair?
>
> The reason of my question is such scenario - I add the same amount of
> rows to CF every month. All of them have TTL of 6 months - when I add
> data from July, data from January should expire. I do NOT modify these
> data any later. However, because of SizeTiered compaction and large
> SSTables my old data do not expire in terms of disk usage - the're in
> the biggest/oldest SSTable which is not going to be compacted any soon.
> I want to get rid of the data I don't need. So my solution is to perform
> a user defined compaction on the single file that contains the oldest
> data (I make an assumption that in my use case it's the biggest / oldest
> SSTable). It works (at least the first compaction - see below), but I
> want to make sure that I'm right and I understand why it happens ;-)
>
> Heres how I understand how it works (it's December, my oldest data are
> from November, so I want to have nothing older than June):
>
> I have a large SSTable which was compacted in August for the last time
> and it's the oldest SSTable, much larger than the rest, so I can assume
> that it contains:
> (a) some Tombstones for the January data (when it was compacted for the
> last time January was the month to be expired so the Tombstones were
> created) which haven't been removed so far
> (b) some data from February - May which are NOT marked for deletion so
> far because when compaction has occured for the last time they were
> "fresh" enough to stay
> (c) some newer data (June+)
> So I compact it. Tombstones (a) are removed. Expired data (b) are marked
> for deletion by creating Tombstones for them. The rest of data is
> untouched. This reduces the file size by ~10-20%. This is what I checked
> and it worked.
> Then I wait 10 days (gc_grace) and compact it once again. It should
> remove all the Tombstones created during previous compaction, so file
> size should be reduced significantly (let's say it should be like 20% of
> the initial size or so). This is what I wait for.
> Am I right?
>
> How about repair? As compaction is a "per-node" task, I guess I should
> run repair between these two compactions to make sure that Tombstones
> have been transfered to other replicas?
>
> Or maybe - returning to my first question - Cassandra checks TTLs much
> more often (like with every single read?) so they're "spread" among many
> SSTables and they won't get compacted efficiently during compacting the
> oldest SSTable only? Or maybe jobs like scrub check TTLs and create
> Tombstones too? Or repair?
>
> I know that I could check some of these things with new nodetool
> features (like checking % of Tombstones in SSTable), but I run 1.1.1 and
> it's unavailable here. I know that 1.2 (or 1.1.7?) handles Tombstones in
> a better way, but - still - it's not my case unless I upgrade.
>
> Kind regards,
> Michał


Re: Question on TTLs and Tombstones

Posted by Michal Michalski <mi...@opera.com>.
Thanks for your answer. Moreover, the issue you mentioned in the end was 
the answer to the question I was going to ask next ;-)

Regards,
Michał

W dniu 02.01.2013 15:42, Sylvain Lebresne pisze:
>> WHEN does Cassandra remove expired (because of TTL) data?
>
> When a compaction reads an expired column, it removes it and replaces it by
> a tombstone (i.e. a deleted marker). So the first compaction after the
> expiration is what actually removes the data, but it won't reclaim all the
> disk space yet due to the tombstone. Said tombstone then follow the usual
> rules for tombstones. I.e. it will be fully removed by compaction once
> gc_grace seconds after the tombstone creation has elapsed. I note that
> during reads, if a column is expired it is simply ignored by the read, but
> nothing more is done.
>
>> Which operations cause Cassandra to check for TTL and create Tombstones
> for
>> them if needed?
>
> Only compaction does (for the 'create tombstones' part at least I mean).
>   For the rest of the system, an expired column is always handled exactly as
> if it was a tombstone (so reads ignore them, scrub don't care specially
> about them and repair don't do anything special either). I note that for
> repair this could be a source of inconsistency between nodes; see more
> details on https://issues.apache.org/jira/browse/CASSANDRA-4905.
>
> --
> Sylvain
>


Re: Question on TTLs and Tombstones

Posted by Sylvain Lebresne <sy...@datastax.com>.
> WHEN does Cassandra remove expired (because of TTL) data?

When a compaction reads an expired column, it removes it and replaces it by
a tombstone (i.e. a deleted marker). So the first compaction after the
expiration is what actually removes the data, but it won't reclaim all the
disk space yet due to the tombstone. Said tombstone then follow the usual
rules for tombstones. I.e. it will be fully removed by compaction once
gc_grace seconds after the tombstone creation has elapsed. I note that
during reads, if a column is expired it is simply ignored by the read, but
nothing more is done.

> Which operations cause Cassandra to check for TTL and create Tombstones
for
> them if needed?

Only compaction does (for the 'create tombstones' part at least I mean).
 For the rest of the system, an expired column is always handled exactly as
if it was a tombstone (so reads ignore them, scrub don't care specially
about them and repair don't do anything special either). I note that for
repair this could be a source of inconsistency between nodes; see more
details on https://issues.apache.org/jira/browse/CASSANDRA-4905.

--
Sylvain