You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Gabriel Giussi <ga...@gmail.com> on 2018/09/17 12:58:07 UTC
Fwd: How default_time_to_live would delete rows without tombstones in Cassandra?
From
https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
> Cassandra allows you to set a default_time_to_live property for an entire
table. Columns and rows marked with regular TTLs are processed as described
above; but when a record exceeds the table-level TTL, **Cassandra deletes
it immediately, without tombstoning or compaction**.
This is also answered in https://stackoverflow.com/a/50060436/3517383
> If a table has default_time_to_live on it then rows that exceed this
time limit are **deleted immediately without tombstones being written**.
And commented in LastPickle's post About deletes and tombstones (
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html#comment-3949581514
)
> Another clue to explore would be to use the TTL as a default value if
that's a good fit. TTLs set at the table level with 'default_time_to_live'
**should not generate any tombstone at all in C*3.0+**. Not tested on my
hand, but I read about this.
I've made the simplest test that I could imagine using
`LeveledCompactionStrategy`:
CREATE KEYSPACE IF NOT EXISTS temp WITH replication = {'class':
'SimpleStrategy', 'replication_factor': '1'};
CREATE TABLE IF NOT EXISTS temp.test_ttl (
key text,
value text,
PRIMARY KEY (key)
) WITH compaction = { 'class': 'LeveledCompactionStrategy'}
AND default_time_to_live = 180;
1. `INSERT INTO temp.test_ttl (key,value) VALUES ('k1','v1');`
2. `nodetool flush temp`
3. `sstabledump mc-1-big-Data.db`
[image: cassandra0.png]
4. wait for 180 seconds (default_time_to_live)
5. `sstabledump mc-1-big-Data.db`
[image: cassandra1.png]
The tombstone isn't created yet
6. `nodetool compact temp`
7. `sstabledump mc-2-big-Data.db`
[image: cassandra2.png]
The **tombstone is created** (and not dropped on compaction due to
gc_grace_seconds)
The test was performed using apache cassandra 3.0.13
From the example I conclude that isn't true that `default_time_to_live` not
require tombstones, at least for version 3.0.13.
However this is a very simple test and I'm forcing a major compaction with
`nodetool compact` so I may not be recreating the scenario where
default_time_to_live magic comes into play.
But how would C* delete without tombstones? Why this should be a different
scenario to using TTL per insert?
This is also asked at stackoverflow
<https://stackoverflow.com/questions/52282517/how-default-time-to-live-would-delete-rows-without-tombstones-in-cassandra>
Re: How default_time_to_live would delete rows without tombstones in Cassandra?
Posted by Gabriel Giussi <ga...@gmail.com>.
Hello Alain,
thanks for clarifying this topic. You had alerted that this should be
explored indeed, so there is nothing to apologize for.
I've asked this in stackoverflow too (
https://stackoverflow.com/q/52282517/3517383), so if you want to answer
there I will mark yours as the correct one, if not I will reference this
mail from the mailing list.
Your posts on LastPickle are really great BTW.
Cheers.
El jue., 27 sept. 2018 a las 13:48, Alain RODRIGUEZ (<ar...@gmail.com>)
escribió:
> Hello Gabriel,
>
> Sorry for not answering earlier. I should have, given that I contributed
> spreading this wrong idea. I will also try to edit my comment in the post.
> I have been fooled by the piece of documentation you mentioned when
> answering this question on our blog. I probably answered this one too
> quickly, even though I wrote this a thing 'to explore', even saying I did
> not try it explicitely.
>
> Another clue to explore would be to use the TTL as a default value if
>> that's a good fit. TTLs set at the table level with
>> 'default_time_to_live' **should not generate any tombstone at all in
>> C*3.0+**. Not tested on my hand, but I read about this.
>
>
> So my sentence above is wrong. Basically, the default can be overwritten
> by the TTL at the query level and I do not see how Cassandra could handle
> this without tombstones.
>
> I spent time on the post and it was reviewed. I believe it is reliable.
> The questions, on the other side, are answered by me alone and well, that
> only reflects my opinion at the moment I am asked and I sometimes find
> enough time and interest to dig topics, sometimes a bit less. So this is
> fully on me, my apologies for this inaccuracy. I must say am always afraid
> when writing publicly and sharing information to do this kind of mistakes
> and mislead people. I hope the impact of this read was still positive for
> you overall.
>
> From the example I conclude that isn't true that `default_time_to_live`
>> not require tombstones, at least for version 3.0.13.
>>
>
> Also, I am glad to see you did not believe me or Datastax documentation
> but tried it by yourself. This is definitively the right approach.
>
> But how would C* delete without tombstones? Why this should be a different
>> scenario to using TTL per insert?
>>
>
> Yes, exactly this,
>
> C*heers.
> -----------------------
> Alain Rodriguez - @arodream - alain@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> Le lun. 17 sept. 2018 à 14:58, Gabriel Giussi <ga...@gmail.com> a
> écrit :
>
>>
>> From
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
>>
>> > Cassandra allows you to set a default_time_to_live property for an
>> entire table. Columns and rows marked with regular TTLs are processed as
>> described above; but when a record exceeds the table-level TTL, **Cassandra
>> deletes it immediately, without tombstoning or compaction**.
>>
>> This is also answered in https://stackoverflow.com/a/50060436/3517383
>>
>> > If a table has default_time_to_live on it then rows that exceed this
>> time limit are **deleted immediately without tombstones being written**.
>>
>> And commented in LastPickle's post About deletes and tombstones (
>> http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html#comment-3949581514
>> )
>>
>> > Another clue to explore would be to use the TTL as a default value if
>> that's a good fit. TTLs set at the table level with 'default_time_to_live'
>> **should not generate any tombstone at all in C*3.0+**. Not tested on my
>> hand, but I read about this.
>>
>> I've made the simplest test that I could imagine using
>> `LeveledCompactionStrategy`:
>>
>> CREATE KEYSPACE IF NOT EXISTS temp WITH replication = {'class':
>> 'SimpleStrategy', 'replication_factor': '1'};
>>
>> CREATE TABLE IF NOT EXISTS temp.test_ttl (
>> key text,
>> value text,
>> PRIMARY KEY (key)
>> ) WITH compaction = { 'class': 'LeveledCompactionStrategy'}
>> AND default_time_to_live = 180;
>>
>> 1. `INSERT INTO temp.test_ttl (key,value) VALUES ('k1','v1');`
>> 2. `nodetool flush temp`
>> 3. `sstabledump mc-1-big-Data.db`
>> [image: cassandra0.png]
>>
>> 4. wait for 180 seconds (default_time_to_live)
>> 5. `sstabledump mc-1-big-Data.db`
>> [image: cassandra1.png]
>>
>> The tombstone isn't created yet
>> 6. `nodetool compact temp`
>> 7. `sstabledump mc-2-big-Data.db`
>> [image: cassandra2.png]
>>
>> The **tombstone is created** (and not dropped on compaction due to
>> gc_grace_seconds)
>>
>> The test was performed using apache cassandra 3.0.13
>>
>> From the example I conclude that isn't true that `default_time_to_live`
>> not require tombstones, at least for version 3.0.13.
>> However this is a very simple test and I'm forcing a major compaction
>> with `nodetool compact` so I may not be recreating the scenario where
>> default_time_to_live magic comes into play.
>>
>> But how would C* delete without tombstones? Why this should be a
>> different scenario to using TTL per insert?
>>
>> This is also asked at stackoverflow
>> <https://stackoverflow.com/questions/52282517/how-default-time-to-live-would-delete-rows-without-tombstones-in-cassandra>
>>
>
Re: How default_time_to_live would delete rows without tombstones in Cassandra?
Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hello Gabriel,
Sorry for not answering earlier. I should have, given that I contributed
spreading this wrong idea. I will also try to edit my comment in the post.
I have been fooled by the piece of documentation you mentioned when
answering this question on our blog. I probably answered this one too
quickly, even though I wrote this a thing 'to explore', even saying I did
not try it explicitely.
Another clue to explore would be to use the TTL as a default value if
> that's a good fit. TTLs set at the table level with
> 'default_time_to_live' **should not generate any tombstone at all in
> C*3.0+**. Not tested on my hand, but I read about this.
So my sentence above is wrong. Basically, the default can be overwritten by
the TTL at the query level and I do not see how Cassandra could handle this
without tombstones.
I spent time on the post and it was reviewed. I believe it is reliable. The
questions, on the other side, are answered by me alone and well, that only
reflects my opinion at the moment I am asked and I sometimes find enough
time and interest to dig topics, sometimes a bit less. So this is fully on
me, my apologies for this inaccuracy. I must say am always afraid when
writing publicly and sharing information to do this kind of mistakes and
mislead people. I hope the impact of this read was still positive for you
overall.
From the example I conclude that isn't true that `default_time_to_live` not
> require tombstones, at least for version 3.0.13.
>
Also, I am glad to see you did not believe me or Datastax documentation but
tried it by yourself. This is definitively the right approach.
But how would C* delete without tombstones? Why this should be a different
> scenario to using TTL per insert?
>
Yes, exactly this,
C*heers.
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France / Spain
The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com
Le lun. 17 sept. 2018 à 14:58, Gabriel Giussi <ga...@gmail.com> a
écrit :
>
> From
> https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
>
> > Cassandra allows you to set a default_time_to_live property for an
> entire table. Columns and rows marked with regular TTLs are processed as
> described above; but when a record exceeds the table-level TTL, **Cassandra
> deletes it immediately, without tombstoning or compaction**.
>
> This is also answered in https://stackoverflow.com/a/50060436/3517383
>
> > If a table has default_time_to_live on it then rows that exceed this
> time limit are **deleted immediately without tombstones being written**.
>
> And commented in LastPickle's post About deletes and tombstones (
> http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html#comment-3949581514
> )
>
> > Another clue to explore would be to use the TTL as a default value if
> that's a good fit. TTLs set at the table level with 'default_time_to_live'
> **should not generate any tombstone at all in C*3.0+**. Not tested on my
> hand, but I read about this.
>
> I've made the simplest test that I could imagine using
> `LeveledCompactionStrategy`:
>
> CREATE KEYSPACE IF NOT EXISTS temp WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '1'};
>
> CREATE TABLE IF NOT EXISTS temp.test_ttl (
> key text,
> value text,
> PRIMARY KEY (key)
> ) WITH compaction = { 'class': 'LeveledCompactionStrategy'}
> AND default_time_to_live = 180;
>
> 1. `INSERT INTO temp.test_ttl (key,value) VALUES ('k1','v1');`
> 2. `nodetool flush temp`
> 3. `sstabledump mc-1-big-Data.db`
> [image: cassandra0.png]
>
> 4. wait for 180 seconds (default_time_to_live)
> 5. `sstabledump mc-1-big-Data.db`
> [image: cassandra1.png]
>
> The tombstone isn't created yet
> 6. `nodetool compact temp`
> 7. `sstabledump mc-2-big-Data.db`
> [image: cassandra2.png]
>
> The **tombstone is created** (and not dropped on compaction due to
> gc_grace_seconds)
>
> The test was performed using apache cassandra 3.0.13
>
> From the example I conclude that isn't true that `default_time_to_live`
> not require tombstones, at least for version 3.0.13.
> However this is a very simple test and I'm forcing a major compaction with
> `nodetool compact` so I may not be recreating the scenario where
> default_time_to_live magic comes into play.
>
> But how would C* delete without tombstones? Why this should be a different
> scenario to using TTL per insert?
>
> This is also asked at stackoverflow
> <https://stackoverflow.com/questions/52282517/how-default-time-to-live-would-delete-rows-without-tombstones-in-cassandra>
>