You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Anishek Agarwal <an...@gmail.com> on 2015/06/24 08:41:52 UTC

DTCS - nodetool repair - TTL

Hello all,

We are running c* version 2.0.15. We have 5 nodes with RF=3. We are using
DTCS and on all inserts we have a TTL of 30 days. We have no deletes.We
just have one CF. When i run nodetool repair on a node i notice a lot of
extra sst tables created, this I think is due to the fact that its trying
to reconcile the correct values across different nodes. What i am trying to
figure out now is how will this affect the performance after the ttl is
reached for rows. As far as i understood from Spotify DTCS
<https://labs.spotify.com/tag/dtcs/> it looks like DTCS will drop the whole
SST table once the ttl is reached as it compacts data which are inserted
around the same time into same SST table.  Now when repair happens we have
these new SST Tables which are earlier in the timeline and hence will have
tombstones alive for sometime.

for ex if the machine is up for 2 weeks and i run repair now for the first
time then the new sst tables might have data which is from anywhere in the
previous weeks and hence even though the SST tables created during week 1
will get dropped off in the starting of 5th Week because of repair there
will additional SST tables which will have tombstones till they reach their
eventual drop state a few weeks later.

Am i thinking correct ?

This means that we might still have lot of tombstones lying around as
compaction is less frequent for older tables ?

thanks
anishek