You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Marcus Eriksson (JIRA)" <ji...@apache.org> on 2016/01/25 10:52:39 UTC

[jira] [Resolved] (CASSANDRA-11060) Allow DTCS old SSTable filtering to use min timestamp instead of max

     [ https://issues.apache.org/jira/browse/CASSANDRA-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcus Eriksson resolved CASSANDRA-11060.
-----------------------------------------
    Resolution: Won't Fix

Using max_sstable_age_days has been deprecated in favor of limiting the size of the time windows, background here: CASSANDRA-10280

> Allow DTCS old SSTable filtering to use min timestamp instead of max
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-11060
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11060
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sam Bisbee
>              Labels: dtcs
>
> We have observed a DTCS behavior when using TTLs where SSTables are never or very rarely fully expired due to compaction, allowing expired data to be "stuck" in large partially expired SSTables.
> This is because compaction filtering is performed on the max timestamp, which continues to grow as SSTables are compacted together. This means they will never move past max_sstable_age_days. With a sufficiently large TTL, like 30 days, this allows old but not expired SSTables to continue combining and never become fully expired, even with a max_sstable_age_days of 1.
> As a result we have seen expired data hang around in large SSTables for over six months longer than it should have. This is obviously wasteful and a disk capacity issue.
> As a result we have been running an extended version of DTCS called MTCS in some deployments. The only change is that it uses min timestamp instead of max for compaction filtering (filterOldSSTables()). This allows SSTables to move beyond max_sstable_age_days and stop compacting, which means the entire SSTable can become fully expired and be dropped off disk as intended.
> You can see and test MTCS here: https://github.com/threatstack/mtcs
> I am not advocating that MTCS be its own stand alone compaction strategy. However, I would like to see a configuration option for DTCS that allows you to specify whether old SSTables should be filtered on min or max timestamp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)