You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "mck (JIRA)" <ji...@apache.org> on 2017/08/24 04:30:02 UTC

[jira] [Commented] (CASSANDRA-10496) Make DTCS/TWCS split partitions based on time during compaction

    [ https://issues.apache.org/jira/browse/CASSANDRA-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139536#comment-16139536 ] 

mck commented on CASSANDRA-10496:
---------------------------------

While completely forgetting about this ticket, I did the following (completely untested) experiment: https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_twcs-sstable-size

This took a slightly different approach in that it splits compacted sstables based on the 'sstable_size_in_mb' option (which is only used as a hint really), but still sorts partitions over the splits by time. It wouldn't isolate specific "old" data into seperate old sstables as the ticket description describes, but it helps in the situation where different TTLs are used within the same TWCS table, and would partially help in the ticket's described use-case. In the same line of thinking it's worth noting that CASSANDRA-10540 can help in situations (depending on the extent and distribution of the problem).

> Make DTCS/TWCS split partitions based on time during compaction
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-10496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10496
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>              Labels: dtcs
>             Fix For: 4.x
>
>
> To avoid getting old data in new time windows with DTCS (or related, like [TWCS|CASSANDRA-9666]), we need to split out old data into its own sstable during compaction.
> My initial idea is to just create two sstables, when we create the compaction task we state the start and end times for the window, and any data older than the window will be put in its own sstable.
> By creating a single sstable with old data, we will incrementally get the windows correct - say we have an sstable with these timestamps:
> {{[100, 99, 98, 97, 75, 50, 10]}}
> and we are compacting in window {{[100, 80]}} - we would create two sstables:
> {{[100, 99, 98, 97]}}, {{[75, 50, 10]}}, and the first window is now 'correct'. The next compaction would compact in window {{[80, 60]}} and create sstables {{[75]}}, {{[50, 10]}} etc.
> We will probably also want to base the windows on the newest data in the sstables so that we actually have older data than the window.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org