You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Marcus Eriksson (JIRA)" <ji...@apache.org> on 2014/09/25 16:56:35 UTC

[jira] [Comment Edited] (CASSANDRA-7019) Major tombstone compaction

    [ https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147823#comment-14147823 ] 

Marcus Eriksson edited comment on CASSANDRA-7019 at 9/25/14 2:56 PM:
---------------------------------------------------------------------

branch here: https://github.com/krummas/cassandra/commits/marcuse/7019-2

triggered with nodetool compact -o <ks> <cf>

It writes fully compacted partitions - each partition will only be in one single sstable  - my first idea was to put the cells back in the corresponding files where they were found (minus tombstones), but it felt wrong to not actually write the compacted partition out when we have it.

LCS:
* creates an 'optimal' leveling - it takes all existing files, compacts them, and starts filling each level from L0 up
** note that (if we have token range 0 -> 1000) L1 will get tokens 0 -> 10, L2 11 -> 100 and L3 101 -> 1000. Not thought much about if this is good/bad for future compactions.

STCS:
* calculates an 'optimal' distribution of sstables, currently it makes them 50%, 25%, 12.5% ... of total data size until the smallest sstable would be sub 50MB, then puts all the rest in the last sstable. If anyone has a more optimal sstable distribution, please let me know
** the sstables will be non-overlapping, it starts writing the biggest sstable first and continues with the rest once 50% is in that


was (Author: krummas):
branch here: https://github.com/krummas/cassandra/commits/marcuse/7019-2

triggered with nodetool compact -o <ks> <cf>

It writes fully compacted partitions - each partition will only be in one single sstable  - my first idea was to put the cells back in the corresponding files where they were found (minus tombstones), but it felt wrong to not actually write the compacted partition out when we have it.

LCS:
* creates an 'optimal' leveling - it takes all existing files, compacts them, and starts filling each level from L0 up
** note that (if we have token range 0 -> 1000) L1 will get tokens 0 -> 10, L2 11 -> 100 and L3 101 -> 1000. Not though much about if this is good/bad for future compactions.

STCS:
* calculates an 'optimal' distribution of sstables, currently it makes them 50%, 25%, 12.5% ... of total data size until the smallest sstable would be sub 50MB, then puts all the rest in the last sstable. If anyone has a more optimal sstable distribution, please let me know
** the sstables will be non-overlapping, it starts writing the biggest sstable first and continues with the rest once 50% is in that

> Major tombstone compaction
> --------------------------
>
>                 Key: CASSANDRA-7019
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>              Labels: compaction
>
> It should be possible to do a "major" tombstone compaction by including all sstables, but writing them out 1:1, meaning that if you have 10 sstables before, you will have 10 sstables after the compaction with the same data, minus all the expired tombstones.
> We could do this in two ways:
> # a nodetool command that includes _all_ sstables
> # once we detect that an sstable has more than x% (20%?) expired tombstones, we start one of these compactions, and include all overlapping sstables that contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)