You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by San Luoji <sa...@gmail.com> on 2015/10/10 18:24:03 UTC

A number of questions on LeveledCompactionStrategy

Hi,

I've got a number of questions when looking into LCS in Cassandra. Could
somebody help to enlighten me?

1. Will LCS always strive to clean up L0 sstable? i.e. whenever a new L0
sstable shows up, it will trigger LCS compaction to upgrade it into higher
level? If that’s what’s happening, what sstable(s) are involved in this
compaction? Just this new L0 sstable or more sstables will be involved?
2. If L[N] still hasn’t reached its threshold (10^N sstables) then no
sstable will be placed in L[N+1], correct?
3. How do the partitions in a sstable move up the levels? Assuming I’ve got
a 500MB L0 sstable and half of the partitions in there have new partition
keys, the other 25% match existing partition keys in L1, and 25% match
existing partition keys in L2, also assuming L0 only has this one sstable,
L1 already has 10 sstables, L2 already has 100 sstables, when LCS works on
compacting this 500MB L0 sstable, where does it decide to move the content
to higher levels? What about a situation where this 500MB L0 sstable has
all new partition keys (never inserted into this CQL table before) and L1
and L2 are already at threshold 10 and 100 respectively?

I’ve read this blog
http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra,
but it seems that we only specify this rule “Within each level, sstables
are guaranteed to be non-overlapping. Each level is ten times as large as
the previous”, and the answers to the above questions are subject to
implementation details.

Lastly, I noticed the following statement at the end of that blog: “Leveled
compaction ignores the concurrent_compactors setting. Concurrent compaction
is designed to avoid tiered compaction’s problem of a backlog of small
compaction sets becoming blocked temporarily while the compaction system is
busy with a large set. Leveled compaction does not have this problem, since
all compaction sets are roughly the same size. Leveled compaction does
honor the multithreaded_compaction setting, which allows using one thread
per sstable to speed up compaction.” Is it still accurate? It appears that
multithreaded_compaction is removed from the recent Cassandra versions.

Re: A number of questions on LeveledCompactionStrategy

Posted by Jeff Ferland <jb...@tubularlabs.com>.

> On Oct 10, 2015, at 9:24 AM, San Luoji <sa...@gmail.com> wrote:
> 
> Hi,
> 
> I've got a number of questions when looking into LCS in Cassandra. Could somebody help to enlighten me?
> 
> 1. Will LCS always strive to clean up L0 sstable? i.e. whenever a new L0 sstable shows up, it will trigger LCS compaction to upgrade it into higher level? If that’s what’s happening, what sstable(s) are involved in this compaction? Just this new L0 sstable or more sstables will be involved?

See Cassandra will try to dump L0 into L1 when it reaches ~600 MB in size. (4 * max size of 160mb) and will use size-tiered compaction in that level until it reached this size. It will also revert to using size tiered compaction if there are more than 32 SSTables in L0 until it can compact the tier below that number. A notable issue when you dump in data faster than it can percolate to higher levels.

This is based on the getCandidatesFor function from https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java

> 2. If L[N] still hasn’t reached its threshold (10^N sstables) then no sstable will be placed in L[N+1], correct?

Generally correct. It’s actually sized based in bytes, so if you have a lot of expiring tombstones, you may find your table count growing above 10^N.

> 3. How do the partitions in a sstable move up the levels? Assuming I’ve got a 500MB L0 sstable and half of the partitions in there have new partition keys, the other 25% match existing partition keys in L1, and 25% match existing partition keys in L2, also assuming L0 only has this one sstable, L1 already has 10 sstables, L2 already has 100 sstables, when LCS works on compacting this 500MB L0 sstable, where does it decide to move the content to higher levels? What about a situation where this 500MB L0 sstable has all new partition keys (never inserted into this CQL table before) and L1 and L2 are already at threshold 10 and 100 respectively?

500MB L0: nothing happens yet. Let’s say you have 1000MB L0 in 4 tables:

The oldest table is selected to compact into L1. Tables that overlap with it (probably all of them) are also selected. L0 and all the tables in L1 that it overlaps with (probably all of them) are compacted together with each table being flushed to disk at 160MB (or whatever other size you configured). L1 will now probably have 15 tables in it ((160 * 10) + (1000 * .75)). The five oldest ones will now compact up into L2.

Overlap doesn’t really matter in calculating. The final size after compaction is the answer. If the tier isn’t L0 and size of the tier is > 10^N * 160MB, then the oldest tables will compact up. If it’s L0, see above.

> 
> I’ve read this blog http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra <http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra>, but it seems that we only specify this rule “Within each level, sstables are guaranteed to be non-overlapping. Each level is ten times as large as the previous”, and the answers to the above questions are subject to implementation details.
> 
> Lastly, I noticed the following statement at the end of that blog: “Leveled compaction ignores the concurrent_compactors setting. Concurrent compaction is designed to avoid tiered compaction’s problem of a backlog of small compaction sets becoming blocked temporarily while the compaction system is busy with a large set. Leveled compaction does not have this problem, since all compaction sets are roughly the same size. Leveled compaction does honor the multithreaded_compaction setting, which allows using one thread per sstable to speed up compaction.” Is it still accurate? It appears that multithreaded_compaction is removed from the recent Cassandra versions.

Probably not accurate, especially regarding that multithreaded_compaction is removed. I don’t see any special casing in https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java> either, but won’t try to definitively speak to it.

-Jeff