You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Peter Schuller <pe...@infidyne.com> on 2010/07/18 11:10:54 UTC
Re: GC Storm

(adding dev@)

> (2) Can we implement multi-thread compaction?

I think this is the only way to scale. Or at least to implement
concurrent compaction (whether it is by division into threads or not)
of multiple size classes. As long as the worst-case compactions are
significantly slower than best-base compactions, then presumably you
will have the problem of accumulation of lots of sstables during long
compactions. Since having few sstables is part of the design goal (or
so I have assumed, or else you will seek to much on disk when doing
e.g. a range query), triggering situations where this is not the case
is a performance problem for readers.

I've been thinking about this for a bit and maybe there could be one
tweakable configuration setting which sets the desired machine
concurrency, that the user tweaks in order to make compaction fast
enough in relation to incoming writes. Regardless of the database
size, this is necessary whenever cassandra is able to take writes
faster than a CPU-bound compaction thread is able to process them.

The other thing would be to have an intelligent compaction scheduler
that does something along the lines of scheduling a compaction thread
for every "level" of compaction (i.e., one for log_m(n) = 1, one for
log_m(n) = 2, etc). To avoid inefficiency and huge spikes in CPU
usage, these compaction threads could stop every now and then
(something reasonable; say ever 100 mb compacted or something) and
yield to other compaction threads.

This way:

(a) a limited amount of threads will be actively runnable at any given
moment, allowing the user to limit the effect of background compaction
can have on CPU usage
(b) but on the other hand, it also means that more than one CPU can be
used; whatever is appropriate for the cluster
(c) it should be reasonably easy to implement because each compaction
is just a regular thread doing what it does now already
(d) the synchronization overhead between compaction threads should be
completely irrelevant as long as one selects a high enough
synchronization threshold (100 mb was just a suggestion; might be 1
gig).
(e) log_m(n) will never be large enough for it to be a scaling problem
that you have one thread per "level"

Thoughts?

-- 
/ Peter Schuller