You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2013/10/03 17:42:49 UTC

[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

    [ https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785287#comment-13785287 ] 

Jonathan Ellis commented on CASSANDRA-6142:
-------------------------------------------

I tried parallelizing at the OnDiskAtomIterator level instead (thread-per-iterator-per-partition, buffering into a queue) and for small partitions the performance is ridiculously bad, easily 100x worse than single threaded mode.

Any better ideas [~krummas] [~yukim] [~iamaleksey] [~slebresne]?  If not I will post a patch to rip out PCI.

> Remove multithreaded compaction
> -------------------------------
>
>                 Key: CASSANDRA-6142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 2.1
>
>
> There is at best a very small sweet spot for multithreaded compaction (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall back to a single LCR pass.  For small rows, the overhead of the coordination outweighs the benefits of parallelization (45s to compact 2x1M stress rows with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)