You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Branimir Lambov (JIRA)" <ji...@apache.org> on 2016/07/04 12:11:10 UTC

[jira] [Commented] (CASSANDRA-12071) Regression in flushing throughput under load after CASSANDRA-6696

    [ https://issues.apache.org/jira/browse/CASSANDRA-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15361225#comment-15361225 ] 

Branimir Lambov commented on CASSANDRA-12071:
---------------------------------------------

+1 to commit.

The growing number of threads in this piece of code is a bit worrying, but improving that needs some non-trivial restructuring (which should come soon enough as part of CASSANDRA-8496).

> Regression in flushing throughput under load after CASSANDRA-6696
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-12071
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12071
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Ariel Weisberg
>            Assignee: Marcus Eriksson
>             Fix For: 3.9
>
>
> The way flushing used to work is that a ColumnFamilyStore could have multiple Memtables flushing at once and multiple ColumnFamilyStores could flush at the same time. The way it works now there can be only a single flush of any ColumnFamilyStore & Memtable running in the C* process, and the number of threads applied to that flush is bounded by the number of disks in JBOD.
> This works ok most of the time but occasionally flushing will be a little slower and ingest will outstrip it and then block on available memory. At this point you see several second stalls that cause timeouts.
> This is a problem for reasonable configurations that don't use JBOD but have access to a fast disk that can handle some IO queuing (RAID, SSD).
> You can reproduce on beefy hardware (12 cores 24 threads, 64 gigs of RAM, SSD) if you unthrottle compaction or set it to something like 64 megabytes/second and run with 8 compaction threads and stress with the default write workload and a reasonable number of threads. I tested with 96.
> It started happening after about 60 gigabytes of data was loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)