You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2013/11/27 01:26:35 UTC

[jira] [Updated] (CASSANDRA-3578) Multithreaded commitlog

     [ https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benedict updated CASSANDRA-3578:
--------------------------------

    Attachment: oprate.svg
                latency.svg

A patch for this is available for review at [3578-2|https://github.com/belliottsmith/cassandra/tree/iss-3578-2]

Already discussed:
- Chained headers
- Ensures commits are persistent, using the suggested synchronisation scheme (read/write lock)

Further changes:
- Writes are completely non-blocking unless the CLE is behind or you're using Batch CLE
- On activating a new CLS, we trigger a sync() of the log; so now we sync() ever pollInterval elapsed, OR commit_log_segment_size_in_mb written, whichever condition is met first after the previous sync. This allows us to stay a little ahead of pollInterval, giving us some breathing room during "brief" spikes in write load in excess of what the disk can handle.
- Once we've completely written a CLS we immediately close/unmap the buffer
- On any drop keyspace or column family command, or on a node drain, we force the recycling of any CLS in use at the time of the call (this addresses CASSANDRA-5911. I included it in this ticket as it was easier to think about both at once)

Some implementation detail changes:
- We maintain a separate cfDirty and cfClean set now, which we merge on demand, to avoid allocating/deallocating AtomicIntegers all of the time
- We now reject row mutations that are only HALF the size of the CL, as opposed to equal in size - this is to stop burning through lots of CLS if we try to switch to a new segment but then are beaten to allocating the first item in it.

Some future work:
- Could reasonably easily have a guaranteed non-blocking CL.add method, which yields a Future if blocking becomes necessary; this could allow us to short-circuit the write-path a little to reduce latency in the majority of cases where blocking doesn't happen
- Compressed CL to improve IO
- Need to improve error handling in CL in general

Note, Vijay, that I briefly switched to a simpler blocking approach to switching in a new segment, as you suggested you preferred the simpler approach, but I decided to revert to non-blocking, due to potential future dividends with this guarantee.

I've attached two graphs to demonstrate the effect of this patch in a real 4-node cluster. Note the latency graph has a logarithmic y-axis, so this patch looks to be an order of magnitude better at worst write latency measured; also variance in latency at the tail end is lower. This is also why there are fewer measurements, as the stderr of the measurements was smaller, so stress finished earlier. Also a roughly 12% increase in maximum throughput on this particular cluster.

> Multithreaded commitlog
> -----------------------
>
>                 Key: CASSANDRA-3578
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Vijay
>            Priority: Minor
>              Labels: performance
>         Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, Current-CL.png, Multi-Threded-CL.png, latency.svg, oprate.svg, parallel_commit_log_2.patch
>
>
> Brian Aker pointed out a while ago that allowing multiple threads to modify the commitlog simultaneously (reserving space for each with a CAS first, the way we do in the SlabAllocator.Region.allocate) can improve performance, since you're not bottlenecking on a single thread to do all the copying and CRC computation.
> Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes doable.
> (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)