You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefania (JIRA)" <ji...@apache.org> on 2015/07/01 05:47:05 UTC

[jira] [Commented] (CASSANDRA-7404) Use direct i/o for sequential operations (compaction/streaming)

    [ https://issues.apache.org/jira/browse/CASSANDRA-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609521#comment-14609521 ] 

Stefania commented on CASSANDRA-7404:
-------------------------------------

Sure, have a look at CASSANDRA-8897 when you have time. One thing I forgot to mention and that is probably an issue is that we only support pooling for up to 64k, so we would need to support larger sizes to limit disk seeks I believe. There is a global bound for the pool size, beyond which we allocate directly (on or off heap depending on another setting). The executive summary on how the pool works is this: a set of macro slabs are kept by a global pool and sliced into slabs of 64k that are given out to thread-local pools. Each 64k slab is divided into units of 1k, (the minimum allocation unit) and we keep track of which units are in use via a long value, which we use as a bitmask (one bit per unit). When a thread is done with a slab it gives it back to the global pool. A thread is allowed to release a buffer allocated by another thread. The global pool itself slices the macro slabs into 64 units and we have a plan to extend this mechanism to support larger or smaller sizes.

I'll try to complete CASSANDRA-8894 so that we are in a better position to decide on how to manage buffer sizing for direct IO.



> Use direct i/o for sequential operations (compaction/streaming)
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-7404
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7404
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jason Brown
>            Assignee: Ariel Weisberg
>              Labels: performance
>             Fix For: 3.x
>
>
> Investigate using linux's direct i/o for operations where we read sequentially through a file (repair and bootstrap streaming, compaction reads, and so on). Direct i/o does not go through the kernel page page, so it should leave the hot cache pages used for live reads unaffected.
> Note: by using direct i/o, we will probably take a performance hit on reading the file we're sequentially scanning through (that is, compactions may get slower), but the goal of this ticket is to limit the impact of these background tasks on the main read/write functionality. Of course, I'll measure any perf hit that is incurred, and see if there's any mechanisms to mitigate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)