You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Jason Brown (JIRA)" <ji...@apache.org> on 2013/11/21 07:11:37 UTC

[jira] [Comment Edited] (CASSANDRA-1632) Thread workflow and cpu affinity

    [ https://issues.apache.org/jira/browse/CASSANDRA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828522#comment-13828522 ] 

Jason Brown edited comment on CASSANDRA-1632 at 11/21/13 6:09 AM:
------------------------------------------------------------------

Attached patch to switch to a batch read from the internal queues in OTC and PeriodicCommitLogExecutorService. I'll wait on the addressing the ExecutorPools until CASSANDRA-4718.


was (Author: jasobrown):
Attached patch switching to a batch read from internal queues to OTC and PeriodicCommitLogExecutorService. I'll wait on the addressing the ExecutorPools until CASSANDRA-4718.

> Thread workflow and cpu affinity
> --------------------------------
>
>                 Key: CASSANDRA-1632
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1632
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Jason Brown
>              Labels: performance
>         Attachments: 1632_batchRead-v1.diff, threadAff_reads.txt, threadAff_writes.txt
>
>
> Here are some thoughts I wanted to write down, we need to run some serious benchmarks to see the benefits:
> 1) All thread pools for our stages use a shared queue per stage. For some stages we could move to a model where each thread has its own queue. This would reduce lock contention on the shared queue. This workload only suits the stages that have no variance, else you run into thread starvation. Some stages that this might work: ROW-MUTATION.
> 2) Set cpu affinity for each thread in each stage. If we can pin threads to specific cores, and control the workflow of a message from Thrift down to each stage, we should see improvements on reducing L1 cache misses. We would need to build a JNI extension (to set cpu affinity), as I could not find anywhere in JDK where it was exposed. 
> 3) Batching the delivery of requests across stage boundaries. Peter Schuller hasn't looked deep enough yet into the JDK, but he thinks there may be significant improvements to be had there. Especially in high-throughput situations. If on each consumption you were to consume everything in the queue, rather than implying a synchronization point in between each request.



--
This message was sent by Atlassian JIRA
(v6.1#6144)