You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2014/05/06 10:22:15 UTC

[jira] [Commented] (CASSANDRA-6995) Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to read stage

    [ https://issues.apache.org/jira/browse/CASSANDRA-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990416#comment-13990416 ] 

Sylvain Lebresne commented on CASSANDRA-6995:
---------------------------------------------

So what's the status of this particular issue? Is the patch still pending for review, or are we saying "we don't want to do that but we might be doing something else", in which later case, let's drop the 'patch available' status.

> Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to read stage
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6995
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6995
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.0.8
>
>         Attachments: 6995-v1.diff, syncread-stress.txt
>
>
> When performing a read local to a coordinator node, AbstractReadExecutor will create a new SP.LocalReadRunnable and drop it into the read stage for asynchronous execution. If you are using a client that intelligently routes  read requests to a node holding the data for a given request, and are using CL.ONE/LOCAL_ONE, the enqueuing SP.LocalReadRunnable and waiting for the context switches (and possible NUMA misses) adds unneccesary latency. We can reduce that latency and improve throughput by avoiding the queueing and thread context switching by simply executing the SP.LocalReadRunnable synchronously in the request thread. Testing on a three node cluster (each with 32 cpus, 132 GB ram) yields ~10% improvement in throughput and ~20% speedup on avg/95/99 percentiles (99.9% was about 5-10% improvement).



--
This message was sent by Atlassian JIRA
(v6.2#6252)