You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2014/12/11 19:08:13 UTC
[jira] [Commented] (CASSANDRA-8457) nio MessagingService

    [ https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242859#comment-14242859 ] 

Benedict commented on CASSANDRA-8457:
-------------------------------------

FTR, I strongly doubt _"context switching"_ is actually as much of a problem as we think, although constraining it is never a bad thing. The big hit we have is _thread signalling_ costs, which is a different but related beast. Certainly the talking point that raised this was discussing system time spent serving "context switches" which would definitely be referring to signalling, not the switching itself.

Now, we do use a BlockingQueue for OutboundTcpConnection which will incur these costs, however I strongly suspect the impact will be much lower than predicted - especially as the testing done to flag this up was on small clusters with RF=1, where these threads would not be being exercised at all. The costs of going to the network itself are likely to exceed the context switching costs, and naturally permit messages to accumulate in the queue, reducing the number of signals actually needed. 

There's then the negative performance implications we have found from small numbers of connections under NIO to consider, so that this change could have significant downsides for the majority of deployed clusters (although if we get batching in the client driver we may see these penalties disappear).

To establish if there's likely a benefit to exploit, we could most likely refactor this code comparatively minimally (than rewriting to NIO/Netty) to make use of the SharedExecutorPool to establish if such a positive effect is indeed to be had, as this would reduce the number of threads in flight to those actually serving work on the OTCs. This wouldn't affect the ITC, but I am dubious of their contribution. We should probably also actually test if this is indeed a problem from clusters at scale performing in-memory CL>1 reads.


> nio MessagingService
> --------------------
>
>                 Key: CASSANDRA-8457
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Ariel Weisberg
>              Labels: performance
>             Fix For: 3.0
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big contributor to context switching, especially for larger clusters.  Let's look at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)