You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jason Brown (JIRA)" <ji...@apache.org> on 2016/07/08 22:32:11 UTC

[jira] [Commented] (CASSANDRA-8457) nio MessagingService

    [ https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368604#comment-15368604 ] 

Jason Brown commented on CASSANDRA-8457:
----------------------------------------

Here's the first pass at switching internode messaging to netty.
||8457||
|[branch|https://github.com/jasobrown/cassandra/tree/8457]|
|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-8457-dtest/]|
|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-8457-testall/]|

I've tried to preserve all of the functionality/behaviors of the existing implementation as I could, and some aspects were a bit tricky in the non-blocking IO, netty world. I've also extensively documented the code as much as I can, and I still want to add more high-level docs on 1) the internode protocol itself, and 2) the use of netty in internode messaging. Hopefully the current state of documentation helps understanding and reviewing the changes. Here's some high-level notes as points of departure/interest/discussion:

- I've left the existing {{OutboundTcpConnection}} code largely intact for the short term (read on for more detail). But mostly the new and existing behaviors and coexist in the code together (though not at run time)
- There is a yaml property to enable/disable using netty for internode messaging. If disabled, we'll fall back to the existing {{OutboundTcpConnection}} code. Part of this stems from the fact that streaming also uses the same the socket infrastructure as internode messaging handshake as messaging, and streaming would be broken without the {{OutboundTcpConnection}} implementation. I am knees deep in switching streaming over to a non-blocking, netty-based solution, but that is a separate ticket/body of work.
- In order to support non-blocking IO, I've altered the internode messaging protocol such that each message is framed, and the frame contains a message size. The protocol change is what forces these changes to happen at a major rev update, hence 4.0
- Backward compatibility - We will need to handle the case of cluster upgrade where some nodes are on the previous version of the protocol (not upgraded), and some are upgraded. The upgraded nodes will still need to behave and operate correctly with the older nodes, and that functionality is encapsulated and documented in {{LegacyClientHandler}} (for the receive side) and {{MessageOutHandler}} for the send side.
- Message coalescing - The existing behaviors in {{CoalescingStrategies}} are predicated on parking the thread to allow outbound messages to arrive (and be coalesced). Parking a thread in a non-blocking/netty context is a bad thing, so I've inverted the behavior of message coalescing a bit. Instead of blocking a thread, I've extended the {{CoalescingStrategies.CoalescingStrategy}} implementations to return a 'time to wait' to left messages arrive for sending. I then schedule a task in the netty scheduler to execute that many nanoseconds in the future, queuing up incoming message, and then send them out when the scheduled task executes (this is {{CoalescingMessageOutHandler}}). I've also added callback functions to {{CoalescingStrategies.CoalescingStrategy}} implementations for the non-blocking paradigm to record updates to the strategy (for recalculation of the time window, etc).
- Message flushing - Currently in {{OutboundTcpConnection}}, we only call flush on the output stream if the backlog is empty (there's no more messages to send to the peer). Unfortunately there's no equivalent API in netty to know there's any messages in the channel waiting to be sent. The solution that I've gone with is to have a shared counter outside of the channel {{InternodeMessagingConnection#outboundCount}} and inside the channel {{CoalescingMessageOutHandler#outboundCounter}}, and when {{CoalescingMessageOutHandler}} sees the value of that counter is zero, it knows it can explicitly call flush. I'm not entirely thrilled with this approach, and there's some potential race/correctness problems (and complexity!) when reconnections occur, so I'm open to suggestions on how to achieve this functionality.
- I've included support for netty's OpenSSL library. The operator will need to deploy an extra netty jar (http://netty.io/wiki/forked-tomcat-native.html) to get the OpenSSL behavior (I'm not sure if we can or want to include it in our distro). {{SSLFactory}} needed to be refactored a bit to support the OpenSSL functionality.

I'll be doing some more extensive testing next week (including a more thorough exploration of the backward compatibility).

> nio MessagingService
> --------------------
>
>                 Key: CASSANDRA-8457
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Priority: Minor
>              Labels: netty, performance
>             Fix For: 4.x
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big contributor to context switching, especially for larger clusters.  Let's look at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)