You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2019/06/10 16:14:00 UTC

[jira] [Comment Edited] (CASSANDRA-15066) Improvements to Internode Messaging

    [ https://issues.apache.org/jira/browse/CASSANDRA-15066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860113#comment-16860113 ] 

Benedict edited comment on CASSANDRA-15066 at 6/10/19 4:13 PM:
---------------------------------------------------------------

The patch is now ready to commit from my point of view, with many small fixes and clarity improvements.  On my part, the main work since our last discussion has been to introduce {{ConnectionBurnTest}} and its corresponding {{Verifier}}, that together exercise and verify a wide variety of connection behaviours.  This isn’t completely exhaustive, but it is _close_, and helped shake out many of the bugs we have fixed.  [~ifesdjeen] has also been offering excellent review feedback and bug reports, that have been incorporated into both the stylistic improvements (particularly the {{OutboundConnection}} state machine) and fixes below.

h3. Improvements
* {{OutboundConnection}}: introduce simple state machine for connection status
* Hooks for verification, including {{OutboundMessageCallbacks}} and {{OutboundDebugCallbacks}}
* Log canonical and actual addresses on connection
* Simplify/clarify {{OutboundConnectionSettings}} and {{OutboundConnection.template}} semantics
* Abstract {{MonotonicClock}} source we use, so that users can opt to pay the cost of {{System.nanoTime}} and avoid the trade-offs inherent in using a clock with {{~2ms+scheduler error}} granularity if they prefer.

h3. Fixes
* Event Loop delivery schedules self for later execution if work remains after completing a batch, instead of spinning until link saturated
* Large message delivery did not handle exception recovery correctly
    * Could have attempted to continue delivery after invalidating connection
    * Could have failed to re-schedule after invalidation
* Stop using autoRead to prevent queued reads
* Accurately handle error bounds in when using approximate time 
* Ensure impossible to overflow stack in OMQ.lock by moving to iterative vs recursive evaluation
* Data race between large and event loop thread when closing connection
* Do not immediately propagate connection close in decoder; defer until all received messages have been processed
* {{FrameDecoderLegacy}} should not {{discard}} itself on encountering a corrupt frame; waits for IMH to do so
* {{OutboundConnectionInitiator}} cancellation would leak connection if cancelled after established but before handshake completed
* LARGE_MESSAGE_THRESHOLD includes frame overheads

h3. Unrelated Fixes
* CASSANDRA-10726 introduced an unnecessary digest calculation for CL {{ONE}} reads



was (Author: benedict):
The patch is now ready to commit from my point of view, with many small fixes and clarity improvements.  On my part, the main work since our last discussion has been to introduce {{ConnectionBurnTest}} and its corresponding {{Verifier}}, that together exercise and verify a wide variety of connection behaviours.  This isn’t completely exhaustive, but it is _close_, and helped shake out many of the bugs we have fixed.  [~ifesdjeen] has also been offering excellent review feedback and bug reports, that have been incorporated into both the stylistic improvements (particularly the OutboundConnection state machine) and fixes below.

h3. Improvements
* {{OutboundConnection}}: introduce simple state machine for connection status
* Hooks for verification, including {{OutboundMessageCallbacks}} and {{OutboundDebugCallbacks}}
* Log canonical and actual addresses on connection
* Simplify/clarify {{OutboundConnectionSettings}} and {{OutboundConnection.template}} semantics
* Abstract {{MonotonicClock}} source we use, so that users can opt to pay the cost of {{System.nanoTime}} and avoid the trade-offs inherent in using a clock with {{~2ms+scheduler error}} granularity if they prefer.

h3. Fixes
* Event Loop delivery schedules self for later execution if work remains after completing a batch, instead of spinning until link saturated
* Large message delivery did not handle exception recovery correctly
    * Could have attempted to continue delivery after invalidating connection
    * Could have failed to re-schedule after invalidation
* Stop using autoRead to prevent queued reads
* Accurately handle error bounds in when using approximate time 
* Ensure impossible to overflow stack in OMQ.lock by moving to iterative vs recursive evaluation
* Data race between large and event loop thread when closing connection
* Do not immediately propagate connection close in decoder; defer until all received messages have been processed
* {{FrameDecoderLegacy}} should not {{discard}} itself on encountering a corrupt frame; waits for IMH to do so
* {{OutboundConnectionInitiator}} cancellation would leak connection if cancelled after established but before handshake completed
* LARGE_MESSAGE_THRESHOLD includes frame overheads

h3. Unrelated Fixes
* CASSANDRA-10726 introduced an unnecessary digest calculation for CL {{ONE}} reads


> Improvements to Internode Messaging
> -----------------------------------
>
>                 Key: CASSANDRA-15066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Internode
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: High
>             Fix For: 4.0
>
>         Attachments: 20k_backfill.png, 60k_RPS.png, 60k_RPS_CPU_bottleneck.png, backfill_cass_perf_ft_msg_tst.svg, baseline_patch_vs_30x.png, increasing_reads_latency.png, many_reads_cass_perf_ft_msg_tst.svg
>
>
> CASSANDRA-8457 introduced asynchronous networking to internode messaging, but there have been several follow-up endeavours to improve some semantic issues.  CASSANDRA-14503 and CASSANDRA-13630 are the latest such efforts, and were combined some months ago into a single overarching refactor of the original work, to address some of the issues that have been discovered.  Given the criticality of this work to the project, we wanted to bring some more eyes to bear to ensure the release goes ahead smoothly.  In doing so, we uncovered a number of issues with messaging, some of which long standing, that we felt needed to be addressed.  This patch widens the scope of CASSANDRA-14503 and CASSANDRA-13630 in an effort to close the book on the messaging service, at least for the foreseeable future.
> The patch includes a number of clarifying refactors that touch outside of the {{net.async}} package, and a number of semantic changes to the {{net.async}} packages itself.  We believe it clarifies the intent and behaviour of the code while improving system stability, which we will outline in comments below.
> https://github.com/belliottsmith/cassandra/tree/messaging-improvements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org