You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Aleksei Zotov (Jira)" <ji...@apache.org> on 2021/09/20 16:33:00 UTC

[jira] [Comment Edited] (CASSANDRA-14930) decommission may cause timeout because messaging backlog is cleared

    [ https://issues.apache.org/jira/browse/CASSANDRA-14930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17417719#comment-17417719 ] 

Aleksei Zotov edited comment on CASSANDRA-14930 at 9/20/21, 4:32 PM:
---------------------------------------------------------------------

Sorry guys, I checked the wrong commit! So just deleted my previous comment to prevent any confusion.

However, I have some comments to the actual PR:
 # Looks like a new property ({{cassandra.messaging_destroy_delay_in_ms}}) is going to be introduced. I think it needs to be described in {{cassandra.yaml}} and other documentation. Moreover, as far as I understand work with properties needs to happen through the {{DatabaseDescriptor}} , not just system properties.
 # 
{code:java}
if (delay <= 0) // opt out{code}
Is it really possible? If yes, could you please describe a scenario? We either take a max of positive values or read it from the property (that should have corresponding validation of being non-negative or positive).
 # Here I do not really know the flow well, but just to confirm: the node should *not* be among the unreachable endpoint to close the connection?
{code:java}
if (!liveEndpoints.contains(endpoint) && !unreachableEndpoints.containsKey(endpoint)){code}
 # Should we have corresponding tests introduced? 

 

Taking into account that it is just a bug fix for old versions probably #1 and #4 might be not really critical (but I'd still remove the property unless there is a justification of having it). #2 also does not seem to affect something even though might be not required.

 

 


was (Author: azotcsit):
Sorry guys, I checked the wrong commit! So just deleted my previous comment to prevent any confusion.

However, I have some comments to the actual PR:
 # Looks like a new property ({{cassandra.messaging_destroy_delay_in_ms}}) is going to be introduced. I think it needs to be described in {{cassandra.yaml}} and other documentation. Moreover, as far as I understand work with properties needs to happen through the {{DatabaseDescriptor}} , not just system properties.
 # 
{code:java}
if (delay <= 0) // opt out{code}
Is it really possible? If yes, could you please describe a scenario? We either take a max of positive values or read it from the property (that should have corresponding validation of being non-negative or positive).
 # Here I do not really know the flow well, but just to confirm: the node should *not* be among the unreachable endpoint to close the connection?

{code:java}
if (!liveEndpoints.contains(endpoint) && !unreachableEndpoints.containsKey(endpoint)){code}

 # Should we have corresponding tests introduced? 

 

Taking into account that it is just a bug fix for old versions probably #1 and #4 might be not really critical (but I'd still remove the property unless there is a justification of having it). #2 also does not seem to affect something even though might be not required.

 

 

> decommission may cause timeout because messaging backlog is cleared 
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-14930
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14930
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Coordination, Legacy/Core
>            Reporter: Zhao Yang
>            Assignee: Zhao Yang
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x
>
>
> On a 3-node cluster with RF=2, decommissioning a node may cause quorum write timeout because messaging backlog to decommissioned node is cleared via {{Gossiper#removeEndpoint() -> OutboundTcpConnection#closeSocket()}}.
>  (Timeout is less likely to happen with RF=3, because we can afford one less response)
> {code:java}
> What happened:
> 1. [WriteStage] before the leaving node is removed from tokenmetadata, the write endpoints are generated ( leaving endpoint is included )
> 2. [GossipStage] the leaving node is removed from tokenmetadata, no more future write handler will include leaving endpoints
> 3. [WriteStage] write handlers sends messages to messaging-service backlog
> 4. [GossipStage] messaging-service backlog is cleared, messages are not sent and connection closed
> 5. [WriteStage] write time out
>  {code}
> |patch|
> |[3.0|https://github.com/jasonstack/cassandra/commits/decommission_timeout_3.0]|
> |[3.11|https://github.com/jasonstack/cassandra/commits/decommission_timeout_3.11]|
> We can avoid it by delaying to destroy messaging connection so that messages are sent and responded. This patch also avoids reopening already closed connection on {{MessagingService#convict()}}.
>  New messaging framework rewrite in {{Trunk}} avoids the issues by not clearing messaging backlog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org