You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Yifan Cai (Jira)" <ji...@apache.org> on 2019/10/29 17:52:00 UTC

[jira] [Commented] (CASSANDRA-15350) Add CAS “uncertainty” and “contention" messages that are currently propagated as a WriteTimeoutException.

    [ https://issues.apache.org/jira/browse/CASSANDRA-15350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16962278#comment-16962278 ] 

Yifan Cai commented on CASSANDRA-15350:
---------------------------------------

In the current {{cas}} implementation, WriteTimeoutExceptions ({{WriteType.CAS}}) are thrown under the following scenarios
 * The overall {{cas}} operation times out.
 * The PREPARE phase times out.
 ** Multiple unsuccessful retries and eventually times out
 ** RPC requests with nodes time out. (networking)
 ** Multiple proposers contend.Each proposer get promise from the majority and pre-empt the other proposers from proceeding to PROPOSE phase. When the other proposers (thinking they are still the winners, but in fact not) send proposal, they gets rejections from *ALL* acceptors. Such contention continues and time runs out.
 ** A repair attempt is added in this phase.
 *** Propose to replay the previous accepted update timeouts
 *** Commit the update timeout
 * The PROPOSE phase times out.
 ** RPC requests with nodes time out. (networking)
 ** Send proposal to *ALL* acceptors and wait,
 *** If successful, i.e. majority accepts, we are good.
 *** If *all* acceptors rejects, it is safe for the proposer to re-submit the proposal with a higher ballot.
 *** {color:#ff8b00}If some but *not quorum* accepts, the proposal may or may not be replayed by new proposers. (Uncertainty){color}
 **** If the new proposer reaches to the acceptors that accepted the old proposal, it replays the proposal when it is the most recent in-progress one.
 **** If the new proposer does not reach to those acceptors, it is free for the new proposer to choose a value and possibly making the earlier proposal to not be qualified for replaying.
 * The COMMIT phase times out.
 ** Apply update times out. Note that this is a normal write. The WriteType is {{SIMPLE}} instead of {{CAS}}
 ** Only exception is when the timeout is from the repair attempt in the PREPARE phase. In this case, the WriteType is overridden to {{CAS}}

Most of the timeouts are genuine in the list, except the one colored in {color:#ff8b00}orange{color}.

> Add CAS “uncertainty” and “contention" messages that are currently propagated as a WriteTimeoutException.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15350
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15350
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/Lightweight Transactions
>            Reporter: Alex Petrov
>            Priority: Normal
>              Labels: client-impacting, protocolv5
>
> Right now, CAS uncertainty introduced in https://issues.apache.org/jira/browse/CASSANDRA-6013 is propagating as WriteTimeout. One of this conditions it manifests is when there’s at least one acceptor that has accepted the value, which means that this value _may_ still get accepted during the later round, despite the proposer failure. Similar problem happens with CAS contention, which is also indistinguishable from the “regular” timeout, even though it is visible in metrics correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org