You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict Elliott Smith (Jira)" <ji...@apache.org> on 2020/05/05 12:11:00 UTC

[jira] [Comment Edited] (CASSANDRA-15745) Conflicting LWT transactions may be committed during topology change

    [ https://issues.apache.org/jira/browse/CASSANDRA-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099825#comment-17099825 ] 

Benedict Elliott Smith edited comment on CASSANDRA-15745 at 5/5/20, 12:10 PM:
------------------------------------------------------------------------------

Thanks for the report [~Osipov].  Just to reformulate (mostly just removing "Topology change is advertised on A", as I don't believe this is a necessary step):
 # Topology change starts on C, replacing A with D, only visible on C
 # CAS1 starts on C with \{A, B, C, D}
 # CAS2 (ballot > CAS1) starts on A with \{A, B, C}
 # CAS1 prepares on \{B, C, D} (timeout on A)
 # CAS2 prepares and accepts on \{A, B} (timeout on C); commits on A; terminates
 # CAS1 accepts on D; terminates
 # Topology change finishes (A is removed), visible globally
 # CAS3 prepares with \{C, D}, sees accept of CAS1 and re-proposes it (with a newer ballot)

Unfortunately this isn't trivial to fix, though there is more than one approach.  I happen to have an incomplete piece of work that should be able to address this issue, but I have no timeline on when I may be able to propose it here as a patch.


was (Author: benedict):
Thanks for the report [~Osipov].  Just to reformulate (mostly just removing "Topology change is advertised on A", as I don't believe this is a necessary step):
 # Topology change starts on C, replacing A with D, only visible on C
 # CAS1 starts on C with \{A, B, C, D}
 # CAS2 (ballot > CAS1) starts on A with {A, B, C}
 # CAS1 prepares on {B, C, D} (timeout on A)
 # CAS2 prepares and accepts on \{A, B} (timeout on C); commits on A; terminates
 # CAS1 accepts on D; terminates
 # Topology change finishes (A is removed), visible globally
 # CAS3 prepares with \{C, D}, sees accept of CAS1 and re-proposes it (with a newer ballot)

Unfortunately this isn't trivial to fix, though there is more than one approach.  I happen to have an incomplete piece of work that should be able to address this issue, but I have no timeline on when I may be able to propose it here as a patch.

> Conflicting LWT transactions may be committed during topology change
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-15745
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15745
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Feature/Lightweight Transactions
>            Reporter: Konstantin
>            Priority: Normal
>
> Let's consider a cluster which consists of replicas A, B and C.
> We're adding replica D which replaces A.
> A scenario is possible when two conflicting transactions, CAS1 and CAS2, may be committed during replace:
> CAS2 ballot > CAS1 ballot
> CAS2 and CAS1 conflict  on LWT condition, yet both of them may be committed in  case of the following sequence of events:
> Topology change starts, advertises on C
> CAS1 starts on node C, uses {A, B, C, D}
> CAS2 starts on node A, still uses {A, B, C}
> Topology change is advertised on A
> CAS1 prepares on {B, C, D}
> CAS2 prepares and accepts on {A, B}, commits on A
> CAS1 accepts on D, then stops
> Streaming starts, topology change finishes, A is removed
> CAS3 prepares using C and D. It sees the accept of CAS1 and replays it
> Both CAS1 and CAS2 are committed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org