You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "David Capwell (Jira)" <ji...@apache.org> on 2020/02/19 23:18:00 UTC

[jira] [Commented] (CASSANDRA-15553) Preview repair should include sstables from finalized incremental repair sessions

    [ https://issues.apache.org/jira/browse/CASSANDRA-15553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040506#comment-17040506 ] 

David Capwell commented on CASSANDRA-15553:
-------------------------------------------

Took a look and had to look closer at IR messaging, what I see is the following

IR messaging is fire-and-forget pattern, so any ephemeral issues lead to messages not being seen (tests show this CASSANDRA-15564 and have been reported as issues with current repair CASSANDRA-15566).  This patch relies on the FINALIZE_COMMIT_MSG being seen on the coordinator of the IR preview repair in order to detect conflict, but the message is seen asynchronously so may see this on the participants while validation is running and seen on the coordinator after all validations have been seen on the coordinator (so session is already complete); in this case you have the same issue as reported by this JIRA.

This patch also affectively blocks preview and IR running for the same range as the preview will fail with conflict*, so IR should stop scheduling if preview is running, and preview should not be scheduled while IR is running (else we waste the resources on validation); effectively what ever is scheduling the repairs will have to be enhanced to handle this which adds more complexity to operators.

I actually wonder if we can remove this restriction.  What it looks like to me is that repairedAt is system time (aka, could have drift, could roll backwards, etc.), but we could keep track of largest one and make sure this counter is monotonic.  With a data structure of 

* largest contiguous commit (long)
* inFlight (array of long)

We could make sure that we (coordinator) always produce a repairedAt larger than any we know of, and this lets preview take a snapshot of the state at the start of coordination. With this snapshot, we filter for repaired and repairedAt <= largest contiguous commit snapshot; this should give preview repair effectively snapshot isolation (assuming compaction also maintains repairedAt).

* In CASSANDRA-15564 I show that preview doesn't properly check session failures, run [this test|https://github.com/apache/cassandra/pull/446/files#diff-af4a07a2b44695f510dddb0c102e1953R28] and [this one|https://github.com/apache/cassandra/pull/446/files#diff-ca9f3b43ad8ff955d6ddd2ef4d2b6904R28] without the change in the JIRA to see it.  The reason your tests are different is because you don't use nodetool and directly monitor notifications.

> Preview repair should include sstables from finalized incremental repair sessions
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15553
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15553
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 4.0-alpha
>
>
> When running a preview repair we currently grab all repaired sstables, problem is that we depend on compaction to move the sstables from pending to repaired so we might have different data marked repaired on different nodes. Including any sstables from finalized incremental repair sessions as repaired will solve this.
> Another problem is that validations don't start at exactly the same time on different nodes, so if an incremental repair finishes while the preview repair is running we might also validate the wrong repaired set. We should fail the preview repair if an intersecting incremental repair finishes during the preview repair.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org