You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benjamin Roth (JIRA)" <ji...@apache.org> on 2017/03/03 09:42:46 UTC

[jira] [Commented] (CASSANDRA-12888) Incremental repairs broken for MVs and CDC

    [ https://issues.apache.org/jira/browse/CASSANDRA-12888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894006#comment-15894006 ] 

Benjamin Roth commented on CASSANDRA-12888:
-------------------------------------------

I am about to hack a proof of concept for that issue.

Concept:
Each mutation and each partition update have a "repairedAt" flag. This will be passed along through the whole write path like MV updates and serialization for remote MV updates. Then repair + non-repair mutations have to be separated in memtables and flushed to separate SSTables. From what I can see it should be easier to maintain a memtable each for repaired and non-repaired data than tracking the repair state within a memtable.

My question is:
How important is the exact value of "repairedAt". Is it possible to merge updates with different repair timestamps into a single memtable and finally flush them to an SSTable with repairedAt set to the latest or earliest repairedAt timestamps of all mutations in the memtable?
Or would that produce repair-inconsistencies or sth?

Any feedback?

> Incremental repairs broken for MVs and CDC
> ------------------------------------------
>
>                 Key: CASSANDRA-12888
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12888
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>            Reporter: Stefan Podkowinski
>            Assignee: Benjamin Roth
>            Priority: Critical
>             Fix For: 3.0.x, 3.11.x
>
>
> SSTables streamed during the repair process will first be written locally and afterwards either simply added to the pool of existing sstables or, in case of existing MVs or active CDC, replayed on mutation basis:
> As described in {{StreamReceiveTask.OnCompletionRunnable}}:
> {quote}
> We have a special path for views and for CDC.
> For views, since the view requires cleaning up any pre-existing state, we must put all partitions through the same write path as normal mutations. This also ensures any 2is are also updated.
> For CDC-enabled tables, we want to ensure that the mutations are run through the CommitLog so they can be archived by the CDC process on discard.
> {quote}
> Using the regular write path turns out to be an issue for incremental repairs, as we loose the {{repaired_at}} state in the process. Eventually the streamed rows will end up in the unrepaired set, in contrast to the rows on the sender site moved to the repaired set. The next repair run will stream the same data back again, causing rows to bounce on and on between nodes on each repair.
> See linked dtest on steps to reproduce. An example for reproducing this manually using ccm can be found [here|https://gist.github.com/spodkowinski/2d8e0408516609c7ae701f2bf1e515e8]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)