You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefan Podkowinski (JIRA)" <ji...@apache.org> on 2019/02/18 08:43:00 UTC

[jira] [Commented] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

    [ https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16770869#comment-16770869 ] 

Stefan Podkowinski commented on CASSANDRA-15027:
------------------------------------------------

* [ [trunk|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-15027] ][ [circleci|https://circleci.com/workflow-run/2b027f87-cf45-48ee-8eae-45a563701bc6] ]

> Handle IR prepare phase failures less race prone by waiting for all results
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15027
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair, Local/Compaction
>            Reporter: Stefan Podkowinski
>            Assignee: Stefan Podkowinski
>            Priority: Major
>             Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a {{PrepareConsistentRequest}} message to all participants, which may also include the coordinator itself. Participants will run anti-compactions upon receiving such a message and report the result of the operation back to the coordinator.
> Once we receive a failure response from any of the participants, we fail-fast in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant, we may end up with a local session and submitted anti-compactions, which will be executed without any coordination with the coordinator session (on same node). This may result in situations where running repair commands right after another, may cause overlapping execution of anti-compactions that will cause the following (misleading) message to show up in the logs and will cause the repair to fail again:
>  "Prepare phase for incremental repair session %s has failed because it encountered intersecting sstables belonging to another incremental repair session (%s). This is by starting an incremental repair session before a previous one has completed. Check nodetool repair_admin for hung sessions and fix them."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org