You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Marcus Eriksson (Jira)" <ji...@apache.org> on 2021/11/24 08:21:00 UTC

[jira] [Updated] (CASSANDRA-17168) Don't block gossip when clearing snapshots for failing repairs

     [ https://issues.apache.org/jira/browse/CASSANDRA-17168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcus Eriksson updated CASSANDRA-17168:
----------------------------------------
     Bug Category: Parent values: Availability(12983)Level 1 values: Unavailable(12994)
       Complexity: Normal
      Component/s: Consistency/Repair
    Discovered By: Adhoc Test
    Fix Version/s: 4.0.x
                   4.x
        Reviewers: David Capwell
         Severity: Normal
           Status: Open  (was: Triage Needed)

trunk:
https://github.com/apache/cassandra/pull/1340
https://app.circleci.com/pipelines/github/krummas/cassandra?branch=marcuse%2F17168-trunk
4.0:
https://github.com/apache/cassandra/pull/1341 
https://app.circleci.com/pipelines/github/krummas/cassandra?branch=marcuse%2F17168

note that the trunk version includes a change to the PREPARE message to include repair parallelism instead of setting a flag on ParentRepairSession

> Don't block gossip when clearing snapshots for failing repairs
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-17168
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17168
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 4.0.x, 4.x
>
>
> We clear snapshots in the GossipTasks thread when a repair session fails due to a replica shutting down. If there are many tables/repair sessions ongoing this can take a long time. With enough tables being repaired at the same time even checking if the snapshots exists can take long enough to mark nodes down.
> We should clear snapshots in a separate thread and add a flag to tell us whether this repair session can have snapshots to avoid checking if the directory exists.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org