You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Anuj Wadehra (JIRA)" <ji...@apache.org> on 2016/01/20 03:57:40 UTC
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas

    [ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107876#comment-15107876 ] 

Anuj Wadehra commented on CASSANDRA-10446:
------------------------------------------

I think, this an issue with the way we handled the "downed replica" scenario in repairs. We should increase the priority and change the type from Improvement to Bug.

Consider following scenario and flow of events which demonstrate the importance of this issue:
Scenario: I have a 20 node clsuter, RF=5, Read/Write Quorum, gc grace period=20. My cluster is fault tolerant and it can afford 2 node failures.

Suddenly, one node goes down due to some hardware issue. The failed node would prevent repair on many nodes in the cluster as it has approximately 5/20th share of total data ..1/20 which it owns and 4/20 which is stored as replica of data owned by other nodes. Now Its 10 days since the node is down, most of the nodes are not being repaired and now its decision time. I am not sure how soon the issue would be fixed may be next 2 days i.e. 8 days before gc grace, so I shouldnt remove node early and add node back as it would cause significant and unnecessary streaming due to token re-arrangement. At the same time, if I dont remove the failed node at this time i.e. 10 days (much before gc grace), my entire system health would be in question and it would be a panic situation as most of the data didnt get repaired in last 10 days and gc grace is approaching. I need sufficient time to repair all nodes.
What looked like a fault tolerant Cassandra cluster which can easily afford 2 node failure, required urgent attention and manual decision making when a single node went down. If some replicas are down, we should allow Repair to proceed with remaining replicas. If failed nodes comes up before gc grace period, we would run repair to fix inconsistencies and otheriwse we would discard data and bootstrap. I think that would be a really robust fault tolerant system.



> Run repair with down replicas
> -----------------------------
>
>                 Key: CASSANDRA-10446
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10446
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Priority: Minor
>             Fix For: 3.x
>
>
> We should have an option of running repair when replicas are down. We can call it -force.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)