You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Donald Smith (JIRA)" <ji...@apache.org> on 2013/12/30 19:14:57 UTC

[jira] [Comment Edited] (CASSANDRA-5396) Repair process is a joke leading to a downward spiralling and eventually unusable cluster

    [ https://issues.apache.org/jira/browse/CASSANDRA-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858966#comment-13858966 ] 

Donald Smith edited comment on CASSANDRA-5396 at 12/30/13 6:14 PM:
-------------------------------------------------------------------

We ran "nodetool repair -pr" on one node of a three node cluster running on production-quality hardware, each node with about 1TB of data. It was using cassandra version 2.0.3. After 5 days it was still running and had apparently frozen.  See https://issues.apache.org/jira/browse/CASSANDRA-5220 (Dec 23 comment by Donald Smith) for more detail.      We tried running repair on our smallest column family (with 12G of data), and it took 31 hours to complete.    We're not yet in production but we plan on not running repair, since we do very few deletes or updates and since we don't trust it. Also, our data isn't critical.


was (Author: thinkerfeeler):
We ran "nodetool repair -pr" on one node of a three node cluster running on production-quality hardware, each node with about 1TB of data. It was using cassandra version 2.0.3. After 5 days it was still running and had apparently frozen.  See https://issues.apache.org/jira/browse/CASSANDRA-5220 (Dec 23 comment by Donald Smith) for more detail.      We tried running repair on our smallest column family (with 12G of data), and it took 31 hours to complete.    We're not yet in production but we plan on not running repair, since we do very few deletes or updates and since we don't trust it.

> Repair process is a joke leading to a downward spiralling and eventually unusable cluster
> -----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5396
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5396
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2.3
>         Environment: all
>            Reporter: David Berkman
>            Priority: Critical
>
> Let's review the repair process...
> 1) It's mandatory to run repair.
> 2) Repair has a high impact and can take hours.
> 3) Repair provides no estimation of completion time and no progress indicator.
> 4) Repair is extremely fragile, and can fail to complete, or become stuck quite easily in real operating environments.
> 5) When repair fails it provides no feedback whatsoever of the problem or possible resolution.
> 6) A failed repair operation saddles the effected nodes with a huge amount of extra data (judging from node size).
> 7) There is no way to rid the node of the extra data associated with a failed repair short of completely rebuilding the node.
> 8) The extra data from a failed repair makes any subsequent repair take longer and increases the likelihood that it will simply become stuck or fail, leading to yet more node corruption.
> 9) Eventually no repair operation will complete successfully, and node operations will eventually become impacted leading to a failing cluster.
> Who would design such a system for a service meant to operate as a fault tolerant clustered data store operating on a lot of commodity hardware?
> Solution...
> 1) Repair must be robust.
> 2) Repair must *never* become 'stuck'.
> 3) Failure to complete must result in reasonable feedback.
> 4) Failure to complete must not result in a node whose state is worse than before the operation began.
> 5) Repair must provide some means of determining completion percentage.
> 6) It would be nice if repair could estimate its run time, even if it could do so only based upon previous runs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)