You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stu Hood (JIRA)" <ji...@apache.org> on 2009/09/27 08:01:16 UTC
[jira] Issue Comment Edited: (CASSANDRA-193) Proactive repair
[ https://issues.apache.org/jira/browse/CASSANDRA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759990#action_12759990 ]
Stu Hood edited comment on CASSANDRA-193 at 9/26/09 11:00 PM:
--------------------------------------------------------------
I've been working on this ticket a bit more in the past few days:
* Added o.a.c.service.AntiEntropyService - Maintains trees for each CF, and accepts invalidations when values change.
Still TODO:
* Implement TreeRequestVerbHandler/TreeResponseVerbHandler - The AEService on a first endpoint will periodically wake up and send a TreeRequest to a replica. The replica endpoint will handle the TreeRequest by validating one or all of its MerkleTrees, and responding with a TreeResponse. Handling the TreeResponse on the first endpoint will involve validating the local tree, and then comparing the two trees.
* Validation is the only part that is fuzzy here: we need to iterate over keys in each CF (essentially, a major compaction, except that we can skip processing for anything that is still valid in the tree).
* Begin implementing the actual repair step - There isn't a design for this part yet: any thoughts would be appreciated. The output of the TreeRequest/TreeResponse conversation will be a list of ranges in a given CF that disagree between the two endpoints.
EDIT: The code is still located at: http://github.com/stuhood/cassandra-anti-entropy/
was (Author: stuhood):
I've been working on this ticket a bit more in the past few days:
* Added o.a.c.service.AntiEntropyService - Maintains trees for each CF, and accepts invalidations when values change.
Still TODO:
* Implement TreeRequestVerbHandler/TreeResponseVerbHandler - The AEService on a first endpoint will periodically wake up and send a TreeRequest to a replica. The replica endpoint will handle the TreeRequest by validating one or all of its MerkleTrees, and responding with a TreeResponse. Handling the TreeResponse on the first endpoint will involve validating the local tree, and then comparing the two trees.
* Validation is the only part that is fuzzy here: we need to iterate over keys in each CF (essentially, a major compaction, except that we can skip processing for anything that is still valid in the tree).
* Begin implementing the actual repair step - There isn't a design for this part yet: any thoughts would be appreciated. The output of the TreeRequest/TreeResponse conversation will be a list of ranges in a given CF that disagree between the two endpoints.
> Proactive repair
> ----------------
>
> Key: CASSANDRA-193
> URL: https://issues.apache.org/jira/browse/CASSANDRA-193
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Stu Hood
> Fix For: 0.5
>
> Attachments: CASSANDRA-193.diff
>
>
> Currently cassandra supports "read repair," i.e., lazy repair when a read is done. This is better than nothing but is not sufficient for some cases (e.g. catastrophic node failure where you need to rebuild all of a node's data on a new machine).
> Dynamo uses merkle trees here. This is harder for Cassandra given the CF data model but I suppose we could just hash the serialized CF value.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.