You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jeff Jirsa (JIRA)" <ji...@apache.org> on 2015/02/06 00:24:37 UTC

[jira] [Comment Edited] (CASSANDRA-8703) incremental repair vs. bitrot

    [ https://issues.apache.org/jira/browse/CASSANDRA-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308231#comment-14308231 ] 

Jeff Jirsa edited comment on CASSANDRA-8703 at 2/5/15 11:24 PM:
----------------------------------------------------------------

I've got a version at https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the scrub read path and implements nodetool verify / sstableverify. This works, for both compressed and uncompressed, but requires walking the entire sstable and verifies each on disk atom.  This works, it just isn't very fast (though it is thorough). 

The faster method will be checking against the Digest.sha1 file (which actually contains an adler32 hash), and skipping the full iteration. I'll rebase and work that in, using the 'walk all atoms' approach above as an optional extended verify (-e) or similar, unless someone objects. Also going to rename the DIGEST sstable component to Digest.adler32 since it's definitely not sha1 anymore. 


was (Author: jjirsa):
I've got a version at https://github.com/jeffjirsa/cassandra/commits/cassandra-8703 that follows the scrub read path and implements nodetool verify / sstableverify. This works, for both compressed and uncompressed, but requires walking the entire sstable and verifies each on disk atom.  This works, it just isn't very fast (though it is thorough). 

The faster method will be checking against the Digest.sha1 file (which actually contains an adler32 hash), and skipping the full iteration. I'll rebase and work that in, using the 'walk all atoms' approach above as an optional extended verify (-e) or similar, unless someone objects.

> incremental repair vs. bitrot
> -----------------------------
>
>                 Key: CASSANDRA-8703
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8703
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Robert Coli
>            Assignee: Jeff Jirsa
>
> Incremental repair is a great improvement in Cassandra, but it does not contain a feature that non-incremental repair does : protection against bitrot.
> Scenario :
> 1) repair SSTable, marking it repaired
> 2) cosmic ray hits hard drive, corrupting a record in SSTable
> 3) range is actually unrepaired as of the time that SSTable was repaired, but thinks it is repaired
> From my understanding, if bitrot is detected (via eg the CRC on the read path) then all SSTables containing the corrupted range needs to be marked unrepaired on all replicas. Per marcuse@IRC, the naive/simplest response would be to just trigger a full repair in this case.
> I am concerned about incremental repair as an operational default while it does not handle this case. As an aside, this would also seem to require a new CRC on the uncompressed read path, as otherwise one cannot detect the corruption without periodic checksumming of SSTables. Alternately, a "nodetool checksum" function which verified table checksums, marking ranges unrepaired on failure, and which could be run every gc_grace_seconds would seem to meet the requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)