You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2012/08/02 00:21:04 UTC

[jira] [Commented] (CASSANDRA-3362) allow sub-row repair

    [ https://issues.apache.org/jira/browse/CASSANDRA-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426958#comment-13426958 ] 

Jonathan Ellis commented on CASSANDRA-3362:
-------------------------------------------

Notes from chat:

When we repair [0, 1000], we agree on some level for the merkle tree, say 2, and we say the merkle tree leaves will be [0, 250], [250, 500], [500, 750], [750, 1000]
then each node calculate the hash for those leave base on their keys, and we compare.

We could make it a two step process, where everyone starts w/ the power of 2 tree, but then A can say "i have row 10 with a billion columns, let's subdivide [0, 250] into [0, (10, 500000000)] and [(10, 500000000), 250].

The drawback then is that you will do a first validation pass to agree on the subdivisions, then another to compute the actual hashes.

Or, we could first do a merkle tree as we do now, then for the ranges that differ, if we know they cover lots of columns (which can be computed easily initially), we could compute smaller hash ranges before streaming anything.  You'd still read everything twice in the worst case, but if most rows are small then you don't need to read much the second time.

In the meantime, if you can shard your huge rows instead at the app level that will work better.
                
> allow sub-row repair
> --------------------
>
>                 Key: CASSANDRA-3362
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3362
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>              Labels: repair
>
> With large rows, it would be nice to not have to send an entire row if a small part is out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira