You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (Created) (JIRA)" <ji...@apache.org> on 2011/10/15 00:42:11 UTC

[jira] [Created] (CASSANDRA-3362) allow sub-row repair

allow sub-row repair
--------------------

                 Key: CASSANDRA-3362
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3362
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Jonathan Ellis
            Assignee: Sylvain Lebresne
            Priority: Minor


With large rows, it would be nice to not have to send an entire row if a small part is out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3362) allow sub-row repair

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151199#comment-13151199 ] 

Sylvain Lebresne commented on CASSANDRA-3362:
---------------------------------------------

Right now we're using tokens range to create the merkle tree, so we cannot repair less than a token without major changes to repair.
Besides, repair needs to use the same "atoms" on all the node it repairs, so I don't think the row index blocks would qualify since they differ from node to node.

Overall, I don't see how that can be done with the current repair.
                
> allow sub-row repair
> --------------------
>
>                 Key: CASSANDRA-3362
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3362
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>              Labels: repair
>
> With large rows, it would be nice to not have to send an entire row if a small part is out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3362) allow sub-row repair

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3362:
----------------------------------------

    Assignee:     (was: Sylvain Lebresne)
    
> allow sub-row repair
> --------------------
>
>                 Key: CASSANDRA-3362
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3362
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>              Labels: repair
>
> With large rows, it would be nice to not have to send an entire row if a small part is out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3362) allow sub-row repair

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151353#comment-13151353 ] 

Sylvain Lebresne commented on CASSANDRA-3362:
---------------------------------------------

For the record, I certainly don't pretend it's a good thing. I would even add that it will also be a problem in the hypothesis of CASSANDRA-1684.
                
> allow sub-row repair
> --------------------
>
>                 Key: CASSANDRA-3362
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3362
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>              Labels: repair
>
> With large rows, it would be nice to not have to send an entire row if a small part is out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3362) allow sub-row repair

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426958#comment-13426958 ] 

Jonathan Ellis commented on CASSANDRA-3362:
-------------------------------------------

Notes from chat:

When we repair [0, 1000], we agree on some level for the merkle tree, say 2, and we say the merkle tree leaves will be [0, 250], [250, 500], [500, 750], [750, 1000]
then each node calculate the hash for those leave base on their keys, and we compare.

We could make it a two step process, where everyone starts w/ the power of 2 tree, but then A can say "i have row 10 with a billion columns, let's subdivide [0, 250] into [0, (10, 500000000)] and [(10, 500000000), 250].

The drawback then is that you will do a first validation pass to agree on the subdivisions, then another to compute the actual hashes.

Or, we could first do a merkle tree as we do now, then for the ranges that differ, if we know they cover lots of columns (which can be computed easily initially), we could compute smaller hash ranges before streaming anything.  You'd still read everything twice in the worst case, but if most rows are small then you don't need to read much the second time.

In the meantime, if you can shard your huge rows instead at the app level that will work better.
                
> allow sub-row repair
> --------------------
>
>                 Key: CASSANDRA-3362
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3362
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>              Labels: repair
>
> With large rows, it would be nice to not have to send an entire row if a small part is out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3362) allow sub-row repair

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267326#comment-13267326 ] 

Sylvain Lebresne commented on CASSANDRA-3362:
---------------------------------------------

Un-assigning myself for now as I have 0 idea on how to do that. As said previously, I'm skeptical that our current repair is compatible with this so imo this ticket amounts to redo repair pretty much completely. 
                
> allow sub-row repair
> --------------------
>
>                 Key: CASSANDRA-3362
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3362
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>              Labels: repair
>
> With large rows, it would be nice to not have to send an entire row if a small part is out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3362) allow sub-row repair

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3362:
--------------------------------------

    Priority: Major  (was: Minor)
    
> allow sub-row repair
> --------------------
>
>                 Key: CASSANDRA-3362
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3362
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>              Labels: repair
>
> With large rows, it would be nice to not have to send an entire row if a small part is out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3362) allow sub-row repair

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151320#comment-13151320 ] 

Jonathan Ellis commented on CASSANDRA-3362:
-------------------------------------------

That's a pretty big ouch for wide-row data models.  If you're doing tens of appends per second to one of those, the odds are pretty good that your merkle trees will be out of sync at any given instant, and you end up streaming the entire row.
                
> allow sub-row repair
> --------------------
>
>                 Key: CASSANDRA-3362
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3362
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>              Labels: repair
>
> With large rows, it would be nice to not have to send an entire row if a small part is out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira