You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sandeep Tata (JIRA)" <ji...@apache.org> on 2009/06/12 23:35:07 UTC
[jira] Issue Comment Edited: (CASSANDRA-225) Support mastered writes

    [ https://issues.apache.org/jira/browse/CASSANDRA-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718984#action_12718984 ] 

Sandeep Tata edited comment on CASSANDRA-225 at 6/12/09 2:33 PM:
-----------------------------------------------------------------

Okay this is an ugly first cut, but I want to put it out there so you guys have a chance to provide comments on the design as I hack up the rest of this feature. 

Basic idea --- calls that go to the primary (currently defined as the first endpoint in the list) are applied locally then asynchronously sent to the other replicas. If the node is not the primary, it forwards the request to the primary and waits for a response before acking.

1. I didn't add a whole bunch of calls in the interface -- I stole the block=1 values to mean mastered writes for now. This is not unreasonable since even non-blocking writes give you a read-your-writes semantics. block=1 doesn't really mean much right now. Of course, this is not clean, and I expect to change it once we're happy to expose this in the interface. You need to turn on "MasteredUpdatesForBlockOne" in the conf file to use it.

2. This does not (yet) work in the presence of failures. It is possible that some failure scenarios lead to a state where 2 nodes both think they're "masters". The easiest way to solve this is using a safe leader-election algorithm using Zookeeper. That'll have to be in round 2 of the patch.

Of course, if you don't turn on MasteredUpdatesForBlockOne, you never touch this code path.



      was (Author: sandeep_tata):
    Okay this is an ugly first cut, but I want to put it out there so you guys have a chance to provide comments on the design as I hack up the rest of this feature. 

Basic idea --- calls that go to the primary (currently defined as the first endpoint in the list) are applied locally then asynchronously sent to the other replicas. If the node is not the primary, it forwards the request to the primary and waits for a response before acking.

1. I didn't add a whole bunch of calls in the interface -- I stole the block=1 values to mean mastered writes for now. This is not unreasonable since even non-blocking writes give you a read-your-writes semantics. block=1 doesn't really mean much right now. Of course, this is not clean, and I expect to change it once we're happy to expose this in the interface. You need to turn on "UseMasteredWritesForBlockOne" in the conf file to use it.

2. This does not (yet) work in the presence of failures. It is possible that some failure scenarios lead to a state where 2 nodes both think they're "masters". The easiest way to solve this is using a safe leader-election algorithm using Zookeeper. That'll have to be in round 2 of the patch.

Of course, if you don't turn on UseMasteredWritesForBlockOne, you never touch this code path.


  
> Support mastered writes
> -----------------------
>
>                 Key: CASSANDRA-225
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-225
>             Project: Cassandra
>          Issue Type: New Feature
>    Affects Versions: 0.4
>         Environment: all
>            Reporter: Sandeep Tata
>            Assignee: Sandeep Tata
>             Fix For: 0.4
>
>         Attachments: 225.patch
>
>
> Writes to a row today can be run on any of the replicas that own the row. An additional set of APIs to perform "mastered" writes that funnel through a primary is important if applications have some operations that require higher consistency. Test-and-set is an example of one such operation that requires a higher consistency guarantee.
> To stay true to Cassandra's performance goals, this should be done in a way that does not compromise performance for apps that can deal with lower consistency and never use these APIs. That said, an app that mixes higher consistency calls with lower consistency calls should be careful that they don't operate on the same data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.