You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "David Strauss (JIRA)" <ji...@apache.org> on 2012/09/18 04:52:07 UTC

[jira] [Commented] (CASSANDRA-2338) C* consistency level needs to be pluggable

    [ https://issues.apache.org/jira/browse/CASSANDRA-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457563#comment-13457563 ] 

David Strauss commented on CASSANDRA-2338:
------------------------------------------

Just chiming in with another use case: first data found. We have a large CF indexed by the hash of what's in the row. Basically, any non-empty result is guaranteed to be the same. The common case is for all replicas to be up-to-date, so CL.ONE is almost ideal. But, we also write with CL.ONE, and that means we only have eventually consistent read-after-write. A pluggable CL would allow something more ideal.

The ideal would be:
1. Send read to all replicas.
2. On the first non-NotFoundException response, we're done.
3. If all replicas return NotFoundException, then it truly doesn't exist.

CL.ONE just goes with the NotFoundException if it's the first response. CL.ALL waits for all responses, even if the first one is all we need.

This may be useful in the reverse, too:
1. Send read to all replicas.
2. On the first NotFoundException response, we're done.
3. If all replicas return data, then use the one with the latest timestamp.

I also second Peter's request for controlling what extent requests get sent to all replicas on reads. In our case, it's usually okay to wait a bit longer but only put I/O load on one out of all the replica boxes. This map well to the algorithms above in higher-latency scenarios.
                
> C* consistency level needs to be pluggable
> ------------------------------------------
>
>                 Key: CASSANDRA-2338
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2338
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Matthew F. Dennis
>            Priority: Minor
>
> for cases where people want to run C* across multiple DCs for disaster recovery et cetera where normal operations only happen in the first DC (e.g. no writes/reads happen in the remove DC under normal operation) neither LOCAL_QUORUM or EACH_QUORUM really suffices.  
> Consider the case with RF of DC1:3 DC2:2
> LOCAL_QUORUM doesn't provide any guarantee that data is in the remote DC.
> EACH_QUORUM requires that both nodes in the remote DC are up.
> It would be useful in some situations to be able to specify a strategy where LOCAL_QUORUM is used for the local DC and at least one in a remote DC (and/or at least in *each* remote DC).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira