You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2013/01/03 15:30:13 UTC
[jira] [Updated] (CASSANDRA-4858) Coverage analysis for low-CL queries

     [ https://issues.apache.org/jira/browse/CASSANDRA-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-4858:
----------------------------------------

    Attachment: 4858-v3.txt

I'm afraid the endpoint inclusion as done by v2 is not as efficient as could be. Consider a 5 nodes, RF=3, no DC and query at CL.ONE setup. As it happens, the first endpoint for any given range won't be in the list of endpoint for the next range. So we'll end up merging no range and doing 5 range queries, even though 2 would be enough to cover the whole range.

So to minimize the number of range queried I'm pretty sure the best option is for a given range to consider the intersection of its endpoints and the ones of the next range. I'm attaching a v3 patch that implements what I have in mind.

I note that this v3 pull the logic that compute whether a list of live endpoint can fulfill a given consistency level from ReadCallback and WriteResponseHandler into the ConsistencyLevel class. The reason is that patch needs that logic before the ReadCallback has been created. But I think this is a good refactor as this logic belong to ConsistencyLevel anyway.

This made me realise there is a complication however, which is that we probably need to take datacenters and maybe even the endpoint latency scores into account.  Say a range has for replica [A, B] and the next range has replica [B, C] and CL == ONE. You could merge both range and send the request to B, but if say B is in a remote datacenter while A and C are in the local one, maybe doing 2 queries to A and C would actually be better. Same if B is local but very very slow. To try to handle that, the v3 patch move that decision to the snitch and the default implementation consider only endpoints in the localDC in the intersection of endpoints used to decided whether we can/should merge two consecutive ranges. We could then have the dymanic snitch do something special, like not consider endpoint with a very bad latency score when computing the intersection, but I haven't implemented that yet, because it's unclear to me where to draw the limit.

I've done a few quick tests with this patch. For a 5 nodes, RF=3, no DC setup, without the patch we query 5 ranges at CL.ONE and 10 at CL.QUORUM to cover the full ring (SELECT * FROM foo). With the patch, we query 2 ranges at CL.ONE and 6 at CL.QUORUM. And as expected, in the vnodes case with in a single node setup, the same SELECT * requires only 1 internal query instead of 256.

                
> Coverage analysis for low-CL queries
> ------------------------------------
>
>                 Key: CASSANDRA-4858
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4858
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Vijay
>             Fix For: 1.2.1
>
>         Attachments: 0001-CASSANDRA-4858.patch, 0001-CASSANDRA-4858-v2.patch, 4858-v3.txt
>
>
> There are many cases where getRangeSlice creates more
> RangeSliceCommand than it should, because it always creates one for each range
> returned by getRestrictedRange.  Especially for CL.ONE this does not take
> the replication factor into account and is potentially pretty wasteful.
> A range slice at CL.ONE on a 3 node cluster with RF=3 should only
> ever create one RangeSliceCommand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira