You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2012/07/26 11:44:35 UTC

[jira] [Commented] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes

    [ https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422986#comment-13422986 ] 

Sylvain Lebresne commented on CASSANDRA-1337:
---------------------------------------------

The patch almost move the check for "do we have enough rows for the query" inside the "don't use the local path for that range", which is broken (as in "uselessly inefficient"), because it means if we do have enough rows locally, we will still check another range.

But probably more importantly, I'm confused on how this patch works.

First, it estimated how much the query of *a range* will likely yield based on the *total* number of keys on the node. We should at least divide this by the replication factor to have a proper estimate of the number-of-keys-per-range.

Second, it seems to correctly parallelize "old-style" range_slice (i.e. range_slice for thrift without any IndexExpression), but I think it doesn't correctly handle neither secondary indexes (which was the main goal of the patch I believe), not any kind of range slices when CQL3 is involved:
* For secondary index queries, the number or rows returned doesn't depend at all on the number of row keys the node holds (and thus if follows that estimating the number of parallel queries to do based on that parameter is broken), it depends on how many columns the row for the most selective index contains. So for the concurrencyFactor we should 1) figure out which index will be used and 2) probably use the estimated mean columns count for that index.
* For CQL3 queries, they use the maxIsColumns parameters (for both traditional *and* 2ndary index queries) and so the number of rows returned shouldn't be directly compared to maxResults (i.e. if the first row we find has enough columns to satisfy maxResults, we're done). In that case, it unfortunately become more complicated to predict how much "results" a query might yield in general, because this depends on the column filter. I.e. if the filter is a name filter, or an "identity" slice filter (as in IdentitySliceFilter), we can try an estimate (in the latter case, something like maxResults / (estimatedKeys * meanColumnsCountPerKey)), but for other kind of slice filters, I don't think we can do much estimate. That being said, it might still be worth using the estimate in those two cases, because at least for 2ndary index query, the column filter will likely be very often an "identity" slice. And in that "identity" slice case, for 2ndary index and CQL3, the estimation should be something like maxResults / (meanColumnCountPerKey(index used) * meanColumnCountPerKey(parent cf)).

I'll note however that in both case, the meanColumnsCount is not necessary the perfect estimate to use, as it pretty much imply that half the time we will query one more range than is necessary. Instead we could either use the maxColumnsCount (if we really want to be conservative) or add some fudge factor to the mean. Fudge factor that may maybe be based on the difference between the min, mean and max columns count estimates.
                
> parallelize fetching rows for low-cardinality indexes
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1337
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: David Alves
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt, CASSANDRA-1337.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> currently, we read the indexed rows from the first node (in partitioner order); if that does not have enough matching rows, we read the rows from the next, and so forth.
> we should use the statistics fom CASSANDRA-1155 to query multiple nodes in parallel, such that we have a high chance of getting enough rows w/o having to do another round of queries (but, if our estimate is incorrect, we do need to loop and do more rounds until we have enough data or we have fetched from each node).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira