You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2014/04/02 23:19:15 UTC

[jira] [Updated] (CASSANDRA-6933) Optimise Read Comparison Costs in collectTimeOrderedData

     [ https://issues.apache.org/jira/browse/CASSANDRA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-6933:
--------------------------------------

    Attachment: 6933-v3.txt

I agree that in the best case this is a good optimization, I'm just not convinced that real-world use cases are going to much resemble the best case.  In particular, in CollationController the container will be guaranteed to only have columns the filter is looking for, so we expect to have a lot of sequential "runs" of matches when compaction is working well.  On the other hand, once we've found "most" matches and are looking for the last handful, there's no particular reason to expect that these last ones will be evenly distributed across the container space.  (Sure, they will be "on average," but the variance is high enough to make that useless as a guideline.)

v3 removes the range heuristic and fixes incrementing i on a hit.

> Optimise Read Comparison Costs in collectTimeOrderedData
> --------------------------------------------------------
>
>                 Key: CASSANDRA-6933
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6933
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.1
>
>         Attachments: 6933-v3.txt
>
>
> Introduce a new SearchIterator construct, which can be obtained from a ColumnFamily, which permits efficiently iterating a subset of the cells in ascending order. Essentially, it saves the previously visited position and searches from there, but also tries to avoid searching the whole remaining space if possible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)