You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Oded Peer (JIRA)" <ji...@apache.org> on 2014/12/02 11:02:13 UTC

[jira] [Updated] (CASSANDRA-4476) Support 2ndary index queries with only inequality clauses (LT, LTE, GT, GTE)

     [ https://issues.apache.org/jira/browse/CASSANDRA-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Oded Peer updated CASSANDRA-4476:
---------------------------------
    Attachment: 4476-3.patch

Added new patch 4476-3.patch

# I switched my development env to avoid these types of errors in the future.
# Right, I fixed it. Instead of comparing to {{EQ}} I added a method to {{Operator}} that identifies a relational operator with the notion of order.
# The simple index selection algorithm is described in Sylvain’s comment from 4/Dec/13. I improved the algorithm in this patch to estimate the amount of rows the slice operator returns.
# I see it as a trade-off between code complexity and query performance. As Sylvain explained in his earlier comment ??more than one indexed column means ALLOW FILTERING, for which all bets are off in terms of performance anyway??. While it is good to strive and deliver the optimal performance altogether I think the use case you are describing is rare. Jonathan Ellis described “When Not to Use Secondary Indexes” in a blog post ??Do not use secondary indexes to query a huge volume of records for a small number of results?? so for the proper use of indexed queries this shouldn't have a significant effect but it would make the code more complex.
# I added comments.
# I thought the purpose of the test was obvious from the class name. The bug description in the method name was meant to put it into context. I guess it wasn't obvious. I changed the names.
# I chose to use an instance variable to be consistent with {{usePrepared}} usage. I understand your point and I have made this an explicit variable to a new {{execute}} method

In addition as part of committing the patch a change the wiki page describing Secondary Indexes needs to be changed (http://wiki.apache.org/cassandra/SecondaryIndexes). How are wiki changes usually handled?


> Support 2ndary index queries with only inequality clauses (LT, LTE, GT, GTE)
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4476
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4476
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API, Core
>            Reporter: Sylvain Lebresne
>            Assignee: Oded Peer
>            Priority: Minor
>              Labels: cql
>             Fix For: 3.0
>
>         Attachments: 4476-2.patch, 4476-3.patch, cassandra-trunk-4476.patch
>
>
> Currently, a query that uses 2ndary indexes must have at least one EQ clause (on an indexed column). Given that indexed CFs are local (and use LocalPartitioner that order the row by the type of the indexed column), we should extend 2ndary indexes to allow querying indexed columns even when no EQ clause is provided.
> As far as I can tell, the main problem to solve for this is to update KeysSearcher.highestSelectivityPredicate(). I.e. how do we estimate the selectivity of non-EQ clauses? I note however that if we can do that estimate reasonably accurately, this might provide better performance even for index queries that both EQ and non-EQ clauses, because some non-EQ clauses may have a much better selectivity than EQ ones (say you index both the user country and birth date, for SELECT * FROM users WHERE country = 'US' AND birthdate > 'Jan 2009' AND birtdate < 'July 2009', you'd better use the birthdate index first).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)