You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Mark Kerzner <ma...@gmail.com> on 2011/06/10 04:53:15 UTC

Secondary indices with multiple conditions?

Hi,

if I am using Cassandra's secondary indices, or even if I am doing it myself
following Ed Anuff's
advice<http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html>,
can I do multiple slices? That is, how do I imitate a SQL query of

where column_1 > 5 and column_2 < 4 and so on, up to dozens of conditions?

Thank you,
Mark

Re: Secondary indices with multiple conditions?

Posted by aaron morton <aa...@thelastpickle.com>.
The query is resolved server side. 

From the blog post
"
We can perform the range query now that the state column is also indexed, so Cassandra can use the state predicate as the primary and filter on the other with a nested loop.
"

So if you have 10 terms, the service will use statistics to find the best match for an "=" term and then filter all matching rows using the other terms. 

Having to use 10 terms sounds like you could do some more work on the data model. That sounds more like a RDBMS model than a Cassandra style model. Consider refactoring the model to better support the read requests including de-normalising to reduce the number of conditions required to find the right data. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 10 Jun 2011, at 17:53, Mark Kerzner wrote:

> So in Hector I can do, for example, addGtExpression any number of times, correct?
> 
> Internally, how is it implemented? Do we get the subset of data based on an indexed columns, and then essentially scan the rest, with Cassandra API or Hector providing  filtering. So this may be an expensive operation? For example, how long would you expect 1 million rows and 10 conditions to take?
> 
> Thank you very much, very helpful.
> 
> Mark
> 
> On Thu, Jun 9, 2011 at 10:28 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> Yes, with the restriction that at least one of the conditions must be
> = on an indexed column.
> 
> See http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes
> for an example.
> 
> On Thu, Jun 9, 2011 at 9:53 PM, Mark Kerzner <ma...@gmail.com> wrote:
> > Hi,
> > if I am using Cassandra's secondary indices, or even if I am doing it myself
> > following Ed Anuff's advice, can I do multiple slices? That is, how do I
> > imitate a SQL query of
> > where column_1 > 5 and column_2 < 4 and so on, up to dozens of conditions?
> > Thank you,
> > Mark
> 
> 
> 
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
> 


Re: Secondary indices with multiple conditions?

Posted by Mark Kerzner <ma...@gmail.com>.
So in Hector I can do, for example, addGtExpression any number of times,
correct?

Internally, how is it implemented? Do we get the subset of data based on an
indexed columns, and then essentially scan the rest, with Cassandra API or
Hector providing  filtering. So this may be an expensive operation? For
example, how long would you expect 1 million rows and 10 conditions to take?

Thank you very much, very helpful.

Mark

On Thu, Jun 9, 2011 at 10:28 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> Yes, with the restriction that at least one of the conditions must be
> = on an indexed column.
>
> See
> http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes
> for an example.
>
> On Thu, Jun 9, 2011 at 9:53 PM, Mark Kerzner <ma...@gmail.com>
> wrote:
> > Hi,
> > if I am using Cassandra's secondary indices, or even if I am doing it
> myself
> > following Ed Anuff's advice, can I do multiple slices? That is, how do I
> > imitate a SQL query of
> > where column_1 > 5 and column_2 < 4 and so on, up to dozens of
> conditions?
> > Thank you,
> > Mark
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: Secondary indices with multiple conditions?

Posted by Jonathan Ellis <jb...@gmail.com>.
Yes, with the restriction that at least one of the conditions must be
= on an indexed column.

See http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes
for an example.

On Thu, Jun 9, 2011 at 9:53 PM, Mark Kerzner <ma...@gmail.com> wrote:
> Hi,
> if I am using Cassandra's secondary indices, or even if I am doing it myself
> following Ed Anuff's advice, can I do multiple slices? That is, how do I
> imitate a SQL query of
> where column_1 > 5 and column_2 < 4 and so on, up to dozens of conditions?
> Thank you,
> Mark



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com