You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> on 2013/09/02 20:37:55 UTC

Re: How to perform range queries efficiently?

We had some problems when using secondary indexes because of three issues:

- The query is a Range Query, which means that it is slow.
- There is an open bug regarding the use of row cache for secondary indexes (CASSANDRA-4973)
- The cardinality of our secondary key was very low (this was bad)

We performed some modifications and created another column family, which maps the secondary index to the key of the original column family. The improvements were very impressive in our case!

Best regards
Francisco


On Aug 28, 2013, at 12:22 PM, Vivek Mishra <mi...@gmail.com> wrote:

> Create a column family of compositeType (or PRIMARY KEY) as (user_id,age, salary).
> 
> Then you will be able to query use eq operator  over partition key and as well over clustering key:
> 
> You may also exclude salary as a secondary index rather than part of cluster key(e.g. age,salary)
> 
> I am sure based on your query usage, you can opt for either a composite key or may mix composite key with secondary index !
> 
> Have a look at: http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1
> 
> Hope it helps.
> 
> 
> -Vivek
> 
> 
> On Wed, Aug 28, 2013 at 5:49 PM, Sávio Teles <sa...@lupa.inf.ufg.br> wrote:
> I can populate again. We are modelling the data yet! Tks.
> 
> 
> 2013/8/28 Vivek Mishra <mi...@gmail.com>
> Just saw that you already have data populated, so i guess modifying for composite key may not work for you.
> 
> -Vivek
> 
> 
> On Tue, Aug 27, 2013 at 11:55 PM, Sávio Teles <sa...@lupa.inf.ufg.br> wrote:
> Vivek, using a composite key, how would be the query?
> 
> 
> 2013/8/27 Vivek Mishra <mi...@gmail.com>
> For such queries, looks like you may create a composite key as (user_id,age, salary).
> 
> Too much indexing always kills(irrespective of RDBMS or NoSQL). Remember every search request on secondary indexes will be passed on each node in ring.
> 
> -Vivek
> 
> 
> On Tue, Aug 27, 2013 at 11:11 PM, Sávio Teles <sa...@lupa.inf.ufg.br> wrote:
> Use a database that is designed for efficient range queries? ;D
> 
> Is there no way to do this with Cassandra? Like using Hive, Sorl...
> 
> 
> 2013/8/27 Robert Coli <rc...@eventbrite.com>
> On Fri, Aug 23, 2013 at 5:53 AM, Sávio Teles <sa...@lupa.inf.ufg.br> wrote:
> I need to perform range query efficiently. 
> ... 
> This query takes a long time to run. Any ideas to perform it efficiently?
> 
> Use a database that is designed for efficient range queries? ;D
> 
> =Rob
>  
> 
> 
> 
> -- 
> Atenciosamente,
> Sávio S. Teles de Oliveira
> voice: +55 62 9136 6996
> http://br.linkedin.com/in/savioteles
> Mestrando em Ciências da Computação - UFG 
> Arquiteto de Software
> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
> 
> 
> 
> 
> -- 
> Atenciosamente,
> Sávio S. Teles de Oliveira
> voice: +55 62 9136 6996
> http://br.linkedin.com/in/savioteles
> Mestrando em Ciências da Computação - UFG 
> Arquiteto de Software
> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
> 
> 
> 
> 
> -- 
> Atenciosamente,
> Sávio S. Teles de Oliveira
> voice: +55 62 9136 6996
> http://br.linkedin.com/in/savioteles
> Mestrando em Ciências da Computação - UFG 
> Arquiteto de Software
> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
> 


Re: How to perform range queries efficiently?

Posted by Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br>.
Sorry, I was not very clear.

We simply created another CF whose row keys were given by the secondary index that we needed. The value of each row in this new CF was the key associated with a row in the first CF (the original one).

Francisco


On Sep 2, 2013, at 4:13 PM, Sávio Teles <sa...@lupa.inf.ufg.br> wrote:

> 
> We performed some modifications and created another column family, which maps the secondary index to the key of the original column family. The improvements were very impressive in our case!
> 
> Sorry, I coundn't understand! What changes? Have you built a B-tree? 
> 
> 
> 2013/9/2 Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br>
> We had some problems when using secondary indexes because of three issues:
> 
> - The query is a Range Query, which means that it is slow.
> - There is an open bug regarding the use of row cache for secondary indexes (CASSANDRA-4973)
> - The cardinality of our secondary key was very low (this was bad)
> 
> We performed some modifications and created another column family, which maps the secondary index to the key of the original column family. The improvements were very impressive in our case!
> 
> Best regards
> Francisco
> 
> 
> On Aug 28, 2013, at 12:22 PM, Vivek Mishra <mi...@gmail.com> wrote:
> 
>> Create a column family of compositeType (or PRIMARY KEY) as (user_id,age, salary).
>> 
>> Then you will be able to query use eq operator  over partition key and as well over clustering key:
>> 
>> You may also exclude salary as a secondary index rather than part of cluster key(e.g. age,salary)
>> 
>> I am sure based on your query usage, you can opt for either a composite key or may mix composite key with secondary index !
>> 
>> Have a look at: http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1
>> 
>> Hope it helps.
>> 
>> 
>> -Vivek
>> 
>> 
>> On Wed, Aug 28, 2013 at 5:49 PM, Sávio Teles <sa...@lupa.inf.ufg.br> wrote:
>> I can populate again. We are modelling the data yet! Tks.
>> 
>> 
>> 2013/8/28 Vivek Mishra <mi...@gmail.com>
>> Just saw that you already have data populated, so i guess modifying for composite key may not work for you.
>> 
>> -Vivek
>> 
>> 
>> On Tue, Aug 27, 2013 at 11:55 PM, Sávio Teles <sa...@lupa.inf.ufg.br> wrote:
>> Vivek, using a composite key, how would be the query?
>> 
>> 
>> 2013/8/27 Vivek Mishra <mi...@gmail.com>
>> For such queries, looks like you may create a composite key as (user_id,age, salary).
>> 
>> Too much indexing always kills(irrespective of RDBMS or NoSQL). Remember every search request on secondary indexes will be passed on each node in ring.
>> 
>> -Vivek
>> 
>> 
>> On Tue, Aug 27, 2013 at 11:11 PM, Sávio Teles <sa...@lupa.inf.ufg.br> wrote:
>> Use a database that is designed for efficient range queries? ;D
>> 
>> Is there no way to do this with Cassandra? Like using Hive, Sorl...
>> 
>> 
>> 2013/8/27 Robert Coli <rc...@eventbrite.com>
>> On Fri, Aug 23, 2013 at 5:53 AM, Sávio Teles <sa...@lupa.inf.ufg.br> wrote:
>> I need to perform range query efficiently. 
>> ... 
>> This query takes a long time to run. Any ideas to perform it efficiently?
>> 
>> Use a database that is designed for efficient range queries? ;D
>> 
>> =Rob
>>  
>> 
>> 
>> 
>> -- 
>> Atenciosamente,
>> Sávio S. Teles de Oliveira
>> voice: +55 62 9136 6996
>> http://br.linkedin.com/in/savioteles
>> Mestrando em Ciências da Computação - UFG 
>> Arquiteto de Software
>> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>> 
>> 
>> 
>> 
>> -- 
>> Atenciosamente,
>> Sávio S. Teles de Oliveira
>> voice: +55 62 9136 6996
>> http://br.linkedin.com/in/savioteles
>> Mestrando em Ciências da Computação - UFG 
>> Arquiteto de Software
>> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>> 
>> 
>> 
>> 
>> -- 
>> Atenciosamente,
>> Sávio S. Teles de Oliveira
>> voice: +55 62 9136 6996
>> http://br.linkedin.com/in/savioteles
>> Mestrando em Ciências da Computação - UFG 
>> Arquiteto de Software
>> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>> 
> 
> 
> 
> 
> -- 
> Atenciosamente,
> Sávio S. Teles de Oliveira
> voice: +55 62 9136 6996
> http://br.linkedin.com/in/savioteles
> Mestrando em Ciências da Computação - UFG 
> Arquiteto de Software
> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG


Re: How to perform range queries efficiently?

Posted by Sávio Teles <sa...@lupa.inf.ufg.br>.
> We performed some modifications and created another column family, which
> maps the secondary index to the key of the original column family. The
> improvements were very impressive in our case!


Sorry, I coundn't understand! What changes? Have you built a B-tree?


2013/9/2 Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br>

> We had some problems when using secondary indexes because of three issues:
>
> - The query is a Range Query, which means that it is slow.
> - There is an open bug regarding the use of row cache for secondary
> indexes (CASSANDRA-4973)
> - The cardinality of our secondary key was very low (this was bad)
>
> We performed some modifications and created another column family, which
> maps the secondary index to the key of the original column family. The
> improvements were very impressive in our case!
>
> Best regards
> Francisco
>
>
> On Aug 28, 2013, at 12:22 PM, Vivek Mishra <mi...@gmail.com> wrote:
>
> Create a column family of compositeType (or PRIMARY KEY) as (user_id,age,
> salary).
>
> Then you will be able to query use eq operator  over partition key and as
> well over clustering key:
>
> You may also exclude salary as a secondary index rather than part of
> cluster key(e.g. age,salary)
>
> I am sure based on your query usage, you can opt for either a composite
> key or may mix composite key with secondary index !
>
> Have a look at:
> http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1
>
> Hope it helps.
>
>
> -Vivek
>
>
> On Wed, Aug 28, 2013 at 5:49 PM, Sávio Teles <sa...@lupa.inf.ufg.br>wrote:
>
>> I can populate again. We are modelling the data yet! Tks.
>>
>>
>> 2013/8/28 Vivek Mishra <mi...@gmail.com>
>>
>>> Just saw that you already have data populated, so i guess modifying for
>>> composite key may not work for you.
>>>
>>> -Vivek
>>>
>>>
>>> On Tue, Aug 27, 2013 at 11:55 PM, Sávio Teles <
>>> savio.teles@lupa.inf.ufg.br> wrote:
>>>
>>>> Vivek, using a composite key, how would be the query?
>>>>
>>>>
>>>> 2013/8/27 Vivek Mishra <mi...@gmail.com>
>>>>
>>>>> For such queries, looks like you may create a composite key as
>>>>> (user_id,age, salary).
>>>>>
>>>>> Too much indexing always kills(irrespective of RDBMS or NoSQL).
>>>>> Remember every search request on secondary indexes will be passed on each
>>>>> node in ring.
>>>>>
>>>>> -Vivek
>>>>>
>>>>>
>>>>> On Tue, Aug 27, 2013 at 11:11 PM, Sávio Teles <
>>>>> savio.teles@lupa.inf.ufg.br> wrote:
>>>>>
>>>>>> Use a database that is designed for efficient range queries? ;D
>>>>>>>
>>>>>>
>>>>>> Is there no way to do this with Cassandra? Like using Hive, Sorl...
>>>>>>
>>>>>>
>>>>>> 2013/8/27 Robert Coli <rc...@eventbrite.com>
>>>>>>
>>>>>>> On Fri, Aug 23, 2013 at 5:53 AM, Sávio Teles <
>>>>>>> savio.teles@lupa.inf.ufg.br> wrote:
>>>>>>>
>>>>>>>> I need to perform range query efficiently.
>>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>>> This query takes a long time to run. Any ideas to perform it
>>>>>>>> efficiently?
>>>>>>>>
>>>>>>>
>>>>>>> Use a database that is designed for efficient range queries? ;D
>>>>>>>
>>>>>>> =Rob
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Atenciosamente,
>>>>>> Sávio S. Teles de Oliveira
>>>>>> voice: +55 62 9136 6996
>>>>>> http://br.linkedin.com/in/savioteles
>>>>>>  Mestrando em Ciências da Computação - UFG
>>>>>> Arquiteto de Software
>>>>>> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Atenciosamente,
>>>> Sávio S. Teles de Oliveira
>>>> voice: +55 62 9136 6996
>>>> http://br.linkedin.com/in/savioteles
>>>>  Mestrando em Ciências da Computação - UFG
>>>> Arquiteto de Software
>>>> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>>>>
>>>
>>>
>>
>>
>> --
>> Atenciosamente,
>> Sávio S. Teles de Oliveira
>> voice: +55 62 9136 6996
>> http://br.linkedin.com/in/savioteles
>>  Mestrando em Ciências da Computação - UFG
>> Arquiteto de Software
>> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>>
>
>
>


-- 
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ciências da Computação - UFG
Arquiteto de Software
Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG