You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> on 2013/04/26 14:27:52 UTC

Slow retrieval using secondary indexes

Hi all!

We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We started with 6 nodes and added the 2 later. When performing some reads in Cassandra, we observed a high difference between gets using the primary key and gets using secondary indexes:


[default@Sessions] get Users where mahoutUserid = 30127944399716352;
-------------------
RowKey: STQ0TTNII2LS211YYJI4GEV80M1SE8
=> (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)

1 Row Returned.
Elapsed time: 3508 msec(s).

[default@Sessions] get Users['STQ0TTNII2LS211YYJI4GEV80M1SE8'];
=> (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
Returned 1 results.

Elapsed time: 3.06 msec(s).


In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?

Best regards,
Francisco Sobral.

Re: Slow retrieval using secondary indexes

Posted by aaron morton <aa...@thelastpickle.com>.
> 
> cqlsh:Sessions> select * from "Items" where "mahoutItemid" = 610866442877251584;
> 
>  key                    | mahoutItemid
> ------------------------+--------------------
>  687474703a2f2f6573706f7| 610866442877251584
> 
> unsupported operand type(s) for /: 'NoneType' and 'float'
Can you put together a process to replicate this and run cqlsh with the --debug command ? 
If so please write a ticket at https://issues.apache.org/jira/browse/CASSANDRA 

Thanks

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/05/2013, at 12:11 AM, Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> wrote:

> Thanks!
> 
> The creation of the new CF worked pretty well and fast! Unfortunately, I was unable to trace the request made using secondary indexes:
> 
> cqlsh:Sessions> select * from "Items" where key = '687474703a2f2f6573706f7';
> 
>  key                    | mahoutItemid
> ------------------------+--------------------
>  687474703a2f2f6573706f7| 610866442877251584
> 
> 
> Tracing session: b0240a40-b3e9-11e2-a219-59599925ed5a
> 
>  activity           | timestamp    | source       | source_elapsed
> --------------------+--------------+--------------+----------------
>  execute_cql3_query | 09:05:03,845 | 10.32.63.148 |              0
>   Parsing statement | 09:05:03,845 | 10.32.63.148 |             36
>  Peparing statement | 09:05:03,845 | 10.32.63.148 |            232
>       Row cache hit | 09:05:03,845 | 10.32.63.148 |            577
>    Request complete | 09:05:03,845 | 10.32.63.148 |            785
> 
> cqlsh:Sessions> select * from "Items" where "mahoutItemid" = 610866442877251584;
> 
>  key                    | mahoutItemid
> ------------------------+--------------------
>  687474703a2f2f6573706f7| 610866442877251584
> 
> unsupported operand type(s) for /: 'NoneType' and 'float'
> 
> 
> Regards,
> Francisco Sobral
> 
> 
> On Apr 28, 2013, at 4:55 PM, aaron morton <aa...@thelastpickle.com> wrote:
> 
>> Try the request tracing in 1.2 http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 it may point to the different. 
>> 
>>> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
>> IMHO if you have a request that is frequently used as part of a hot code path it is still a good idea to support that with a custom CF. 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 27/04/2013, at 12:27 AM, Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> wrote:
>> 
>>> Hi all!
>>> 
>>> We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We started with 6 nodes and added the 2 later. When performing some reads in Cassandra, we observed a high difference between gets using the primary key and gets using secondary indexes:
>>> 
>>> 
>>> [default@Sessions] get Users where mahoutUserid = 30127944399716352;
>>> -------------------
>>> RowKey: STQ0TTNII2LS211YYJI4GEV80M1SE8
>>> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
>>> 
>>> 1 Row Returned.
>>> Elapsed time: 3508 msec(s).
>>> 
>>> [default@Sessions] get Users['STQ0TTNII2LS211YYJI4GEV80M1SE8'];
>>> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
>>> Returned 1 results.
>>> 
>>> Elapsed time: 3.06 msec(s).
>>> 
>>> 
>>> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
>>> 
>>> Best regards,
>>> Francisco Sobral.
>> 
> 


Re: Slow retrieval using secondary indexes

Posted by Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br>.
Thanks!

The creation of the new CF worked pretty well and fast! Unfortunately, I was unable to trace the request made using secondary indexes:

cqlsh:Sessions> select * from "Items" where key = '687474703a2f2f6573706f7';

 key                    | mahoutItemid
------------------------+--------------------
 687474703a2f2f6573706f7| 610866442877251584


Tracing session: b0240a40-b3e9-11e2-a219-59599925ed5a

 activity           | timestamp    | source       | source_elapsed
--------------------+--------------+--------------+----------------
 execute_cql3_query | 09:05:03,845 | 10.32.63.148 |              0
  Parsing statement | 09:05:03,845 | 10.32.63.148 |             36
 Peparing statement | 09:05:03,845 | 10.32.63.148 |            232
      Row cache hit | 09:05:03,845 | 10.32.63.148 |            577
   Request complete | 09:05:03,845 | 10.32.63.148 |            785

cqlsh:Sessions> select * from "Items" where "mahoutItemid" = 610866442877251584;

 key                    | mahoutItemid
------------------------+--------------------
 687474703a2f2f6573706f7| 610866442877251584

unsupported operand type(s) for /: 'NoneType' and 'float'


Regards,
Francisco Sobral


On Apr 28, 2013, at 4:55 PM, aaron morton <aa...@thelastpickle.com> wrote:

> Try the request tracing in 1.2 http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 it may point to the different. 
> 
>> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
> IMHO if you have a request that is frequently used as part of a hot code path it is still a good idea to support that with a custom CF. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 27/04/2013, at 12:27 AM, Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> wrote:
> 
>> Hi all!
>> 
>> We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We started with 6 nodes and added the 2 later. When performing some reads in Cassandra, we observed a high difference between gets using the primary key and gets using secondary indexes:
>> 
>> 
>> [default@Sessions] get Users where mahoutUserid = 30127944399716352;
>> -------------------
>> RowKey: STQ0TTNII2LS211YYJI4GEV80M1SE8
>> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
>> 
>> 1 Row Returned.
>> Elapsed time: 3508 msec(s).
>> 
>> [default@Sessions] get Users['STQ0TTNII2LS211YYJI4GEV80M1SE8'];
>> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
>> Returned 1 results.
>> 
>> Elapsed time: 3.06 msec(s).
>> 
>> 
>> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
>> 
>> Best regards,
>> Francisco Sobral.
> 


Re: Slow retrieval using secondary indexes

Posted by aaron morton <aa...@thelastpickle.com>.
Try the request tracing in 1.2 http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 it may point to the different. 

> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
IMHO if you have a request that is frequently used as part of a hot code path it is still a good idea to support that with a custom CF. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 27/04/2013, at 12:27 AM, Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> wrote:

> Hi all!
> 
> We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We started with 6 nodes and added the 2 later. When performing some reads in Cassandra, we observed a high difference between gets using the primary key and gets using secondary indexes:
> 
> 
> [default@Sessions] get Users where mahoutUserid = 30127944399716352;
> -------------------
> RowKey: STQ0TTNII2LS211YYJI4GEV80M1SE8
> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
> 
> 1 Row Returned.
> Elapsed time: 3508 msec(s).
> 
> [default@Sessions] get Users['STQ0TTNII2LS211YYJI4GEV80M1SE8'];
> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
> Returned 1 results.
> 
> Elapsed time: 3.06 msec(s).
> 
> 
> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
> 
> Best regards,
> Francisco Sobral.