You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> on 2013/04/26 14:27:52 UTC
Slow retrieval using secondary indexes
Hi all!
We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We started with 6 nodes and added the 2 later. When performing some reads in Cassandra, we observed a high difference between gets using the primary key and gets using secondary indexes:
[default@Sessions] get Users where mahoutUserid = 30127944399716352;
-------------------
RowKey: STQ0TTNII2LS211YYJI4GEV80M1SE8
=> (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
1 Row Returned.
Elapsed time: 3508 msec(s).
[default@Sessions] get Users['STQ0TTNII2LS211YYJI4GEV80M1SE8'];
=> (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
Returned 1 results.
Elapsed time: 3.06 msec(s).
In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
Best regards,
Francisco Sobral.
Re: Slow retrieval using secondary indexes
Posted by aaron morton <aa...@thelastpickle.com>.
>
> cqlsh:Sessions> select * from "Items" where "mahoutItemid" = 610866442877251584;
>
> key | mahoutItemid
> ------------------------+--------------------
> 687474703a2f2f6573706f7| 610866442877251584
>
> unsupported operand type(s) for /: 'NoneType' and 'float'
Can you put together a process to replicate this and run cqlsh with the --debug command ?
If so please write a ticket at https://issues.apache.org/jira/browse/CASSANDRA
Thanks
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 4/05/2013, at 12:11 AM, Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> wrote:
> Thanks!
>
> The creation of the new CF worked pretty well and fast! Unfortunately, I was unable to trace the request made using secondary indexes:
>
> cqlsh:Sessions> select * from "Items" where key = '687474703a2f2f6573706f7';
>
> key | mahoutItemid
> ------------------------+--------------------
> 687474703a2f2f6573706f7| 610866442877251584
>
>
> Tracing session: b0240a40-b3e9-11e2-a219-59599925ed5a
>
> activity | timestamp | source | source_elapsed
> --------------------+--------------+--------------+----------------
> execute_cql3_query | 09:05:03,845 | 10.32.63.148 | 0
> Parsing statement | 09:05:03,845 | 10.32.63.148 | 36
> Peparing statement | 09:05:03,845 | 10.32.63.148 | 232
> Row cache hit | 09:05:03,845 | 10.32.63.148 | 577
> Request complete | 09:05:03,845 | 10.32.63.148 | 785
>
> cqlsh:Sessions> select * from "Items" where "mahoutItemid" = 610866442877251584;
>
> key | mahoutItemid
> ------------------------+--------------------
> 687474703a2f2f6573706f7| 610866442877251584
>
> unsupported operand type(s) for /: 'NoneType' and 'float'
>
>
> Regards,
> Francisco Sobral
>
>
> On Apr 28, 2013, at 4:55 PM, aaron morton <aa...@thelastpickle.com> wrote:
>
>> Try the request tracing in 1.2 http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 it may point to the different.
>>
>>> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
>> IMHO if you have a request that is frequently used as part of a hot code path it is still a good idea to support that with a custom CF.
>>
>> Cheers
>>
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 27/04/2013, at 12:27 AM, Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> wrote:
>>
>>> Hi all!
>>>
>>> We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We started with 6 nodes and added the 2 later. When performing some reads in Cassandra, we observed a high difference between gets using the primary key and gets using secondary indexes:
>>>
>>>
>>> [default@Sessions] get Users where mahoutUserid = 30127944399716352;
>>> -------------------
>>> RowKey: STQ0TTNII2LS211YYJI4GEV80M1SE8
>>> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
>>>
>>> 1 Row Returned.
>>> Elapsed time: 3508 msec(s).
>>>
>>> [default@Sessions] get Users['STQ0TTNII2LS211YYJI4GEV80M1SE8'];
>>> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
>>> Returned 1 results.
>>>
>>> Elapsed time: 3.06 msec(s).
>>>
>>>
>>> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
>>>
>>> Best regards,
>>> Francisco Sobral.
>>
>
Re: Slow retrieval using secondary indexes
Posted by Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br>.
Thanks!
The creation of the new CF worked pretty well and fast! Unfortunately, I was unable to trace the request made using secondary indexes:
cqlsh:Sessions> select * from "Items" where key = '687474703a2f2f6573706f7';
key | mahoutItemid
------------------------+--------------------
687474703a2f2f6573706f7| 610866442877251584
Tracing session: b0240a40-b3e9-11e2-a219-59599925ed5a
activity | timestamp | source | source_elapsed
--------------------+--------------+--------------+----------------
execute_cql3_query | 09:05:03,845 | 10.32.63.148 | 0
Parsing statement | 09:05:03,845 | 10.32.63.148 | 36
Peparing statement | 09:05:03,845 | 10.32.63.148 | 232
Row cache hit | 09:05:03,845 | 10.32.63.148 | 577
Request complete | 09:05:03,845 | 10.32.63.148 | 785
cqlsh:Sessions> select * from "Items" where "mahoutItemid" = 610866442877251584;
key | mahoutItemid
------------------------+--------------------
687474703a2f2f6573706f7| 610866442877251584
unsupported operand type(s) for /: 'NoneType' and 'float'
Regards,
Francisco Sobral
On Apr 28, 2013, at 4:55 PM, aaron morton <aa...@thelastpickle.com> wrote:
> Try the request tracing in 1.2 http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 it may point to the different.
>
>> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
> IMHO if you have a request that is frequently used as part of a hot code path it is still a good idea to support that with a custom CF.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 27/04/2013, at 12:27 AM, Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> wrote:
>
>> Hi all!
>>
>> We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We started with 6 nodes and added the 2 later. When performing some reads in Cassandra, we observed a high difference between gets using the primary key and gets using secondary indexes:
>>
>>
>> [default@Sessions] get Users where mahoutUserid = 30127944399716352;
>> -------------------
>> RowKey: STQ0TTNII2LS211YYJI4GEV80M1SE8
>> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
>>
>> 1 Row Returned.
>> Elapsed time: 3508 msec(s).
>>
>> [default@Sessions] get Users['STQ0TTNII2LS211YYJI4GEV80M1SE8'];
>> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
>> Returned 1 results.
>>
>> Elapsed time: 3.06 msec(s).
>>
>>
>> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
>>
>> Best regards,
>> Francisco Sobral.
>
Re: Slow retrieval using secondary indexes
Posted by aaron morton <aa...@thelastpickle.com>.
Try the request tracing in 1.2 http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 it may point to the different.
> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
IMHO if you have a request that is frequently used as part of a hot code path it is still a good idea to support that with a custom CF.
Cheers
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 27/04/2013, at 12:27 AM, Francisco Nogueira Calmon Sobral <fs...@igcorp.com.br> wrote:
> Hi all!
>
> We are using Cassandra 1.2.1 with a 8 node cluster running at Amazon. We started with 6 nodes and added the 2 later. When performing some reads in Cassandra, we observed a high difference between gets using the primary key and gets using secondary indexes:
>
>
> [default@Sessions] get Users where mahoutUserid = 30127944399716352;
> -------------------
> RowKey: STQ0TTNII2LS211YYJI4GEV80M1SE8
> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
>
> 1 Row Returned.
> Elapsed time: 3508 msec(s).
>
> [default@Sessions] get Users['STQ0TTNII2LS211YYJI4GEV80M1SE8'];
> => (column=mahoutUserid, value=30127944399716352, timestamp=1366820944696000)
> Returned 1 results.
>
> Elapsed time: 3.06 msec(s).
>
>
> In our model the secondary index in also unique, as the primary key is. Is it better, in this case, to create another CF mapping the secondary index to the key?
>
> Best regards,
> Francisco Sobral.