You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jacob Rhoden <ja...@me.com> on 2014/11/06 05:41:56 UTC

Why is one query 10 times slower than the other?

Hi Guys,

I have two cassandra 2.0.5 nodes, RF=2. When I do a:

    select * from table1 where clustercolumn=‘something'

The trace indicates that it only needs to talk to one node, which I would have expected. However when I do a:

    select * from table2

Which is a small table with only has 20 rows in it, should be fully replicated, and should be a much quicker query, trace indicates that cassandra is talking to both nodes. This adds a 200ms to the query results, and is not necessary for my application (this table might have an amendment once per year if that), theres no real need to check both nodes for consistency.

At this point I’ve not altered anything to do with consistency level. Does this mean that cassandra attempts to guess/infer what consistency level you need depending on if your query includes a filter on a particular key or clustering key?

Thanks,
Jacob


CREATE KEYSPACE mykeyspace WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': ‘2' };

CREATE TABLE organisation (uuid uuid, name text, url text, PRIMARY KEY (uuid))

CREATE TABLE lookup_code (type text, code text, name text, PRIMARY KEY ((type), code)) 


select * from lookup_code where type=‘mylist':

 activity                                                                  | timestamp    | source       | source_elapsed
---------------------------------------------------------------------------+--------------+--------------+----------------
                                                        execute_cql3_query | 04:20:15,319 | 74.50.54.123 |              0
 Parsing select * from lookup_code where type='research_area' LIMIT 10000; | 04:20:15,319 | 74.50.54.123 |             64
                                                       Preparing statement | 04:20:15,320 | 74.50.54.123 |            204
                           Executing single-partition query on lookup_code | 04:20:15,320 | 74.50.54.123 |            849
                                              Acquiring sstable references | 04:20:15,320 | 74.50.54.123 |            870
                                               Merging memtable tombstones | 04:20:15,320 | 74.50.54.123 |            894
 Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones | 04:20:15,320 | 74.50.54.123 |            958
                                Merging data from memtables and 0 sstables | 04:20:15,320 | 74.50.54.123 |            976
                                      Read 168 live and 0 tombstoned cells | 04:20:15,321 | 74.50.54.123 |           1412
                                                          Request complete | 04:20:15,321 | 74.50.54.123 |           2043


select * from organisation:

 activity                                                                                        | timestamp    | source       | source_elapsed
-------------------------------------------------------------------------------------------------+--------------+--------------+----------------
                                                                              execute_cql3_query | 04:21:03,641 | 74.50.54.123 |              0
                                                 Parsing select * from organisation LIMIT 10000; | 04:21:03,641 | 74.50.54.123 |             68
                                                                             Preparing statement | 04:21:03,641 | 74.50.54.123 |            174
                                                                   Determining replicas to query | 04:21:03,642 | 74.50.54.123 |            307
                                                              Enqueuing request to /72.249.82.85 | 04:21:03,642 | 74.50.54.123 |           1034
                                                                Sending message to /72.249.82.85 | 04:21:03,643 | 74.50.54.123 |           1402
                                                             Message received from /74.50.54.123 | 04:21:03,644 | 72.249.82.85 |             47
 Executing seq scan across 0 sstables for [min(-9223372036854775808), min(-9223372036854775808)] | 04:21:03,644 | 72.249.82.85 |            461
                                                              Read 1 live and 0 tombstoned cells | 04:21:03,644 | 72.249.82.85 |            560
                                                              Read 1 live and 0 tombstoned cells | 04:21:03,644 | 72.249.82.85 |            611

………..etc….....

Re: Why is one query 10 times slower than the other?

Posted by graham sanderson <gr...@vast.com>.

In your “lookup_code” example “type” is not a clustercolumn it is the partition key, and hence the first query only hits one partition
The second query is a range slice across all possible keys, so the sub-ranges are farmed out to nodes with the data.
You are likely at CL_ONE, so it only needs response from one node for each sub-range… I guess it has decided (based on the snitch) that it is not unreasonable to share the query across the two nodes 

> On Nov 5, 2014, at 10:41 PM, Jacob Rhoden <ja...@me.com> wrote:
> 
> Hi Guys,
> 
> I have two cassandra 2.0.5 nodes, RF=2. When I do a:
> 
>     select * from table1 where clustercolumn=‘something'
> 
> The trace indicates that it only needs to talk to one node, which I would have expected. However when I do a:
> 
>     select * from table2
> 
> Which is a small table with only has 20 rows in it, should be fully replicated, and should be a much quicker query, trace indicates that cassandra is talking to both nodes. This adds a 200ms to the query results, and is not necessary for my application (this table might have an amendment once per year if that), theres no real need to check both nodes for consistency.
> 
> At this point I’ve not altered anything to do with consistency level. Does this mean that cassandra attempts to guess/infer what consistency level you need depending on if your query includes a filter on a particular key or clustering key?
> 
> Thanks,
> Jacob
> 
> 
> CREATE KEYSPACE mykeyspace WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': ‘2' };
> 
> CREATE TABLE organisation (uuid uuid, name text, url text, PRIMARY KEY (uuid))
> 
> CREATE TABLE lookup_code (type text, code text, name text, PRIMARY KEY ((type), code)) 
> 
> 
> select * from lookup_code where type=‘mylist':
> 
>  activity                                                                  | timestamp    | source       | source_elapsed
> ---------------------------------------------------------------------------+--------------+--------------+----------------
>                                                         execute_cql3_query | 04:20:15,319 | 74.50.54.123 |              0
>  Parsing select * from lookup_code where type='research_area' LIMIT 10000; | 04:20:15,319 | 74.50.54.123 |             64
>                                                        Preparing statement | 04:20:15,320 | 74.50.54.123 |            204
>                            Executing single-partition query on lookup_code | 04:20:15,320 | 74.50.54.123 |            849
>                                               Acquiring sstable references | 04:20:15,320 | 74.50.54.123 |            870
>                                                Merging memtable tombstones | 04:20:15,320 | 74.50.54.123 |            894
>  Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones | 04:20:15,320 | 74.50.54.123 |            958
>                                 Merging data from memtables and 0 sstables | 04:20:15,320 | 74.50.54.123 |            976
>                                       Read 168 live and 0 tombstoned cells | 04:20:15,321 | 74.50.54.123 |           1412
>                                                           Request complete | 04:20:15,321 | 74.50.54.123 |           2043
> 
> 
> select * from organisation:
> 
>  activity                                                                                        | timestamp    | source       | source_elapsed
> -------------------------------------------------------------------------------------------------+--------------+--------------+----------------
>                                                                               execute_cql3_query | 04:21:03,641 | 74.50.54.123 |              0
>                                                  Parsing select * from organisation LIMIT 10000; | 04:21:03,641 | 74.50.54.123 |             68
>                                                                              Preparing statement | 04:21:03,641 | 74.50.54.123 |            174
>                                                                    Determining replicas to query | 04:21:03,642 | 74.50.54.123 |            307
>                                                               Enqueuing request to /72.249.82.85 | 04:21:03,642 | 74.50.54.123 |           1034
>                                                                 Sending message to /72.249.82.85 | 04:21:03,643 | 74.50.54.123 |           1402
>                                                              Message received from /74.50.54.123 | 04:21:03,644 | 72.249.82.85 |             47
>  Executing seq scan across 0 sstables for [min(-9223372036854775808), min(-9223372036854775808)] | 04:21:03,644 | 72.249.82.85 |            461
>                                                               Read 1 live and 0 tombstoned cells | 04:21:03,644 | 72.249.82.85 |            560
>                                                               Read 1 live and 0 tombstoned cells | 04:21:03,644 | 72.249.82.85 |            611
> 
> ………..etc….....