You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jacob Rhoden <ja...@me.com> on 2014/11/06 05:41:56 UTC
Why is one query 10 times slower than the other?
Hi Guys,
I have two cassandra 2.0.5 nodes, RF=2. When I do a:
select * from table1 where clustercolumn=‘something'
The trace indicates that it only needs to talk to one node, which I would have expected. However when I do a:
select * from table2
Which is a small table with only has 20 rows in it, should be fully replicated, and should be a much quicker query, trace indicates that cassandra is talking to both nodes. This adds a 200ms to the query results, and is not necessary for my application (this table might have an amendment once per year if that), theres no real need to check both nodes for consistency.
At this point I’ve not altered anything to do with consistency level. Does this mean that cassandra attempts to guess/infer what consistency level you need depending on if your query includes a filter on a particular key or clustering key?
Thanks,
Jacob
CREATE KEYSPACE mykeyspace WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': ‘2' };
CREATE TABLE organisation (uuid uuid, name text, url text, PRIMARY KEY (uuid))
CREATE TABLE lookup_code (type text, code text, name text, PRIMARY KEY ((type), code))
select * from lookup_code where type=‘mylist':
activity | timestamp | source | source_elapsed
---------------------------------------------------------------------------+--------------+--------------+----------------
execute_cql3_query | 04:20:15,319 | 74.50.54.123 | 0
Parsing select * from lookup_code where type='research_area' LIMIT 10000; | 04:20:15,319 | 74.50.54.123 | 64
Preparing statement | 04:20:15,320 | 74.50.54.123 | 204
Executing single-partition query on lookup_code | 04:20:15,320 | 74.50.54.123 | 849
Acquiring sstable references | 04:20:15,320 | 74.50.54.123 | 870
Merging memtable tombstones | 04:20:15,320 | 74.50.54.123 | 894
Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones | 04:20:15,320 | 74.50.54.123 | 958
Merging data from memtables and 0 sstables | 04:20:15,320 | 74.50.54.123 | 976
Read 168 live and 0 tombstoned cells | 04:20:15,321 | 74.50.54.123 | 1412
Request complete | 04:20:15,321 | 74.50.54.123 | 2043
select * from organisation:
activity | timestamp | source | source_elapsed
-------------------------------------------------------------------------------------------------+--------------+--------------+----------------
execute_cql3_query | 04:21:03,641 | 74.50.54.123 | 0
Parsing select * from organisation LIMIT 10000; | 04:21:03,641 | 74.50.54.123 | 68
Preparing statement | 04:21:03,641 | 74.50.54.123 | 174
Determining replicas to query | 04:21:03,642 | 74.50.54.123 | 307
Enqueuing request to /72.249.82.85 | 04:21:03,642 | 74.50.54.123 | 1034
Sending message to /72.249.82.85 | 04:21:03,643 | 74.50.54.123 | 1402
Message received from /74.50.54.123 | 04:21:03,644 | 72.249.82.85 | 47
Executing seq scan across 0 sstables for [min(-9223372036854775808), min(-9223372036854775808)] | 04:21:03,644 | 72.249.82.85 | 461
Read 1 live and 0 tombstoned cells | 04:21:03,644 | 72.249.82.85 | 560
Read 1 live and 0 tombstoned cells | 04:21:03,644 | 72.249.82.85 | 611
………..etc….....
Re: Why is one query 10 times slower than the other?
Posted by graham sanderson <gr...@vast.com>.
In your “lookup_code” example “type” is not a clustercolumn it is the partition key, and hence the first query only hits one partition
The second query is a range slice across all possible keys, so the sub-ranges are farmed out to nodes with the data.
You are likely at CL_ONE, so it only needs response from one node for each sub-range… I guess it has decided (based on the snitch) that it is not unreasonable to share the query across the two nodes
> On Nov 5, 2014, at 10:41 PM, Jacob Rhoden <ja...@me.com> wrote:
>
> Hi Guys,
>
> I have two cassandra 2.0.5 nodes, RF=2. When I do a:
>
> select * from table1 where clustercolumn=‘something'
>
> The trace indicates that it only needs to talk to one node, which I would have expected. However when I do a:
>
> select * from table2
>
> Which is a small table with only has 20 rows in it, should be fully replicated, and should be a much quicker query, trace indicates that cassandra is talking to both nodes. This adds a 200ms to the query results, and is not necessary for my application (this table might have an amendment once per year if that), theres no real need to check both nodes for consistency.
>
> At this point I’ve not altered anything to do with consistency level. Does this mean that cassandra attempts to guess/infer what consistency level you need depending on if your query includes a filter on a particular key or clustering key?
>
> Thanks,
> Jacob
>
>
> CREATE KEYSPACE mykeyspace WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': ‘2' };
>
> CREATE TABLE organisation (uuid uuid, name text, url text, PRIMARY KEY (uuid))
>
> CREATE TABLE lookup_code (type text, code text, name text, PRIMARY KEY ((type), code))
>
>
> select * from lookup_code where type=‘mylist':
>
> activity | timestamp | source | source_elapsed
> ---------------------------------------------------------------------------+--------------+--------------+----------------
> execute_cql3_query | 04:20:15,319 | 74.50.54.123 | 0
> Parsing select * from lookup_code where type='research_area' LIMIT 10000; | 04:20:15,319 | 74.50.54.123 | 64
> Preparing statement | 04:20:15,320 | 74.50.54.123 | 204
> Executing single-partition query on lookup_code | 04:20:15,320 | 74.50.54.123 | 849
> Acquiring sstable references | 04:20:15,320 | 74.50.54.123 | 870
> Merging memtable tombstones | 04:20:15,320 | 74.50.54.123 | 894
> Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones | 04:20:15,320 | 74.50.54.123 | 958
> Merging data from memtables and 0 sstables | 04:20:15,320 | 74.50.54.123 | 976
> Read 168 live and 0 tombstoned cells | 04:20:15,321 | 74.50.54.123 | 1412
> Request complete | 04:20:15,321 | 74.50.54.123 | 2043
>
>
> select * from organisation:
>
> activity | timestamp | source | source_elapsed
> -------------------------------------------------------------------------------------------------+--------------+--------------+----------------
> execute_cql3_query | 04:21:03,641 | 74.50.54.123 | 0
> Parsing select * from organisation LIMIT 10000; | 04:21:03,641 | 74.50.54.123 | 68
> Preparing statement | 04:21:03,641 | 74.50.54.123 | 174
> Determining replicas to query | 04:21:03,642 | 74.50.54.123 | 307
> Enqueuing request to /72.249.82.85 | 04:21:03,642 | 74.50.54.123 | 1034
> Sending message to /72.249.82.85 | 04:21:03,643 | 74.50.54.123 | 1402
> Message received from /74.50.54.123 | 04:21:03,644 | 72.249.82.85 | 47
> Executing seq scan across 0 sstables for [min(-9223372036854775808), min(-9223372036854775808)] | 04:21:03,644 | 72.249.82.85 | 461
> Read 1 live and 0 tombstoned cells | 04:21:03,644 | 72.249.82.85 | 560
> Read 1 live and 0 tombstoned cells | 04:21:03,644 | 72.249.82.85 | 611
>
> ………..etc….....