You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by nitin padalia <pa...@gmail.com> on 2015/01/19 09:47:19 UTC
Cassandra fetches complete partition
Hi,
Does Cassandra fetches complete partition if I include Cluster key in
where clause.
Or What is the difference in:
1. Select * from column_family where partition_key = 'somekey' limit 1;
2. Select * from column_family where partition_key = 'somekey' and
clustering_key = 'some_clustring_key';
Thanks! in advance.
Nitin Padalia
Re: Cassandra fetches complete partition
Posted by Eric Stevens <mi...@gmail.com>.
It depends on your version of Cassandra. I would suggest starting with
this, which describes the differences between 2.0 and 2.1
http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1
In particular:
> In previous releases, this cache has required storing the entire
partition in memory, which meant that if that was larger than the cache
size, you would never be reading it from the cache. Cassandra 2.1 has
introduced extra CQL syntax to specify the number of rows to cache per
partition.
*However* row cache is actually a surprisingly dangerous property for the
health of a cluster. Practically speaking it's very, very rarely useful.
In particular the OS does a good job of caching disk seeks in the page
cache, and Cassandra relies on this heavily for consistent and reliable
performance. When you establish a row cache, you're putting a copy of the
data into off-heap Cassandra memory (a huge win over previous on-heap row
caches), but practically speaking this has little to no real advantage over
the OS level cache of the same data. And it has the downside that it can
hold onto cold data whose memory would be better used for some other
operation.
In Cassandra 2.0, row caches were "Don't use them. No, seriously!"
(Jonathan Ellis, CTO Datastax at Cassandra Summit '14 keynote). In 2.1
they're better because of changes mentioned in the article above, but
except for fairly narrow use cases you're usually better off focusing on
something else for performance optimizations first.
Couple this with the strong recommendation that Cassandra 2.1 isn't yet a
good candidate for important production uses (see
https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/),
you probably should not be concerning yourself with row caches yet.
On Mon, Jan 19, 2015 at 7:05 AM, nitin padalia <pa...@gmail.com>
wrote:
> e.g.
> CREATE TABLE usertable_cache (
> user_id uuid,
> dept_id uuid,
> location_id text,
> locationmap_id uuid,
> PRIMARY KEY ((user_id, dept_id), location_id)
> ) WITH
> bloom_filter_fp_chance=0.010000 AND
> caching='{"keys":"ALL", "rows_per_partition":"1000"}' AND
> comment='' AND
> dclocal_read_repair_chance=0.100000 AND
> gc_grace_seconds=864000 AND
> read_repair_chance=0.000000 AND
> default_time_to_live=0 AND
> speculative_retry='99.0PERCENTILE' AND
> memtable_flush_period_in_ms=0 AND
> compaction={'class': 'SizeTieredCompactionStrategy'} AND
> compression={'sstable_compression': 'LZ4Compressor'};
>
>
>
> select * from usertable_cache WHERE user_id =
> 7bf16edf-b552-40f4-94ac-87b2e878d8c2 and dept_id
> =de3ac44f-2078-4321-a47c-de96c615d40d and location_id = 'ABC4:1';
>
> user_id | dept_id
> | location_id | locationmap_id
>
> --------------------------------------+--------------------------------------+---------------+--------------------------------------
> 7bf16edf-b552-40f4-94ac-87b2e878d8c2 |
> de3ac44f-2078-4321-a47c-de96c615d40d | ABC4:1 |
> 32b97639-ea5b-427f-8c27-8a5016e2ad6e
>
> (1 rows)
>
>
> Tracing session: c40f9ba0-9fe2-11e4-9522-35de4dc20d00
>
> activity
>
> |
> timestamp | source | source_elapsed
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+----------------
>
>
> execute_cql3_query |
> 19:25:02,875 | 10.76.214.80 | 0
> Parsing select * from usertable_cache WHERE user_id =
> 7bf16edf-b552-40f4-94ac-87b2e878d8c2 and dept_id =
> de3ac44f-2078-4321-a47c-de96c615d40d and location_id = 'ABC4:1 LIMIT
> 10000; | 19:25:02,875 | 10.76.214.80 | 60
>
>
> Preparing statement |
> 19:25:02,875 | 10.76.214.80 | 157
>
>
> Ignoring row cache as cached value could not satisfy query |
> 19:25:02,879 | 10.76.214.80 | 3668
>
>
> Executing single-partition query on userobjectid_by_extn_uri_10k_cache
> | 19:25:02,879 | 10.76.214.80 | 3690
>
>
> Acquiring sstable references |
> 19:25:02,879 | 10.76.214.80 | 3700
>
>
> Merging memtable tombstones |
> 19:25:02,879 | 10.76.214.80 | 3755
>
>
> Key cache hit for sstable 3 |
> 19:25:02,879 | 10.76.214.80 | 4264
>
>
> Seeking to partition indexed section in data file |
> 19:25:02,879 | 10.76.214.80 | 4276
>
> Skipped
> 0/1 non-slice-intersecting sstables, included 0 due to tombstones |
> 19:25:02,879 | 10.76.214.80 | 4324
>
>
> Merging data from memtables and 1 sstables |
> 19:25:02,879 | 10.76.214.80 | 4337
>
>
> Read 1 live and 0 tombstoned cells |
> 19:25:02,883 | 10.76.214.80 | 7596
>
>
> Request complete |
> 19:25:02,883 | 10.76.214.80 | 8263
>
>
>
> select * from usertable_cache WHERE user_id =
> 7bf16edf-b552-40f4-94ac-87b2e878d8c2 and dept_id =
> de3ac44f-2078-4321-a47c-de96c615d40d and location_id = 'ABC4:2';
>
> user_id | dept_id
> | location_id | locationmap_id
>
> --------------------------------------+--------------------------------------+---------------+--------------------------------------
> 7bf16edf-b552-40f4-94ac-87b2e878d8c2 |
> de3ac44f-2078-4321-a47c-de96c615d40d | ABC4:2 |
> 1ddf3188-2642-4f8b-948b-78f220987e54
>
> (1 rows)
>
>
> Tracing session: 42bfdbe0-9fe3-11e4-9522-35de4dc20d00
>
> activity
>
> |
> timestamp | source | source_elapsed
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+----------------
>
>
> execute_cql3_query |
> 19:28:35,423 | 10.76.214.80 | 0
> Parsing select * from usertable_cache WHERE user_id =
> 7bf16edf-b552-40f4-94ac-87b2e878d8c2 and dept_id =
> de3ac44f-2078-4321-a47c-de96c615d40d and location_id = 'ABC4:2' LIMIT
> 10000; | 19:28:35,423 | 10.76.214.80 | 56
>
>
> Preparing statement |
> 19:28:35,423 | 10.76.214.80 | 147
>
>
> Row cache hit |
> 19:28:35,425 | 10.76.214.80 | 2530
>
>
> Read 1 live and 0 tombstoned cells |
> 19:28:35,425 | 10.76.214.80 | 2574
>
>
> Request complete |
> 19:28:35,425 | 10.76.214.80 | 2943
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Jan 19, 2015 at 6:25 PM, nitin padalia <pa...@gmail.com>
> wrote:
> > My question is specifically for row cache? As in cassandra 2.1.2 when I
> > populate a Column Family with 1000 rows for a partition and
> > rows_per_partition setting is 1000 for the Column Family then for first
> and
> > last row, it says cache miss.. if I mention specific row key in query?
> If I
> > increase rows_per_partition to 1002 then it is HIT for all.
> >
> > On Jan 19, 2015 2:17 PM, "nitin padalia" <pa...@gmail.com>
> wrote:
> >>
> >> Hi,
> >>
> >> Does Cassandra fetches complete partition if I include Cluster key in
> >> where clause.
> >>
> >> Or What is the difference in:
> >> 1. Select * from column_family where partition_key = 'somekey' limit 1;
> >> 2. Select * from column_family where partition_key = 'somekey' and
> >> clustering_key = 'some_clustring_key';
> >>
> >>
> >>
> >> Thanks! in advance.
> >> Nitin Padalia
>
>
>
> --
> Nitin Padalia
> 9999256157
>
Re: Cassandra fetches complete partition
Posted by nitin padalia <pa...@gmail.com>.
e.g.
CREATE TABLE usertable_cache (
user_id uuid,
dept_id uuid,
location_id text,
locationmap_id uuid,
PRIMARY KEY ((user_id, dept_id), location_id)
) WITH
bloom_filter_fp_chance=0.010000 AND
caching='{"keys":"ALL", "rows_per_partition":"1000"}' AND
comment='' AND
dclocal_read_repair_chance=0.100000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.000000 AND
default_time_to_live=0 AND
speculative_retry='99.0PERCENTILE' AND
memtable_flush_period_in_ms=0 AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'LZ4Compressor'};
select * from usertable_cache WHERE user_id =
7bf16edf-b552-40f4-94ac-87b2e878d8c2 and dept_id
=de3ac44f-2078-4321-a47c-de96c615d40d and location_id = 'ABC4:1';
user_id | dept_id
| location_id | locationmap_id
--------------------------------------+--------------------------------------+---------------+--------------------------------------
7bf16edf-b552-40f4-94ac-87b2e878d8c2 |
de3ac44f-2078-4321-a47c-de96c615d40d | ABC4:1 |
32b97639-ea5b-427f-8c27-8a5016e2ad6e
(1 rows)
Tracing session: c40f9ba0-9fe2-11e4-9522-35de4dc20d00
activity
|
timestamp | source | source_elapsed
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+----------------
execute_cql3_query |
19:25:02,875 | 10.76.214.80 | 0
Parsing select * from usertable_cache WHERE user_id =
7bf16edf-b552-40f4-94ac-87b2e878d8c2 and dept_id =
de3ac44f-2078-4321-a47c-de96c615d40d and location_id = 'ABC4:1 LIMIT
10000; | 19:25:02,875 | 10.76.214.80 | 60
Preparing statement |
19:25:02,875 | 10.76.214.80 | 157
Ignoring row cache as cached value could not satisfy query |
19:25:02,879 | 10.76.214.80 | 3668
Executing single-partition query on userobjectid_by_extn_uri_10k_cache
| 19:25:02,879 | 10.76.214.80 | 3690
Acquiring sstable references |
19:25:02,879 | 10.76.214.80 | 3700
Merging memtable tombstones |
19:25:02,879 | 10.76.214.80 | 3755
Key cache hit for sstable 3 |
19:25:02,879 | 10.76.214.80 | 4264
Seeking to partition indexed section in data file |
19:25:02,879 | 10.76.214.80 | 4276
Skipped
0/1 non-slice-intersecting sstables, included 0 due to tombstones |
19:25:02,879 | 10.76.214.80 | 4324
Merging data from memtables and 1 sstables |
19:25:02,879 | 10.76.214.80 | 4337
Read 1 live and 0 tombstoned cells |
19:25:02,883 | 10.76.214.80 | 7596
Request complete |
19:25:02,883 | 10.76.214.80 | 8263
select * from usertable_cache WHERE user_id =
7bf16edf-b552-40f4-94ac-87b2e878d8c2 and dept_id =
de3ac44f-2078-4321-a47c-de96c615d40d and location_id = 'ABC4:2';
user_id | dept_id
| location_id | locationmap_id
--------------------------------------+--------------------------------------+---------------+--------------------------------------
7bf16edf-b552-40f4-94ac-87b2e878d8c2 |
de3ac44f-2078-4321-a47c-de96c615d40d | ABC4:2 |
1ddf3188-2642-4f8b-948b-78f220987e54
(1 rows)
Tracing session: 42bfdbe0-9fe3-11e4-9522-35de4dc20d00
activity
|
timestamp | source | source_elapsed
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+----------------
execute_cql3_query |
19:28:35,423 | 10.76.214.80 | 0
Parsing select * from usertable_cache WHERE user_id =
7bf16edf-b552-40f4-94ac-87b2e878d8c2 and dept_id =
de3ac44f-2078-4321-a47c-de96c615d40d and location_id = 'ABC4:2' LIMIT
10000; | 19:28:35,423 | 10.76.214.80 | 56
Preparing statement |
19:28:35,423 | 10.76.214.80 | 147
Row cache hit |
19:28:35,425 | 10.76.214.80 | 2530
Read 1 live and 0 tombstoned cells |
19:28:35,425 | 10.76.214.80 | 2574
Request complete |
19:28:35,425 | 10.76.214.80 | 2943
On Mon, Jan 19, 2015 at 6:25 PM, nitin padalia <pa...@gmail.com> wrote:
> My question is specifically for row cache? As in cassandra 2.1.2 when I
> populate a Column Family with 1000 rows for a partition and
> rows_per_partition setting is 1000 for the Column Family then for first and
> last row, it says cache miss.. if I mention specific row key in query? If I
> increase rows_per_partition to 1002 then it is HIT for all.
>
> On Jan 19, 2015 2:17 PM, "nitin padalia" <pa...@gmail.com> wrote:
>>
>> Hi,
>>
>> Does Cassandra fetches complete partition if I include Cluster key in
>> where clause.
>>
>> Or What is the difference in:
>> 1. Select * from column_family where partition_key = 'somekey' limit 1;
>> 2. Select * from column_family where partition_key = 'somekey' and
>> clustering_key = 'some_clustring_key';
>>
>>
>>
>> Thanks! in advance.
>> Nitin Padalia
--
Nitin Padalia
9999256157
Re: Cassandra fetches complete partition
Posted by nitin padalia <pa...@gmail.com>.
My question is specifically for row cache? As in cassandra 2.1.2 when I
populate a Column Family with 1000 rows for a partition and
rows_per_partition setting is 1000 for the Column Family then for first and
last row, it says cache miss.. if I mention specific row key in query? If I
increase rows_per_partition to 1002 then it is HIT for all.
On Jan 19, 2015 2:17 PM, "nitin padalia" <pa...@gmail.com> wrote:
> Hi,
>
> Does Cassandra fetches complete partition if I include Cluster key in
> where clause.
>
> Or What is the difference in:
> 1. Select * from column_family where partition_key = 'somekey' limit 1;
> 2. Select * from column_family where partition_key = 'somekey' and
> clustering_key = 'some_clustring_key';
>
>
>
> Thanks! in advance.
> Nitin Padalia
>