You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Oskar Kjellin <os...@gmail.com> on 2017/04/21 08:44:55 UTC

Will query on PK read entire partition?

If I have a table like this:

PRIMARY KEY ((userid),deviceid)

And I query
SELECT * FROM devices where userid= ? and deviceid = ?

Will cassandra read the entire partition for the userid? So if I lots of
tombstones for userid, will they get scanned?

I guess this depends on how the bloomfilter is working. Does it contain
partitioning key or primary key?

We're using 2.0.17 if it matters.

/Oskar

Re: Will query on PK read entire partition?

Posted by Vladimir Yudovin <vl...@winguzone.com>.
Hi,



if you provide primary key C* will not scan whole partition, but will bloom filter to determinate SSTable:

Cassandra uses Bloom filters to determine whether an SSTable has data for a particular row. Bloom filters are unused for range scans, but are used for index scans.






Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






---- On Fri, 21 Apr 2017 07:56:08 -0400 Alain RODRIGUEZ &lt;arodrime@gmail.com&gt; wrote ----




Hi Oskar,



My guess (wait for confirmation maybe): When you read from a primary key + specific clustering key or (range of clustering keys), Apache Cassandra will look for these specific values and not read all the row. Yet it is important to know that a minimal block size of 64 KB is read from the disk (not configurable in C* 2.0). Or if the table is compressed, the minimal read size is a chunk, for which you can manually set the size. That's why when using small rows, it is sometimes interesting to enable compression, even if you don't care about the data size... This all has been improved a bit in 2.1 / 2.2 and greatly in C* 3.0+.



I might write a post about this, if I do, I will let you know. It's an interesting topic I have been working on recently.



C*heers,

-----------------------

Alain Rodriguez - @arodream - alain@thelastpickle.com

France



The Last Pickle - Apache Cassandra Consulting

http://www.thelastpickle.com










2017-04-21 10:44 GMT+02:00 Oskar Kjellin &lt;oskar.kjellin@gmail.com&gt;:

If I have a table like this:



PRIMARY KEY ((userid),deviceid)



And I query

SELECT * FROM devices where userid= ? and deviceid = ?



Will cassandra read the entire partition for the userid? So if I lots of tombstones for userid, will they get scanned?



I guess this depends on how the bloomfilter is working. Does it contain partitioning key or primary key?



We're using 2.0.17 if it matters.



/Oskar










Re: Will query on PK read entire partition?

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hi Oskar,

My guess (wait for confirmation maybe): When you read from a primary key +
specific clustering key or (range of clustering keys), Apache Cassandra
will look for these specific values and not read all the row. Yet it is
important to know that a minimal block size of 64 KB is read from the disk
(not configurable in C* 2.0). Or if the table is compressed, the minimal
read size is a chunk, for which you can manually set the size. That's why
when using small rows, it is sometimes interesting to enable compression,
even if you don't care about the data size... This all has been improved a
bit in 2.1 / 2.2 and greatly in C* 3.0+.

I might write a post about this, if I do, I will let you know. It's an
interesting topic I have been working on recently.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com



2017-04-21 10:44 GMT+02:00 Oskar Kjellin <os...@gmail.com>:

> If I have a table like this:
>
> PRIMARY KEY ((userid),deviceid)
>
> And I query
> SELECT * FROM devices where userid= ? and deviceid = ?
>
> Will cassandra read the entire partition for the userid? So if I lots of
> tombstones for userid, will they get scanned?
>
> I guess this depends on how the bloomfilter is working. Does it contain
> partitioning key or primary key?
>
> We're using 2.0.17 if it matters.
>
> /Oskar
>