You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by onmstester onmstester <on...@zoho.com.INVALID> on 2020/01/12 14:04:01 UTC

bug in cluster key push down

Using Apache Cassandra 3.11.2, defined a table like this:



create table my_table(

                   partition text,
                   clustering1 int,
                  clustering2 text,

                  data set<text>,

                primary key (partition, clustering1, clustering2))



and configured slow queries threshold to 1ms in yaml to see how queries passed to cassandra. Query below:



select * from my_table where partition='a' and clustering1= 1 and clustering2='b'



would be like this in debug.log of cassandra:



select * from my_table where partition='a' LIMIT 100>  (it means that the two cluster key restriction did not push down to storage engine and the whole partition been retrieved)



but this query:



select * from my_table where partition='a' and clustering1= 1



would be 



select * from my_table where partition='a' and clustering1= 1 LIMIT 100> (single cluster key been pushed down to storage engine)





So it seems to me that, we could not restrict multiple clustering keys in select because it would retrieve the whole partition ?!

Sent using https://www.zoho.com/mail/

Re: [E] bug in cluster key push down

Posted by Hannu Kröger <hk...@gmail.com>.
No, I think it was originally correct.

If partition key has multiple parts, then you need parenthesis around parts of partition key.

Hannu

> On 13. Jan 2020, at 14.30, Saha, Sushanta K <su...@verizonwireless.com.INVALID> wrote:
> 
>> primary key (partition, clustering1, clustering2)
>> 
>> So, the partitioning key has three columns. You need to specify values for all three columns. For clustering columns, you need another parenthesis like primary key (partition, (clustering1, clustering2))
>> 
>> .... Sushanta
> 
> On Sun, Jan 12, 2020 at 10:52 AM Jeff Jirsa <jjirsa@gmail.com <ma...@gmail.com>> wrote:
> Can you open a jira so someone can investigate ? It’s probably just a logging / visibility problem, but we should confirm 
> 
> Sent from my iPhone
> 
>> On Jan 12, 2020, at 6:04 AM, onmstester onmstester <on...@zoho.com.invalid> wrote:
>> 
>> 
>> Using Apache Cassandra 3.11.2, defined a table like this:
>> 
>> create table my_table(
>>                    partition text,
>>                    clustering1 int,
>>                   clustering2 text,
>>                   data set<text>,
>>                 primary key (partition, clustering1, clustering2))
>> 
>> and configured slow queries threshold to 1ms in yaml to see how queries passed to cassandra. Query below:
>> 
>> select * from my_table where partition='a' and clustering1= 1 and clustering2='b'
>> 
>> would be like this in debug.log of cassandra:
>> 
>> select * from my_table where partition='a' LIMIT 100>  (it means that the two cluster key restriction did not push down to storage engine and the whole partition been retrieved)
>> 
>> but this query:
>> 
>> select * from my_table where partition='a' and clustering1= 1
>> 
>> would be 
>> 
>> select * from my_table where partition='a' and clustering1= 1 LIMIT 100> (single cluster key been pushed down to storage engine)
>> 
>> 
>> So it seems to me that, we could not restrict multiple clustering keys in select because it would retrieve the whole partition ?!
>> Sent using Zoho Mail <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.zoho.com_mail_&d=DwMFaQ&c=udBTRvFvXC5Dhqg7UHpJlPps3mZ3LRxpb6__0PomBTQ&r=JaNakyvP8b5eGvWhmxll6L4SNrrqHGq3Ghe3-Mb4Nezhm-3SiJqFhK3ICb6tsog4&m=nLzRDuDtJfHu6ztWOA1rS6O5NNM608IOgZ5IQ-DCwXc&s=uiUd0dwu7gjeqOHJ3l_qTkBp8vsm9aeH4a4abkZFFKc&e=>
>> 
>> 
>> 
> 
> 
> -- 
> 
> Sushanta Saha|MTS IV-Cslt-Sys Engrg|WebIaaS_DB Group|HQ - VerizonWireless 
> O 770.797.1260  C 770.714.6555 Iaas Support Line 949-286-8810
> 


Re: [E] Re: bug in cluster key push down

Posted by "Saha, Sushanta K" <su...@verizonwireless.com.INVALID>.
*primary key (partition, clustering1, clustering2)*

So, the partitioning key has three columns. You need to specify values for
all three columns. For clustering columns, you need another parenthesis
like *primary key (partition, (clustering1, clustering2))*

*.... Sushanta*


On Sun, Jan 12, 2020 at 10:52 AM Jeff Jirsa <jj...@gmail.com> wrote:

> Can you open a jira so someone can investigate ? It’s probably just a
> logging / visibility problem, but we should confirm
>
> Sent from my iPhone
>
> On Jan 12, 2020, at 6:04 AM, onmstester onmstester
> <on...@zoho.com.invalid> wrote:
>
> 
> Using Apache Cassandra 3.11.2, defined a table like this:
>
>
> *create table my_table(*
>                    *partition text,*
>                    *clustering1 int,*
>
> *clustering2 text,*
>
> *data set<text>,*
>                 *primary key (partition, clustering1, clustering2))*
>
> and configured slow queries threshold to 1ms in yaml to see how queries
> passed to cassandra. Query below:
>
> *select * from my_table where partition='a' and clustering1= 1 and
> clustering2='b'*
>
> would be like this in debug.log of cassandra:
>
> *select * from my_table where partition='a' LIMIT 100>  (it means that the
> two cluster key restriction did not push down to storage engine and the
> whole partition been retrieved)*
>
> but this query:
>
> *select * from my_table where partition='a' and clustering1= 1*
>
> *would be *
>
> *select * from my_table where partition='a' and clustering1= 1 LIMIT 100>
> (single cluster key been pushed down to storage engine)*
>
>
> *So it seems to me that, we could not restrict multiple clustering keys in
> select because it would retrieve the whole partition ?!*
>
> Sent using Zoho Mail
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.zoho.com_mail_&d=DwMFaQ&c=udBTRvFvXC5Dhqg7UHpJlPps3mZ3LRxpb6__0PomBTQ&r=JaNakyvP8b5eGvWhmxll6L4SNrrqHGq3Ghe3-Mb4Nezhm-3SiJqFhK3ICb6tsog4&m=nLzRDuDtJfHu6ztWOA1rS6O5NNM608IOgZ5IQ-DCwXc&s=uiUd0dwu7gjeqOHJ3l_qTkBp8vsm9aeH4a4abkZFFKc&e=>
>
>
>
>

-- 

*Sushanta Saha|*MTS IV-Cslt-Sys Engrg|WebIaaS_DB Group|HQ -
* VerizonWireless O 770.797.1260  C 770.714.6555 Iaas Support Line
949-286-8810*

Re: bug in cluster key push down

Posted by onmstester onmstester <on...@zoho.com.INVALID>.
-----It’s probably just a logging / visibility problem, but we should confirm 

I think it is.

Cause with tracing on, cqlsh logs that "read 1 live rows... " for the query with both clustering key restricted but the whole partition (with no clustering key restriction) has 12 live rows, so i suppose that clustering key restrictions been pushed down to storage engine.



Thanks Jeff
Sent using https://www.zoho.com/mail/






---- On Mon, 13 Jan 2020 08:38:44 +0330 onmstester onmstester <ma...@zoho.com.INVALID> wrote ----



Done.

https://issues.apache.org/jira/browse/CASSANDRA-15500



Sent using https://www.zoho.com/mail/






---- On Sun, 12 Jan 2020 19:22:33 +0330 Jeff Jirsa <ma...@gmail.com> wrote ----












Can you open a jira so someone can investigate ? It’s probably just a logging / visibility problem, but we should confirm 



Sent from my iPhone



On Jan 12, 2020, at 6:04 AM, onmstester onmstester <ma...@zoho.com.invalid> wrote:





Using Apache Cassandra 3.11.2, defined a table like this:



create table my_table(

                   partition text,

                   clustering1 int,

                  clustering2 text,

                  data set<text>,

                primary key (partition, clustering1, clustering2))



and configured slow queries threshold to 1ms in yaml to see how queries passed to cassandra. Query below:



select * from my_table where partition='a' and clustering1= 1 and clustering2='b'



would be like this in debug.log of cassandra:



select * from my_table where partition='a' LIMIT 100>  (it means that the two cluster key restriction did not push down to storage engine and the whole partition been retrieved)



but this query:



select * from my_table where partition='a' and clustering1= 1



would be 



select * from my_table where partition='a' and clustering1= 1 LIMIT 100> (single cluster key been pushed down to storage engine)





So it seems to me that, we could not restrict multiple clustering keys in select because it would retrieve the whole partition ?!

Sent using https://www.zoho.com/mail/

Re: bug in cluster key push down

Posted by onmstester onmstester <on...@zoho.com.INVALID>.
Done.

https://issues.apache.org/jira/browse/CASSANDRA-15500


Sent using https://www.zoho.com/mail/




---- On Sun, 12 Jan 2020 19:22:33 +0330 Jeff Jirsa <jj...@gmail.com> wrote ----


Can you open a jira so someone can investigate ? It’s probably just a logging / visibility problem, but we should confirm 

Sent from my iPhone


On Jan 12, 2020, at 6:04 AM, onmstester onmstester <ma...@zoho.com.invalid> wrote:



Using Apache Cassandra 3.11.2, defined a table like this:



create table my_table(

                   partition text,

                   clustering1 int,

                  clustering2 text,

                  data set<text>,

                primary key (partition, clustering1, clustering2))



and configured slow queries threshold to 1ms in yaml to see how queries passed to cassandra. Query below:



select * from my_table where partition='a' and clustering1= 1 and clustering2='b'



would be like this in debug.log of cassandra:



select * from my_table where partition='a' LIMIT 100>  (it means that the two cluster key restriction did not push down to storage engine and the whole partition been retrieved)



but this query:



select * from my_table where partition='a' and clustering1= 1



would be 



select * from my_table where partition='a' and clustering1= 1 LIMIT 100> (single cluster key been pushed down to storage engine)





So it seems to me that, we could not restrict multiple clustering keys in select because it would retrieve the whole partition ?!

Sent using https://www.zoho.com/mail/

Re: bug in cluster key push down

Posted by Jeff Jirsa <jj...@gmail.com>.
Can you open a jira so someone can investigate ? It’s probably just a logging / visibility problem, but we should confirm 

Sent from my iPhone

> On Jan 12, 2020, at 6:04 AM, onmstester onmstester <on...@zoho.com.invalid> wrote:
> 
> 
> Using Apache Cassandra 3.11.2, defined a table like this:
> 
> create table my_table(
>                    partition text,
>                    clustering1 int,
>                   clustering2 text,
>                   data set<text>,
>                 primary key (partition, clustering1, clustering2))
> 
> and configured slow queries threshold to 1ms in yaml to see how queries passed to cassandra. Query below:
> 
> select * from my_table where partition='a' and clustering1= 1 and clustering2='b'
> 
> would be like this in debug.log of cassandra:
> 
> select * from my_table where partition='a' LIMIT 100>  (it means that the two cluster key restriction did not push down to storage engine and the whole partition been retrieved)
> 
> but this query:
> 
> select * from my_table where partition='a' and clustering1= 1
> 
> would be 
> 
> select * from my_table where partition='a' and clustering1= 1 LIMIT 100> (single cluster key been pushed down to storage engine)
> 
> 
> So it seems to me that, we could not restrict multiple clustering keys in select because it would retrieve the whole partition ?!
> Sent using Zoho Mail
> 
> 
> 
>