You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by va...@geodesiconline.com on 2012/01/19 12:18:22 UTC

CQL 'Where' clause ignores secondary index filter

Hi,
 
I've defined a  column family 'Vaibhav' in which every row has few columns and its values. I've declared two column as secondary index so that I can filter the rows on the basis of those column values.
 
Now whenever I execute a CQL with either only rowkey or column name in 'WHERE' clause, it gives me the proper result. But when I use both rowkey and column name filter in 'WHERE' clause it ignores the column filter and gives the result matching with the rowkey.
 
Sample Records from column family 'Vaibhav':
===================================
 
 
RowKey: 1111 

=> (column=Bare_JID, value=mithun@geo.com, timestamp=1326436553009) 
=> (column=Domain, value=geo.com, timestamp=1326436553014) 
=> (column=Full_JID, value=mithun@geo.com/android, timestamp=1326436552979) 
------------------- 
RowKey: 2222
=> (column=Bare_JID, value=venkat@geo.com, timestamp=1326436611316)
=> (column=Domain, value=geo.com, timestamp=1326436611318)
=> (column=Full_JID, value=venkat@geo.com/android, timestamp=1326436611310)
-------------------
RowKey: 3333
=> (column=Bare_JID, value=kondiram@geo.com, timestamp=1326436643841)
=> (column=Domain, value=geo.com, timestamp=1326436643842)
=> (column=Full_JID, value=kondiram@geo.com/android, timestamp=1326436643835)
 
 
Here both 'Bare_JID' and 'Domain' have secondary Index. 
 
CQL : SELECT * from Vaibhav where KEY = '1111' AND Bare_JID='venkat@geo.com' 
 
is returning the first row, but ideally it should return no records as both the condition are not satisfied.
 
 
Kindly help me to resolve this issue.
 
Regards,
Vaibhav S.

Re: CQL 'Where' clause ignores secondary index filter

Posted by Sylvain Lebresne <sy...@datastax.com>.
I think that qualify as a bug. We should either refuse the query if we
don't know how to do this correctly or return a sensible result (i.e,
no result in that case).
Would you mind opening a ticket on
https://issues.apache.org/jira/browse/CASSANDRA?

--
Sylvain

On Fri, Jan 20, 2012 at 6:39 AM,  <va...@geodesiconline.com> wrote:
> Dear Aaron,
>
>
>
> Thanks for the information.
>
>
>
> Actually it's a normal query which works with SQL. I believe there will be
> some mechanism to do so in Cassandra, as first retrieving the records based
> on key and then checking for the column index later will be inefficient.
>
>
>
> Thanks again.
>
>
>
> Regards,
>
> Vaibhav
>
>
>
> -----Original Message-----
> From: "aaron morton" <aa...@thelastpickle.com>
> Sent: Friday, January 20, 2012 12:23am
> To: user@cassandra.apache.org
> Subject: Re: CQL 'Where' clause ignores secondary index filter
>
> It is working as expected.
> Because you have specified a KEY the query returns records that match that
> key(s), and it ignores the other clauses.
> Selecting rows follows one of three paths:
> * selects rows by key(s)
> * select rows by key range, i.e. rows after this key.
> * select rows by (secondary) column index.
> Hope that helps.
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 20/01/2012, at 12:18 AM, vaibhav.s@geodesiconline.com wrote:
>
> Hi,
>
>
>
> I've defined a  column family 'Vaibhav' in which every row has few columns
> and its values. I've declared two column as secondary index so that I can
> filter the rows on the basis of those column values.
>
>
>
> Now whenever I execute a CQL with either only rowkey or column name in
> 'WHERE' clause, it gives me the proper result. But when I use both rowkey
> and column name filter in 'WHERE' clause it ignores the column filter and
> gives the result matching with the rowkey.
>
>
>
> Sample Records from column family 'Vaibhav':
> ===================================
>
>
>
>
>
> RowKey: 1111
>
>
>
> => (column=Bare_JID, value=mithun@geo.com, timestamp=1326436553009)
> => (column=Domain, value=geo.com, timestamp=1326436553014)
> => (column=Full_JID, value=mithun@geo.com/android, timestamp=1326436552979)
> -------------------
> RowKey: 2222
> => (column=Bare_JID, value=venkat@geo.com, timestamp=1326436611316)
> => (column=Domain, value=geo.com, timestamp=1326436611318)
> => (column=Full_JID, value=venkat@geo.com/android, timestamp=1326436611310)
> -------------------
> RowKey: 3333
> => (column=Bare_JID, value=kondiram@geo.com, timestamp=1326436643841)
> => (column=Domain, value=geo.com, timestamp=1326436643842)
> => (column=Full_JID, value=kondiram@geo.com/android,
> timestamp=1326436643835)
>
>
>
>
>
> Here both 'Bare_JID' and 'Domain' have secondary Index.
>
>
>
> CQL : SELECT * from Vaibhav where KEY = '1111' AND Bare_JID='venkat@geo.com'
>
>
>
> is returning the first row, but ideally it should return no records as both
> the condition are not satisfied.
>
>
>
>
>
> Kindly help me to resolve this issue.
>
>
>
> Regards,
> Vaibhav S.

Re: CQL 'Where' clause ignores secondary index filter

Posted by va...@geodesiconline.com.
Dear Aaron,
 
Thanks for the information.
 
Actually it's a normal query which works with SQL. I believe there will be some mechanism to do so in Cassandra, as first retrieving the records based on key and then checking for the column index later will be inefficient.
 
Thanks again.
 
Regards,
Vaibhav
 
-----Original Message-----
From: "aaron morton" <aa...@thelastpickle.com>
Sent: Friday, January 20, 2012 12:23am
To: user@cassandra.apache.org
Subject: Re: CQL 'Where' clause ignores secondary index filter


It is working as expected.
Because you have specified a KEY the query returns records that match that key(s), and it ignores the other clauses.
Selecting rows follows one of three paths:
* selects rows by key(s)
* select rows by key range, i.e. rows after this key.
* select rows by (secondary) column index.
Hope that helps.






-----------------
Aaron Morton
Freelance Developer
@aaronmorton
[http://www.thelastpickle.com] http://www.thelastpickle.com


On 20/01/2012, at 12:18 AM, [mailto:vaibhav.s@geodesiconline.com] vaibhav.s@geodesiconline.com wrote:

Hi,
 
I've defined a  column family 'Vaibhav' in which every row has few columns and its values. I've declared two column as secondary index so that I can filter the rows on the basis of those column values.
 
Now whenever I execute a CQL with either only rowkey or column name in 'WHERE' clause, it gives me the proper result. But when I use both rowkey and column name filter in 'WHERE' clause it ignores the column filter and gives the result matching with the rowkey.
 
Sample Records from column family 'Vaibhav':
===================================
 
 
RowKey: 1111 
 
=> (column=Bare_JID, [mailto:value=mithun@geo.com] value=mithun@geo.com, timestamp=1326436553009) 
=> (column=Domain, value=geo.com, timestamp=1326436553014) 
=> (column=Full_JID, [mailto:value=mithun@geo.com] value=mithun@geo.com/android, timestamp=1326436552979) 
------------------- 
RowKey: 2222
=> (column=Bare_JID, [mailto:value=venkat@geo.com] value=venkat@geo.com, timestamp=1326436611316)
=> (column=Domain, value=geo.com, timestamp=1326436611318)
=> (column=Full_JID, [mailto:value=venkat@geo.com] value=venkat@geo.com/android, timestamp=1326436611310)
-------------------
RowKey: 3333
=> (column=Bare_JID, [mailto:value=kondiram@geo.com] value=kondiram@geo.com, timestamp=1326436643841)
=> (column=Domain, value=geo.com, timestamp=1326436643842)
=> (column=Full_JID, [mailto:value=kondiram@geo.com] value=kondiram@geo.com/android, timestamp=1326436643835)
 
 
Here both 'Bare_JID' and 'Domain' have secondary Index. 
 
CQL : SELECT * from Vaibhav where KEY = '1111' AND [mailto:Bare_JID=] Bare_JID='venkat@geo.com' 
 
is returning the first row, but ideally it should return no records as both the condition are not satisfied.
 
 
Kindly help me to resolve this issue.
 
Regards,
Vaibhav S.

Re: CQL 'Where' clause ignores secondary index filter

Posted by aaron morton <aa...@thelastpickle.com>.
It is working as expected. 

Because you have specified a KEY the query returns records that match that key(s), and it ignores the other clauses. 

Selecting rows follows one of three paths:
* selects rows by key(s)
* select rows by key range, i.e. rows after this key. 
* select rows by (secondary) column index. 

Hope that helps. 
  
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 20/01/2012, at 12:18 AM, vaibhav.s@geodesiconline.com wrote:

> Hi,
>  
> I've defined a  column family 'Vaibhav' in which every row has few columns and its values. I've declared two column as secondary index so that I can filter the rows on the basis of those column values.
>  
> Now whenever I execute a CQL with either only rowkey or column name in 'WHERE' clause, it gives me the proper result. But when I use both rowkey and column name filter in 'WHERE' clause it ignores the column filter and gives the result matching with the rowkey.
>  
> Sample Records from column family 'Vaibhav':
> ===================================
>  
>  
> RowKey: 1111
> => (column=Bare_JID, value=mithun@geo.com, timestamp=1326436553009)
> => (column=Domain, value=geo.com, timestamp=1326436553014)
> => (column=Full_JID, value=mithun@geo.com/android, timestamp=1326436552979)
> -------------------
> RowKey: 2222
> => (column=Bare_JID, value=venkat@geo.com, timestamp=1326436611316)
> => (column=Domain, value=geo.com, timestamp=1326436611318)
> => (column=Full_JID, value=venkat@geo.com/android, timestamp=1326436611310)
> -------------------
> RowKey: 3333
> => (column=Bare_JID, value=kondiram@geo.com, timestamp=1326436643841)
> => (column=Domain, value=geo.com, timestamp=1326436643842)
> => (column=Full_JID, value=kondiram@geo.com/android, timestamp=1326436643835)
>  
>  
> Here both 'Bare_JID' and 'Domain' have secondary Index.
>  
> CQL : SELECT * from Vaibhav where KEY = '1111' AND Bare_JID='venkat@geo.com'
>  
> is returning the first row, but ideally it should return no records as both the condition are not satisfied.
>  
>  
> Kindly help me to resolve this issue.
>  
> Regards,
> Vaibhav S.
>