You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2013/11/20 17:01:35 UTC

[jira] [Commented] (CASSANDRA-6377) ALLOW FILTERING should allow seq scan filtering

    [ https://issues.apache.org/jira/browse/CASSANDRA-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827787#comment-13827787 ] 

Sylvain Lebresne commented on CASSANDRA-6377:
---------------------------------------------

Now that I look more closely, I remember why we don't allow it.

We could allow the example above to work. But currently the code isn't be able to handle the same example if there is a clustering column. The reason being that the internal sequential scanning work on the unit of internal rows. In particular the filtering is done at that level, not at the CQL3 row level. Which is ok without clustering keys because one CQL3 row == one internal row but isn't otherwise.

That's for a very similar reason that CompositesSearcher basically does the filtering himself and shunt the filtering done in ColumnFamilyStore.filter().  So to make filtering work with CQL3 tables having a clustering column, we'd need to update sequential scanning in a similar way so that it does the filtering at the CQL3 level. And unfortunately, unless we push the IndexExpression filtering deep into SliceQueryFilter (doable but probably a tad ugly), I don't see an easy way to do that without 2 iterations over each row.  On the whole, I'm not entirely it's worth getting into that trouble for an ALLOW FILTERING query. We can always revisit when we've entirely refactored the storage engine based on CQL3 rows :P

Back to the example in the description, we can just allow that example and keep disallowing the case where there is clustering columns, but I wonder if it wouldn't be more confusing than not to support it only in that special case.  But I'm fine writing the trivial patch if you think that's worth it?


> ALLOW FILTERING should allow seq scan filtering
> -----------------------------------------------
>
>                 Key: CASSANDRA-6377
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6377
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 1.2.13
>
>
> CREATE TABLE emp_table2 (
>         empID int PRIMARY KEY,
>         firstname text,
>         lastname text,
>         b_mon text,
>         b_day text,
>         b_yr text,
> );
> INSERT INTO emp_table2 (empID,firstname,lastname,b_mon,b_day,b_yr) 
>    VALUES (100,'jane','doe','oct','31','1980');
> INSERT INTO emp_table2 (empID,firstname,lastname,b_mon,b_day,b_yr) 
>    VALUES (101,'john','smith','jan','01','1981');
> INSERT INTO emp_table2 (empID,firstname,lastname,b_mon,b_day,b_yr) 
>    VALUES (102,'mary','jones','apr','15','1982');
> INSERT INTO emp_table2 (empID,firstname,lastname,b_mon,b_day,b_yr) 
>    VALUES (103,'tim','best','oct','25','1982');
>    
> SELECT b_mon,b_day,b_yr,firstname,lastname FROM emp_table2 
>     WHERE b_mon='oct' ALLOW FILTERING;
> Bad Request: No indexed columns present in by-columns clause with Equal operator



--
This message was sent by Atlassian JIRA
(v6.1#6144)