You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2018/06/07 21:31:00 UTC

[jira] [Comment Edited] (CALCITE-2329) Enhance SubQueryRemoveRule to rewrite IN operator with the constant from the left side more optimally

    [ https://issues.apache.org/jira/browse/CALCITE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505249#comment-16505249 ] 

Julian Hyde edited comment on CALCITE-2329 at 6/7/18 9:30 PM:
--------------------------------------------------------------

Reviewing now:
* I notice that at one point you generate {{case when deptno is null then false else true end}}; that is identical to {{deptno is not null}}.

I wrote a couple of test cases that fail:
* "select <constant> IN (select <constant that is sometimes null and sometimes not null> ..." generates a cartesian product and returns 2x more rows than it should;
* "select cast(null as ...) IN (<empty query>)" returns unknown, but should return false.

See my branch https://github.com/julianhyde/calcite/tree/2329-constant-in which contains a few fix-ups and the new test-cases at the end of sub-query.iq.


was (Author: julianhyde):
Reviewing now:
* I notice that at one point you generate {{case when deptno is null then false else true end}}; that is identical to {{deptno is not null}}.

> Enhance SubQueryRemoveRule to rewrite IN operator with the constant from the left side more optimally
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2329
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2329
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Volodymyr Vysotskyi
>            Assignee: Julian Hyde
>            Priority: Major
>
> Currently, for queries like this:
> {code:sql}
> select sal from emp where 2 IN (select deptno from dept)
> {code}
> SubQueryRemoveRule rules expand query plan in such a way:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalJoin(condition=[=(2, $9)], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalProject(DEPTNO=[$0])
>         LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> Since join condition is actually a filter condition, it will be pushed into the filter during further planning stages and join with the true condition is left.
> But these types of the queries may be rewritten in the same way as EXISTS queries:
> {code:sql}
> select sal from emp where EXISTS (select deptno from dept where deptno=2)
> {code}
> with the more optimal plan:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalJoin(condition=[true], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{0}])
>         LogicalProject(i=[true])
>           LogicalFilter(condition=[=($0, 2)])
>             LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)