You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Vineet Garg (JIRA)" <ji...@apache.org> on 2016/11/15 23:55:58 UTC

[jira] [Comment Edited] (CALCITE-1493) Wrong plan for NOT IN correlated queries

    [ https://issues.apache.org/jira/browse/CALCITE-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15668787#comment-15668787 ] 

Vineet Garg edited comment on CALCITE-1493 at 11/15/16 11:55 PM:
-----------------------------------------------------------------

Note that I had to hack RelOptTestBase's checkPlanning method to first generate the plan, apply SubqueryRemove rule and then immediately do decorrelation using RelDecorrelator.decorrelateQuery to reproduce above plans. (This is exactly the same way Hive plans to use Calcite for processing subqueries).
I tried adding test in SqlToRelConverterTest to reproduce this but it seems to bypass SubqueryRemove Rule.


was (Author: vgarg):
Note that I had to hack RelOptTestBase's checkPlanning method to first generate the plan, apply SubqueryRemove rule and then immediately do decorrelation using RelDecorrelator.decorrelateQuery to reproduce above plans. (This is exactly the same way Hive plans to use Calcite for processing subqueries)

> Wrong plan for NOT IN correlated queries
> ----------------------------------------
>
>                 Key: CALCITE-1493
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1493
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Vineet Garg
>            Assignee: Julian Hyde
>
> Plan generated by calcite with SubqueryRemoveRule followed by de-correlation for the following query:
> {noformat} select sal from emp where empno NOT IN (select deptno from dept where emp.job = dept.name) {noformat}
> is
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalFilter(condition=[IS NULL($11)])
>       LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[CAST($9):INTEGER], JOB0=[CAST($10):VARCHAR(10) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f2=[CAST($11):BOOLEAN])
>         LogicalJoin(condition=[AND(=($2, $10), =($0, $9))], joinType=[inner])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>           LogicalProject(DEPTNO=[$0], JOB=[$1], $f2=[true])
>             LogicalAggregate(group=[{0, 1}])
>               LogicalProject(DEPTNO=[$0], JOB=[$2], i=[$1])
>                 LogicalProject(DEPTNO=[$0], i=[true], JOB=[$1])
>                   LogicalProject(DEPTNO=[$0], JOB=[$2])
>                     LogicalJoin(condition=[=($2, $1)], joinType=[inner])
>                       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>                       LogicalAggregate(group=[{0}])
>                         LogicalProject(JOB=[$2])
>                           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {noformat}
> As you can notice instead of doing Left Outer Join Calcite is doing Inner Join. This will produce wrong results.
> Plan for same query just before SubqueryRemove Rule is:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalFilter(condition=[NOT(IN($0, {
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[=($cor0.JOB, $1)])
>     LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> }))], variablesSet=[[$cor0]])
>     LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {noformat}
> Plan just after SubqueryRemove Rule:
> {noformat}
> LogicalProject(SAL=[$5])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalFilter(condition=[IS NULL($10)])
>       LogicalFilter(condition=[=($0, $9)])
>         LogicalCorrelate(correlation=[$cor0], joinType=[LEFT], requiredColumns=[{2}])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>           LogicalAggregate(group=[{0, 1}])
>             LogicalProject(DEPTNO=[$0], i=[true])
>               LogicalProject(DEPTNO=[$0])
>                 LogicalFilter(condition=[=($cor0.JOB, $1)])
>                   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> Looking at above it seems RelDecorrelator have some issue where it is coming up with Inner Join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)