You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Stamatis Zampetakis (JIRA)" <ji...@apache.org> on 2019/04/22 12:49:00 UTC
[jira] [Resolved] (CALCITE-1624) Inefficient plan for NOT IN
correlated subqueries
[ https://issues.apache.org/jira/browse/CALCITE-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stamatis Zampetakis resolved CALCITE-1624.
------------------------------------------
Resolution: Not A Problem
Fix Version/s: 1.20.0
I am marking this as resolved rather than closed. If I am not wrong, closed is reserved for the time of the release.
> Inefficient plan for NOT IN correlated subqueries
> -------------------------------------------------
>
> Key: CALCITE-1624
> URL: https://issues.apache.org/jira/browse/CALCITE-1624
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Vineet Garg
> Assignee: Danny Chan
> Priority: Major
> Fix For: 1.20.0
>
>
> I just noticed that {{NOT IN}} correlated subqueries produces an extra un-neccessary join after de-correlation (this is an addition to un-necessary joins reported in CALCITE-1494)
> Query
> {code:SQL}
> select sal from emp
> where empno NOT IN (
> select deptno from dept
> where emp.job = dept.name)
> {code}
> Plan after subquery remove rule:
> {code}
> LogicalProject(SAL=[$5])
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalFilter(condition=[NOT(CASE(=($9, 0), false, IS NOT NULL($12), true, <($10, $9), true, false))])
> LogicalCorrelate(correlation=[$cor0], joinType=[LEFT], requiredColumns=[{2}])
> LogicalCorrelate(correlation=[$cor0], joinType=[LEFT], requiredColumns=[{2}])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)])
> LogicalProject(DEPTNO=[$0])
> LogicalFilter(condition=[=($cor0.JOB, $1)])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalFilter(condition=[=($cor0.EMPNO, $0)])
> LogicalAggregate(group=[{0, 1}])
> LogicalProject(DEPTNO=[$0], i=[true])
> LogicalProject(DEPTNO=[$0])
> LogicalFilter(condition=[=($cor0.JOB, $1)])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> Plan after de-correlation
> {code}
> LogicalProject(SAL=[$5])
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalFilter(condition=[NOT(CASE(=($10, 0), false, IS NOT NULL($14), true, <($11, $10), true, false))])
> LogicalJoin(condition=[AND(=($0, $15), =($2, $13))], joinType=[left])
> LogicalJoin(condition=[=($2, $9)], joinType=[left])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0}], c=[COUNT()], ck=[COUNT($1)])
> LogicalProject(JOB=[$1], DEPTNO=[$0])
> LogicalProject(DEPTNO=[$0], JOB=[$2])
> LogicalJoin(condition=[=($2, $1)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalAggregate(group=[{0}])
> LogicalProject(JOB=[$2])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalJoin(condition=[=($3, $0)], joinType=[inner]) // <== Un-necessary join
> LogicalProject(DEPTNO=[$0], JOB=[$1], $f2=[true])
> LogicalAggregate(group=[{0, 1}])
> LogicalProject(DEPTNO=[$0], JOB=[$2], i=[$1])
> LogicalProject(DEPTNO=[$0], i=[true], JOB=[$1])
> LogicalProject(DEPTNO=[$0], JOB=[$2])
> LogicalJoin(condition=[=($2, $1)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalAggregate(group=[{0}])
> LogicalProject(JOB=[$2])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0}])
> LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> As you can see in plan after de-correlation there is an un-necessary inner join.
> This is not reproducible on CALCITE-1494's branch. But since this is a separate issue from CALCITE-1494 I decided to open a separate JIRA.
> Feel free to mark is duplicate or close it if you think otherwise.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)