You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "godfrey he (JIRA)" <ji...@apache.org> on 2019/02/20 08:53:00 UTC

[jira] [Updated] (CALCITE-2857) SemiJoin extends Join directly instead of EquiJoin to support non-equi condition

     [ https://issues.apache.org/jira/browse/CALCITE-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

godfrey he updated CALCITE-2857:
--------------------------------
    Description: 
SemiJoin extends Join directly instead of EquiJoin to support non-equi condition, 
 e.g.
{code:sql}
select * from (select * from dept where dept.deptno in (select emp.deptno from emp where emp.job <> dept.name))R where R.deptno <= 10
{code}

Currently, this query will be converted to
{code}
SemiJoin(condition=[AND(=($1, $3), =($0, $2))], joinType=[inner])
  LogicalFilter(condition=[<=($0, 10)])
    LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
  LogicalProject(DEPTNO=[$7], NAME=[$9])
    LogicalJoin(condition=[<>($2, $9)], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalAggregate(group=[{0}])
        LogicalProject(NAME=[$1])
          LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}

however the following plan is more efficient than the above one:
{code}
LogicalProject(DEPTNO=[$0], NAME=[$1])
  LogicalFilter(condition=[<=($0, 10)])
    SemiJoin(condition=[=AND(=($0, $3), <>($1, $2))], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
      LogicalProject(JOB=[$2], DEPTNO=[$7])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}

  was:
SemiJoin extends Join directly instead of EquiJoin to support non-equi condition, 
 e.g.
{code:sql}
select * from (select * from dept where dept.deptno in (select emp.deptno from emp where emp.job <> dept.name))R where R.deptno <= 10
{code}

Currently, this query will be converted to
{code}
SemiJoin(condition=[AND(=($1, $3), =($0, $2))], joinType=[inner])
  LogicalFilter(condition=[<=($0, 10)])
    LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
  LogicalProject(DEPTNO=[$7], NAME=[$9])
    LogicalJoin(condition=[<>($2, $9)], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalAggregate(group=[{0}])
        LogicalProject(NAME=[$1])
          LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}

however the above plan is not efficient than the following:
{code}
LogicalProject(DEPTNO=[$0], NAME=[$1])
  LogicalFilter(condition=[<=($0, 10)])
    SemiJoin(condition=[=AND(=($0, $3), <>($1, $2))], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
      LogicalProject(JOB=[$2], DEPTNO=[$7])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}


> SemiJoin extends Join directly instead of EquiJoin to support non-equi condition
> --------------------------------------------------------------------------------
>
>                 Key: CALCITE-2857
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2857
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: godfrey he
>            Assignee: Julian Hyde
>            Priority: Major
>             Fix For: 1.19.0
>
>
> SemiJoin extends Join directly instead of EquiJoin to support non-equi condition, 
>  e.g.
> {code:sql}
> select * from (select * from dept where dept.deptno in (select emp.deptno from emp where emp.job <> dept.name))R where R.deptno <= 10
> {code}
> Currently, this query will be converted to
> {code}
> SemiJoin(condition=[AND(=($1, $3), =($0, $2))], joinType=[inner])
>   LogicalFilter(condition=[<=($0, 10)])
>     LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>   LogicalProject(DEPTNO=[$7], NAME=[$9])
>     LogicalJoin(condition=[<>($2, $9)], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{0}])
>         LogicalProject(NAME=[$1])
>           LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> however the following plan is more efficient than the above one:
> {code}
> LogicalProject(DEPTNO=[$0], NAME=[$1])
>   LogicalFilter(condition=[<=($0, 10)])
>     SemiJoin(condition=[=AND(=($0, $3), <>($1, $2))], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>       LogicalProject(JOB=[$2], DEPTNO=[$7])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)