You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/05/09 03:25:59 UTC

[jira] [Commented] (CALCITE-714) Decorrelation does not push join condition into subquery

    [ https://issues.apache.org/jira/browse/CALCITE-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536085#comment-14536085 ] 

Aman Sinha commented on CALCITE-714:
------------------------------------

I am looking into this.   Here's another example query that does not have a cartesian join in the query but the decorrelated version creates a cartesian join: 
{code}
      SELECT 1 FROM emp e1, dept d1 where e1.deptno = d1.deptno 
            and e1.sal > (select avg(sal) from emp e2 where e1.empno = e2.empno)
{code}

> Decorrelation does not push join condition into subquery
> --------------------------------------------------------
>
>                 Key: CALCITE-714
>                 URL: https://issues.apache.org/jira/browse/CALCITE-714
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.1.0-incubating
>            Reporter: Aman Sinha
>            Assignee: Julian Hyde
>
> When decorrelating a scalar aggregate subquery,  we push copies of the tables from the outer query block into the subquery but don't push the join condition, thereby creating a cartesian join.  This seems to be a regression. 
> Query: 
> {code}
> select count(*) 
>     from dfs.`/Users/asinha/data/tpch-sf1/nation` n, dfs.`/Users/asinha/data/tpch-sf1/orders` o 
>          where n.n_nationkey = o.o_orderkey 
>          and n.n_nationkey > (select avg(ps.ps_suppkey) from dfs.`/Users/asinha/data/tpch-sf1/partsupp` ps
>                                where n.n_regionkey = ps.ps_partkey);
> {code}
> Here's the plan on an earlier version of Calcite (I am not sure of the version number but it was before the decorrelation refactoring),  Note the join between nation and orders at the leaf level has a equi-join condition.  
> {code} 
> AggregateRel(group=[{}], EXPR$0=[COUNT()])
>   ProjectRel($f0=[$0])
>     ProjectRel($f0=[0])
>       FilterRel(condition=[>($1, $5)])
>         ProjectRel(*=[$0], n_nationkey=[$1], n_regionkey=[$2], *0=[$3], o_orderkey=[$4], EXPR$0=[$6])
>           JoinRel(condition=[=($2, $5)], joinType=[left])
>             JoinRel(condition=[=($1, $4)], joinType=[inner])
>               EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/nation]])
>               EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/orders]])
>             AggregateRel(group=[{0}], EXPR$0=[AVG($1)])
>               ProjectRel($f0=[$1], ps_suppkey=[$0])
>                 ProjectRel(ps_suppkey=[$2], $f0=[$3])
>                   FilterRel(condition=[=($3, $1)])
>                     JoinRel(condition=[true], joinType=[inner])
>                       EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/partsupp]])
>                       AggregateRel(group=[{0}])
>                         ProjectRel($f0=[$2])
>                           JoinRel(condition=[=($1, $4)], joinType=[inner])
>                             EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/nation]])
>                             EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/orders]])
> {code}
> Here's the new plan (I am on version 1.1.0 but I think the plan has not changed in the latest version).  Note the join between nation and orders at the leaf level does not have any join condition. 
> {code}
> LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
>   LogicalProject($f0=[$0])
>     LogicalProject($f0=[0])
>       LogicalFilter(condition=[AND(=($1, $4), >($1, $5))])
>         LogicalProject(*=[$0], n_nationkey=[$1], n_regionkey=[$2], *0=[$3], o_orderkey=[$4], EXPR$0=[$6])
>           LogicalJoin(condition=[=($2, $5)], joinType=[left])
>             LogicalJoin(condition=[true], joinType=[inner])
>               EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/nation]])
>               EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/orders]])
>             LogicalAggregate(group=[{0}], EXPR$0=[AVG($1)])
>               LogicalProject(n_regionkey=[$1], ps_suppkey=[$0])
>                 LogicalProject(ps_suppkey=[$2], n_regionkey=[$3])
>                   LogicalFilter(condition=[=($3, $1)])
>                     LogicalJoin(condition=[true], joinType=[inner])
>                       EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/partsupp]])
>                       LogicalAggregate(group=[{0}])
>                         LogicalProject(n_regionkey=[$2])
>                           LogicalJoin(condition=[true], joinType=[inner])
>                             EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/nation]])
>                             EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/orders]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)