You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Alessandro Solimando (Jira)" <ji...@apache.org> on 2022/01/11 10:11:00 UTC

[jira] [Commented] (CALCITE-4981) tpch q21 query plan has many cartesian joins after decorrelation

    [ https://issues.apache.org/jira/browse/CALCITE-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472624#comment-17472624 ] 

Alessandro Solimando commented on CALCITE-4981:
-----------------------------------------------

[~yyfyyf123123], would you mind sharing the plan you obtain when using _CoreRules.FILTER_CORRELATE_ (for future reference)? 

If confirmed, I guess that also https://issues.apache.org/jira/browse/CALCITE-320 can be closed, right?

> tpch q21 query plan has many cartesian joins after decorrelation
> ----------------------------------------------------------------
>
>                 Key: CALCITE-4981
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4981
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.27.0
>            Reporter: Yifei Yang
>            Priority: Major
>
> Hi, I found the query plan of tpch q21 has 6 cartesian joins (totally 18 joins) after calling the decorrelator. It's first reported there, but it seems still an issue in 1.27.0. I'm not sure if it's a bug of Calcite or I need to incorporate some additional rules to fix it. Thank you!
> query (tpch q21):
> {quote}{{select}}
> {{  s.s_name,}}
> {{  count(*) as numwait}}
> {{from}}
> {{  supplier s,}}
> {{  lineitem l1,}}
> {{  orders o,}}
> {{  nation n}}
> {{where}}
> {{  s.s_suppkey = l1.l_suppkey}}
> {{  and o.o_orderkey = l1.l_orderkey}}
> {{  and o.o_orderstatus = 'F'}}
> {{  and l1.l_receiptdate > l1.l_commitdate}}
> {{  and exists (}}
> {{    select}}
> {{      *}}
> {{    from}}
> {{      lineitem l2}}
> {{    where}}
> {{      l2.l_orderkey = l1.l_orderkey}}
> {{      and l2.l_suppkey <> l1.l_suppkey}}
> {{  )}}
> {{  and not exists (}}
> {{    select}}
> {{      *}}
> {{    from}}
> {{      lineitem l3}}
> {{    where}}
> {{      l3.l_orderkey = l1.l_orderkey}}
> {{      and l3.l_suppkey <> l1.l_suppkey}}
> {{      and l3.l_receiptdate > l3.l_commitdate}}
> {{  )}}
> {{  and s.s_nationkey = n.n_nationkey}}
> {{  and n.n_name = 'BRAZIL'}}
> {{group by}}
> {{  s.s_name}}
> {{order by}}
> {{  numwait desc,}}
> {{  s.s_name}}
> {{limit 100}}
> {quote}
> Plan generated:
> {quote}EnumerableProject(S_NAME=[$0], NUMWAIT=[$1])
>   EnumerableLimitSort(sort0=[$1], sort1=[$0], dir0=[DESC], dir1=[ASC], fetch=[100])
>     EnumerableAggregate(group=[\{0}], NUMWAIT=[COUNT()])
>       EnumerableProject(s_name=[$3])
>         EnumerableFilter(condition=[IS NULL($39)])
>           EnumerableHashJoin(condition=[AND(=($11, $37), =($15, $38))], joinType=[left])
>             EnumerableProject(s_comment=[$0], s_phone=[$1], s_nationkey=[$2], s_name=[$3], s_address=[$4], s_acctbal=[$5], s_suppkey=[$6], l_returnflag=[$7], l_receiptdate=[$8], l_tax=[$9], l_shipmode=[$10], l_suppkey=[$11], l_shipdate=[$12], l_commitdate=[$13], l_partkey=[$14], l_orderkey=[$15], l_quantity=[$16], l_linestatus=[$17], l_comment=[$18], l_extendedprice=[$19], l_linenumber=[$20], l_discount=[$21], l_shipinstruct=[$22], o_orderstatus=[$23], o_clerk=[$24], o_orderdate=[$25], o_shippriority=[$26], o_custkey=[$27], o_totalprice=[$28], o_orderkey=[$29], o_comment=[$30], o_orderpriority=[$31], n_comment=[$32], n_nationkey=[$33], n_regionkey=[$34], n_name=[$35], $f0=[CAST($38):BOOLEAN])
>               EnumerableHashJoin(condition=[AND(=($11, $36), =($15, $37))], joinType=[inner])
>                 EnumerableProject(s_comment=[$0], s_phone=[$1], s_nationkey=[$2], s_name=[$3], s_address=[$4], s_acctbal=[$5], s_suppkey=[$6], l_returnflag=[$7], l_receiptdate=[$8], l_tax=[$9], l_shipmode=[$10], l_suppkey=[$11], l_shipdate=[$12], l_commitdate=[$13], l_partkey=[$14], l_orderkey=[$15], l_quantity=[$16], l_linestatus=[$17], l_comment=[$18], l_extendedprice=[$19], l_linenumber=[$20], l_discount=[$21], l_shipinstruct=[$22], o_orderstatus=[$27], o_clerk=[$28], o_orderdate=[$29], o_shippriority=[$30], o_custkey=[$31], o_totalprice=[$32], o_orderkey=[$33], o_comment=[$34], o_orderpriority=[$35], n_comment=[$23], n_nationkey=[$24], n_regionkey=[$25], n_name=[$26])
>                   EnumerableHashJoin(condition=[=($15, $33)], joinType=[inner])
>                     EnumerableProject(s_comment=[$0], s_phone=[$1], s_nationkey=[$2], s_name=[$3], s_address=[$4], s_acctbal=[$5], s_suppkey=[$6], l_returnflag=[$11], l_receiptdate=[$12], l_tax=[$13], l_shipmode=[$14], l_suppkey=[$15], l_shipdate=[$16], l_commitdate=[$17], l_partkey=[$18], l_orderkey=[$19], l_quantity=[$20], l_linestatus=[$21], l_comment=[$22], l_extendedprice=[$23], l_linenumber=[$24], l_discount=[$25], l_shipinstruct=[$26], n_comment=[$7], n_nationkey=[$8], n_regionkey=[$9], n_name=[$10])
>                       EnumerableHashJoin(condition=[=($6, $15)], joinType=[inner])
>                         EnumerableHashJoin(condition=[=($2, $8)], joinType=[inner])
>                           EnumerableTableScan(table=[[tpch-sf0.01/csv, supplier]])
>                           EnumerableFilter(condition=[=($3, 'BRAZIL')])
>                             EnumerableTableScan(table=[[tpch-sf0.01/csv, nation]])
>                         EnumerableFilter(condition=[>($1, $6)])
>                           EnumerableTableScan(table=[[tpch-sf0.01/csv, lineitem]])
>                     EnumerableFilter(condition=[=($0, 'F')])
>                       EnumerableTableScan(table=[[tpch-sf0.01/csv, orders]])
>                 EnumerableAggregate(group=[\{0, 1}], agg#0=[MIN($2)])
>                   EnumerableProject(l_suppkey0=[$16], l_orderkey0=[$17], $f0=[true])
>                     EnumerableFilter(condition=[<>($4, $16)])
>                       EnumerableHashJoin(condition=[=($8, $17)], joinType=[inner])
>                         EnumerableTableScan(table=[[tpch-sf0.01/csv, lineitem]])
>                         EnumerableAggregate(group=[\{0, 1}])
>                           EnumerableProject(l_suppkey=[$11], l_orderkey=[$15])
>                             EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
>                               EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
>                                 EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
>                                   EnumerableTableScan(table=[[tpch-sf0.01/csv, supplier]])
>                                   EnumerableTableScan(table=[[tpch-sf0.01/csv, lineitem]])
>                                 EnumerableTableScan(table=[[tpch-sf0.01/csv, orders]])
>                               EnumerableTableScan(table=[[tpch-sf0.01/csv, nation]])
>             EnumerableAggregate(group=[\{0, 1}], agg#0=[MIN($2)])
>               EnumerableProject(l_suppkey0=[$16], l_orderkey0=[$17], $f0=[true])
>                 EnumerableFilter(condition=[<>($4, $16)])
>                   EnumerableHashJoin(condition=[=($8, $17)], joinType=[inner])
>                     EnumerableFilter(condition=[>($1, $6)])
>                       EnumerableTableScan(table=[[tpch-sf0.01/csv, lineitem]])
>                     EnumerableAggregate(group=[\{0, 1}])
>                       EnumerableProject(l_suppkey=[$11], l_orderkey=[$15])
>                         EnumerableHashJoin(condition=[AND(=($11, $36), =($15, $37))], joinType=[semi])
>                           EnumerableProject(s_comment=[$0], s_phone=[$1], s_nationkey=[$2], s_name=[$3], s_address=[$4], s_acctbal=[$5], s_suppkey=[$6], l_returnflag=[$7], l_receiptdate=[$8], l_tax=[$9], l_shipmode=[$10], l_suppkey=[$11], l_shipdate=[$12], l_commitdate=[$13], l_partkey=[$14], l_orderkey=[$15], l_quantity=[$16], l_linestatus=[$17], l_comment=[$18], l_extendedprice=[$19], l_linenumber=[$20], l_discount=[$21], l_shipinstruct=[$22], o_orderstatus=[$27], o_clerk=[$28], o_orderdate=[$29], o_shippriority=[$30], o_custkey=[$31], o_totalprice=[$32], o_orderkey=[$33], o_comment=[$34], o_orderpriority=[$35], n_comment=[$23], n_nationkey=[$24], n_regionkey=[$25], n_name=[$26])
>                             EnumerableHashJoin(condition=[=($15, $33)], joinType=[inner])
>                               EnumerableProject(s_comment=[$0], s_phone=[$1], s_nationkey=[$2], s_name=[$3], s_address=[$4], s_acctbal=[$5], s_suppkey=[$6], l_returnflag=[$11], l_receiptdate=[$12], l_tax=[$13], l_shipmode=[$14], l_suppkey=[$15], l_shipdate=[$16], l_commitdate=[$17], l_partkey=[$18], l_orderkey=[$19], l_quantity=[$20], l_linestatus=[$21], l_comment=[$22], l_extendedprice=[$23], l_linenumber=[$24], l_discount=[$25], l_shipinstruct=[$26], n_comment=[$7], n_nationkey=[$8], n_regionkey=[$9], n_name=[$10])
>                                 EnumerableHashJoin(condition=[=($6, $15)], joinType=[inner])
>                                   EnumerableHashJoin(condition=[=($2, $8)], joinType=[inner])
>                                     EnumerableTableScan(table=[[tpch-sf0.01/csv, supplier]])
>                                     EnumerableFilter(condition=[=($3, 'BRAZIL')])
>                                       EnumerableTableScan(table=[[tpch-sf0.01/csv, nation]])
>                                   EnumerableFilter(condition=[>($1, $6)])
>                                     EnumerableTableScan(table=[[tpch-sf0.01/csv, lineitem]])
>                               EnumerableFilter(condition=[=($0, 'F')])
>                                 EnumerableTableScan(table=[[tpch-sf0.01/csv, orders]])
>                           EnumerableProject(l_suppkey0=[$16], l_orderkey0=[$17], $f0=[true])
>                             EnumerableFilter(condition=[<>($4, $16)])
>                               EnumerableHashJoin(condition=[=($8, $17)], joinType=[inner])
>                                 EnumerableTableScan(table=[[tpch-sf0.01/csv, lineitem]])
>                                 EnumerableAggregate(group=[\{0, 1}])
>                                   EnumerableProject(l_suppkey=[$11], l_orderkey=[$15])
>                                     EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
>                                       EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
>                                         EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
>                                           EnumerableTableScan(table=[[tpch-sf0.01/csv, supplier]])
>                                           EnumerableTableScan(table=[[tpch-sf0.01/csv, lineitem]])
>                                         EnumerableTableScan(table=[[tpch-sf0.01/csv, orders]])
>                                       EnumerableTableScan(table=[[tpch-sf0.01/csv, nation]])
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)