You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Chen Luo (Jira)" <ji...@apache.org> on 2020/09/11 02:46:00 UTC

[jira] [Commented] (ASTERIXDB-2779) Join Condition Is Not Identified for TPC-H Q18

    [ https://issues.apache.org/jira/browse/ASTERIXDB-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193962#comment-17193962 ] 

Chen Luo commented on ASTERIXDB-2779:
-------------------------------------

[~shivaj] This issue may be of your interest if you're running TPC-H workloads.

> Join Condition Is Not Identified for TPC-H Q18
> ----------------------------------------------
>
>                 Key: ASTERIXDB-2779
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2779
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: COMP - Compiler
>            Reporter: Chen Luo
>            Priority: Major
>         Attachments: tpch_ddl.sql
>
>
> The query optimizer fails to identify the join condition for TPC-H Q18, and thus produced a query plan with a cartisian product. The DDLs are attached below.
> The original Q18 is as follows:
>  
> {code:java}
> use tpch;
> WITH tmp AS
> (
>     SELECT l_orderkey l_orderkey, sum(l.l_quantity) t_sum_quantity
>     FROM  LineItem AS l
>     GROUP BY l.l_orderkey as l_orderkey
> )SELECT c_name c_name, c_custkey c_custkey, o_orderkey o_orderkey,
>        o_orderdate o_orderdate, o_totalprice o_totalprice,
>        sum(l.l_quantity) sum_quantity
> FROM  Customer c,
>       Orders o,
>       tmp t,
>       LineItem l
> WHERE c.c_custkey = o.o_custkey AND o.o_orderkey = t.l_orderkey AND t.t_sum_quantity > 300
>       AND l.l_orderkey = t.l_orderkey
> GROUP BY c.c_name AS c_name,c.c_custkey AS c_custkey,
>          o.o_orderkey AS o_orderkey,o.o_orderdate AS o_orderdate,
>          o.o_totalprice AS o_totalprice
> ORDER BY o_totalprice DESC,o_orderdate
> LIMIT 100
> ;
> {code}
> However, the query condition was correctly identified after Q18 is refactored as follows:
>  
>  
> {code:java}
> use tpch;
> WITH tmp AS
> (
>     SELECT l_orderkey, sum(l.l_quantity) t_sum_quantity
>     FROM  LineItem AS l
>     GROUP BY l.l_orderkey as l_orderkey
>     HAVING  sum(l.l_quantity)>300
> )SELECT c_name, c_custkey, o_orderkey,
>        o_orderdate, o_totalprice,
>        sum(l.l_quantity) sum_quantity
> FROM  Customer c JOIN Orders o ON  c.c_custkey = o.o_custkey
>     JOIN tmp t ON o.o_orderkey = t.l_orderkey
>    JOIN LineItem l ON  t.l_orderkey =  l.l_orderkey
> GROUP BY c.c_name AS c_name,c.c_custkey AS c_custkey,
>          o.o_orderkey AS o_orderkey,o.o_orderdate AS o_orderdate,
>          o.o_totalprice AS o_totalprice
> ORDER BY o_totalprice DESC,o_orderdate
> LIMIT 100
> ;
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)