You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/04/30 22:23:00 UTC

[jira] [Commented] (DRILL-7227) TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100

    [ https://issues.apache.org/jira/browse/DRILL-7227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830737#comment-16830737 ] 

ASF GitHub Bot commented on DRILL-7227:
---------------------------------------

gparai commented on pull request #1775: DRILL-7227: Fix predicate check in DrillRelOptUtil.analyzeSimpleEquiJoin
URL: https://github.com/apache/drill/pull/1775
 
 
   @amansinha100 can you please review the changes in commit f893ffb? Thanks!
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100
> ---------------------------------------------------------------------
>
>                 Key: DRILL-7227
>                 URL: https://issues.apache.org/jira/browse/DRILL-7227
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Metadata
>    Affects Versions: 1.16.0
>            Reporter: Robert Hou
>            Assignee: Gautam Parai
>            Priority: Major
>             Fix For: 1.17.0
>
>         Attachments: 23387ab0-cb1c-cd5e-449a-c9bcefc901c1.sys.drill, 2338ae93-155b-356d-382e-0da949c6f439.sys.drill
>
>
> Here is query 78:
> {noformat}
> WITH ws 
>      AS (SELECT d_year                 AS ws_sold_year, 
>                 ws_item_sk, 
>                 ws_bill_customer_sk    ws_customer_sk, 
>                 Sum(ws_quantity)       ws_qty, 
>                 Sum(ws_wholesale_cost) ws_wc, 
>                 Sum(ws_sales_price)    ws_sp 
>          FROM   web_sales 
>                 LEFT JOIN web_returns 
>                        ON wr_order_number = ws_order_number 
>                           AND ws_item_sk = wr_item_sk 
>                 JOIN date_dim 
>                   ON ws_sold_date_sk = d_date_sk 
>          WHERE  wr_order_number IS NULL 
>          GROUP  BY d_year, 
>                    ws_item_sk, 
>                    ws_bill_customer_sk), 
>      cs 
>      AS (SELECT d_year                 AS cs_sold_year, 
>                 cs_item_sk, 
>                 cs_bill_customer_sk    cs_customer_sk, 
>                 Sum(cs_quantity)       cs_qty, 
>                 Sum(cs_wholesale_cost) cs_wc, 
>                 Sum(cs_sales_price)    cs_sp 
>          FROM   catalog_sales 
>                 LEFT JOIN catalog_returns 
>                        ON cr_order_number = cs_order_number 
>                           AND cs_item_sk = cr_item_sk 
>                 JOIN date_dim 
>                   ON cs_sold_date_sk = d_date_sk 
>          WHERE  cr_order_number IS NULL 
>          GROUP  BY d_year, 
>                    cs_item_sk, 
>                    cs_bill_customer_sk), 
>      ss 
>      AS (SELECT d_year                 AS ss_sold_year, 
>                 ss_item_sk, 
>                 ss_customer_sk, 
>                 Sum(ss_quantity)       ss_qty, 
>                 Sum(ss_wholesale_cost) ss_wc, 
>                 Sum(ss_sales_price)    ss_sp 
>          FROM   store_sales 
>                 LEFT JOIN store_returns 
>                        ON sr_ticket_number = ss_ticket_number 
>                           AND ss_item_sk = sr_item_sk 
>                 JOIN date_dim 
>                   ON ss_sold_date_sk = d_date_sk 
>          WHERE  sr_ticket_number IS NULL 
>          GROUP  BY d_year, 
>                    ss_item_sk, 
>                    ss_customer_sk) 
> SELECT ss_item_sk, 
>                Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2) ratio, 
>                ss_qty                                              store_qty, 
>                ss_wc 
>                store_wholesale_cost, 
>                ss_sp 
>                store_sales_price, 
>                COALESCE(ws_qty, 0) + COALESCE(cs_qty, 0) 
>                other_chan_qty, 
>                COALESCE(ws_wc, 0) + COALESCE(cs_wc, 0) 
>                other_chan_wholesale_cost, 
>                COALESCE(ws_sp, 0) + COALESCE(cs_sp, 0) 
>                other_chan_sales_price 
> FROM   ss 
>        LEFT JOIN ws 
>               ON ( ws_sold_year = ss_sold_year 
>                    AND ws_item_sk = ss_item_sk 
>                    AND ws_customer_sk = ss_customer_sk ) 
>        LEFT JOIN cs 
>               ON ( cs_sold_year = ss_sold_year 
>                    AND cs_item_sk = cs_item_sk 
>                    AND cs_customer_sk = ss_customer_sk ) 
> WHERE  COALESCE(ws_qty, 0) > 0 
>        AND COALESCE(cs_qty, 0) > 0 
>        AND ss_sold_year = 1999 
> ORDER  BY ss_item_sk, 
>           ss_qty DESC, 
>           ss_wc DESC, 
>           ss_sp DESC, 
>           other_chan_qty, 
>           other_chan_wholesale_cost, 
>           other_chan_sales_price, 
>           Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2)
> LIMIT 100; 
> {noformat}
> The profile for the new plan is 2338ae93-155b-356d-382e-0da949c6f439.  Hash partition sender operator (10-00) takes 10-15 minutes.  I am not sure why it takes so long.  It has 10 minor fragments sending to receiver (06-05), which has 62 minor fragments.  But hash partition sender (16-00) has 10 minor fragments sending to receiver (12-06), which has 220 minor fragments, and there is no performance issue.
> The profile for the old plan is 23387ab0-cb1c-cd5e-449a-c9bcefc901c1.  Both plans use the same commit.  The old plan is created by disabling statistics.
> I have not included the plans in the Jira because Jira has a max of 32K.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)