You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Robert Hou (JIRA)" <ji...@apache.org> on 2019/04/30 20:53:00 UTC

[jira] [Created] (DRILL-7227) TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100

Robert Hou created DRILL-7227:
---------------------------------

             Summary: TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100
                 Key: DRILL-7227
                 URL: https://issues.apache.org/jira/browse/DRILL-7227
             Project: Apache Drill
          Issue Type: Bug
          Components: Metadata
    Affects Versions: 1.16.0
            Reporter: Robert Hou
            Assignee: Gautam Parai
             Fix For: 1.17.0
         Attachments: 23387ab0-cb1c-cd5e-449a-c9bcefc901c1.sys.drill, 2338ae93-155b-356d-382e-0da949c6f439.sys.drill

Here is query 78:
{noformat}
WITH ws 
     AS (SELECT d_year                 AS ws_sold_year, 
                ws_item_sk, 
                ws_bill_customer_sk    ws_customer_sk, 
                Sum(ws_quantity)       ws_qty, 
                Sum(ws_wholesale_cost) ws_wc, 
                Sum(ws_sales_price)    ws_sp 
         FROM   web_sales 
                LEFT JOIN web_returns 
                       ON wr_order_number = ws_order_number 
                          AND ws_item_sk = wr_item_sk 
                JOIN date_dim 
                  ON ws_sold_date_sk = d_date_sk 
         WHERE  wr_order_number IS NULL 
         GROUP  BY d_year, 
                   ws_item_sk, 
                   ws_bill_customer_sk), 
     cs 
     AS (SELECT d_year                 AS cs_sold_year, 
                cs_item_sk, 
                cs_bill_customer_sk    cs_customer_sk, 
                Sum(cs_quantity)       cs_qty, 
                Sum(cs_wholesale_cost) cs_wc, 
                Sum(cs_sales_price)    cs_sp 
         FROM   catalog_sales 
                LEFT JOIN catalog_returns 
                       ON cr_order_number = cs_order_number 
                          AND cs_item_sk = cr_item_sk 
                JOIN date_dim 
                  ON cs_sold_date_sk = d_date_sk 
         WHERE  cr_order_number IS NULL 
         GROUP  BY d_year, 
                   cs_item_sk, 
                   cs_bill_customer_sk), 
     ss 
     AS (SELECT d_year                 AS ss_sold_year, 
                ss_item_sk, 
                ss_customer_sk, 
                Sum(ss_quantity)       ss_qty, 
                Sum(ss_wholesale_cost) ss_wc, 
                Sum(ss_sales_price)    ss_sp 
         FROM   store_sales 
                LEFT JOIN store_returns 
                       ON sr_ticket_number = ss_ticket_number 
                          AND ss_item_sk = sr_item_sk 
                JOIN date_dim 
                  ON ss_sold_date_sk = d_date_sk 
         WHERE  sr_ticket_number IS NULL 
         GROUP  BY d_year, 
                   ss_item_sk, 
                   ss_customer_sk) 
SELECT ss_item_sk, 
               Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2) ratio, 
               ss_qty                                              store_qty, 
               ss_wc 
               store_wholesale_cost, 
               ss_sp 
               store_sales_price, 
               COALESCE(ws_qty, 0) + COALESCE(cs_qty, 0) 
               other_chan_qty, 
               COALESCE(ws_wc, 0) + COALESCE(cs_wc, 0) 
               other_chan_wholesale_cost, 
               COALESCE(ws_sp, 0) + COALESCE(cs_sp, 0) 
               other_chan_sales_price 
FROM   ss 
       LEFT JOIN ws 
              ON ( ws_sold_year = ss_sold_year 
                   AND ws_item_sk = ss_item_sk 
                   AND ws_customer_sk = ss_customer_sk ) 
       LEFT JOIN cs 
              ON ( cs_sold_year = ss_sold_year 
                   AND cs_item_sk = cs_item_sk 
                   AND cs_customer_sk = ss_customer_sk ) 
WHERE  COALESCE(ws_qty, 0) > 0 
       AND COALESCE(cs_qty, 0) > 0 
       AND ss_sold_year = 1999 
ORDER  BY ss_item_sk, 
          ss_qty DESC, 
          ss_wc DESC, 
          ss_sp DESC, 
          other_chan_qty, 
          other_chan_wholesale_cost, 
          other_chan_sales_price, 
          Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2)
LIMIT 100; 
{noformat}

The profile for the new plan is 2338ae93-155b-356d-382e-0da949c6f439.  Hash partition sender operator (10-00) takes 10-15 minutes.  I am not sure why it takes so long.  It has 10 minor fragments sending to receiver (06-05), which has 62 minor fragments.  But hash partition sender (16-00) has 10 minor fragments sending to receiver (12-06), which has 220 minor fragments, and there is no performance issue.

The profile for the old plan is 23387ab0-cb1c-cd5e-449a-c9bcefc901c1.  Both plans use the same commit.  The old plan is created by disabling statistics.

I have not included the plans in the Jira because Jira has a max of 32K.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)