You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Robert Hou (JIRA)" <ji...@apache.org> on 2019/04/30 20:53:00 UTC
[jira] [Created] (DRILL-7227) TPCDS queries 47, 57, 59 fail to run
with Statistics enabled at sf100
Robert Hou created DRILL-7227:
---------------------------------
Summary: TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100
Key: DRILL-7227
URL: https://issues.apache.org/jira/browse/DRILL-7227
Project: Apache Drill
Issue Type: Bug
Components: Metadata
Affects Versions: 1.16.0
Reporter: Robert Hou
Assignee: Gautam Parai
Fix For: 1.17.0
Attachments: 23387ab0-cb1c-cd5e-449a-c9bcefc901c1.sys.drill, 2338ae93-155b-356d-382e-0da949c6f439.sys.drill
Here is query 78:
{noformat}
WITH ws
AS (SELECT d_year AS ws_sold_year,
ws_item_sk,
ws_bill_customer_sk ws_customer_sk,
Sum(ws_quantity) ws_qty,
Sum(ws_wholesale_cost) ws_wc,
Sum(ws_sales_price) ws_sp
FROM web_sales
LEFT JOIN web_returns
ON wr_order_number = ws_order_number
AND ws_item_sk = wr_item_sk
JOIN date_dim
ON ws_sold_date_sk = d_date_sk
WHERE wr_order_number IS NULL
GROUP BY d_year,
ws_item_sk,
ws_bill_customer_sk),
cs
AS (SELECT d_year AS cs_sold_year,
cs_item_sk,
cs_bill_customer_sk cs_customer_sk,
Sum(cs_quantity) cs_qty,
Sum(cs_wholesale_cost) cs_wc,
Sum(cs_sales_price) cs_sp
FROM catalog_sales
LEFT JOIN catalog_returns
ON cr_order_number = cs_order_number
AND cs_item_sk = cr_item_sk
JOIN date_dim
ON cs_sold_date_sk = d_date_sk
WHERE cr_order_number IS NULL
GROUP BY d_year,
cs_item_sk,
cs_bill_customer_sk),
ss
AS (SELECT d_year AS ss_sold_year,
ss_item_sk,
ss_customer_sk,
Sum(ss_quantity) ss_qty,
Sum(ss_wholesale_cost) ss_wc,
Sum(ss_sales_price) ss_sp
FROM store_sales
LEFT JOIN store_returns
ON sr_ticket_number = ss_ticket_number
AND ss_item_sk = sr_item_sk
JOIN date_dim
ON ss_sold_date_sk = d_date_sk
WHERE sr_ticket_number IS NULL
GROUP BY d_year,
ss_item_sk,
ss_customer_sk)
SELECT ss_item_sk,
Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2) ratio,
ss_qty store_qty,
ss_wc
store_wholesale_cost,
ss_sp
store_sales_price,
COALESCE(ws_qty, 0) + COALESCE(cs_qty, 0)
other_chan_qty,
COALESCE(ws_wc, 0) + COALESCE(cs_wc, 0)
other_chan_wholesale_cost,
COALESCE(ws_sp, 0) + COALESCE(cs_sp, 0)
other_chan_sales_price
FROM ss
LEFT JOIN ws
ON ( ws_sold_year = ss_sold_year
AND ws_item_sk = ss_item_sk
AND ws_customer_sk = ss_customer_sk )
LEFT JOIN cs
ON ( cs_sold_year = ss_sold_year
AND cs_item_sk = cs_item_sk
AND cs_customer_sk = ss_customer_sk )
WHERE COALESCE(ws_qty, 0) > 0
AND COALESCE(cs_qty, 0) > 0
AND ss_sold_year = 1999
ORDER BY ss_item_sk,
ss_qty DESC,
ss_wc DESC,
ss_sp DESC,
other_chan_qty,
other_chan_wholesale_cost,
other_chan_sales_price,
Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2)
LIMIT 100;
{noformat}
The profile for the new plan is 2338ae93-155b-356d-382e-0da949c6f439. Hash partition sender operator (10-00) takes 10-15 minutes. I am not sure why it takes so long. It has 10 minor fragments sending to receiver (06-05), which has 62 minor fragments. But hash partition sender (16-00) has 10 minor fragments sending to receiver (12-06), which has 220 minor fragments, and there is no performance issue.
The profile for the old plan is 23387ab0-cb1c-cd5e-449a-c9bcefc901c1. Both plans use the same commit. The old plan is created by disabling statistics.
I have not included the plans in the Jira because Jira has a max of 32K.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)