You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Robert Hou (JIRA)" <ji...@apache.org> on 2019/03/19 23:15:00 UTC
[jira] [Created] (DRILL-7121) TPCH 4 takes longer
Robert Hou created DRILL-7121:
---------------------------------
Summary: TPCH 4 takes longer
Key: DRILL-7121
URL: https://issues.apache.org/jira/browse/DRILL-7121
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 1.16.0
Reporter: Robert Hou
Assignee: Gautam Parai
Fix For: 1.16.0
Here is TPCH 4 with sf 100:
{noformat}
select
o.o_orderpriority,
count(*) as order_count
from
orders o
where
o.o_orderdate >= date '1996-10-01'
and o.o_orderdate < date '1996-10-01' + interval '3' month
and
exists (
select
*
from
lineitem l
where
l.l_orderkey = o.o_orderkey
and l.l_commitdate < l.l_receiptdate
)
group by
o.o_orderpriority
order by
o.o_orderpriority;
{noformat}
The plan has changed when Statistics is disabled. A Hash Agg and a Broadcast Exchange have been added. These two operators expand the number of rows from the lineitem table from 137M to 9B rows. This forces the hash join to use 6GB of memory instead of 30 MB.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)