You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Robert Hou (JIRA)" <ji...@apache.org> on 2019/03/16 00:21:00 UTC

[jira] [Created] (DRILL-7109) Statistics adds external sort, which spills to disk

Robert Hou created DRILL-7109:
---------------------------------

             Summary: Statistics adds external sort, which spills to disk
                 Key: DRILL-7109
                 URL: https://issues.apache.org/jira/browse/DRILL-7109
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning &amp; Optimization
    Affects Versions: 1.16.0
            Reporter: Robert Hou
            Assignee: Gautam Parai
             Fix For: 1.16.0


TPCH query 4 with sf 100 runs many times slower.  One issue is that an extra external sort has been added, and both external sorts spill to disk.

Also, the hash join sees 100x more data.

Here is the query:
{noformat}
select
  o.o_orderpriority,
  count(*) as order_count
from
  orders o

where
  o.o_orderdate >= date '1996-10-01'
  and o.o_orderdate < date '1996-10-01' + interval '3' month
  and 
  exists (
    select
      *
    from
      lineitem l
    where
      l.l_orderkey = o.o_orderkey
      and l.l_commitdate < l.l_receiptdate
  )
group by
  o.o_orderpriority
order by
  o.o_orderpriority;
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)