You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/03/28 01:51:00 UTC

[jira] [Commented] (DRILL-7121) TPCH 4 takes longer when Statistics is disabled.

    [ https://issues.apache.org/jira/browse/DRILL-7121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803527#comment-16803527 ] 

ASF GitHub Bot commented on DRILL-7121:
---------------------------------------

gparai commented on pull request #1718: DRILL-7121: Use correct ndv when statistics is disabled
URL: https://github.com/apache/drill/pull/1718
 
 
   @amansinha100 can you please review the PR? Thanks!
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> TPCH 4 takes longer when Statistics is disabled.
> ------------------------------------------------
>
>                 Key: DRILL-7121
>                 URL: https://issues.apache.org/jira/browse/DRILL-7121
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning &amp; Optimization
>    Affects Versions: 1.16.0
>            Reporter: Robert Hou
>            Assignee: Gautam Parai
>            Priority: Blocker
>             Fix For: 1.16.0
>
>
> Here is TPCH 4 with sf 100:
> {noformat}
> select
>   o.o_orderpriority,
>   count(*) as order_count
> from
>   orders o
> where
>   o.o_orderdate >= date '1996-10-01'
>   and o.o_orderdate < date '1996-10-01' + interval '3' month
>   and 
>   exists (
>     select
>       *
>     from
>       lineitem l
>     where
>       l.l_orderkey = o.o_orderkey
>       and l.l_commitdate < l.l_receiptdate
>   )
> group by
>   o.o_orderpriority
> order by
>   o.o_orderpriority;
> {noformat}
> The plan has changed when Statistics is disabled.   A Hash Agg and a Broadcast Exchange have been added.  These two operators expand the number of rows from the lineitem table from 137M to 9B rows.   This forces the hash join to use 6GB of memory instead of 30 MB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)