You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Robert Hou (JIRA)" <ji...@apache.org> on 2019/04/04 19:09:00 UTC
[jira] [Updated] (DRILL-7154) TPCH query 4, 17 and 18 take longer with sf 1000 when Statistics are disabled

     [ https://issues.apache.org/jira/browse/DRILL-7154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Hou updated DRILL-7154:
------------------------------
    Summary: TPCH query 4, 17 and 18 take longer with sf 1000 when Statistics are disabled  (was: TPCH query 4 and 17 take longer with sf 1000 when Statistics are disabled)

> TPCH query 4, 17 and 18 take longer with sf 1000 when Statistics are disabled
> -----------------------------------------------------------------------------
>
>                 Key: DRILL-7154
>                 URL: https://issues.apache.org/jira/browse/DRILL-7154
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning &amp; Optimization
>    Affects Versions: 1.16.0
>            Reporter: Robert Hou
>            Assignee: Boaz Ben-Zvi
>            Priority: Critical
>             Fix For: 1.16.0
>
>         Attachments: 235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5.sys.drill, 235a471b-aa97-bfb5-207d-3f25b4b5fbbb.sys.drill, hashagg.nostats.log, hashagg.stats.disabled.log
>
>
> Here is TPCH 04 with sf 1000:
> {noformat}
> select
>   o.o_orderpriority,
>   count(*) as order_count
> from
>   orders o
> where
>   o.o_orderdate >= date '1996-10-01'
>   and o.o_orderdate < date '1996-10-01' + interval '3' month
>   and 
>   exists (
>     select
>       *
>     from
>       lineitem l
>     where
>       l.l_orderkey = o.o_orderkey
>       and l.l_commitdate < l.l_receiptdate
>   )
> group by
>   o.o_orderpriority
> order by
>   o.o_orderpriority;
> {noformat}
> TPCH query 4 takes 30% longer.  The plan is the same.  But the Hash Agg operator in the new plan is taking longer.  One possible reason is that the Hash Agg operator in the new plan is not using as many buckets as the old plan did.  The memory usage of the Hash Agg operator in the new plan is using less memory compared to the old plan.
> Here is the old plan:
> {noformat}
> 00-00    Screen : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9163601940441746E10 rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5645
> 00-01      Project(o_orderpriority=[$0], order_count=[$1]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5644
> 00-02        SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5643
> 01-01          OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5642
> 02-01            SelectionVectorRemover : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5641
> 02-02              Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5640
> 02-03                HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10 memory}, id = 5639
> 02-04                  HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10 memory}, id = 5638
> 03-01                    HashAgg(group=[{0}], order_count=[COUNT()]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10 memory}, id = 5637
> 03-02                      Project(o_orderpriority=[$1]) : rowType = RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5636
> 03-03                        Project(o_orderkey=[$1], o_orderpriority=[$2], l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority, ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10 rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5635
> 03-04                          HashJoin(condition=[=($1, $0)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_orderkey, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.7944476940441746E10 rows, 7.995890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5634
> 03-05                            HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {4.125E9 rows, 2.2125E10 cpu, 4.5E9 io, 3.072E12 network, 0.0 memory}, id = 5633
> 05-01                              Project(o_orderkey=[$1], o_orderpriority=[$2]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.75E9 rows, 1.6125E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5632
> 05-02                                SelectionVectorRemover : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.375E9 rows, 1.5375E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5631
> 05-03                                  Filter(condition=[AND(>=($0, 1996-10-01), <($0, 1997-01-01 00:00:00))]) : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.0E9 rows, 1.5E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5630
> 05-04                                    Scan(table=[[dfs, tpchpar1000_micro, orders]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/orders]], selectionRoot=maprfs:/tpchParquet10/SF1000/orders, numFiles=1, numRowGroups=750, usedMetadataFile=false, columns=[`o_orderdate`, `o_orderkey`, `o_orderpriority`]]]) : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 1.5E9, cumulative cost = {1.5E9 rows, 4.5E9 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5629
> 03-06                            HashAgg(group=[{0}]) : rowType = RecordType(ANY l_orderkey): rowcount = 4499992.28175, cumulative cost = {1.3439976948159996E10 rows, 5.477990604317001E10 cpu, 1.7999969127E10 io, 1.8431968386048E11 network, 8.711985057468E9 memory}, id = 5628
> 03-07                              HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = {1.3394977025342497E10 rows, 5.441990666063001E10 cpu, 1.7999969127E10 io, 1.8431968386048E11 network, 7.919986415880001E9 memory}, id = 5627
> 04-01                                HashAgg(group=[{0}]) : rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = {1.3349977102524998E10 rows, 5.369990789555001E10 cpu, 1.7999969127E10 io, 0.0 network, 7.919986415880001E9 memory}, id = 5626
> 04-02                                  Project(l_orderkey=[$0], i=[true]) : rowType = RecordType(ANY l_orderkey, BOOLEAN i): rowcount = 4.49999228175E8, cumulative cost = {1.2899977874349998E10 rows, 5.009991407015001E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5625
> 04-03                                    SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = {1.2449978646175E10 rows, 4.784991792927501E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5624
> 04-04                                      Filter(condition=[AND(=($0, $0), <($1, $2))]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = {1.1999979418E10 rows, 4.7399918701100006E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5623
> 04-05                                        Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_orderkey`, `l_commitdate`, `l_receiptdate`]]]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 5.999989709E9, cumulative cost = {5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5622
> {noformat}
> Here is the new plan:
> {noformat}
> 00-00    Screen : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9163601940441746E10 rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 11739
> 00-01      Project(o_orderpriority=[$0], order_count=[$1]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 11738
> 00-02        SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 11737
> 01-01          OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11736
> 02-01            SelectionVectorRemover : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11735
> 02-02              Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11734
> 02-03                HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10 memory}, id = 11733
> 02-04                  HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10 memory}, id = 11732
> 03-01                    HashAgg(group=[{0}], order_count=[COUNT()]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10 memory}, id = 11731
> 03-02                      Project(o_orderpriority=[$1]) : rowType = RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11730
> 03-03                        Project(o_orderkey=[$1], o_orderpriority=[$2], l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority, ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10 rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11729
> 03-04                          HashJoin(condition=[=($1, $0)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_orderkey, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.7944476940441746E10 rows, 7.995890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11728
> 03-05                            HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {4.125E9 rows, 2.2125E10 cpu, 4.5E9 io, 3.072E12 network, 0.0 memory}, id = 11727
> 05-01                              Project(o_orderkey=[$1], o_orderpriority=[$2]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.75E9 rows, 1.6125E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11726
> 05-02                                SelectionVectorRemover : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.375E9 rows, 1.5375E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11725
> 05-03                                  Filter(condition=[AND(>=($0, 1996-10-01), <($0, 1997-01-01 00:00:00))]) : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.0E9 rows, 1.5E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11724
> 05-04                                    Scan(table=[[dfs, tpchpar1000_micro, orders]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/orders]], selectionRoot=maprfs:/tpchParquet10/SF1000/orders, numFiles=1, numRowGroups=750, usedMetadataFile=false, columns=[`o_orderdate`, `o_orderkey`, `o_orderpriority`]]]) : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 1.5E9, cumulative cost = {1.5E9 rows, 4.5E9 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11723
> 03-06                            HashAgg(group=[{0}]) : rowType = RecordType(ANY l_orderkey): rowcount = 4499992.28175, cumulative cost = {1.3439976948159996E10 rows, 5.477990604317001E10 cpu, 1.7999969127E10 io, 1.8431968386048E11 network, 8.711985057468E9 memory}, id = 11722
> 03-07                              HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = {1.3394977025342497E10 rows, 5.441990666063001E10 cpu, 1.7999969127E10 io, 1.8431968386048E11 network, 7.919986415880001E9 memory}, id = 11721
> 04-01                                HashAgg(group=[{0}]) : rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = {1.3349977102524998E10 rows, 5.369990789555001E10 cpu, 1.7999969127E10 io, 0.0 network, 7.919986415880001E9 memory}, id = 11720
> 04-02                                  Project(l_orderkey=[$0], i=[true]) : rowType = RecordType(ANY l_orderkey, BOOLEAN i): rowcount = 4.49999228175E8, cumulative cost = {1.2899977874349998E10 rows, 5.009991407015001E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11719
> 04-03                                    SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = {1.2449978646175E10 rows, 4.784991792927501E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11718
> 04-04                                      Filter(condition=[AND(=($0, $0), <($1, $2))]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = {1.1999979418E10 rows, 4.7399918701100006E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11717
> 04-05                                        Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_orderkey`, `l_commitdate`, `l_receiptdate`]]]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 5.999989709E9, cumulative cost = {5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11716
> {noformat}
> I have attached two profiles. 235a471b-aa97-bfb5-207d-3f25b4b5fbbb is from commit id 4627973bde9847a4eb2672c44941136c167326a1. This does not have Statistics code and serves as the baseline. It is the commit prior to the Statistics commit. 235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5 is from commit id 5c436dbb028b813e80b7b8fcf045af31f0bcf68b. This has the Statistics code with the fix for disabled Statistics.
> I also pulled the logs from the foreman to show the memory limit for Hash Agg.  With the baseline, HashAgg has a limit of 10GB:
> {noformat}
> 2019-04-04 00:57:27,446 [235a471b-aa97-bfb5-207d-3f25b4b5fbbb:frag:4:142] TRACE o.a.d.e.p.i.aggregate.HashAggregator - 1st phase. Estimated internal row width: 8 Values row width: 0 batch size: 720896  memory limit: 10000000000  max column width: 8
> {noformat}
> With statistics disabled, HashAgg has a limit of 64MB:
> {noformat}
> 2019-04-04 01:32:48,132 [235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5:frag:4:182] TRACE o.a.d.e.p.i.aggregate.HashAggregator - 1st phase. Estimated internal row width: 8 Values row width: 0 batch size: 720896  memory limit: 65075262  max column width: 8
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)