You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Gautam Parai (JIRA)" <ji...@apache.org> on 2019/04/02 01:55:00 UTC

[jira] [Created] (DRILL-7148) TPCH query 17 increases execution time with Statistics enabled because join order is changed

Gautam Parai created DRILL-7148:
-----------------------------------

             Summary: TPCH query 17 increases execution time with Statistics enabled because join order is changed
                 Key: DRILL-7148
                 URL: https://issues.apache.org/jira/browse/DRILL-7148
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.16.0
            Reporter: Gautam Parai
            Assignee: Gautam Parai
             Fix For: 1.16.0


TPCH query 17 with sf 1000 runs 45% slower. One issue is that the join order has flipped the build side and the probe side in Major Fragment 01.

Here is the query:
select
 sum(l.l_extendedprice) / 7.0 as avg_yearly
from
 lineitem l,
 part p
where
 p.p_partkey = l.l_partkey
 and p.p_brand = 'Brand#13'
 and p.p_container = 'JUMBO CAN'
 and l.l_quantity < (
 select
 0.2 * avg(l2.l_quantity)
 from
 lineitem l2
 where
 l2.l_partkey = p.p_partkey
 );

Here is original plan:
{noformat}
00-00 Screen : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{7.853786601428E10 rows, 6.6179786770537E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489493
00-01 Project(avg_yearly=[/($0, 7.0)]) : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{7.853786601418E10 rows, 6.6179786770527E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489492
00-02 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{7.853786601318E10 rows, 6.6179786770127E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489491
00-03 UnionExchange : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{7.853786601218E10 rows, 6.6179786768927E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489490
01-01 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{7.853786601118E10 rows, 6.6179786768127E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489489
01-02 Project(l_extendedprice=[$1]) : rowType = RecordType(ANY l_extendedprice): rowcount = 2.9999948545E9, cumulative cost = \{7.553787115668E10 rows, 6.2579792942727E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489488
01-03 SelectionVectorRemover : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2.9999948545E9, cumulative cost = \{7.253787630218E10 rows, 6.2279793457277E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489487
01-04 Filter(condition=[<($0, *(0.2, $4))]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2.9999948545E9, cumulative cost = \{6.953788144768E10 rows, 6.1979793971827E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489486
01-05 HashJoin(condition=[=($2, $3)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 5.999989709E9, cumulative cost = \{6.353789173867999E10 rows, 5.8379800146427E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489485
01-07 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5.999989709E9, cumulative cost = \{4.2417927963E10 rows, 2.71618536905E11 cpu, 1.8599969127E10 io, 9.8471562592256E13 network, 7.92E7 memory}, id = 489476
01-09 HashToRandomExchange(dist0=[[$2]]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E9, cumulative cost = \{3.6417938254E10 rows, 2.53618567778E11 cpu, 1.8599969127E10 io, 9.8471562592256E13 network, 7.92E7 memory}, id = 489475
02-01 UnorderedMuxExchange : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E9, cumulative cost = \{3.0417948545E10 rows, 1.57618732434E11 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489474
04-01 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 1301011)]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E9, cumulative cost = \{2.4417958836E10 rows, 1.51618742725E11 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489473
04-02 Project(l_quantity=[$1], l_extendedprice=[$2], p_partkey=[$3]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5.999989709E9, cumulative cost = \{1.8417969127E10 rows, 1.09618814762E11 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489472
04-03 HashJoin(condition=[=($0, $3)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5.999989709E9, cumulative cost = \{1.2417979418E10 rows, 9.1618845635E10 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489471
04-05 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`, `l_extendedprice`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 489465
04-04 BroadcastExchange : rowType = RecordType(ANY p_partkey): rowcount = 4500000.0, cumulative cost = \{4.135E8 rows, 1.583E9 cpu, 6.0E8 io, 1.677312E11 network, 0.0 memory}, id = 489470
06-01 Project(p_partkey=[$0]) : rowType = RecordType(ANY p_partkey): rowcount = 4500000.0, cumulative cost = \{4.09E8 rows, 1.547E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489469
06-02 SelectionVectorRemover : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 4500000.0, cumulative cost = \{4.045E8 rows, 1.5425E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489468
06-03 Filter(condition=[AND(=($1, 'Brand#13'), =($2, 'JUMBO CAN'))]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 4500000.0, cumulative cost = \{4.0E8 rows, 1.538E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489467
06-04 Scan(table=[[dfs, tpchpar1000_micro, part]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/part]], selectionRoot=maprfs:/tpchParquet10/SF1000/part, numFiles=1, numRowGroups=90, usedMetadataFile=false, columns=[`p_partkey`, `p_brand`, `p_container`]]]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 2.0E8, cumulative cost = \{2.0E8 rows, 6.0E8 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489466
01-06 Project(l_partkey=[$0], $f1=[divide(CastHigh(CASE(=($2, 0), null, $1)), $2)]) : rowType = RecordType(ANY l_partkey, ANY $f1): rowcount = 5.9999897089999996E7, cumulative cost = \{1.5059974169589998E10 rows, 2.3969958887455E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.1615980076624E11 memory}, id = 489484
01-08 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[$SUM0($2)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 5.9999897089999996E7, cumulative cost = \{1.4999974272499998E10 rows, 2.3939958938909998E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.1615980076624E11 memory}, id = 489483
01-10 Project(l_partkey=[$0], $f1=[$1], $f2=[$2]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 5.999989709E8, cumulative cost = \{1.4399975301599998E10 rows, 2.201996223203E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.0559981887840001E11 memory}, id = 489482
01-11 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E8, cumulative cost = \{1.3799976330699999E10 rows, 2.1839962540759998E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.0559981887840001E11 memory}, id = 489481
03-01 UnorderedMuxExchange : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E8, cumulative cost = \{1.31999773598E10 rows, 2.0879964187319998E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 489480
05-01 Project(l_partkey=[$0], $f1=[$1], $f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0, 1301011)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E8, cumulative cost = \{1.25999783889E10 rows, 2.081996429023E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 489479
05-02 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 5.999989709E8, cumulative cost = \{1.1999979418E10 rows, 2.03999650106E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 489478
05-03 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.1999979418E10 cpu, 1.1999979418E10 io, 0.0 network, 0.0 memory}, id = 489477
{noformat}

Here is the new plan:
{noformat}
00-00 Screen : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{2.589042618686726E10 rows, 3.1133328060351746E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62719
00-01 Project(avg_yearly=[/($0, 7.0)]) : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{2.589042618676726E10 rows, 3.113332806034175E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62718
00-02 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{2.589042618576726E10 rows, 3.113332805994175E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62717
00-03 UnionExchange : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{2.589042618476726E10 rows, 3.113332805874175E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62716
01-01 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{2.589042618376726E10 rows, 3.113332805794175E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62715
01-02 Project(l_extendedprice=[$1]) : rowType = RecordType(ANY l_extendedprice): rowcount = 2928825.0930136647, cumulative cost = \{2.5887497358674248E10 rows, 3.1129813467830133E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62714
01-03 SelectionVectorRemover : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2928825.0930136647, cumulative cost = \{2.5884568533581234E10 rows, 3.112952058532083E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62713
01-04 Filter(condition=[<($0, *(0.2, $4))]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2928825.0930136647, cumulative cost = \{2.588163970848822E10 rows, 3.112922770281153E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62712
01-05 Project(l_quantity=[$2], l_extendedprice=[$3], p_partkey=[$4], l_partkey=[$0], $f1=[$1]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 5857650.186027329, cumulative cost = \{2.5875782058302193E10 rows, 3.1125713112699915E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62711
01-06 HashJoin(condition=[=($4, $0)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_partkey, ANY $f1, ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{2.5869924408116165E10 rows, 3.1122784287606903E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62710
01-08 Project(l_partkey=[$0], $f1=[divide(CastHigh(CASE(=($2, 0), null, $1)), $2)]) : rowType = RecordType(ANY l_partkey, ANY $f1): rowcount = 2.04859953E8, cumulative cost = \{1.3229139136E10 rows, 2.17110687098E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0920535405120001E11 memory}, id = 62697
01-10 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[$SUM0($2)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 2.04859953E8, cumulative cost = \{1.3024279183E10 rows, 2.16086387333E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0920535405120001E11 memory}, id = 62696
01-11 Project(l_partkey=[$0], $f1=[$1], $f2=[$2]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 2.04859953E8, cumulative cost = \{1.281941923E10 rows, 2.09530868837E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0559981887840001E11 memory}, id = 62695
01-12 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.04859953E8, cumulative cost = \{1.2614559277E10 rows, 2.08916288978E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0559981887840001E11 memory}, id = 62694
02-01 UnorderedMuxExchange : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.04859953E8, cumulative cost = \{1.2409699324E10 rows, 2.0563852973E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 62693
04-01 Project(l_partkey=[$0], $f1=[$1], $f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0, 1301011)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.04859953E8, cumulative cost = \{1.2204839371E10 rows, 2.05433669777E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 62692
04-02 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 2.04859953E8, cumulative cost = \{1.1999979418E10 rows, 2.03999650106E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 62691
04-03 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.1999979418E10 cpu, 1.1999979418E10 io, 0.0 network, 0.0 memory}, id = 62690
01-07 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{1.2430067668930138E10 rows, 9.16119751405808E10 cpu, 1.8599969127E10 io, 1.0342646064787177E11 network, 3520000.0000000005 memory}, id = 62709
01-09 HashToRandomExchange(dist0=[[$2]]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5857650.186027329, cumulative cost = \{1.242421001874411E10 rows, 9.159440219002272E10 cpu, 1.8599969127E10 io, 1.0342646064787177E11 network, 3520000.0000000005 memory}, id = 62708
03-01 UnorderedMuxExchange : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5857650.186027329, cumulative cost = \{1.2418352368558083E10 rows, 9.150067978704628E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62707
05-01 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 1301011)]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5857650.186027329, cumulative cost = \{1.2412494718372055E10 rows, 9.149482213686026E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62706
05-02 Project(l_quantity=[$1], l_extendedprice=[$2], p_partkey=[$3]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{1.2406637068186028E10 rows, 9.145381858555807E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62705
05-03 HashJoin(condition=[=($0, $3)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{1.2400779418E10 rows, 9.1436245635E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62704
05-05 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`, `l_extendedprice`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 62698
05-04 BroadcastExchange : rowType = RecordType(ANY p_partkey): rowcount = 200000.0, cumulative cost = \{4.006E8 rows, 1.4348E9 cpu, 6.0E8 io, 7.45472E9 network, 0.0 memory}, id = 62703
06-01 Project(p_partkey=[$0]) : rowType = RecordType(ANY p_partkey): rowcount = 200000.0, cumulative cost = \{4.004E8 rows, 1.4332E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62702
06-02 SelectionVectorRemover : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 200000.0, cumulative cost = \{4.002E8 rows, 1.433E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62701
06-03 Filter(condition=[AND(=($1, 'Brand#13'), =($2, 'JUMBO CAN'))]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 200000.0, cumulative cost = \{4.0E8 rows, 1.4328E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62700
06-04 Scan(table=[[dfs, tpchpar1000_micro, part]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/part]], selectionRoot=maprfs:/tpchParquet10/SF1000/part, numFiles=1, numRowGroups=90, usedMetadataFile=false, columns=[`p_partkey`, `p_brand`, `p_container`]]]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 2.0E8, cumulative cost = \{2.0E8 rows, 6.0E8 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62699
{noformat}

I have attached two profiles. /2384d66b-1b93-6fe1-8abe-34cc74994138 is from commit id 4627973bde9847a4eb2672c44941136c167326a1. This does not have Statistics code and serves as the baseline. It is the commit prior to the Statistics commit. 23650ee5-6721-8a8f-7dd3-f5dd09a3a7b0 is from commit id 212e5c0d9656cd572426aa514bf37e0bd002bdd6. This has the Statistics code. This has the fix for DRILL-7109.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)