You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Gautam Parai (JIRA)" <ji...@apache.org> on 2019/04/07 18:54:00 UTC

[jira] [Updated] (DRILL-7148) TPCH query 17 increases execution time with Statistics enabled because join order is changed

     [ https://issues.apache.org/jira/browse/DRILL-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gautam Parai updated DRILL-7148:
--------------------------------
    Fix Version/s:     (was: 1.16.0)
                   1.17.0

> TPCH query 17 increases execution time with Statistics enabled because join order is changed
> --------------------------------------------------------------------------------------------
>
>                 Key: DRILL-7148
>                 URL: https://issues.apache.org/jira/browse/DRILL-7148
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.16.0
>            Reporter: Gautam Parai
>            Assignee: Gautam Parai
>            Priority: Major
>             Fix For: 1.17.0
>
>
> TPCH query 17 with sf 1000 runs 45% slower. One issue is that the join order has flipped the build side and the probe side in Major Fragment 01.
> Here is the query:
> select
>  sum(l.l_extendedprice) / 7.0 as avg_yearly
> from
>  lineitem l,
>  part p
> where
>  p.p_partkey = l.l_partkey
>  and p.p_brand = 'Brand#13'
>  and p.p_container = 'JUMBO CAN'
>  and l.l_quantity < (
>  select
>  0.2 * avg(l2.l_quantity)
>  from
>  lineitem l2
>  where
>  l2.l_partkey = p.p_partkey
>  );
> Here is original plan:
> {noformat}
> 00-00 Screen : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{7.853786601428E10 rows, 6.6179786770537E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489493
> 00-01 Project(avg_yearly=[/($0, 7.0)]) : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{7.853786601418E10 rows, 6.6179786770527E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489492
> 00-02 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{7.853786601318E10 rows, 6.6179786770127E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489491
> 00-03 UnionExchange : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{7.853786601218E10 rows, 6.6179786768927E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489490
> 01-01 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{7.853786601118E10 rows, 6.6179786768127E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489489
> 01-02 Project(l_extendedprice=[$1]) : rowType = RecordType(ANY l_extendedprice): rowcount = 2.9999948545E9, cumulative cost = \{7.553787115668E10 rows, 6.2579792942727E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489488
> 01-03 SelectionVectorRemover : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2.9999948545E9, cumulative cost = \{7.253787630218E10 rows, 6.2279793457277E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489487
> 01-04 Filter(condition=[<($0, *(0.2, $4))]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2.9999948545E9, cumulative cost = \{6.953788144768E10 rows, 6.1979793971827E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489486
> 01-05 HashJoin(condition=[=($2, $3)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 5.999989709E9, cumulative cost = \{6.353789173867999E10 rows, 5.8379800146427E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489485
> 01-07 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5.999989709E9, cumulative cost = \{4.2417927963E10 rows, 2.71618536905E11 cpu, 1.8599969127E10 io, 9.8471562592256E13 network, 7.92E7 memory}, id = 489476
> 01-09 HashToRandomExchange(dist0=[[$2]]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E9, cumulative cost = \{3.6417938254E10 rows, 2.53618567778E11 cpu, 1.8599969127E10 io, 9.8471562592256E13 network, 7.92E7 memory}, id = 489475
> 02-01 UnorderedMuxExchange : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E9, cumulative cost = \{3.0417948545E10 rows, 1.57618732434E11 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489474
> 04-01 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 1301011)]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E9, cumulative cost = \{2.4417958836E10 rows, 1.51618742725E11 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489473
> 04-02 Project(l_quantity=[$1], l_extendedprice=[$2], p_partkey=[$3]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5.999989709E9, cumulative cost = \{1.8417969127E10 rows, 1.09618814762E11 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489472
> 04-03 HashJoin(condition=[=($0, $3)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5.999989709E9, cumulative cost = \{1.2417979418E10 rows, 9.1618845635E10 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489471
> 04-05 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`, `l_extendedprice`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 489465
> 04-04 BroadcastExchange : rowType = RecordType(ANY p_partkey): rowcount = 4500000.0, cumulative cost = \{4.135E8 rows, 1.583E9 cpu, 6.0E8 io, 1.677312E11 network, 0.0 memory}, id = 489470
> 06-01 Project(p_partkey=[$0]) : rowType = RecordType(ANY p_partkey): rowcount = 4500000.0, cumulative cost = \{4.09E8 rows, 1.547E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489469
> 06-02 SelectionVectorRemover : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 4500000.0, cumulative cost = \{4.045E8 rows, 1.5425E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489468
> 06-03 Filter(condition=[AND(=($1, 'Brand#13'), =($2, 'JUMBO CAN'))]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 4500000.0, cumulative cost = \{4.0E8 rows, 1.538E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489467
> 06-04 Scan(table=[[dfs, tpchpar1000_micro, part]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/part]], selectionRoot=maprfs:/tpchParquet10/SF1000/part, numFiles=1, numRowGroups=90, usedMetadataFile=false, columns=[`p_partkey`, `p_brand`, `p_container`]]]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 2.0E8, cumulative cost = \{2.0E8 rows, 6.0E8 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489466
> 01-06 Project(l_partkey=[$0], $f1=[divide(CastHigh(CASE(=($2, 0), null, $1)), $2)]) : rowType = RecordType(ANY l_partkey, ANY $f1): rowcount = 5.9999897089999996E7, cumulative cost = \{1.5059974169589998E10 rows, 2.3969958887455E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.1615980076624E11 memory}, id = 489484
> 01-08 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[$SUM0($2)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 5.9999897089999996E7, cumulative cost = \{1.4999974272499998E10 rows, 2.3939958938909998E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.1615980076624E11 memory}, id = 489483
> 01-10 Project(l_partkey=[$0], $f1=[$1], $f2=[$2]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 5.999989709E8, cumulative cost = \{1.4399975301599998E10 rows, 2.201996223203E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.0559981887840001E11 memory}, id = 489482
> 01-11 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E8, cumulative cost = \{1.3799976330699999E10 rows, 2.1839962540759998E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.0559981887840001E11 memory}, id = 489481
> 03-01 UnorderedMuxExchange : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E8, cumulative cost = \{1.31999773598E10 rows, 2.0879964187319998E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 489480
> 05-01 Project(l_partkey=[$0], $f1=[$1], $f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0, 1301011)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E8, cumulative cost = \{1.25999783889E10 rows, 2.081996429023E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 489479
> 05-02 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 5.999989709E8, cumulative cost = \{1.1999979418E10 rows, 2.03999650106E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 489478
> 05-03 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.1999979418E10 cpu, 1.1999979418E10 io, 0.0 network, 0.0 memory}, id = 489477
> {noformat}
> Here is the new plan:
> {noformat}
> 00-00 Screen : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{2.589042618686726E10 rows, 3.1133328060351746E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62719
> 00-01 Project(avg_yearly=[/($0, 7.0)]) : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{2.589042618676726E10 rows, 3.113332806034175E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62718
> 00-02 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{2.589042618576726E10 rows, 3.113332805994175E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62717
> 00-03 UnionExchange : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{2.589042618476726E10 rows, 3.113332805874175E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62716
> 01-01 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{2.589042618376726E10 rows, 3.113332805794175E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62715
> 01-02 Project(l_extendedprice=[$1]) : rowType = RecordType(ANY l_extendedprice): rowcount = 2928825.0930136647, cumulative cost = \{2.5887497358674248E10 rows, 3.1129813467830133E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62714
> 01-03 SelectionVectorRemover : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2928825.0930136647, cumulative cost = \{2.5884568533581234E10 rows, 3.112952058532083E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62713
> 01-04 Filter(condition=[<($0, *(0.2, $4))]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2928825.0930136647, cumulative cost = \{2.588163970848822E10 rows, 3.112922770281153E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62712
> 01-05 Project(l_quantity=[$2], l_extendedprice=[$3], p_partkey=[$4], l_partkey=[$0], $f1=[$1]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 5857650.186027329, cumulative cost = \{2.5875782058302193E10 rows, 3.1125713112699915E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62711
> 01-06 HashJoin(condition=[=($4, $0)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_partkey, ANY $f1, ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{2.5869924408116165E10 rows, 3.1122784287606903E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62710
> 01-08 Project(l_partkey=[$0], $f1=[divide(CastHigh(CASE(=($2, 0), null, $1)), $2)]) : rowType = RecordType(ANY l_partkey, ANY $f1): rowcount = 2.04859953E8, cumulative cost = \{1.3229139136E10 rows, 2.17110687098E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0920535405120001E11 memory}, id = 62697
> 01-10 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[$SUM0($2)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 2.04859953E8, cumulative cost = \{1.3024279183E10 rows, 2.16086387333E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0920535405120001E11 memory}, id = 62696
> 01-11 Project(l_partkey=[$0], $f1=[$1], $f2=[$2]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 2.04859953E8, cumulative cost = \{1.281941923E10 rows, 2.09530868837E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0559981887840001E11 memory}, id = 62695
> 01-12 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.04859953E8, cumulative cost = \{1.2614559277E10 rows, 2.08916288978E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0559981887840001E11 memory}, id = 62694
> 02-01 UnorderedMuxExchange : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.04859953E8, cumulative cost = \{1.2409699324E10 rows, 2.0563852973E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 62693
> 04-01 Project(l_partkey=[$0], $f1=[$1], $f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0, 1301011)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.04859953E8, cumulative cost = \{1.2204839371E10 rows, 2.05433669777E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 62692
> 04-02 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 2.04859953E8, cumulative cost = \{1.1999979418E10 rows, 2.03999650106E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 62691
> 04-03 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.1999979418E10 cpu, 1.1999979418E10 io, 0.0 network, 0.0 memory}, id = 62690
> 01-07 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{1.2430067668930138E10 rows, 9.16119751405808E10 cpu, 1.8599969127E10 io, 1.0342646064787177E11 network, 3520000.0000000005 memory}, id = 62709
> 01-09 HashToRandomExchange(dist0=[[$2]]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5857650.186027329, cumulative cost = \{1.242421001874411E10 rows, 9.159440219002272E10 cpu, 1.8599969127E10 io, 1.0342646064787177E11 network, 3520000.0000000005 memory}, id = 62708
> 03-01 UnorderedMuxExchange : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5857650.186027329, cumulative cost = \{1.2418352368558083E10 rows, 9.150067978704628E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62707
> 05-01 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 1301011)]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5857650.186027329, cumulative cost = \{1.2412494718372055E10 rows, 9.149482213686026E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62706
> 05-02 Project(l_quantity=[$1], l_extendedprice=[$2], p_partkey=[$3]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{1.2406637068186028E10 rows, 9.145381858555807E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62705
> 05-03 HashJoin(condition=[=($0, $3)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{1.2400779418E10 rows, 9.1436245635E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62704
> 05-05 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`, `l_extendedprice`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 62698
> 05-04 BroadcastExchange : rowType = RecordType(ANY p_partkey): rowcount = 200000.0, cumulative cost = \{4.006E8 rows, 1.4348E9 cpu, 6.0E8 io, 7.45472E9 network, 0.0 memory}, id = 62703
> 06-01 Project(p_partkey=[$0]) : rowType = RecordType(ANY p_partkey): rowcount = 200000.0, cumulative cost = \{4.004E8 rows, 1.4332E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62702
> 06-02 SelectionVectorRemover : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 200000.0, cumulative cost = \{4.002E8 rows, 1.433E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62701
> 06-03 Filter(condition=[AND(=($1, 'Brand#13'), =($2, 'JUMBO CAN'))]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 200000.0, cumulative cost = \{4.0E8 rows, 1.4328E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62700
> 06-04 Scan(table=[[dfs, tpchpar1000_micro, part]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/part]], selectionRoot=maprfs:/tpchParquet10/SF1000/part, numFiles=1, numRowGroups=90, usedMetadataFile=false, columns=[`p_partkey`, `p_brand`, `p_container`]]]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 2.0E8, cumulative cost = \{2.0E8 rows, 6.0E8 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62699
> {noformat}
> I have attached two profiles. /2384d66b-1b93-6fe1-8abe-34cc74994138 is from commit id 4627973bde9847a4eb2672c44941136c167326a1. This does not have Statistics code and serves as the baseline. It is the commit prior to the Statistics commit. 23650ee5-6721-8a8f-7dd3-f5dd09a3a7b0 is from commit id 212e5c0d9656cd572426aa514bf37e0bd002bdd6. This has the Statistics code. This has the fix for DRILL-7109.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)