You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by ha...@apache.org on 2018/04/09 21:57:48 UTC
[10/13] hive git commit: HIVE-19128 : Update golden files for spark
perf tests
http://git-wip-us.apache.org/repos/asf/hive/blob/328d3f93/ql/src/test/results/clientpositive/perf/spark/query40.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/spark/query40.q.out b/ql/src/test/results/clientpositive/perf/spark/query40.q.out
index f286294..5360385 100644
--- a/ql/src/test/results/clientpositive/perf/spark/query40.q.out
+++ b/ql/src/test/results/clientpositive/perf/spark/query40.q.out
@@ -54,8 +54,7 @@ limit 100
POSTHOOK: type: QUERY
STAGE DEPENDENCIES:
Stage-2 is a root stage
- Stage-3 depends on stages: Stage-2
- Stage-1 depends on stages: Stage-3
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
@@ -63,44 +62,39 @@ STAGE PLANS:
Spark
#### A masked pattern was here ####
Vertices:
- Map 9
+ Map 8
Map Operator Tree:
TableScan
- alias: warehouse
- Statistics: Num rows: 27 Data size: 27802 Basic stats: COMPLETE Column stats: NONE
+ alias: date_dim
+ Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: w_warehouse_sk is not null (type: boolean)
- Statistics: Num rows: 27 Data size: 27802 Basic stats: COMPLETE Column stats: NONE
+ predicate: (CAST( d_date AS TIMESTAMP) BETWEEN TIMESTAMP'1998-03-08 23:00:00.0' AND TIMESTAMP'1998-05-08 00:00:00.0' and d_date_sk is not null) (type: boolean)
+ Statistics: Num rows: 8116 Data size: 9081804 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: w_warehouse_sk (type: int), w_state (type: string)
+ expressions: d_date_sk (type: int), d_date (type: string)
outputColumnNames: _col0, _col1
- Statistics: Num rows: 27 Data size: 27802 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 8116 Data size: 9081804 Basic stats: COMPLETE Column stats: NONE
Spark HashTable Sink Operator
keys:
- 0 _col1 (type: int)
+ 0 _col0 (type: int)
1 _col0 (type: int)
Local Work:
Map Reduce Local Work
-
- Stage: Stage-3
- Spark
-#### A masked pattern was here ####
- Vertices:
- Map 7
+ Map 9
Map Operator Tree:
TableScan
- alias: date_dim
- Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE
+ alias: warehouse
+ Statistics: Num rows: 27 Data size: 27802 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: (CAST( d_date AS TIMESTAMP) BETWEEN TIMESTAMP'1998-03-08 23:00:00.0' AND TIMESTAMP'1998-05-08 00:00:00.0' and d_date_sk is not null) (type: boolean)
- Statistics: Num rows: 8116 Data size: 9081804 Basic stats: COMPLETE Column stats: NONE
+ predicate: w_warehouse_sk is not null (type: boolean)
+ Statistics: Num rows: 27 Data size: 27802 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: d_date_sk (type: int), d_date (type: string)
+ expressions: w_warehouse_sk (type: int), w_state (type: string)
outputColumnNames: _col0, _col1
- Statistics: Num rows: 8116 Data size: 9081804 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 27 Data size: 27802 Basic stats: COMPLETE Column stats: NONE
Spark HashTable Sink Operator
keys:
- 0 _col0 (type: int)
+ 0 _col1 (type: int)
1 _col0 (type: int)
Local Work:
Map Reduce Local Work
@@ -109,7 +103,7 @@ STAGE PLANS:
Spark
Edges:
Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 329), Map 6 (PARTITION-LEVEL SORT, 329)
- Reducer 3 <- Map 8 (PARTITION-LEVEL SORT, 370), Reducer 2 (PARTITION-LEVEL SORT, 370)
+ Reducer 3 <- Map 7 (PARTITION-LEVEL SORT, 336), Reducer 2 (PARTITION-LEVEL SORT, 336)
Reducer 4 <- Reducer 3 (GROUP, 447)
Reducer 5 <- Reducer 4 (SORT, 1)
#### A masked pattern was here ####
@@ -150,7 +144,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
value expressions: _col2 (type: decimal(7,2))
- Map 8
+ Map 7
Map Operator Tree:
TableScan
alias: item
@@ -169,8 +163,6 @@ STAGE PLANS:
Statistics: Num rows: 51333 Data size: 73728460 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string)
Reducer 2
- Local Work:
- Map Reduce Local Work
Reduce Operator Tree:
Join Operator
condition map:
@@ -180,22 +172,12 @@ STAGE PLANS:
1 _col0 (type: int), _col1 (type: int)
outputColumnNames: _col0, _col1, _col2, _col4, _col7
Statistics: Num rows: 316788826 Data size: 42899570777 Basic stats: COMPLETE Column stats: NONE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col0 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col1, _col2, _col4, _col7, _col9
- input vertices:
- 1 Map 7
- Statistics: Num rows: 348467716 Data size: 47189528877 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col2 (type: int)
- sort order: +
- Map-reduce partition columns: _col2 (type: int)
- Statistics: Num rows: 348467716 Data size: 47189528877 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: int), _col4 (type: decimal(7,2)), _col7 (type: decimal(7,2)), _col9 (type: string)
+ Reduce Output Operator
+ key expressions: _col2 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col2 (type: int)
+ Statistics: Num rows: 316788826 Data size: 42899570777 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col0 (type: int), _col1 (type: int), _col4 (type: decimal(7,2)), _col7 (type: decimal(7,2))
Reducer 3
Local Work:
Map Reduce Local Work
@@ -206,35 +188,45 @@ STAGE PLANS:
keys:
0 _col2 (type: int)
1 _col0 (type: int)
- outputColumnNames: _col1, _col4, _col7, _col9, _col11
- Statistics: Num rows: 383314495 Data size: 51908482889 Basic stats: COMPLETE Column stats: NONE
+ outputColumnNames: _col0, _col1, _col4, _col7, _col9
+ Statistics: Num rows: 348467716 Data size: 47189528877 Basic stats: COMPLETE Column stats: NONE
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
- 0 _col1 (type: int)
+ 0 _col0 (type: int)
1 _col0 (type: int)
- outputColumnNames: _col4, _col7, _col9, _col11, _col14
+ outputColumnNames: _col1, _col4, _col7, _col9, _col12
input vertices:
- 1 Map 9
- Statistics: Num rows: 421645953 Data size: 57099332415 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col14 (type: string), _col11 (type: string), CASE WHEN ((CAST( _col9 AS DATE) < DATE'1998-04-08')) THEN ((_col4 - COALESCE(_col7,0))) ELSE (0) END (type: decimal(13,2)), CASE WHEN ((CAST( _col9 AS DATE) >= DATE'1998-04-08')) THEN ((_col4 - COALESCE(_col7,0))) ELSE (0) END (type: decimal(13,2))
- outputColumnNames: _col0, _col1, _col2, _col3
+ 1 Map 8
+ Statistics: Num rows: 383314495 Data size: 51908482889 Basic stats: COMPLETE Column stats: NONE
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col1 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col4, _col7, _col9, _col12, _col14
+ input vertices:
+ 1 Map 9
Statistics: Num rows: 421645953 Data size: 57099332415 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- aggregations: sum(_col2), sum(_col3)
- keys: _col0 (type: string), _col1 (type: string)
- mode: hash
+ Select Operator
+ expressions: _col14 (type: string), _col9 (type: string), CASE WHEN ((CAST( _col12 AS DATE) < DATE'1998-04-08')) THEN ((_col4 - COALESCE(_col7,0))) ELSE (0) END (type: decimal(13,2)), CASE WHEN ((CAST( _col12 AS DATE) >= DATE'1998-04-08')) THEN ((_col4 - COALESCE(_col7,0))) ELSE (0) END (type: decimal(13,2))
outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 421645953 Data size: 57099332415 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: string), _col1 (type: string)
- sort order: ++
- Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
+ Group By Operator
+ aggregations: sum(_col2), sum(_col3)
+ keys: _col0 (type: string), _col1 (type: string)
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 421645953 Data size: 57099332415 Basic stats: COMPLETE Column stats: NONE
- TopN Hash Memory Usage: 0.1
- value expressions: _col2 (type: decimal(23,2)), _col3 (type: decimal(23,2))
+ Reduce Output Operator
+ key expressions: _col0 (type: string), _col1 (type: string)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
+ Statistics: Num rows: 421645953 Data size: 57099332415 Basic stats: COMPLETE Column stats: NONE
+ TopN Hash Memory Usage: 0.1
+ value expressions: _col2 (type: decimal(23,2)), _col3 (type: decimal(23,2))
Reducer 4
Reduce Operator Tree:
Group By Operator
http://git-wip-us.apache.org/repos/asf/hive/blob/328d3f93/ql/src/test/results/clientpositive/perf/spark/query44.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/spark/query44.q.out b/ql/src/test/results/clientpositive/perf/spark/query44.q.out
index 4ca41fb..b432c16 100644
--- a/ql/src/test/results/clientpositive/perf/spark/query44.q.out
+++ b/ql/src/test/results/clientpositive/perf/spark/query44.q.out
@@ -1,5 +1,5 @@
-Warning: Shuffle Join JOIN[36][tables = [$hdt$_2, $hdt$_3, $hdt$_1]] in Work 'Reducer 8' is a cross product
-Warning: Shuffle Join JOIN[81][tables = [$hdt$_4, $hdt$_5, $hdt$_3]] in Work 'Reducer 19' is a cross product
+Warning: Shuffle Join JOIN[33][tables = [$hdt$_1, $hdt$_2, $hdt$_0]] in Work 'Reducer 4' is a cross product
+Warning: Shuffle Join JOIN[78][tables = [$hdt$_3, $hdt$_4, $hdt$_2]] in Work 'Reducer 17' is a cross product
PREHOOK: query: explain
select asceding.rnk, i1.i_product_name best_performing, i2.i_product_name worst_performing
from(select *
@@ -76,65 +76,43 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 11 <- Map 10 (GROUP, 100)
- Reducer 13 <- Map 12 (GROUP, 199)
- Reducer 15 <- Map 14 (PARTITION-LEVEL SORT, 1009), Reducer 20 (PARTITION-LEVEL SORT, 1009)
- Reducer 17 <- Map 16 (GROUP, 100)
- Reducer 18 <- Reducer 17 (GROUP, 1)
- Reducer 19 <- Reducer 18 (PARTITION-LEVEL SORT, 1), Reducer 22 (PARTITION-LEVEL SORT, 1), Reducer 24 (PARTITION-LEVEL SORT, 1)
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1009), Reducer 9 (PARTITION-LEVEL SORT, 1009)
- Reducer 20 <- Reducer 19 (PARTITION-LEVEL SORT, 1009)
- Reducer 22 <- Map 10 (GROUP, 100)
- Reducer 24 <- Map 12 (GROUP, 199)
- Reducer 3 <- Reducer 15 (PARTITION-LEVEL SORT, 1009), Reducer 2 (PARTITION-LEVEL SORT, 1009)
- Reducer 4 <- Reducer 3 (SORT, 1)
- Reducer 8 <- Reducer 11 (PARTITION-LEVEL SORT, 1), Reducer 13 (PARTITION-LEVEL SORT, 1), Reducer 18 (PARTITION-LEVEL SORT, 1)
- Reducer 9 <- Reducer 8 (PARTITION-LEVEL SORT, 1009)
+ Reducer 10 <- Map 20 (GROUP, 100)
+ Reducer 12 <- Map 11 (GROUP, 199)
+ Reducer 15 <- Map 1 (GROUP, 100)
+ Reducer 16 <- Reducer 15 (GROUP, 1)
+ Reducer 17 <- Reducer 16 (PARTITION-LEVEL SORT, 1), Reducer 21 (PARTITION-LEVEL SORT, 1), Reducer 23 (PARTITION-LEVEL SORT, 1)
+ Reducer 18 <- Reducer 17 (PARTITION-LEVEL SORT, 1009)
+ Reducer 19 <- Map 24 (PARTITION-LEVEL SORT, 1009), Reducer 18 (PARTITION-LEVEL SORT, 1009)
+ Reducer 21 <- Map 20 (GROUP, 100)
+ Reducer 23 <- Map 11 (GROUP, 199)
+ Reducer 4 <- Reducer 10 (PARTITION-LEVEL SORT, 1), Reducer 12 (PARTITION-LEVEL SORT, 1), Reducer 16 (PARTITION-LEVEL SORT, 1)
+ Reducer 5 <- Reducer 4 (PARTITION-LEVEL SORT, 1009)
+ Reducer 6 <- Map 13 (PARTITION-LEVEL SORT, 1009), Reducer 5 (PARTITION-LEVEL SORT, 1009)
+ Reducer 7 <- Reducer 19 (PARTITION-LEVEL SORT, 1009), Reducer 6 (PARTITION-LEVEL SORT, 1009)
+ Reducer 8 <- Reducer 7 (SORT, 1)
#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
TableScan
- alias: i1
- Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: i_item_sk is not null (type: boolean)
- Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: i_item_sk (type: int), i_product_name (type: string)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: int)
- sort order: +
- Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: string)
- Map 10
- Map Operator Tree:
- TableScan
alias: store_sales
Statistics: Num rows: 575995635 Data size: 50814502088 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: ((ss_store_sk = 410) and ss_hdemo_sk is null) (type: boolean)
Statistics: Num rows: 143998908 Data size: 12703625455 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: ss_net_profit (type: decimal(7,2))
- outputColumnNames: _col1
Statistics: Num rows: 143998908 Data size: 12703625455 Basic stats: COMPLETE Column stats: NONE
Group By Operator
- aggregations: sum(_col1), count(_col1)
keys: 410 (type: int)
mode: hash
- outputColumnNames: _col0, _col1, _col2
+ outputColumnNames: _col0
Statistics: Num rows: 143998908 Data size: 12703625455 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 143998908 Data size: 12703625455 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: decimal(17,2)), _col2 (type: bigint)
- Map 12
+ Map 11
Map Operator Tree:
TableScan
alias: ss1
@@ -158,7 +136,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 287997817 Data size: 25407250999 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: decimal(17,2)), _col2 (type: bigint)
- Map 14
+ Map 13
Map Operator Tree:
TableScan
alias: i2
@@ -176,7 +154,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string)
- Map 16
+ Map 20
Map Operator Tree:
TableScan
alias: store_sales
@@ -185,18 +163,40 @@ STAGE PLANS:
predicate: ((ss_store_sk = 410) and ss_hdemo_sk is null) (type: boolean)
Statistics: Num rows: 143998908 Data size: 12703625455 Basic stats: COMPLETE Column stats: NONE
Select Operator
+ expressions: ss_net_profit (type: decimal(7,2))
+ outputColumnNames: _col1
Statistics: Num rows: 143998908 Data size: 12703625455 Basic stats: COMPLETE Column stats: NONE
Group By Operator
+ aggregations: sum(_col1), count(_col1)
keys: 410 (type: int)
mode: hash
- outputColumnNames: _col0
+ outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 143998908 Data size: 12703625455 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 143998908 Data size: 12703625455 Basic stats: COMPLETE Column stats: NONE
- Reducer 11
+ value expressions: _col1 (type: decimal(17,2)), _col2 (type: bigint)
+ Map 24
+ Map Operator Tree:
+ TableScan
+ alias: i1
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: i_item_sk is not null (type: boolean)
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: i_item_sk (type: int), i_product_name (type: string)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: string)
+ Reducer 10
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0), count(VALUE._col1)
@@ -212,7 +212,7 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 71999454 Data size: 6351812727 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: decimal(37,22))
- Reducer 13
+ Reducer 12
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0), count(VALUE._col1)
@@ -230,22 +230,6 @@ STAGE PLANS:
value expressions: _col0 (type: int), _col1 (type: decimal(37,22))
Reducer 15
Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col0 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col1, _col3
- Statistics: Num rows: 1267180808338276 Data size: 234986744609712256 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col3 (type: int)
- sort order: +
- Map-reduce partition columns: _col3 (type: int)
- Statistics: Num rows: 1267180808338276 Data size: 234986744609712256 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: string)
- Reducer 17
- Reduce Operator Tree:
Group By Operator
keys: KEY._col0 (type: int)
mode: mergepartial
@@ -262,7 +246,7 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint)
- Reducer 18
+ Reducer 16
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0)
@@ -277,7 +261,7 @@ STAGE PLANS:
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Reducer 19
+ Reducer 17
Reduce Operator Tree:
Join Operator
condition map:
@@ -294,28 +278,12 @@ STAGE PLANS:
Statistics: Num rows: 3455947584198744 Data size: 640872925954123264 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: 0 (type: int), _col3 (type: decimal(37,22))
- sort order: +-
+ sort order: ++
Map-reduce partition columns: 0 (type: int)
Statistics: Num rows: 3455947584198744 Data size: 640872925954123264 Basic stats: COMPLETE Column stats: NONE
TopN Hash Memory Usage: 0.1
value expressions: _col2 (type: int)
- Reducer 2
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col0 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col1, _col3
- Statistics: Num rows: 1267180808338276 Data size: 234986744609712256 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col3 (type: int)
- sort order: +
- Map-reduce partition columns: _col3 (type: int)
- Statistics: Num rows: 1267180808338276 Data size: 234986744609712256 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: string)
- Reducer 20
+ Reducer 18
Reduce Operator Tree:
Select Operator
expressions: VALUE._col2 (type: int), KEY.reducesinkkey1 (type: decimal(37,22))
@@ -330,7 +298,7 @@ STAGE PLANS:
Windowing table definition
input alias: ptf_1
name: windowingtablefunction
- order by: _col3 DESC NULLS LAST
+ order by: _col3 ASC NULLS FIRST
partition by: 0
raw input shape:
window functions:
@@ -355,7 +323,23 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 1151982528066248 Data size: 213624308651374400 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int)
- Reducer 22
+ Reducer 19
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col0 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col1, _col3
+ Statistics: Num rows: 1267180808338276 Data size: 234986744609712256 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col1 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col1 (type: int)
+ Statistics: Num rows: 1267180808338276 Data size: 234986744609712256 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col3 (type: string)
+ Reducer 21
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0), count(VALUE._col1)
@@ -371,7 +355,7 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 71999454 Data size: 6351812727 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: decimal(37,22))
- Reducer 24
+ Reducer 23
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0), count(VALUE._col1)
@@ -387,44 +371,8 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 143998908 Data size: 12703625455 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: int), _col1 (type: decimal(37,22))
- Reducer 3
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col3 (type: int)
- 1 _col3 (type: int)
- outputColumnNames: _col1, _col3, _col5
- Statistics: Num rows: 1393898919384048 Data size: 258485424673204064 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col3 (type: int), _col1 (type: string), _col5 (type: string)
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 1393898919384048 Data size: 258485424673204064 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: int)
- sort order: +
- Statistics: Num rows: 1393898919384048 Data size: 258485424673204064 Basic stats: COMPLETE Column stats: NONE
- TopN Hash Memory Usage: 0.1
- value expressions: _col1 (type: string), _col2 (type: string)
Reducer 4
Reduce Operator Tree:
- Select Operator
- expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 (type: string), VALUE._col1 (type: string)
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 1393898919384048 Data size: 258485424673204064 Basic stats: COMPLETE Column stats: NONE
- Limit
- Number of rows: 100
- Statistics: Num rows: 100 Data size: 18500 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- Statistics: Num rows: 100 Data size: 18500 Basic stats: COMPLETE Column stats: NONE
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- Reducer 8
- Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
@@ -440,12 +388,12 @@ STAGE PLANS:
Statistics: Num rows: 3455947584198744 Data size: 640872925954123264 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: 0 (type: int), _col3 (type: decimal(37,22))
- sort order: ++
+ sort order: +-
Map-reduce partition columns: 0 (type: int)
Statistics: Num rows: 3455947584198744 Data size: 640872925954123264 Basic stats: COMPLETE Column stats: NONE
TopN Hash Memory Usage: 0.1
value expressions: _col2 (type: int)
- Reducer 9
+ Reducer 5
Reduce Operator Tree:
Select Operator
expressions: VALUE._col2 (type: int), KEY.reducesinkkey1 (type: decimal(37,22))
@@ -460,7 +408,7 @@ STAGE PLANS:
Windowing table definition
input alias: ptf_1
name: windowingtablefunction
- order by: _col3 ASC NULLS FIRST
+ order by: _col3 DESC NULLS LAST
partition by: 0
raw input shape:
window functions:
@@ -485,6 +433,58 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 1151982528066248 Data size: 213624308651374400 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int)
+ Reducer 6
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col0 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col1, _col3
+ Statistics: Num rows: 1267180808338276 Data size: 234986744609712256 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col1 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col1 (type: int)
+ Statistics: Num rows: 1267180808338276 Data size: 234986744609712256 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col3 (type: string)
+ Reducer 7
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col1 (type: int)
+ 1 _col1 (type: int)
+ outputColumnNames: _col3, _col5, _col7
+ Statistics: Num rows: 1393898919384048 Data size: 258485424673204064 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col5 (type: int), _col7 (type: string), _col3 (type: string)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1393898919384048 Data size: 258485424673204064 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Statistics: Num rows: 1393898919384048 Data size: 258485424673204064 Basic stats: COMPLETE Column stats: NONE
+ TopN Hash Memory Usage: 0.1
+ value expressions: _col1 (type: string), _col2 (type: string)
+ Reducer 8
+ Reduce Operator Tree:
+ Select Operator
+ expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 (type: string), VALUE._col1 (type: string)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1393898919384048 Data size: 258485424673204064 Basic stats: COMPLETE Column stats: NONE
+ Limit
+ Number of rows: 100
+ Statistics: Num rows: 100 Data size: 18500 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 100 Data size: 18500 Basic stats: COMPLETE Column stats: NONE
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
http://git-wip-us.apache.org/repos/asf/hive/blob/328d3f93/ql/src/test/results/clientpositive/perf/spark/query45.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/spark/query45.q.out b/ql/src/test/results/clientpositive/perf/spark/query45.q.out
index b674400..7e1cc88 100644
--- a/ql/src/test/results/clientpositive/perf/spark/query45.q.out
+++ b/ql/src/test/results/clientpositive/perf/spark/query45.q.out
@@ -46,10 +46,10 @@ STAGE PLANS:
Stage: Stage-2
Spark
Edges:
- Reducer 16 <- Map 15 (GROUP, 1)
+ Reducer 6 <- Map 5 (GROUP, 1)
#### A masked pattern was here ####
Vertices:
- Map 15
+ Map 5
Map Operator Tree:
TableScan
alias: item
@@ -70,7 +70,7 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint), _col1 (type: bigint)
- Reducer 16
+ Reducer 6
Local Work:
Map Reduce Local Work
Reduce Operator Tree:
@@ -88,33 +88,45 @@ STAGE PLANS:
Spark
Edges:
Reducer 11 <- Map 10 (GROUP, 3)
- Reducer 13 <- Map 12 (PARTITION-LEVEL SORT, 154), Map 14 (PARTITION-LEVEL SORT, 154)
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 855), Map 6 (PARTITION-LEVEL SORT, 855)
- Reducer 3 <- Reducer 2 (PARTITION-LEVEL SORT, 777), Reducer 9 (PARTITION-LEVEL SORT, 777)
- Reducer 4 <- Reducer 3 (GROUP, 230)
- Reducer 5 <- Reducer 4 (SORT, 1)
+ Reducer 13 <- Map 12 (PARTITION-LEVEL SORT, 154), Map 15 (PARTITION-LEVEL SORT, 154)
+ Reducer 14 <- Map 16 (PARTITION-LEVEL SORT, 706), Reducer 13 (PARTITION-LEVEL SORT, 706)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 527), Reducer 9 (PARTITION-LEVEL SORT, 527)
+ Reducer 3 <- Reducer 2 (GROUP, 224)
+ Reducer 4 <- Reducer 3 (SORT, 1)
Reducer 8 <- Map 7 (PARTITION-LEVEL SORT, 7), Reducer 11 (PARTITION-LEVEL SORT, 7)
- Reducer 9 <- Reducer 13 (PARTITION-LEVEL SORT, 174), Reducer 8 (PARTITION-LEVEL SORT, 174)
+ Reducer 9 <- Reducer 14 (PARTITION-LEVEL SORT, 191), Reducer 8 (PARTITION-LEVEL SORT, 191)
#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
TableScan
- alias: customer
- Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
+ alias: customer_address
+ Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: (c_current_addr_sk is not null and c_customer_sk is not null) (type: boolean)
- Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
+ predicate: ca_address_sk is not null (type: boolean)
+ Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: c_customer_sk (type: int), c_current_addr_sk (type: int)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col1 (type: int)
- sort order: +
- Map-reduce partition columns: _col1 (type: int)
- Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: int)
+ expressions: ca_address_sk (type: int), ca_county (type: string), ca_zip (type: string)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0
+ 1
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ input vertices:
+ 1 Reducer 6
+ Statistics: Num rows: 40000000 Data size: 41275195284 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 40000000 Data size: 41275195284 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: bigint), _col4 (type: bigint)
+ Local Work:
+ Map Reduce Local Work
Map 10
Map Operator Tree:
TableScan
@@ -155,7 +167,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 144002668 Data size: 19580198212 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: decimal(7,2))
- Map 14
+ Map 15
Map Operator Tree:
TableScan
alias: date_dim
@@ -172,24 +184,24 @@ STAGE PLANS:
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
- Map 6
+ Map 16
Map Operator Tree:
TableScan
- alias: customer_address
- Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
+ alias: customer
+ Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ca_address_sk is not null (type: boolean)
- Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
+ predicate: (c_current_addr_sk is not null and c_customer_sk is not null) (type: boolean)
+ Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: ca_address_sk (type: int), ca_county (type: string), ca_zip (type: string)
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
+ expressions: c_customer_sk (type: int), c_current_addr_sk (type: int)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: string), _col2 (type: string)
+ Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: int)
Map 7
Map Operator Tree:
TableScan
@@ -236,99 +248,91 @@ STAGE PLANS:
outputColumnNames: _col1, _col2, _col3
Statistics: Num rows: 158402938 Data size: 21538218500 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col1 (type: int)
+ key expressions: _col2 (type: int)
sort order: +
- Map-reduce partition columns: _col1 (type: int)
+ Map-reduce partition columns: _col2 (type: int)
Statistics: Num rows: 158402938 Data size: 21538218500 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col2 (type: int), _col3 (type: decimal(7,2))
- Reducer 2
+ value expressions: _col1 (type: int), _col3 (type: decimal(7,2))
+ Reducer 14
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
- 0 _col1 (type: int)
+ 0 _col2 (type: int)
1 _col0 (type: int)
- outputColumnNames: _col0, _col3, _col4
- Statistics: Num rows: 88000001 Data size: 75681779077 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: int)
- sort order: +
- Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 88000001 Data size: 75681779077 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col3 (type: string), _col4 (type: string)
- Reducer 3
- Local Work:
- Map Reduce Local Work
+ outputColumnNames: _col1, _col3, _col8
+ Statistics: Num rows: 174243235 Data size: 23692040863 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col8 (type: int), _col1 (type: int), _col3 (type: decimal(7,2))
+ outputColumnNames: _col1, _col3, _col5
+ Statistics: Num rows: 174243235 Data size: 23692040863 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col3 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col3 (type: int)
+ Statistics: Num rows: 174243235 Data size: 23692040863 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: int), _col5 (type: decimal(7,2))
+ Reducer 2
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
0 _col0 (type: int)
- 1 _col6 (type: int)
- outputColumnNames: _col3, _col4, _col6, _col8, _col12
- Statistics: Num rows: 191667562 Data size: 26061245514 Basic stats: COMPLETE Column stats: NONE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0
- 1
- outputColumnNames: _col3, _col4, _col6, _col8, _col12, _col16, _col17
- input vertices:
- 1 Reducer 16
- Statistics: Num rows: 191667562 Data size: 29319594068 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col12 (type: decimal(7,2)), _col3 (type: string), _col4 (type: string), _col6 (type: string), _col16 (type: bigint), _col17 (type: bigint), _col8 (type: boolean)
- outputColumnNames: _col3, _col7, _col8, _col13, _col14, _col15, _col17
- Statistics: Num rows: 191667562 Data size: 29319594068 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: ((substr(_col8, 1, 5)) IN ('85669', '86197', '88274', '83405', '86475', '85392', '85460', '80348', '81792') or CASE WHEN ((_col14 = 0L)) THEN (false) WHEN (_col17 is not null) THEN (true) WHEN (_col13 is null) THEN (null) WHEN ((_col15 < _col14)) THEN (null) ELSE (false) END) (type: boolean)
- Statistics: Num rows: 191667562 Data size: 29319594068 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col3 (type: decimal(7,2)), _col7 (type: string), _col8 (type: string)
- outputColumnNames: _col3, _col7, _col8
- Statistics: Num rows: 191667562 Data size: 29319594068 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- aggregations: sum(_col3)
- keys: _col8 (type: string), _col7 (type: string)
- mode: hash
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 191667562 Data size: 29319594068 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: string), _col1 (type: string)
- sort order: ++
- Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
- Statistics: Num rows: 191667562 Data size: 29319594068 Basic stats: COMPLETE Column stats: NONE
- TopN Hash Memory Usage: 0.1
- value expressions: _col2 (type: decimal(17,2))
- Reducer 4
+ 1 _col5 (type: int)
+ outputColumnNames: _col1, _col2, _col3, _col4, _col6, _col8, _col14
+ Statistics: Num rows: 210834322 Data size: 28667370686 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col14 (type: decimal(7,2)), _col1 (type: string), _col2 (type: string), _col6 (type: string), _col3 (type: bigint), _col4 (type: bigint), _col8 (type: boolean)
+ outputColumnNames: _col3, _col7, _col8, _col13, _col14, _col15, _col17
+ Statistics: Num rows: 210834322 Data size: 28667370686 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: ((substr(_col8, 1, 5)) IN ('85669', '86197', '88274', '83405', '86475', '85392', '85460', '80348', '81792') or CASE WHEN ((_col14 = 0L)) THEN (false) WHEN (_col17 is not null) THEN (true) WHEN (_col13 is null) THEN (null) WHEN ((_col15 < _col14)) THEN (null) ELSE (false) END) (type: boolean)
+ Statistics: Num rows: 210834322 Data size: 28667370686 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col3 (type: decimal(7,2)), _col7 (type: string), _col8 (type: string)
+ outputColumnNames: _col3, _col7, _col8
+ Statistics: Num rows: 210834322 Data size: 28667370686 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ aggregations: sum(_col3)
+ keys: _col8 (type: string), _col7 (type: string)
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 210834322 Data size: 28667370686 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string), _col1 (type: string)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
+ Statistics: Num rows: 210834322 Data size: 28667370686 Basic stats: COMPLETE Column stats: NONE
+ TopN Hash Memory Usage: 0.1
+ value expressions: _col2 (type: decimal(17,2))
+ Reducer 3
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0)
keys: KEY._col0 (type: string), KEY._col1 (type: string)
mode: mergepartial
outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 95833781 Data size: 14659797034 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 105417161 Data size: 14333685343 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string), _col1 (type: string)
sort order: ++
- Statistics: Num rows: 95833781 Data size: 14659797034 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 105417161 Data size: 14333685343 Basic stats: COMPLETE Column stats: NONE
TopN Hash Memory Usage: 0.1
value expressions: _col2 (type: decimal(17,2))
- Reducer 5
+ Reducer 4
Reduce Operator Tree:
Select Operator
expressions: KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: string), VALUE._col0 (type: decimal(17,2))
outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 95833781 Data size: 14659797034 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 105417161 Data size: 14333685343 Basic stats: COMPLETE Column stats: NONE
Limit
Number of rows: 100
- Statistics: Num rows: 100 Data size: 15200 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 100 Data size: 13500 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
- Statistics: Num rows: 100 Data size: 15200 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 100 Data size: 13500 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
@@ -356,15 +360,15 @@ STAGE PLANS:
Inner Join 0 to 1
keys:
0 _col0 (type: int)
- 1 _col1 (type: int)
- outputColumnNames: _col1, _col3, _col6, _col7
- Statistics: Num rows: 174243235 Data size: 23692040863 Basic stats: COMPLETE Column stats: NONE
+ 1 _col3 (type: int)
+ outputColumnNames: _col1, _col3, _col5, _col9
+ Statistics: Num rows: 191667562 Data size: 26061245514 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col6 (type: int)
+ key expressions: _col5 (type: int)
sort order: +
- Map-reduce partition columns: _col6 (type: int)
- Statistics: Num rows: 174243235 Data size: 23692040863 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: string), _col3 (type: boolean), _col7 (type: decimal(7,2))
+ Map-reduce partition columns: _col5 (type: int)
+ Statistics: Num rows: 191667562 Data size: 26061245514 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: string), _col3 (type: boolean), _col9 (type: decimal(7,2))
Stage: Stage-0
Fetch Operator
http://git-wip-us.apache.org/repos/asf/hive/blob/328d3f93/ql/src/test/results/clientpositive/perf/spark/query46.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/spark/query46.q.out b/ql/src/test/results/clientpositive/perf/spark/query46.q.out
index 8b0525d..6705f50 100644
--- a/ql/src/test/results/clientpositive/perf/spark/query46.q.out
+++ b/ql/src/test/results/clientpositive/perf/spark/query46.q.out
@@ -76,7 +76,7 @@ STAGE PLANS:
Spark
#### A masked pattern was here ####
Vertices:
- Map 11
+ Map 10
Map Operator Tree:
TableScan
alias: store
@@ -94,7 +94,7 @@ STAGE PLANS:
1 _col0 (type: int)
Local Work:
Map Reduce Local Work
- Map 12
+ Map 11
Map Operator Tree:
TableScan
alias: household_demographics
@@ -116,12 +116,12 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 855), Map 5 (PARTITION-LEVEL SORT, 855)
- Reducer 3 <- Reducer 2 (PARTITION-LEVEL SORT, 882), Reducer 9 (PARTITION-LEVEL SORT, 882)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 829), Reducer 8 (PARTITION-LEVEL SORT, 829)
+ Reducer 3 <- Map 13 (PARTITION-LEVEL SORT, 637), Reducer 2 (PARTITION-LEVEL SORT, 637)
Reducer 4 <- Reducer 3 (SORT, 1)
- Reducer 7 <- Map 10 (PARTITION-LEVEL SORT, 398), Map 6 (PARTITION-LEVEL SORT, 398)
- Reducer 8 <- Map 13 (PARTITION-LEVEL SORT, 846), Reducer 7 (PARTITION-LEVEL SORT, 846)
- Reducer 9 <- Reducer 8 (GROUP, 582)
+ Reducer 6 <- Map 5 (PARTITION-LEVEL SORT, 398), Map 9 (PARTITION-LEVEL SORT, 398)
+ Reducer 7 <- Map 12 (PARTITION-LEVEL SORT, 846), Reducer 6 (PARTITION-LEVEL SORT, 846)
+ Reducer 8 <- Reducer 7 (GROUP, 582)
#### A masked pattern was here ####
Vertices:
Map 1
@@ -137,29 +137,12 @@ STAGE PLANS:
outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col1 (type: int)
- sort order: +
- Map-reduce partition columns: _col1 (type: int)
- Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: int), _col2 (type: string), _col3 (type: string)
- Map 10
- Map Operator Tree:
- TableScan
- alias: date_dim
- Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: ((d_dow) IN (6, 0) and (d_year) IN (1998, 1999, 2000) and d_date_sk is not null) (type: boolean)
- Statistics: Num rows: 18263 Data size: 20436297 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: d_date_sk (type: int)
- outputColumnNames: _col0
- Statistics: Num rows: 18263 Data size: 20436297 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 18263 Data size: 20436297 Basic stats: COMPLETE Column stats: NONE
- Map 13
+ Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: int), _col2 (type: string), _col3 (type: string)
+ Map 12
Map Operator Tree:
TableScan
alias: customer_address
@@ -177,7 +160,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string)
- Map 5
+ Map 13
Map Operator Tree:
TableScan
alias: current_addr
@@ -195,7 +178,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string)
- Map 6
+ Map 5
Map Operator Tree:
TableScan
alias: store_sales
@@ -213,43 +196,60 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 575995635 Data size: 50814502088 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: int), _col5 (type: int), _col6 (type: decimal(7,2)), _col7 (type: decimal(7,2))
+ Map 9
+ Map Operator Tree:
+ TableScan
+ alias: date_dim
+ Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: ((d_dow) IN (6, 0) and (d_year) IN (1998, 1999, 2000) and d_date_sk is not null) (type: boolean)
+ Statistics: Num rows: 18263 Data size: 20436297 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: d_date_sk (type: int)
+ outputColumnNames: _col0
+ Statistics: Num rows: 18263 Data size: 20436297 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 18263 Data size: 20436297 Basic stats: COMPLETE Column stats: NONE
Reducer 2
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
- 0 _col1 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col0, _col2, _col3, _col5
- Statistics: Num rows: 88000001 Data size: 75681779077 Basic stats: COMPLETE Column stats: NONE
+ 0 _col0 (type: int)
+ 1 _col1 (type: int)
+ outputColumnNames: _col1, _col2, _col3, _col4, _col6, _col7, _col8
+ Statistics: Num rows: 463823414 Data size: 40918636263 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col0 (type: int)
+ key expressions: _col1 (type: int)
sort order: +
- Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 88000001 Data size: 75681779077 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col2 (type: string), _col3 (type: string), _col5 (type: string)
+ Map-reduce partition columns: _col1 (type: int)
+ Statistics: Num rows: 463823414 Data size: 40918636263 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col2 (type: string), _col3 (type: string), _col4 (type: int), _col6 (type: string), _col7 (type: decimal(17,2)), _col8 (type: decimal(17,2))
Reducer 3
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
- 0 _col0 (type: int)
- 1 _col1 (type: int)
- outputColumnNames: _col2, _col3, _col5, _col6, _col8, _col9, _col10
- Statistics: Num rows: 463823414 Data size: 40918636263 Basic stats: COMPLETE Column stats: NONE
+ 0 _col1 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col2, _col3, _col4, _col6, _col7, _col8, _col10
+ Statistics: Num rows: 510205766 Data size: 45010500864 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: (_col5 <> _col8) (type: boolean)
- Statistics: Num rows: 463823414 Data size: 40918636263 Basic stats: COMPLETE Column stats: NONE
+ predicate: (_col10 <> _col6) (type: boolean)
+ Statistics: Num rows: 510205766 Data size: 45010500864 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: _col3 (type: string), _col2 (type: string), _col5 (type: string), _col8 (type: string), _col6 (type: int), _col9 (type: decimal(17,2)), _col10 (type: decimal(17,2))
+ expressions: _col3 (type: string), _col2 (type: string), _col10 (type: string), _col6 (type: string), _col4 (type: int), _col7 (type: decimal(17,2)), _col8 (type: decimal(17,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
- Statistics: Num rows: 463823414 Data size: 40918636263 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 510205766 Data size: 45010500864 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: int)
sort order: +++++
- Statistics: Num rows: 463823414 Data size: 40918636263 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 510205766 Data size: 45010500864 Basic stats: COMPLETE Column stats: NONE
TopN Hash Memory Usage: 0.1
value expressions: _col5 (type: decimal(17,2)), _col6 (type: decimal(17,2))
Reducer 4
@@ -257,7 +257,7 @@ STAGE PLANS:
Select Operator
expressions: KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: string), KEY.reducesinkkey2 (type: string), KEY.reducesinkkey3 (type: string), KEY.reducesinkkey4 (type: int), VALUE._col0 (type: decimal(17,2)), VALUE._col1 (type: decimal(17,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
- Statistics: Num rows: 463823414 Data size: 40918636263 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 510205766 Data size: 45010500864 Basic stats: COMPLETE Column stats: NONE
Limit
Number of rows: 100
Statistics: Num rows: 100 Data size: 8800 Basic stats: COMPLETE Column stats: NONE
@@ -268,7 +268,7 @@ STAGE PLANS:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- Reducer 7
+ Reducer 6
Local Work:
Map Reduce Local Work
Reduce Operator Tree:
@@ -288,7 +288,7 @@ STAGE PLANS:
1 _col0 (type: int)
outputColumnNames: _col1, _col2, _col3, _col5, _col6, _col7
input vertices:
- 1 Map 11
+ 1 Map 10
Statistics: Num rows: 696954748 Data size: 61485550191 Basic stats: COMPLETE Column stats: NONE
Map Join Operator
condition map:
@@ -298,7 +298,7 @@ STAGE PLANS:
1 _col0 (type: int)
outputColumnNames: _col1, _col3, _col5, _col6, _col7
input vertices:
- 1 Map 12
+ 1 Map 11
Statistics: Num rows: 766650239 Data size: 67634106676 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col3 (type: int)
@@ -306,7 +306,7 @@ STAGE PLANS:
Map-reduce partition columns: _col3 (type: int)
Statistics: Num rows: 766650239 Data size: 67634106676 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int), _col5 (type: int), _col6 (type: decimal(7,2)), _col7 (type: decimal(7,2))
- Reducer 8
+ Reducer 7
Reduce Operator Tree:
Join Operator
condition map:
@@ -328,7 +328,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int), _col1 (type: string), _col2 (type: int), _col3 (type: int)
Statistics: Num rows: 843315281 Data size: 74397518956 Basic stats: COMPLETE Column stats: NONE
value expressions: _col4 (type: decimal(17,2)), _col5 (type: decimal(17,2))
- Reducer 9
+ Reducer 8
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0), sum(VALUE._col1)