You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by ha...@apache.org on 2016/12/16 18:28:21 UTC
[08/21] hive git commit: HIVE-15192 : Use Calcite to de-correlate and
plan subqueries (Vineet Garg via Ashutosh Chauhan)
http://git-wip-us.apache.org/repos/asf/hive/blob/382dc208/ql/src/test/results/clientpositive/perf/query70.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/query70.q.out b/ql/src/test/results/clientpositive/perf/query70.q.out
index 581fd9b..c3ef2a8 100644
--- a/ql/src/test/results/clientpositive/perf/query70.q.out
+++ b/ql/src/test/results/clientpositive/perf/query70.q.out
@@ -75,92 +75,61 @@ POSTHOOK: type: QUERY
Plan optimized by CBO.
Vertex dependency in root stage
-Reducer 11 <- Map 10 (SIMPLE_EDGE), Map 15 (SIMPLE_EDGE)
-Reducer 12 <- Map 16 (SIMPLE_EDGE), Reducer 11 (SIMPLE_EDGE)
+Reducer 11 <- Map 10 (SIMPLE_EDGE), Map 16 (SIMPLE_EDGE)
+Reducer 12 <- Map 17 (SIMPLE_EDGE), Reducer 11 (SIMPLE_EDGE)
Reducer 13 <- Reducer 12 (SIMPLE_EDGE)
Reducer 14 <- Reducer 13 (SIMPLE_EDGE)
-Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE)
-Reducer 3 <- Reducer 2 (SIMPLE_EDGE), Reducer 9 (SIMPLE_EDGE)
-Reducer 4 <- Reducer 3 (SIMPLE_EDGE)
+Reducer 15 <- Reducer 14 (SIMPLE_EDGE)
+Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 8 (SIMPLE_EDGE)
+Reducer 3 <- Map 9 (SIMPLE_EDGE), Reducer 2 (SIMPLE_EDGE)
+Reducer 4 <- Reducer 15 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE)
Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
-Reducer 9 <- Map 8 (SIMPLE_EDGE), Reducer 14 (SIMPLE_EDGE)
+Reducer 7 <- Reducer 6 (SIMPLE_EDGE)
Stage-0
Fetch Operator
limit:100
Stage-1
- Reducer 6
- File Output Operator [FS_63]
- Limit [LIM_62] (rows=100 width=88)
+ Reducer 7
+ File Output Operator [FS_64]
+ Limit [LIM_63] (rows=100 width=88)
Number of rows:100
- Select Operator [SEL_61] (rows=1045432122 width=88)
+ Select Operator [SEL_62] (rows=1149975358 width=88)
Output:["_col0","_col1","_col2","_col3","_col4"]
- <-Reducer 5 [SIMPLE_EDGE]
- SHUFFLE [RS_60]
- Select Operator [SEL_58] (rows=1045432122 width=88)
+ <-Reducer 6 [SIMPLE_EDGE]
+ SHUFFLE [RS_61]
+ Select Operator [SEL_59] (rows=1149975358 width=88)
Output:["_col0","_col1","_col2","_col3","_col4"]
- PTF Operator [PTF_57] (rows=1045432122 width=88)
+ PTF Operator [PTF_58] (rows=1149975358 width=88)
Function definitions:[{},{"name:":"windowingtablefunction","order by:":"_col4 DESC NULLS LAST","partition by:":"(grouping(_col5, 0) + grouping(_col5, 1)), CASE WHEN ((UDFToInteger(grouping(_col5, 1)) = 0)) THEN (_col0) ELSE (null) END"}]
- Select Operator [SEL_56] (rows=1045432122 width=88)
+ Select Operator [SEL_57] (rows=1149975358 width=88)
Output:["_col0","_col1","_col4","_col5"]
- <-Reducer 4 [SIMPLE_EDGE]
- SHUFFLE [RS_55]
+ <-Reducer 5 [SIMPLE_EDGE]
+ SHUFFLE [RS_56]
PartitionCols:(grouping(_col5, 0) + grouping(_col5, 1)), CASE WHEN ((UDFToInteger(grouping(_col5, 1)) = 0)) THEN (_col0) ELSE (null) END
- Select Operator [SEL_54] (rows=1045432122 width=88)
+ Select Operator [SEL_55] (rows=1149975358 width=88)
Output:["_col0","_col1","_col4","_col5"]
- Group By Operator [GBY_53] (rows=1045432122 width=88)
+ Group By Operator [GBY_54] (rows=1149975358 width=88)
Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0, KEY._col1, KEY._col2
- <-Reducer 3 [SIMPLE_EDGE]
- SHUFFLE [RS_52]
+ <-Reducer 4 [SIMPLE_EDGE]
+ SHUFFLE [RS_53]
PartitionCols:_col0, _col1, _col2
- Group By Operator [GBY_51] (rows=2090864244 width=88)
+ Group By Operator [GBY_52] (rows=2299950717 width=88)
Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(_col2)"],keys:_col0, _col1, 0
- Select Operator [SEL_49] (rows=696954748 width=88)
+ Select Operator [SEL_50] (rows=766650239 width=88)
Output:["_col0","_col1","_col2"]
- Merge Join Operator [MERGEJOIN_92] (rows=696954748 width=88)
- Conds:RS_46._col1=RS_47._col0(Inner),Output:["_col2","_col6","_col7"]
- <-Reducer 2 [SIMPLE_EDGE]
- SHUFFLE [RS_46]
- PartitionCols:_col1
- Merge Join Operator [MERGEJOIN_88] (rows=633595212 width=88)
- Conds:RS_43._col0=RS_44._col0(Inner),Output:["_col1","_col2"]
- <-Map 1 [SIMPLE_EDGE]
- SHUFFLE [RS_43]
- PartitionCols:_col0
- Select Operator [SEL_2] (rows=575995635 width=88)
- Output:["_col0","_col1","_col2"]
- Filter Operator [FIL_81] (rows=575995635 width=88)
- predicate:(ss_sold_date_sk is not null and ss_store_sk is not null)
- TableScan [TS_0] (rows=575995635 width=88)
- default@store_sales,store_sales,Tbl:COMPLETE,Col:NONE,Output:["ss_sold_date_sk","ss_store_sk","ss_net_profit"]
- <-Map 7 [SIMPLE_EDGE]
- SHUFFLE [RS_44]
- PartitionCols:_col0
- Select Operator [SEL_5] (rows=8116 width=1119)
- Output:["_col0"]
- Filter Operator [FIL_82] (rows=8116 width=1119)
- predicate:(d_month_seq BETWEEN 1212 AND 1223 and d_date_sk is not null)
- TableScan [TS_3] (rows=73049 width=1119)
- default@date_dim,d1,Tbl:COMPLETE,Col:NONE,Output:["d_date_sk","d_month_seq"]
- <-Reducer 9 [SIMPLE_EDGE]
- SHUFFLE [RS_47]
+ Merge Join Operator [MERGEJOIN_92] (rows=766650239 width=88)
+ Conds:RS_47._col7=RS_48._col0(Inner),Output:["_col2","_col6","_col7"]
+ <-Reducer 15 [SIMPLE_EDGE]
+ SHUFFLE [RS_48]
PartitionCols:_col0
- Merge Join Operator [MERGEJOIN_91] (rows=127775039 width=88)
- Conds:RS_39._col2=RS_40._col0(Left Semi),Output:["_col0","_col1","_col2"]
- <-Map 8 [SIMPLE_EDGE]
- SHUFFLE [RS_39]
- PartitionCols:_col2
- Select Operator [SEL_8] (rows=1704 width=1910)
- Output:["_col0","_col1","_col2"]
- Filter Operator [FIL_83] (rows=1704 width=1910)
- predicate:(s_state is not null and s_store_sk is not null)
- TableScan [TS_6] (rows=1704 width=1910)
- default@store,s,Tbl:COMPLETE,Col:NONE,Output:["s_store_sk","s_county","s_state"]
+ Group By Operator [GBY_39] (rows=58079562 width=88)
+ Output:["_col0"],keys:KEY._col0
<-Reducer 14 [SIMPLE_EDGE]
- SHUFFLE [RS_40]
+ SHUFFLE [RS_38]
PartitionCols:_col0
- Group By Operator [GBY_38] (rows=116159124 width=88)
+ Group By Operator [GBY_37] (rows=116159124 width=88)
Output:["_col0"],keys:_col0
Select Operator [SEL_32] (rows=116159124 width=88)
Output:["_col0"]
@@ -182,21 +151,21 @@ Stage-0
Output:["_col0","_col1"],aggregations:["sum(_col2)"],keys:_col6
Select Operator [SEL_24] (rows=696954748 width=88)
Output:["_col6","_col2"]
- Merge Join Operator [MERGEJOIN_90] (rows=696954748 width=88)
+ Merge Join Operator [MERGEJOIN_91] (rows=696954748 width=88)
Conds:RS_21._col1=RS_22._col0(Inner),Output:["_col2","_col6"]
- <-Map 16 [SIMPLE_EDGE]
+ <-Map 17 [SIMPLE_EDGE]
SHUFFLE [RS_22]
PartitionCols:_col0
Select Operator [SEL_17] (rows=1704 width=1910)
Output:["_col0","_col1"]
Filter Operator [FIL_87] (rows=1704 width=1910)
- predicate:(s_store_sk is not null and s_state is not null)
+ predicate:s_store_sk is not null
TableScan [TS_15] (rows=1704 width=1910)
default@store,store,Tbl:COMPLETE,Col:NONE,Output:["s_store_sk","s_state"]
<-Reducer 11 [SIMPLE_EDGE]
SHUFFLE [RS_21]
PartitionCols:_col1
- Merge Join Operator [MERGEJOIN_89] (rows=633595212 width=88)
+ Merge Join Operator [MERGEJOIN_90] (rows=633595212 width=88)
Conds:RS_18._col0=RS_19._col0(Inner),Output:["_col1","_col2"]
<-Map 10 [SIMPLE_EDGE]
SHUFFLE [RS_18]
@@ -207,7 +176,7 @@ Stage-0
predicate:(ss_store_sk is not null and ss_sold_date_sk is not null)
TableScan [TS_9] (rows=575995635 width=88)
default@store_sales,store_sales,Tbl:COMPLETE,Col:NONE,Output:["ss_sold_date_sk","ss_store_sk","ss_net_profit"]
- <-Map 15 [SIMPLE_EDGE]
+ <-Map 16 [SIMPLE_EDGE]
SHUFFLE [RS_19]
PartitionCols:_col0
Select Operator [SEL_14] (rows=8116 width=1119)
@@ -216,4 +185,41 @@ Stage-0
predicate:(d_month_seq BETWEEN 1212 AND 1223 and d_date_sk is not null)
TableScan [TS_12] (rows=73049 width=1119)
default@date_dim,date_dim,Tbl:COMPLETE,Col:NONE,Output:["d_date_sk","d_month_seq"]
+ <-Reducer 3 [SIMPLE_EDGE]
+ SHUFFLE [RS_47]
+ PartitionCols:_col7
+ Merge Join Operator [MERGEJOIN_89] (rows=696954748 width=88)
+ Conds:RS_44._col1=RS_45._col0(Inner),Output:["_col2","_col6","_col7"]
+ <-Map 9 [SIMPLE_EDGE]
+ SHUFFLE [RS_45]
+ PartitionCols:_col0
+ Select Operator [SEL_8] (rows=1704 width=1910)
+ Output:["_col0","_col1","_col2"]
+ Filter Operator [FIL_83] (rows=1704 width=1910)
+ predicate:s_store_sk is not null
+ TableScan [TS_6] (rows=1704 width=1910)
+ default@store,s,Tbl:COMPLETE,Col:NONE,Output:["s_store_sk","s_county","s_state"]
+ <-Reducer 2 [SIMPLE_EDGE]
+ SHUFFLE [RS_44]
+ PartitionCols:_col1
+ Merge Join Operator [MERGEJOIN_88] (rows=633595212 width=88)
+ Conds:RS_41._col0=RS_42._col0(Inner),Output:["_col1","_col2"]
+ <-Map 1 [SIMPLE_EDGE]
+ SHUFFLE [RS_41]
+ PartitionCols:_col0
+ Select Operator [SEL_2] (rows=575995635 width=88)
+ Output:["_col0","_col1","_col2"]
+ Filter Operator [FIL_81] (rows=575995635 width=88)
+ predicate:(ss_sold_date_sk is not null and ss_store_sk is not null)
+ TableScan [TS_0] (rows=575995635 width=88)
+ default@store_sales,store_sales,Tbl:COMPLETE,Col:NONE,Output:["ss_sold_date_sk","ss_store_sk","ss_net_profit"]
+ <-Map 8 [SIMPLE_EDGE]
+ SHUFFLE [RS_42]
+ PartitionCols:_col0
+ Select Operator [SEL_5] (rows=8116 width=1119)
+ Output:["_col0"]
+ Filter Operator [FIL_82] (rows=8116 width=1119)
+ predicate:(d_month_seq BETWEEN 1212 AND 1223 and d_date_sk is not null)
+ TableScan [TS_3] (rows=73049 width=1119)
+ default@date_dim,d1,Tbl:COMPLETE,Col:NONE,Output:["d_date_sk","d_month_seq"]
http://git-wip-us.apache.org/repos/asf/hive/blob/382dc208/ql/src/test/results/clientpositive/semijoin4.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/semijoin4.q.out b/ql/src/test/results/clientpositive/semijoin4.q.out
index 3c065a9..89e4023 100644
--- a/ql/src/test/results/clientpositive/semijoin4.q.out
+++ b/ql/src/test/results/clientpositive/semijoin4.q.out
@@ -56,9 +56,10 @@ WHERE (t2.tinyint_col_21) IN (
POSTHOOK: type: QUERY
STAGE DEPENDENCIES:
Stage-1 is a root stage
- Stage-2 depends on stages: Stage-1, Stage-5
+ Stage-2 depends on stages: Stage-1, Stage-6
Stage-3 depends on stages: Stage-2
Stage-5 is a root stage
+ Stage-6 depends on stages: Stage-5
Stage-0 depends on stages: Stage-3
STAGE PLANS:
@@ -69,7 +70,7 @@ STAGE PLANS:
alias: t1
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
- predicate: ((UDFToInteger(tinyint_col_46) = -92) and decimal1309_col_65 is not null and bigint_col_13 is not null) (type: boolean)
+ predicate: (decimal1309_col_65 is not null and bigint_col_13 is not null and tinyint_col_46 is not null) (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: bigint_col_13 (type: bigint), smallint_col_24 (type: smallint), tinyint_col_46 (type: tinyint), double_col_60 (type: double), decimal1309_col_65 (type: decimal(13,9))
@@ -85,7 +86,7 @@ STAGE PLANS:
alias: t2
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
- predicate: ((UDFToInteger(tinyint_col_21) = -92) and tinyint_col_18 is not null and decimal2709_col_9 is not null) (type: boolean)
+ predicate: (tinyint_col_18 is not null and tinyint_col_21 is not null and decimal2709_col_9 is not null) (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: decimal2709_col_9 (type: decimal(27,9)), tinyint_col_18 (type: tinyint), tinyint_col_21 (type: tinyint)
@@ -105,27 +106,23 @@ STAGE PLANS:
1 _col2 (type: tinyint), _col0 (type: decimal(27,9)), UDFToLong(_col1) (type: bigint)
outputColumnNames: _col1, _col3, _col7
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- Select Operator
- expressions: _col1 (type: smallint), _col3 (type: double), _col7 (type: tinyint), UDFToInteger(_col7) (type: int)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-2
Map Reduce
Map Operator Tree:
TableScan
Reduce Output Operator
- key expressions: _col3 (type: int)
+ key expressions: UDFToInteger(_col7) (type: int)
sort order: +
- Map-reduce partition columns: _col3 (type: int)
+ Map-reduce partition columns: UDFToInteger(_col7) (type: int)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- value expressions: _col0 (type: smallint), _col1 (type: double), _col2 (type: tinyint)
+ value expressions: _col1 (type: smallint), _col3 (type: double), _col7 (type: tinyint)
TableScan
Reduce Output Operator
key expressions: _col0 (type: int)
@@ -135,11 +132,11 @@ STAGE PLANS:
Reduce Operator Tree:
Join Operator
condition map:
- Left Semi Join 0 to 1
+ Inner Join 0 to 1
keys:
- 0 _col3 (type: int)
+ 0 UDFToInteger(_col7) (type: int)
1 _col0 (type: int)
- outputColumnNames: _col0, _col1, _col2
+ outputColumnNames: _col1, _col3, _col7
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
File Output Operator
compressed: false
@@ -153,27 +150,27 @@ STAGE PLANS:
Map Operator Tree:
TableScan
Reduce Output Operator
- key expressions: (UDFToShort(_col2) + _col0) (type: smallint), floor(_col1) (type: bigint)
+ key expressions: (UDFToShort(_col7) + _col1) (type: smallint), floor(_col3) (type: bigint)
sort order: +-
- Map-reduce partition columns: (UDFToShort(_col2) + _col0) (type: smallint)
+ Map-reduce partition columns: (UDFToShort(_col7) + _col1) (type: smallint)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- value expressions: _col0 (type: smallint), _col1 (type: double), _col2 (type: tinyint)
+ value expressions: _col1 (type: smallint), _col3 (type: double), _col7 (type: tinyint)
Reduce Operator Tree:
Select Operator
- expressions: VALUE._col0 (type: smallint), VALUE._col1 (type: double), VALUE._col2 (type: tinyint)
- outputColumnNames: _col0, _col1, _col2
+ expressions: VALUE._col1 (type: smallint), VALUE._col3 (type: double), VALUE._col7 (type: tinyint)
+ outputColumnNames: _col1, _col3, _col7
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
PTF Operator
Function definitions:
Input definition
input alias: ptf_0
- output shape: _col0: smallint, _col1: double, _col2: tinyint
+ output shape: _col1: smallint, _col3: double, _col7: tinyint
type: WINDOWING
Windowing table definition
input alias: ptf_1
name: windowingtablefunction
- order by: (UDFToShort(_col2) + _col0) ASC NULLS FIRST, floor(_col1) DESC NULLS LAST
- partition by: (UDFToShort(_col2) + _col0)
+ order by: (UDFToShort(_col7) + _col1) ASC NULLS FIRST, floor(_col3) DESC NULLS LAST
+ partition by: (UDFToShort(_col7) + _col1)
raw input shape:
window functions:
window function definition
@@ -237,21 +234,39 @@ STAGE PLANS:
0 _col0 (type: decimal(19,11)), _col1 (type: timestamp)
1 _col0 (type: decimal(19,11)), _col1 (type: timestamp)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- Select Operator
- expressions: -92 (type: int)
+ Group By Operator
+ keys: -92 (type: int)
+ mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- Group By Operator
- keys: _col0 (type: int)
- mode: hash
- outputColumnNames: _col0
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+
+ Stage: Stage-6
+ Map Reduce
+ Map Operator Tree:
+ TableScan
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+ Reduce Operator Tree:
+ Group By Operator
+ keys: KEY._col0 (type: int)
+ mode: mergepartial
+ outputColumnNames: _col0
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-0
Fetch Operator
http://git-wip-us.apache.org/repos/asf/hive/blob/382dc208/ql/src/test/results/clientpositive/semijoin5.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/semijoin5.q.out b/ql/src/test/results/clientpositive/semijoin5.q.out
index 63b477c..20d372a 100644
--- a/ql/src/test/results/clientpositive/semijoin5.q.out
+++ b/ql/src/test/results/clientpositive/semijoin5.q.out
@@ -48,10 +48,14 @@ WHERE (t2.smallint_col_19) IN (SELECT
POSTHOOK: type: QUERY
STAGE DEPENDENCIES:
Stage-1 is a root stage
- Stage-2 depends on stages: Stage-1, Stage-6
+ Stage-2 depends on stages: Stage-1, Stage-8
Stage-3 depends on stages: Stage-2
Stage-4 depends on stages: Stage-3
- Stage-6 is a root stage
+ Stage-9 is a root stage
+ Stage-10 depends on stages: Stage-9
+ Stage-6 depends on stages: Stage-10
+ Stage-7 depends on stages: Stage-6
+ Stage-8 depends on stages: Stage-7
Stage-0 depends on stages: Stage-4
STAGE PLANS:
@@ -62,7 +66,7 @@ STAGE PLANS:
alias: t1
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
- predicate: (tinyint_col_3 is not null and bigint_col_7 is not null and decimal2016_col_26 is not null and timestamp_col_9 is not null) (type: boolean)
+ predicate: (tinyint_col_3 is not null and bigint_col_7 is not null and decimal2016_col_26 is not null) (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: tinyint_col_3 (type: tinyint), bigint_col_7 (type: bigint), timestamp_col_9 (type: timestamp), double_col_16 (type: double), decimal2016_col_26 (type: decimal(20,16)), smallint_col_50 (type: smallint)
@@ -78,7 +82,7 @@ STAGE PLANS:
alias: t2
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
- predicate: ((UDFToInteger(smallint_col_19) = -92) and tinyint_col_20 is not null and decimal2709_col_9 is not null and tinyint_col_15 is not null) (type: boolean)
+ predicate: (tinyint_col_20 is not null and decimal2709_col_9 is not null and tinyint_col_15 is not null) (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: decimal2709_col_9 (type: decimal(27,9)), int_col_10 (type: int), tinyint_col_15 (type: tinyint), smallint_col_19 (type: smallint), tinyint_col_20 (type: tinyint)
@@ -99,41 +103,37 @@ STAGE PLANS:
1 _col4 (type: tinyint), _col0 (type: decimal(34,16)), UDFToLong(_col2) (type: bigint)
outputColumnNames: _col2, _col3, _col5, _col7, _col9
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- Select Operator
- expressions: _col2 (type: timestamp), _col3 (type: double), _col5 (type: smallint), _col7 (type: int), UDFToInteger(_col9) (type: int)
- outputColumnNames: _col0, _col1, _col2, _col3, _col4
- Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-2
Map Reduce
Map Operator Tree:
TableScan
Reduce Output Operator
- key expressions: _col0 (type: timestamp), _col4 (type: int)
+ key expressions: _col2 (type: timestamp), UDFToInteger(_col9) (type: int)
sort order: ++
- Map-reduce partition columns: _col0 (type: timestamp), _col4 (type: int)
+ Map-reduce partition columns: _col2 (type: timestamp), UDFToInteger(_col9) (type: int)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- value expressions: _col1 (type: double), _col2 (type: smallint), _col3 (type: int)
+ value expressions: _col3 (type: double), _col5 (type: smallint), _col7 (type: int)
TableScan
Reduce Output Operator
- key expressions: _col0 (type: timestamp), _col1 (type: int)
+ key expressions: _col1 (type: timestamp), _col0 (type: int)
sort order: ++
- Map-reduce partition columns: _col0 (type: timestamp), _col1 (type: int)
+ Map-reduce partition columns: _col1 (type: timestamp), _col0 (type: int)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Reduce Operator Tree:
Join Operator
condition map:
- Left Semi Join 0 to 1
+ Inner Join 0 to 1
keys:
- 0 _col0 (type: timestamp), _col4 (type: int)
- 1 _col0 (type: timestamp), _col1 (type: int)
- outputColumnNames: _col1, _col2, _col3
+ 0 _col2 (type: timestamp), UDFToInteger(_col9) (type: int)
+ 1 _col1 (type: timestamp), _col0 (type: int)
+ outputColumnNames: _col3, _col5, _col7
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
File Output Operator
compressed: false
@@ -147,27 +147,27 @@ STAGE PLANS:
Map Operator Tree:
TableScan
Reduce Output Operator
- key expressions: (_col3 + UDFToInteger(_col2)) (type: int), floor(_col1) (type: bigint)
+ key expressions: (_col7 + UDFToInteger(_col5)) (type: int), floor(_col3) (type: bigint)
sort order: +-
- Map-reduce partition columns: (_col3 + UDFToInteger(_col2)) (type: int)
+ Map-reduce partition columns: (_col7 + UDFToInteger(_col5)) (type: int)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- value expressions: _col1 (type: double), _col2 (type: smallint), _col3 (type: int)
+ value expressions: _col3 (type: double), _col5 (type: smallint), _col7 (type: int)
Reduce Operator Tree:
Select Operator
- expressions: VALUE._col1 (type: double), VALUE._col2 (type: smallint), VALUE._col3 (type: int)
- outputColumnNames: _col1, _col2, _col3
+ expressions: VALUE._col3 (type: double), VALUE._col5 (type: smallint), VALUE._col7 (type: int)
+ outputColumnNames: _col3, _col5, _col7
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
PTF Operator
Function definitions:
Input definition
input alias: ptf_0
- output shape: _col1: double, _col2: smallint, _col3: int
+ output shape: _col3: double, _col5: smallint, _col7: int
type: WINDOWING
Windowing table definition
input alias: ptf_1
name: windowingtablefunction
- order by: (_col3 + UDFToInteger(_col2)) ASC NULLS FIRST, floor(_col1) DESC NULLS LAST
- partition by: (_col3 + UDFToInteger(_col2))
+ order by: (_col7 + UDFToInteger(_col5)) ASC NULLS FIRST, floor(_col3) DESC NULLS LAST
+ partition by: (_col7 + UDFToInteger(_col5))
raw input shape:
window functions:
window function definition
@@ -179,8 +179,8 @@ STAGE PLANS:
isPivotResult: true
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
- expressions: LEAD_window_0 (type: int), _col1 (type: double), _col2 (type: smallint), _col3 (type: int)
- outputColumnNames: LEAD_window_0, _col1, _col2, _col3
+ expressions: LEAD_window_0 (type: int), _col3 (type: double), _col5 (type: smallint), _col7 (type: int)
+ outputColumnNames: LEAD_window_0, _col3, _col5, _col7
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
File Output Operator
compressed: false
@@ -194,27 +194,27 @@ STAGE PLANS:
Map Operator Tree:
TableScan
Reduce Output Operator
- key expressions: (_col3 + UDFToInteger(_col2)) (type: int), floor(_col1) (type: bigint)
+ key expressions: (_col7 + UDFToInteger(_col5)) (type: int), floor(_col3) (type: bigint)
sort order: --
- Map-reduce partition columns: (_col3 + UDFToInteger(_col2)) (type: int)
+ Map-reduce partition columns: (_col7 + UDFToInteger(_col5)) (type: int)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- value expressions: LEAD_window_0 (type: int), _col1 (type: double), _col2 (type: smallint), _col3 (type: int)
+ value expressions: LEAD_window_0 (type: int), _col3 (type: double), _col5 (type: smallint), _col7 (type: int)
Reduce Operator Tree:
Select Operator
- expressions: VALUE._col0 (type: int), VALUE._col2 (type: double), VALUE._col3 (type: smallint), VALUE._col4 (type: int)
- outputColumnNames: _col0, _col2, _col3, _col4
+ expressions: VALUE._col0 (type: int), VALUE._col4 (type: double), VALUE._col6 (type: smallint), VALUE._col8 (type: int)
+ outputColumnNames: _col0, _col4, _col6, _col8
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
PTF Operator
Function definitions:
Input definition
input alias: ptf_0
- output shape: _col0: int, _col2: double, _col3: smallint, _col4: int
+ output shape: _col0: int, _col4: double, _col6: smallint, _col8: int
type: WINDOWING
Windowing table definition
input alias: ptf_1
name: windowingtablefunction
- order by: (_col4 + UDFToInteger(_col3)) DESC NULLS LAST, floor(_col2) DESC NULLS LAST
- partition by: (_col4 + UDFToInteger(_col3))
+ order by: (_col8 + UDFToInteger(_col6)) DESC NULLS LAST, floor(_col4) DESC NULLS LAST
+ partition by: (_col8 + UDFToInteger(_col6))
raw input shape:
window functions:
window function definition
@@ -225,7 +225,7 @@ STAGE PLANS:
window frame: PRECEDING(MAX)~FOLLOWING(48)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
- expressions: COALESCE(498,_col0,524) (type: int), (_col4 + UDFToInteger(_col3)) (type: int), floor(_col2) (type: bigint), COALESCE(sum_window_1,704) (type: bigint)
+ expressions: COALESCE(498,_col0,524) (type: int), (_col8 + UDFToInteger(_col6)) (type: int), floor(_col4) (type: bigint), COALESCE(sum_window_1,704) (type: bigint)
outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
File Output Operator
@@ -236,40 +236,149 @@ STAGE PLANS:
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- Stage: Stage-6
+ Stage: Stage-9
Map Reduce
Map Operator Tree:
TableScan
- alias: tt1
+ alias: t1
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
- predicate: decimal2612_col_77 is not null (type: boolean)
+ predicate: (tinyint_col_3 is not null and bigint_col_7 is not null and decimal2016_col_26 is not null) (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
- expressions: decimal2612_col_77 (type: decimal(26,12))
- outputColumnNames: _col0
+ expressions: tinyint_col_3 (type: tinyint), bigint_col_7 (type: bigint), timestamp_col_9 (type: timestamp), decimal2016_col_26 (type: decimal(20,16))
+ outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Reduce Output Operator
- key expressions: _col0 (type: decimal(26,12))
- sort order: +
- Map-reduce partition columns: _col0 (type: decimal(26,12))
+ key expressions: _col0 (type: tinyint), _col3 (type: decimal(34,16)), _col1 (type: bigint)
+ sort order: +++
+ Map-reduce partition columns: _col0 (type: tinyint), _col3 (type: decimal(34,16)), _col1 (type: bigint)
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ value expressions: _col2 (type: timestamp)
+ TableScan
+ alias: t2
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Filter Operator
+ predicate: (tinyint_col_20 is not null and decimal2709_col_9 is not null and tinyint_col_15 is not null) (type: boolean)
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Select Operator
+ expressions: decimal2709_col_9 (type: decimal(27,9)), tinyint_col_15 (type: tinyint), tinyint_col_20 (type: tinyint)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col2 (type: tinyint), _col0 (type: decimal(34,16)), UDFToLong(_col1) (type: bigint)
+ sort order: +++
+ Map-reduce partition columns: _col2 (type: tinyint), _col0 (type: decimal(34,16)), UDFToLong(_col1) (type: bigint)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col0 (type: tinyint), _col3 (type: decimal(34,16)), _col1 (type: bigint)
+ 1 _col2 (type: tinyint), _col0 (type: decimal(34,16)), UDFToLong(_col1) (type: bigint)
+ outputColumnNames: _col2
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Group By Operator
+ keys: _col2 (type: timestamp)
+ mode: hash
+ outputColumnNames: _col0
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+
+ Stage: Stage-10
+ Map Reduce
+ Map Operator Tree:
+ TableScan
+ Reduce Output Operator
+ key expressions: _col0 (type: timestamp)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: timestamp)
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Reduce Operator Tree:
+ Group By Operator
+ keys: KEY._col0 (type: timestamp)
+ mode: mergepartial
+ outputColumnNames: _col0
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+
+ Stage: Stage-6
+ Map Reduce
+ Map Operator Tree:
TableScan
alias: tt2
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
- predicate: (decimal1911_col_16 is not null and timestamp_col_18 is not null) (type: boolean)
+ predicate: decimal1911_col_16 is not null (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: decimal1911_col_16 (type: decimal(19,11)), timestamp_col_18 (type: timestamp)
outputColumnNames: _col0, _col1
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Reduce Output Operator
+ key expressions: _col1 (type: timestamp)
+ sort order: +
+ Map-reduce partition columns: _col1 (type: timestamp)
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ value expressions: _col0 (type: decimal(19,11))
+ TableScan
+ Reduce Output Operator
+ key expressions: _col0 (type: timestamp)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: timestamp)
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col1 (type: timestamp)
+ 1 _col0 (type: timestamp)
+ outputColumnNames: _col0, _col2
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+
+ Stage: Stage-7
+ Map Reduce
+ Map Operator Tree:
+ TableScan
+ Reduce Output Operator
+ key expressions: _col0 (type: decimal(26,12))
+ sort order: +
+ Map-reduce partition columns: _col0 (type: decimal(26,12))
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ value expressions: _col2 (type: timestamp)
+ TableScan
+ alias: tt1
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Filter Operator
+ predicate: decimal2612_col_77 is not null (type: boolean)
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Select Operator
+ expressions: decimal2612_col_77 (type: decimal(26,12))
+ outputColumnNames: _col0
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Reduce Output Operator
key expressions: _col0 (type: decimal(26,12))
sort order: +
Map-reduce partition columns: _col0 (type: decimal(26,12))
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
- value expressions: _col1 (type: timestamp)
Reduce Operator Tree:
Join Operator
condition map:
@@ -280,11 +389,11 @@ STAGE PLANS:
outputColumnNames: _col2
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
- expressions: _col2 (type: timestamp), -92 (type: int)
- outputColumnNames: _col0, _col1
+ expressions: _col2 (type: timestamp)
+ outputColumnNames: _col1
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Group By Operator
- keys: _col0 (type: timestamp), _col1 (type: int)
+ keys: _col1 (type: timestamp), -92 (type: int)
mode: hash
outputColumnNames: _col0, _col1
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
@@ -295,6 +404,32 @@ STAGE PLANS:
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+ Stage: Stage-8
+ Map Reduce
+ Map Operator Tree:
+ TableScan
+ Reduce Output Operator
+ key expressions: _col0 (type: timestamp), _col1 (type: int)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: timestamp), _col1 (type: int)
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Reduce Operator Tree:
+ Group By Operator
+ keys: KEY._col0 (type: timestamp), KEY._col1 (type: int)
+ mode: mergepartial
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ Select Operator
+ expressions: _col1 (type: int), _col0 (type: timestamp)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+
Stage: Stage-0
Fetch Operator
limit: -1
http://git-wip-us.apache.org/repos/asf/hive/blob/382dc208/ql/src/test/results/clientpositive/spark/constprog_partitioner.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/spark/constprog_partitioner.q.out b/ql/src/test/results/clientpositive/spark/constprog_partitioner.q.out
index 567c6d3..a40115c 100644
--- a/ql/src/test/results/clientpositive/spark/constprog_partitioner.q.out
+++ b/ql/src/test/results/clientpositive/spark/constprog_partitioner.q.out
@@ -95,7 +95,10 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 4), Map 3 (PARTITION-LEVEL SORT, 4)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 4), Reducer 5 (PARTITION-LEVEL SORT, 4)
+ Reducer 4 <- Map 3 (PARTITION-LEVEL SORT, 4), Reducer 7 (PARTITION-LEVEL SORT, 4)
+ Reducer 5 <- Reducer 4 (GROUP, 4)
+ Reducer 7 <- Map 6 (GROUP, 4)
#### A masked pattern was here ####
Vertices:
Map 1
@@ -104,7 +107,7 @@ STAGE PLANS:
alias: li
Statistics: Num rows: 100 Data size: 11999 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((l_linenumber = 1) and l_orderkey is not null) (type: boolean)
+ predicate: (l_linenumber = 1) (type: boolean)
Statistics: Num rows: 50 Data size: 5999 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: l_orderkey (type: int), l_partkey (type: int), l_suppkey (type: int)
@@ -122,27 +125,42 @@ STAGE PLANS:
alias: lineitem
Statistics: Num rows: 100 Data size: 11999 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((l_shipmode = 'AIR') and (l_linenumber = 1) and l_orderkey is not null) (type: boolean)
- Statistics: Num rows: 25 Data size: 2999 Basic stats: COMPLETE Column stats: NONE
+ predicate: (l_shipmode = 'AIR') (type: boolean)
+ Statistics: Num rows: 50 Data size: 5999 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: l_orderkey (type: int), 1 (type: int)
+ expressions: l_orderkey (type: int), l_linenumber (type: int)
outputColumnNames: _col0, _col1
- Statistics: Num rows: 25 Data size: 2999 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- keys: _col0 (type: int), _col1 (type: int)
- mode: hash
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 25 Data size: 2999 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: int), _col1 (type: int)
- sort order: ++
- Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
- Statistics: Num rows: 25 Data size: 2999 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 50 Data size: 5999 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col1 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col1 (type: int)
+ Statistics: Num rows: 50 Data size: 5999 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col0 (type: int)
+ Map 6
+ Map Operator Tree:
+ TableScan
+ alias: li
+ Statistics: Num rows: 100 Data size: 11999 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: l_linenumber (type: int)
+ outputColumnNames: l_linenumber
+ Statistics: Num rows: 100 Data size: 11999 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ keys: l_linenumber (type: int)
+ mode: hash
+ outputColumnNames: _col0
+ Statistics: Num rows: 100 Data size: 11999 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 100 Data size: 11999 Basic stats: COMPLETE Column stats: NONE
Reducer 2
Reduce Operator Tree:
Join Operator
condition map:
- Left Semi Join 0 to 1
+ Inner Join 0 to 1
keys:
0 _col0 (type: int), 1 (type: int)
1 _col0 (type: int), _col1 (type: int)
@@ -159,6 +177,50 @@ STAGE PLANS:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ Reducer 4
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col1 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col0, _col3
+ Statistics: Num rows: 55 Data size: 6598 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ keys: _col0 (type: int), _col3 (type: int)
+ mode: hash
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 55 Data size: 6598 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int), _col1 (type: int)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
+ Statistics: Num rows: 55 Data size: 6598 Basic stats: COMPLETE Column stats: NONE
+ Reducer 5
+ Reduce Operator Tree:
+ Group By Operator
+ keys: KEY._col0 (type: int), KEY._col1 (type: int)
+ mode: mergepartial
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 27 Data size: 3239 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int), _col1 (type: int)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
+ Statistics: Num rows: 27 Data size: 3239 Basic stats: COMPLETE Column stats: NONE
+ Reducer 7
+ Reduce Operator Tree:
+ Group By Operator
+ keys: KEY._col0 (type: int)
+ mode: mergepartial
+ outputColumnNames: _col0
+ Statistics: Num rows: 50 Data size: 5999 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 50 Data size: 5999 Basic stats: COMPLETE Column stats: NONE
Stage: Stage-0
Fetch Operator
http://git-wip-us.apache.org/repos/asf/hive/blob/382dc208/ql/src/test/results/clientpositive/spark/subquery_exists.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/spark/subquery_exists.q.out b/ql/src/test/results/clientpositive/spark/subquery_exists.q.out
index b58fcbe..c28a218 100644
--- a/ql/src/test/results/clientpositive/spark/subquery_exists.q.out
+++ b/ql/src/test/results/clientpositive/spark/subquery_exists.q.out
@@ -32,7 +32,10 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 2), Map 3 (PARTITION-LEVEL SORT, 2)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 2), Reducer 5 (PARTITION-LEVEL SORT, 2)
+ Reducer 4 <- Map 3 (PARTITION-LEVEL SORT, 2), Reducer 7 (PARTITION-LEVEL SORT, 2)
+ Reducer 5 <- Reducer 4 (GROUP, 2)
+ Reducer 7 <- Map 6 (GROUP, 2)
#### A masked pattern was here ####
Vertices:
Map 1
@@ -40,8 +43,22 @@ STAGE PLANS:
TableScan
alias: b
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string), _col1 (type: string)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Map 3
+ Map Operator Tree:
+ TableScan
+ alias: a
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((value > 'val_9') and key is not null) (type: boolean)
+ predicate: (value > 'val_9') (type: boolean)
Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: key (type: string), value (type: string)
@@ -52,45 +69,86 @@ STAGE PLANS:
sort order: ++
Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
- Map 3
+ Map 6
Map Operator Tree:
TableScan
- alias: a
+ alias: b
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: ((value > 'val_9') and key is not null) (type: boolean)
- Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: key (type: string), value (type: string)
+ Select Operator
+ expressions: key (type: string), value (type: string)
+ outputColumnNames: key, value
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ keys: key (type: string), value (type: string)
+ mode: hash
outputColumnNames: _col0, _col1
- Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- keys: _col0 (type: string), _col1 (type: string)
- mode: hash
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: string), _col1 (type: string)
- sort order: ++
- Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
- Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string), _col1 (type: string)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
Reducer 2
Reduce Operator Tree:
Join Operator
condition map:
- Left Semi Join 0 to 1
+ Inner Join 0 to 1
keys:
0 _col0 (type: string), _col1 (type: string)
1 _col0 (type: string), _col1 (type: string)
outputColumnNames: _col0, _col1
- Statistics: Num rows: 182 Data size: 1939 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
- Statistics: Num rows: 182 Data size: 1939 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ Reducer 4
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col0 (type: string), _col1 (type: string)
+ 1 _col0 (type: string), _col1 (type: string)
+ outputColumnNames: _col2, _col3
+ Statistics: Num rows: 275 Data size: 2921 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ keys: _col2 (type: string), _col3 (type: string)
+ mode: hash
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 275 Data size: 2921 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string), _col1 (type: string)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
+ Statistics: Num rows: 275 Data size: 2921 Basic stats: COMPLETE Column stats: NONE
+ Reducer 5
+ Reduce Operator Tree:
+ Group By Operator
+ keys: KEY._col0 (type: string), KEY._col1 (type: string)
+ mode: mergepartial
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 137 Data size: 1455 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string), _col1 (type: string)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
+ Statistics: Num rows: 137 Data size: 1455 Basic stats: COMPLETE Column stats: NONE
+ Reducer 7
+ Reduce Operator Tree:
+ Group By Operator
+ keys: KEY._col0 (type: string), KEY._col1 (type: string)
+ mode: mergepartial
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string), _col1 (type: string)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
+ Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
Stage: Stage-0
Fetch Operator
@@ -237,7 +295,10 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 2), Map 3 (PARTITION-LEVEL SORT, 2)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 2), Reducer 5 (PARTITION-LEVEL SORT, 2)
+ Reducer 4 <- Map 3 (PARTITION-LEVEL SORT, 2), Reducer 7 (PARTITION-LEVEL SORT, 2)
+ Reducer 5 <- Reducer 4 (GROUP, 2)
+ Reducer 7 <- Map 6 (GROUP, 2)
#### A masked pattern was here ####
Vertices:
Map 1
@@ -245,46 +306,54 @@ STAGE PLANS:
TableScan
alias: b
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: value is not null (type: boolean)
+ Select Operator
+ expressions: key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: key (type: string), value (type: string)
- outputColumnNames: _col0, _col1
+ Reduce Output Operator
+ key expressions: _col1 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col1 (type: string)
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col1 (type: string)
- sort order: +
- Map-reduce partition columns: _col1 (type: string)
- Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: string)
+ value expressions: _col0 (type: string)
Map 3
Map Operator Tree:
TableScan
alias: a
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: value is not null (type: boolean)
+ Select Operator
+ expressions: value (type: string)
+ outputColumnNames: _col0
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: value (type: string)
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Map 6
+ Map Operator Tree:
+ TableScan
+ alias: b
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: value (type: string)
+ outputColumnNames: value
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ keys: value (type: string)
+ mode: hash
outputColumnNames: _col0
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- keys: _col0 (type: string)
- mode: hash
- outputColumnNames: _col0
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: string)
- sort order: +
- Map-reduce partition columns: _col0 (type: string)
- Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
Reducer 2
Reduce Operator Tree:
Join Operator
condition map:
- Left Semi Join 0 to 1
+ Inner Join 0 to 1
keys:
0 _col1 (type: string)
1 _col0 (type: string)
@@ -297,6 +366,50 @@ STAGE PLANS:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ Reducer 4
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col0 (type: string)
+ 1 _col0 (type: string)
+ outputColumnNames: _col1
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ keys: _col1 (type: string)
+ mode: hash
+ outputColumnNames: _col0
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+ Reducer 5
+ Reduce Operator Tree:
+ Group By Operator
+ keys: KEY._col0 (type: string)
+ mode: mergepartial
+ outputColumnNames: _col0
+ Statistics: Num rows: 275 Data size: 2921 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
+ Statistics: Num rows: 275 Data size: 2921 Basic stats: COMPLETE Column stats: NONE
+ Reducer 7
+ Reduce Operator Tree:
+ Group By Operator
+ keys: KEY._col0 (type: string)
+ mode: mergepartial
+ outputColumnNames: _col0
+ Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
+ Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
Stage: Stage-0
Fetch Operator