You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by vg...@apache.org on 2017/11/07 06:27:22 UTC
[03/17] hive git commit: HIVE-17767 Rewrite correlated EXISTS/IN
subqueries into LEFT SEMI JOIN (Vineet Garg, reviewed by Ashutosh Chauhan)
http://git-wip-us.apache.org/repos/asf/hive/blob/aee0eaa0/ql/src/test/results/clientpositive/subquery_exists.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/subquery_exists.q.out b/ql/src/test/results/clientpositive/subquery_exists.q.out
index c9f2a79..b6b31aa 100644
--- a/ql/src/test/results/clientpositive/subquery_exists.q.out
+++ b/ql/src/test/results/clientpositive/subquery_exists.q.out
@@ -27,35 +27,38 @@ STAGE PLANS:
TableScan
alias: b
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: key (type: string), value (type: string)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: string), _col1 (type: string)
- sort order: ++
- Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
- Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: ((value > 'val_9') and key is not null) (type: boolean)
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string), _col1 (type: string)
+ sort order: ++
+ Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
TableScan
alias: a
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((key = key) and (value = value) and (value > 'val_9')) (type: boolean)
- Statistics: Num rows: 41 Data size: 435 Basic stats: COMPLETE Column stats: NONE
+ predicate: ((value > 'val_9') and key is not null) (type: boolean)
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: key (type: string), value (type: string)
outputColumnNames: _col0, _col1
- Statistics: Num rows: 41 Data size: 435 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Group By Operator
keys: _col0 (type: string), _col1 (type: string)
mode: hash
outputColumnNames: _col0, _col1
- Statistics: Num rows: 41 Data size: 435 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string), _col1 (type: string)
sort order: ++
Map-reduce partition columns: _col0 (type: string), _col1 (type: string)
- Statistics: Num rows: 41 Data size: 435 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Reduce Operator Tree:
Join Operator
condition map:
@@ -64,10 +67,10 @@ STAGE PLANS:
0 _col0 (type: string), _col1 (type: string)
1 _col0 (type: string), _col1 (type: string)
outputColumnNames: _col0, _col1
- Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 182 Data size: 1939 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
- Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 182 Data size: 1939 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
@@ -217,16 +220,19 @@ STAGE PLANS:
TableScan
alias: b
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: key (type: string), value (type: string)
- outputColumnNames: _col0, _col1
+ Filter Operator
+ predicate: value is not null (type: boolean)
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col1 (type: string)
- sort order: +
- Map-reduce partition columns: _col1 (type: string)
+ Select Operator
+ expressions: key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: string)
+ Reduce Output Operator
+ key expressions: _col1 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col1 (type: string)
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col0 (type: string)
TableScan
alias: a
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
@@ -1036,13 +1042,13 @@ POSTHOOK: type: CREATETABLE
POSTHOOK: Output: database:default
POSTHOOK: Output: default@tx1
PREHOOK: query: insert into tx1 values (1, 1),
- (1, 2),
- (1, 3)
+ (1, 2),
+ (1, 3)
PREHOOK: type: QUERY
PREHOOK: Output: default@tx1
POSTHOOK: query: insert into tx1 values (1, 1),
- (1, 2),
- (1, 3)
+ (1, 2),
+ (1, 3)
POSTHOOK: type: QUERY
POSTHOOK: Output: default@tx1
POSTHOOK: Lineage: tx1.a EXPRESSION [(values__tmp__table__3)values__tmp__table__3.FieldSchema(name:tmp_values_col1, type:string, comment:), ]
@@ -1065,137 +1071,74 @@ POSTHOOK: query: explain select count(*) as result,3 as expected from tx1 u
where exists (select * from tx1 v where u.a=v.a and u.b <> v.b)
POSTHOOK: type: QUERY
STAGE DEPENDENCIES:
- Stage-4 is a root stage
- Stage-3 depends on stages: Stage-4
- Stage-1 depends on stages: Stage-3
+ Stage-1 is a root stage
Stage-2 depends on stages: Stage-1
Stage-0 depends on stages: Stage-2
STAGE PLANS:
- Stage: Stage-4
+ Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: u
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: a (type: int), b (type: int)
- outputColumnNames: a, b
+ Filter Operator
+ predicate: a is not null (type: boolean)
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- keys: a (type: int), b (type: int)
- mode: hash
+ Select Operator
+ expressions: a (type: int), b (type: int)
outputColumnNames: _col0, _col1
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col0 (type: int), _col1 (type: int)
- sort order: ++
- Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Reduce Operator Tree:
- Group By Operator
- keys: KEY._col0 (type: int), KEY._col1 (type: int)
- mode: mergepartial
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
-
- Stage: Stage-3
- Map Reduce
- Map Operator Tree:
+ value expressions: _col1 (type: int)
TableScan
alias: v
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: a (type: int), b (type: int)
- outputColumnNames: _col0, _col1
+ Filter Operator
+ predicate: (a is not null and b is not null) (type: boolean)
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: int)
- sort order: +
- Map-reduce partition columns: _col0 (type: int)
+ Select Operator
+ expressions: a (type: int), b (type: int)
+ outputColumnNames: _col0, _col1
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: int)
- TableScan
- Reduce Output Operator
- key expressions: _col0 (type: int)
- sort order: +
- Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: int)
+ Group By Operator
+ keys: _col0 (type: int), _col1 (type: int)
+ mode: hash
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: int)
Reduce Operator Tree:
Join Operator
condition map:
- Inner Join 0 to 1
+ Left Semi Join 0 to 1
keys:
0 _col0 (type: int)
1 _col0 (type: int)
- outputColumnNames: _col1, _col2, _col3
+ outputColumnNames: _col1, _col3
+ residual filter predicates: {(_col1 <> _col3)}
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: (_col3 <> _col1) (type: boolean)
- Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col2 (type: int), _col3 (type: int)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- keys: _col0 (type: int), _col1 (type: int)
- mode: hash
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
-
- Stage: Stage-1
- Map Reduce
- Map Operator Tree:
- TableScan
- alias: u
+ Select Operator
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: a (type: int), b (type: int)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: int), _col1 (type: int)
- sort order: ++
- Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
- Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- TableScan
- Reduce Output Operator
- key expressions: _col0 (type: int), _col1 (type: int)
- sort order: ++
- Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
- Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Reduce Operator Tree:
- Join Operator
- condition map:
- Left Semi Join 0 to 1
- keys:
- 0 _col0 (type: int), _col1 (type: int)
- 1 _col0 (type: int), _col1 (type: int)
- Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- aggregations: count()
- mode: hash
- outputColumnNames: _col0
- Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+ Group By Operator
+ aggregations: count()
+ mode: hash
+ outputColumnNames: _col0
+ Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-2
Map Reduce
@@ -1269,135 +1212,76 @@ POSTHOOK: type: QUERY
POSTHOOK: Output: default@t2
POSTHOOK: Lineage: t2.i EXPRESSION [(values__tmp__table__5)values__tmp__table__5.FieldSchema(name:tmp_values_col1, type:string, comment:), ]
POSTHOOK: Lineage: t2.j EXPRESSION [(values__tmp__table__5)values__tmp__table__5.FieldSchema(name:tmp_values_col2, type:string, comment:), ]
-Warning: Shuffle Join JOIN[12][tables = [$hdt$_1, $hdt$_2]] in Stage 'Stage-2:MAPRED' is a cross product
PREHOOK: query: explain select * from t1 where t1.i in (select t2.i from t2 where t2.j <> t1.j)
PREHOOK: type: QUERY
POSTHOOK: query: explain select * from t1 where t1.i in (select t2.i from t2 where t2.j <> t1.j)
POSTHOOK: type: QUERY
STAGE DEPENDENCIES:
- Stage-3 is a root stage
- Stage-2 depends on stages: Stage-3
- Stage-1 depends on stages: Stage-2
+ Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-3
+ Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: t1
Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: j (type: int)
- outputColumnNames: j
+ Filter Operator
+ predicate: i is not null (type: boolean)
Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- keys: j (type: int)
- mode: hash
- outputColumnNames: _col0
+ Select Operator
+ expressions: i (type: int), j (type: int)
+ outputColumnNames: _col0, _col1
Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- Reduce Operator Tree:
- Group By Operator
- keys: KEY._col0 (type: int)
- mode: mergepartial
- outputColumnNames: _col0
- Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
-
- Stage: Stage-2
- Map Reduce
- Map Operator Tree:
+ value expressions: _col1 (type: int)
TableScan
alias: t2
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: i (type: int), j (type: int)
- outputColumnNames: _col0, _col1
+ Filter Operator
+ predicate: (i is not null and j is not null) (type: boolean)
Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- sort order:
- Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: int), _col1 (type: int)
- TableScan
- Reduce Output Operator
- sort order:
- Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: int)
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0
- 1
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 3 Data size: 21 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: (_col1 <> _col2) (type: boolean)
- Statistics: Num rows: 3 Data size: 21 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: int), _col2 (type: int)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 3 Data size: 21 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- keys: _col0 (type: int), _col1 (type: int)
- mode: hash
+ Select Operator
+ expressions: i (type: int), j (type: int)
outputColumnNames: _col0, _col1
- Statistics: Num rows: 3 Data size: 21 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
-
- Stage: Stage-1
- Map Reduce
- Map Operator Tree:
- TableScan
- alias: t1
- Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: i (type: int), j (type: int)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: int), _col1 (type: int)
- sort order: ++
- Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
- Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
- TableScan
- Reduce Output Operator
- key expressions: _col0 (type: int), _col1 (type: int)
- sort order: ++
- Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
- Statistics: Num rows: 3 Data size: 21 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ keys: _col0 (type: int), _col1 (type: int)
+ mode: hash
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: int)
Reduce Operator Tree:
Join Operator
condition map:
Left Semi Join 0 to 1
keys:
- 0 _col0 (type: int), _col1 (type: int)
- 1 _col0 (type: int), _col1 (type: int)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 3 Data size: 23 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- Statistics: Num rows: 3 Data size: 23 Basic stats: COMPLETE Column stats: NONE
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ 0 _col0 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col0, _col1, _col3
+ residual filter predicates: {(_col1 <> _col3)}
+ Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: int), _col1 (type: int)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 3 Data size: 9 Basic stats: COMPLETE Column stats: NONE
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
@@ -1405,7 +1289,6 @@ STAGE PLANS:
Processor Tree:
ListSink
-Warning: Shuffle Join JOIN[12][tables = [$hdt$_1, $hdt$_2]] in Stage 'Stage-2:MAPRED' is a cross product
PREHOOK: query: select * from t1 where t1.i in (select t2.i from t2 where t2.j <> t1.j)
PREHOOK: type: QUERY
PREHOOK: Input: default@t1
http://git-wip-us.apache.org/repos/asf/hive/blob/aee0eaa0/ql/src/test/results/clientpositive/subquery_exists_having.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/subquery_exists_having.q.out b/ql/src/test/results/clientpositive/subquery_exists_having.q.out
index 2c41ff6..ef06dfe 100644
--- a/ql/src/test/results/clientpositive/subquery_exists_having.q.out
+++ b/ql/src/test/results/clientpositive/subquery_exists_having.q.out
@@ -30,9 +30,8 @@ STAGE PLANS:
TableScan
alias: b
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: key (type: string)
- outputColumnNames: key
+ Filter Operator
+ predicate: key is not null (type: boolean)
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count()
@@ -74,22 +73,22 @@ STAGE PLANS:
alias: a
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((key = key) and (value > 'val_9')) (type: boolean)
- Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE
+ predicate: ((value > 'val_9') and key is not null) (type: boolean)
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: key (type: string)
outputColumnNames: _col0
- Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Group By Operator
keys: _col0 (type: string)
mode: hash
outputColumnNames: _col0
- Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
- Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Reduce Operator Tree:
Join Operator
condition map:
@@ -172,9 +171,8 @@ STAGE PLANS:
TableScan
alias: b
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: key (type: string)
- outputColumnNames: key
+ Filter Operator
+ predicate: key is not null (type: boolean)
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count()
@@ -192,33 +190,33 @@ STAGE PLANS:
alias: a
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((key = key) and (value > 'val_9')) (type: boolean)
- Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE
+ predicate: ((value > 'val_9') and key is not null) (type: boolean)
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: key (type: string)
outputColumnNames: _col0
- Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Group By Operator
keys: _col0 (type: string)
mode: hash
outputColumnNames: _col0
- Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
- Statistics: Num rows: 83 Data size: 881 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 166 Data size: 1763 Basic stats: COMPLETE Column stats: NONE
Reduce Operator Tree:
Demux Operator
- Statistics: Num rows: 583 Data size: 6193 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 666 Data size: 7075 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count(VALUE._col0)
keys: KEY._col0 (type: string)
mode: mergepartial
outputColumnNames: _col0, _col1
- Statistics: Num rows: 291 Data size: 3091 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 333 Data size: 3537 Basic stats: COMPLETE Column stats: NONE
Mux Operator
- Statistics: Num rows: 874 Data size: 9284 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 999 Data size: 10612 Basic stats: COMPLETE Column stats: NONE
Join Operator
condition map:
Left Semi Join 0 to 1
@@ -235,7 +233,7 @@ STAGE PLANS:
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Mux Operator
- Statistics: Num rows: 874 Data size: 9284 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 999 Data size: 10612 Basic stats: COMPLETE Column stats: NONE
Join Operator
condition map:
Left Semi Join 0 to 1