You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by kg...@apache.org on 2019/03/14 06:56:08 UTC
[hive] branch master updated (c1c3461 -> a2892cd)
This is an automated email from the ASF dual-hosted git repository.
kgyrtkirk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git.
from c1c3461 HIVE-21424: Disable AggregateStatsCache by default (Zoltan Haindrich reviewed by Ashutosh Chauhan)
new b2aa694 HIVE-21398: Columns which has estimated statistics should not be considered as unique keys (Zoltan Haindrich reviewed by Ashutosh Chauhan)
new a2892cd HIVE-21440: Fix test_teradatabinaryfile to not run into stackoverflows (Zoltan Haindrich reviewed by Sankar Hariappan)
The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
pom.xml | 2 +-
.../calcite/stats/EstimateUniqueKeys.java | 26 +-
.../runtime_skewjoin_mapjoin_spark.q.out | 284 ++++++++++-----------
.../spark/runtime_skewjoin_mapjoin_spark.q.out | 44 ++--
.../spark/spark_dynamic_partition_pruning_6.q.out | 124 ++++-----
5 files changed, 240 insertions(+), 240 deletions(-)
[hive] 02/02: HIVE-21440: Fix test_teradatabinaryfile to not run
into stackoverflows (Zoltan Haindrich reviewed by Sankar Hariappan)
Posted by kg...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
kgyrtkirk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git
commit a2892cda723580520450edc58c2ce8635e5666de
Author: Zoltan Haindrich <ki...@rxd.hu>
AuthorDate: Thu Mar 14 07:54:41 2019 +0100
HIVE-21440: Fix test_teradatabinaryfile to not run into stackoverflows (Zoltan Haindrich reviewed by Sankar Hariappan)
Signed-off-by: Zoltan Haindrich <ki...@rxd.hu>
---
pom.xml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/pom.xml b/pom.xml
index d84b402..e58f925 100644
--- a/pom.xml
+++ b/pom.xml
@@ -183,7 +183,7 @@
<jodd.version>3.5.2</jodd.version>
<json.version>1.8</json.version>
<junit.version>4.11</junit.version>
- <kryo.version>3.0.3</kryo.version>
+ <kryo.version>4.0.2</kryo.version>
<libfb303.version>0.9.3</libfb303.version>
<libthrift.version>0.9.3</libthrift.version>
<log4j2.version>2.10.0</log4j2.version>
[hive] 01/02: HIVE-21398: Columns which has estimated statistics
should not be considered as unique keys (Zoltan Haindrich reviewed by
Ashutosh Chauhan)
Posted by kg...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
kgyrtkirk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git
commit b2aa6943b561a3b7691bed9fee7a1c337e99b253
Author: Zoltan Haindrich <ki...@rxd.hu>
AuthorDate: Thu Mar 14 07:53:49 2019 +0100
HIVE-21398: Columns which has estimated statistics should not be considered as unique keys (Zoltan Haindrich reviewed by Ashutosh Chauhan)
Signed-off-by: Zoltan Haindrich <ki...@rxd.hu>
---
.../calcite/stats/EstimateUniqueKeys.java | 26 +-
.../runtime_skewjoin_mapjoin_spark.q.out | 284 ++++++++++-----------
.../spark/runtime_skewjoin_mapjoin_spark.q.out | 44 ++--
.../spark/spark_dynamic_partition_pruning_6.q.out | 124 ++++-----
4 files changed, 239 insertions(+), 239 deletions(-)
diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
index 5ef945c..4aba098 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/EstimateUniqueKeys.java
@@ -102,19 +102,19 @@ public final class EstimateUniqueKeys {
colStatsPos = 0;
for (ColStatistics cStat : colStats) {
boolean isKey = false;
- if (cStat.getCountDistint() >= numRows) {
- isKey = true;
- }
- if (!isKey && cStat.getRange() != null &&
- cStat.getRange().maxValue != null &&
- cStat.getRange().minValue != null) {
- double r = cStat.getRange().maxValue.doubleValue() -
- cStat.getRange().minValue.doubleValue() + 1;
- isKey = (Math.abs(numRows - r) < RelOptUtil.EPSILON);
- }
- if (isKey) {
- ImmutableBitSet key = ImmutableBitSet.of(posMap.get(colStatsPos));
- keys.add(key);
+ if (!cStat.isEstimated()) {
+ if (cStat.getCountDistint() >= numRows) {
+ isKey = true;
+ }
+ if (!isKey && cStat.getRange() != null && cStat.getRange().maxValue != null
+ && cStat.getRange().minValue != null) {
+ double r = cStat.getRange().maxValue.doubleValue() - cStat.getRange().minValue.doubleValue() + 1;
+ isKey = (Math.abs(numRows - r) < RelOptUtil.EPSILON);
+ }
+ if (isKey) {
+ ImmutableBitSet key = ImmutableBitSet.of(posMap.get(colStatsPos));
+ keys.add(key);
+ }
}
colStatsPos++;
}
diff --git a/ql/src/test/results/clientpositive/runtime_skewjoin_mapjoin_spark.q.out b/ql/src/test/results/clientpositive/runtime_skewjoin_mapjoin_spark.q.out
index 29dec2d..1883790 100644
--- a/ql/src/test/results/clientpositive/runtime_skewjoin_mapjoin_spark.q.out
+++ b/ql/src/test/results/clientpositive/runtime_skewjoin_mapjoin_spark.q.out
@@ -35,10 +35,12 @@ POSTHOOK: Input: default@src
POSTHOOK: Input: default@t1_n94
#### A masked pattern was here ####
STAGE DEPENDENCIES:
- Stage-12 is a root stage , consists of Stage-16, Stage-17, Stage-1
+ Stage-18 is a root stage
+ Stage-13 depends on stages: Stage-18
+ Stage-12 depends on stages: Stage-13 , consists of Stage-16, Stage-17, Stage-1
Stage-16 has a backup stage: Stage-1
Stage-10 depends on stages: Stage-16
- Stage-9 depends on stages: Stage-1, Stage-10, Stage-11, Stage-13 , consists of Stage-14, Stage-15, Stage-2
+ Stage-9 depends on stages: Stage-1, Stage-10, Stage-11 , consists of Stage-14, Stage-15, Stage-2
Stage-14 has a backup stage: Stage-2
Stage-7 depends on stages: Stage-14
Stage-3 depends on stages: Stage-2, Stage-7, Stage-8
@@ -48,24 +50,38 @@ STAGE DEPENDENCIES:
Stage-17 has a backup stage: Stage-1
Stage-11 depends on stages: Stage-17
Stage-1
- Stage-18 is a root stage
- Stage-13 depends on stages: Stage-18
Stage-0 depends on stages: Stage-3
STAGE PLANS:
- Stage: Stage-12
- Conditional Operator
-
- Stage: Stage-16
+ Stage: Stage-18
Map Reduce Local Work
Alias -> Map Local Tables:
- $hdt$_1:src2
+ $hdt$_2:$hdt$_3:t1_n94
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
- $hdt$_1:src2
+ $hdt$_2:$hdt$_3:t1_n94
TableScan
- alias: src2
+ alias: t1_n94
+ filterExpr: key is not null (type: boolean)
+ Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: key is not null (type: boolean)
+ Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: key (type: string)
+ outputColumnNames: _col0
+ Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE
+ HashTable Sink Operator
+ keys:
+ 0 _col0 (type: string)
+ 1 _col0 (type: string)
+
+ Stage: Stage-13
+ Map Reduce
+ Map Operator Tree:
+ TableScan
+ alias: src
filterExpr: key is not null (type: boolean)
Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
@@ -75,10 +91,40 @@ STAGE PLANS:
expressions: key (type: string)
outputColumnNames: _col0
Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- HashTable Sink Operator
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
keys:
0 _col0 (type: string)
1 _col0 (type: string)
+ outputColumnNames: _col0
+ Statistics: Num rows: 550 Data size: 47850 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+ Execution mode: vectorized
+ Local Work:
+ Map Reduce Local Work
+
+ Stage: Stage-12
+ Conditional Operator
+
+ Stage: Stage-16
+ Map Reduce Local Work
+ Alias -> Map Local Tables:
+ $INTNAME
+ Fetch Operator
+ limit: -1
+ Alias -> Map Local Operator Tree:
+ $INTNAME
+ TableScan
+ HashTable Sink Operator
+ keys:
+ 0 _col0 (type: string)
+ 1 _col0 (type: string)
Stage: Stage-10
Map Reduce
@@ -101,7 +147,7 @@ STAGE PLANS:
0 _col0 (type: string)
1 _col0 (type: string)
outputColumnNames: _col0
- Statistics: Num rows: 791 Data size: 68817 Basic stats: COMPLETE Column stats: COMPLETE
+ Statistics: Num rows: 605 Data size: 52635 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
@@ -118,16 +164,26 @@ STAGE PLANS:
Stage: Stage-14
Map Reduce Local Work
Alias -> Map Local Tables:
- $INTNAME1
+ $hdt$_0:src2
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
- $INTNAME1
+ $hdt$_0:src2
TableScan
- HashTable Sink Operator
- keys:
- 0 _col0 (type: string)
- 1 _col0 (type: string)
+ alias: src2
+ filterExpr: key is not null (type: boolean)
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: key is not null (type: boolean)
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: key (type: string)
+ outputColumnNames: _col0
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ HashTable Sink Operator
+ keys:
+ 0 _col0 (type: string)
+ 1 _col0 (type: string)
Stage: Stage-7
Map Reduce
@@ -139,7 +195,7 @@ STAGE PLANS:
keys:
0 _col0 (type: string)
1 _col0 (type: string)
- Statistics: Num rows: 870 Data size: 75698 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 665 Data size: 57898 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count()
mode: hash
@@ -196,24 +252,34 @@ STAGE PLANS:
Map Reduce
Map Operator Tree:
TableScan
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col0 (type: string)
- 1 _col0 (type: string)
- Statistics: Num rows: 870 Data size: 75698 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- aggregations: count()
- mode: hash
+ alias: src2
+ filterExpr: key is not null (type: boolean)
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: key is not null (type: boolean)
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: key (type: string)
outputColumnNames: _col0
- Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col0 (type: string)
+ 1 _col0 (type: string)
+ Statistics: Num rows: 665 Data size: 57898 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ aggregations: count()
+ mode: hash
+ outputColumnNames: _col0
+ Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Execution mode: vectorized
Local Work:
Map Reduce Local Work
@@ -226,13 +292,23 @@ STAGE PLANS:
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
- Statistics: Num rows: 791 Data size: 68817 Basic stats: COMPLETE Column stats: COMPLETE
+ Statistics: Num rows: 605 Data size: 52635 Basic stats: COMPLETE Column stats: NONE
TableScan
- Reduce Output Operator
- key expressions: _col0 (type: string)
- sort order: +
- Map-reduce partition columns: _col0 (type: string)
- Statistics: Num rows: 550 Data size: 47850 Basic stats: COMPLETE Column stats: NONE
+ alias: src2
+ filterExpr: key is not null (type: boolean)
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: key is not null (type: boolean)
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: key (type: string)
+ outputColumnNames: _col0
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
+ Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
Reduce Operator Tree:
Join Operator
condition map:
@@ -240,7 +316,7 @@ STAGE PLANS:
keys:
0 _col0 (type: string)
1 _col0 (type: string)
- Statistics: Num rows: 870 Data size: 75698 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 665 Data size: 57898 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count()
mode: hash
@@ -256,11 +332,11 @@ STAGE PLANS:
Stage: Stage-17
Map Reduce Local Work
Alias -> Map Local Tables:
- $hdt$_0:src1
+ $hdt$_1:src1
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
- $hdt$_0:src1
+ $hdt$_1:src1
TableScan
alias: src1
filterExpr: key is not null (type: boolean)
@@ -281,30 +357,20 @@ STAGE PLANS:
Map Reduce
Map Operator Tree:
TableScan
- alias: src2
- filterExpr: key is not null (type: boolean)
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- Filter Operator
- predicate: key is not null (type: boolean)
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- Select Operator
- expressions: key (type: string)
- outputColumnNames: _col0
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col0 (type: string)
- 1 _col0 (type: string)
- outputColumnNames: _col0
- Statistics: Num rows: 791 Data size: 68817 Basic stats: COMPLETE Column stats: COMPLETE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col0 (type: string)
+ 1 _col0 (type: string)
+ outputColumnNames: _col0
+ Statistics: Num rows: 605 Data size: 52635 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Execution mode: vectorized
Local Work:
Map Reduce Local Work
@@ -329,21 +395,11 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
TableScan
- alias: src2
- filterExpr: key is not null (type: boolean)
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- Filter Operator
- predicate: key is not null (type: boolean)
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- Select Operator
- expressions: key (type: string)
- outputColumnNames: _col0
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- Reduce Output Operator
- key expressions: _col0 (type: string)
- sort order: +
- Map-reduce partition columns: _col0 (type: string)
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
+ Statistics: Num rows: 550 Data size: 47850 Basic stats: COMPLETE Column stats: NONE
Reduce Operator Tree:
Join Operator
condition map:
@@ -352,7 +408,7 @@ STAGE PLANS:
0 _col0 (type: string)
1 _col0 (type: string)
outputColumnNames: _col0
- Statistics: Num rows: 791 Data size: 68817 Basic stats: COMPLETE Column stats: COMPLETE
+ Statistics: Num rows: 605 Data size: 52635 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
@@ -360,62 +416,6 @@ STAGE PLANS:
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
- Stage: Stage-18
- Map Reduce Local Work
- Alias -> Map Local Tables:
- $hdt$_2:$hdt$_3:t1_n94
- Fetch Operator
- limit: -1
- Alias -> Map Local Operator Tree:
- $hdt$_2:$hdt$_3:t1_n94
- TableScan
- alias: t1_n94
- filterExpr: key is not null (type: boolean)
- Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: key is not null (type: boolean)
- Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: key (type: string)
- outputColumnNames: _col0
- Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE
- HashTable Sink Operator
- keys:
- 0 _col0 (type: string)
- 1 _col0 (type: string)
-
- Stage: Stage-13
- Map Reduce
- Map Operator Tree:
- TableScan
- alias: src
- filterExpr: key is not null (type: boolean)
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- Filter Operator
- predicate: key is not null (type: boolean)
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- Select Operator
- expressions: key (type: string)
- outputColumnNames: _col0
- Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col0 (type: string)
- 1 _col0 (type: string)
- outputColumnNames: _col0
- Statistics: Num rows: 550 Data size: 47850 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- table:
- input format: org.apache.hadoop.mapred.SequenceFileInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
- Execution mode: vectorized
- Local Work:
- Map Reduce Local Work
-
Stage: Stage-0
Fetch Operator
limit: -1
diff --git a/ql/src/test/results/clientpositive/spark/runtime_skewjoin_mapjoin_spark.q.out b/ql/src/test/results/clientpositive/spark/runtime_skewjoin_mapjoin_spark.q.out
index b612e91..963b1d1 100644
--- a/ql/src/test/results/clientpositive/spark/runtime_skewjoin_mapjoin_spark.q.out
+++ b/ql/src/test/results/clientpositive/spark/runtime_skewjoin_mapjoin_spark.q.out
@@ -50,7 +50,7 @@ STAGE PLANS:
Spark
#### A masked pattern was here ####
Vertices:
- Map 6
+ Map 5
Map Operator Tree:
TableScan
alias: t1_n94
@@ -74,7 +74,7 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 2), Map 4 (PARTITION-LEVEL SORT, 2), Map 5 (PARTITION-LEVEL SORT, 2)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 2), Map 4 (PARTITION-LEVEL SORT, 2), Map 6 (PARTITION-LEVEL SORT, 2)
#### A masked pattern was here ####
Vertices:
Map 1
@@ -99,25 +99,6 @@ STAGE PLANS:
Map 4
Map Operator Tree:
TableScan
- alias: src2
- filterExpr: key is not null (type: boolean)
- Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: key is not null (type: boolean)
- Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: key (type: string)
- outputColumnNames: _col0
- Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: string)
- sort order: +
- Map-reduce partition columns: _col0 (type: string)
- Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
- Execution mode: vectorized
- Map 5
- Map Operator Tree:
- TableScan
alias: src
filterExpr: key is not null (type: boolean)
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
@@ -136,7 +117,7 @@ STAGE PLANS:
1 _col0 (type: string)
outputColumnNames: _col0
input vertices:
- 1 Map 6
+ 1 Map 5
Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
@@ -146,6 +127,25 @@ STAGE PLANS:
Execution mode: vectorized
Local Work:
Map Reduce Local Work
+ Map 6
+ Map Operator Tree:
+ TableScan
+ alias: src2
+ filterExpr: key is not null (type: boolean)
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: key is not null (type: boolean)
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: key (type: string)
+ outputColumnNames: _col0
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
+ Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
+ Execution mode: vectorized
Reducer 2
Reduce Operator Tree:
Join Operator
diff --git a/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_6.q.out b/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_6.q.out
index 8cc8353..a0d1000 100644
--- a/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_6.q.out
+++ b/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_6.q.out
@@ -174,27 +174,6 @@ STAGE PLANS:
Map 5
Map Operator Tree:
TableScan
- alias: part_table_2
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: col (type: int), part_col (type: int)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col1 (type: int)
- outputColumnNames: _col0
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- keys: _col0 (type: int)
- mode: hash
- outputColumnNames: _col0
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Spark Partition Pruning Sink Operator
- Target Columns: [Map 4 -> [part_col:int (part_col)]]
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Map 6
- Map Operator Tree:
- TableScan
alias: regular_table
filterExpr: col is not null (type: boolean)
Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE
@@ -217,6 +196,27 @@ STAGE PLANS:
Spark Partition Pruning Sink Operator
Target Columns: [Map 1 -> [part_col:int (part_col)], Map 4 -> [part_col:int (part_col)]]
Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE
+ Map 6
+ Map Operator Tree:
+ TableScan
+ alias: part_table_2
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: col (type: int), part_col (type: int)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col1 (type: int)
+ outputColumnNames: _col0
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ keys: _col0 (type: int)
+ mode: hash
+ outputColumnNames: _col0
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Spark Partition Pruning Sink Operator
+ Target Columns: [Map 1 -> [part_col:int (part_col)]]
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Stage: Stage-1
Spark
@@ -227,17 +227,17 @@ STAGE PLANS:
Map 1
Map Operator Tree:
TableScan
- alias: part_table_2
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ alias: part_table_1
+ Statistics: Num rows: 12 Data size: 12 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: col (type: int), part_col (type: int)
outputColumnNames: _col0, _col1
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 12 Data size: 12 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col1 (type: int)
sort order: +
Map-reduce partition columns: _col1 (type: int)
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 12 Data size: 12 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: int)
Map 3
Map Operator Tree:
@@ -260,17 +260,17 @@ STAGE PLANS:
Map 4
Map Operator Tree:
TableScan
- alias: part_table_1
- Statistics: Num rows: 12 Data size: 12 Basic stats: COMPLETE Column stats: NONE
+ alias: part_table_2
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: col (type: int), part_col (type: int)
outputColumnNames: _col0, _col1
- Statistics: Num rows: 12 Data size: 12 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col1 (type: int)
sort order: +
Map-reduce partition columns: _col1 (type: int)
- Statistics: Num rows: 12 Data size: 12 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: int)
Reducer 2
Reduce Operator Tree:
@@ -285,7 +285,7 @@ STAGE PLANS:
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 26 Data size: 26 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: _col2 (type: int), _col3 (type: int), _col4 (type: int), _col0 (type: int), _col1 (type: int)
+ expressions: _col2 (type: int), _col0 (type: int), _col1 (type: int), _col3 (type: int), _col4 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 26 Data size: 26 Basic stats: COMPLETE Column stats: NONE
File Output Operator
@@ -378,34 +378,6 @@ STAGE PLANS:
Spark
#### A masked pattern was here ####
Vertices:
- Map 1
- Map Operator Tree:
- TableScan
- alias: part_table_2
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: col (type: int), part_col (type: int)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- keys:
- 0 _col1 (type: int)
- 1 _col0 (type: int)
- 2 _col1 (type: int)
- Select Operator
- expressions: _col1 (type: int)
- outputColumnNames: _col0
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- keys: _col0 (type: int)
- mode: hash
- outputColumnNames: _col0
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Spark Partition Pruning Sink Operator
- Target Columns: [Map 3 -> [part_col:int (part_col)]]
- Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
- Local Work:
- Map Reduce Local Work
Map 2
Map Operator Tree:
TableScan
@@ -434,16 +406,44 @@ STAGE PLANS:
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE
Spark Partition Pruning Sink Operator
- Target Columns: [Map 3 -> [part_col:int (part_col)]]
+ Target Columns: [Map 1 -> [part_col:int (part_col)]]
Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE
Local Work:
Map Reduce Local Work
+ Map 3
+ Map Operator Tree:
+ TableScan
+ alias: part_table_2
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: col (type: int), part_col (type: int)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Spark HashTable Sink Operator
+ keys:
+ 0 _col1 (type: int)
+ 1 _col0 (type: int)
+ 2 _col1 (type: int)
+ Select Operator
+ expressions: _col1 (type: int)
+ outputColumnNames: _col0
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ keys: _col0 (type: int)
+ mode: hash
+ outputColumnNames: _col0
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Spark Partition Pruning Sink Operator
+ Target Columns: [Map 1 -> [part_col:int (part_col)]]
+ Statistics: Num rows: 8 Data size: 8 Basic stats: COMPLETE Column stats: NONE
+ Local Work:
+ Map Reduce Local Work
Stage: Stage-1
Spark
#### A masked pattern was here ####
Vertices:
- Map 3
+ Map 1
Map Operator Tree:
TableScan
alias: part_table_1
@@ -462,11 +462,11 @@ STAGE PLANS:
2 _col1 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
input vertices:
- 0 Map 1
1 Map 2
+ 2 Map 3
Statistics: Num rows: 26 Data size: 26 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: _col2 (type: int), _col3 (type: int), _col4 (type: int), _col0 (type: int), _col1 (type: int)
+ expressions: _col2 (type: int), _col0 (type: int), _col1 (type: int), _col3 (type: int), _col4 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 26 Data size: 26 Basic stats: COMPLETE Column stats: NONE
File Output Operator