You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by xu...@apache.org on 2014/12/13 18:44:42 UTC
svn commit: r1645338 [6/9] - in /hive/branches/spark: data/conf/spark/
itests/src/test/resources/ ql/src/java/org/apache/hadoop/hive/ql/optimizer/
ql/src/test/results/clientpositive/spark/
Modified: hive/branches/spark/ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out
URL: http://svn.apache.org/viewvc/hive/branches/spark/ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out?rev=1645338&r1=1645337&r2=1645338&view=diff
==============================================================================
--- hive/branches/spark/ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out (original)
+++ hive/branches/spark/ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out Sat Dec 13 17:44:41 2014
@@ -195,33 +195,35 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-1 is a root stage
+ Stage-2 is a root stage
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-1
+ Stage: Stage-2
Spark
- Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 1
+ Map 2
Map Operator Tree:
TableScan
- alias: l
+ alias: r
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 0
- auto parallelism: false
+ Spark HashTable Sink Operator
+ condition expressions:
+ 0 {key} {value}
+ 1
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ Position of Big Table: 0
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -270,24 +272,63 @@ STAGE PLANS:
name: default.test1
name: default.test1
Truncated Path -> Alias:
- /test1 [l]
- Map 3
+ /test1 [r]
+
+ Stage: Stage-1
+ Spark
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
Map Operator Tree:
TableScan
- alias: r
+ alias: l
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 1
- auto parallelism: false
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {key} {value}
+ 1 {key} {value}
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1, _col5, _col6
+ input vertices:
+ 1 Map 2
+ Position of Big Table: 0
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3
+ columns.types string:string:string:string
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -336,43 +377,7 @@ STAGE PLANS:
name: default.test1
name: default.test1
Truncated Path -> Alias:
- /test1 [r]
- Reducer 2
- Needs Tagging: true
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- 1 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- outputColumnNames: _col0, _col1, _col5, _col6
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3
- columns.types string:string:string:string
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
+ /test1 [l]
Stage: Stage-0
Fetch Operator
@@ -431,33 +436,35 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-1 is a root stage
+ Stage-2 is a root stage
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-1
+ Stage: Stage-2
Spark
- Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 1
+ Map 2
Map Operator Tree:
TableScan
- alias: l
+ alias: r
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 0
- auto parallelism: false
+ Spark HashTable Sink Operator
+ condition expressions:
+ 0 {key} {value}
+ 1
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ Position of Big Table: 0
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -506,24 +513,63 @@ STAGE PLANS:
name: default.test2
name: default.test2
Truncated Path -> Alias:
- /test2 [l]
- Map 3
+ /test2 [r]
+
+ Stage: Stage-1
+ Spark
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
Map Operator Tree:
TableScan
- alias: r
+ alias: l
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 1
- auto parallelism: false
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {key} {value}
+ 1 {key} {value}
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1, _col5, _col6
+ input vertices:
+ 1 Map 2
+ Position of Big Table: 0
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3
+ columns.types string:string:string:string
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -572,43 +618,7 @@ STAGE PLANS:
name: default.test2
name: default.test2
Truncated Path -> Alias:
- /test2 [r]
- Reducer 2
- Needs Tagging: true
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- 1 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- outputColumnNames: _col0, _col1, _col5, _col6
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3
- columns.types string:string:string:string
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
+ /test2 [l]
Stage: Stage-0
Fetch Operator
@@ -664,34 +674,35 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-1 is a root stage
+ Stage-2 is a root stage
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-1
+ Stage: Stage-2
Spark
- Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 1
+ Map 2
Map Operator Tree:
TableScan
- alias: l
+ alias: r
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
- predicate: (key + key) is not null (type: boolean)
+ predicate: UDFToDouble(key) is not null (type: boolean)
Statistics: Num rows: 11 Data size: 2200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: (key + key) (type: double)
- sort order: +
- Map-reduce partition columns: (key + key) (type: double)
- Statistics: Num rows: 11 Data size: 2200 Basic stats: COMPLETE Column stats: NONE
- tag: 0
- value expressions: key (type: string), value (type: string)
- auto parallelism: false
+ Spark HashTable Sink Operator
+ condition expressions:
+ 0 {key} {value}
+ 1 {key} {value}
+ keys:
+ 0 (key + key) (type: double)
+ 1 UDFToDouble(key) (type: double)
+ Position of Big Table: 0
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -740,25 +751,63 @@ STAGE PLANS:
name: default.test1
name: default.test1
Truncated Path -> Alias:
- /test1 [l]
- Map 3
+ /test1 [r]
+
+ Stage: Stage-1
+ Spark
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
Map Operator Tree:
TableScan
- alias: r
+ alias: l
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
- predicate: UDFToDouble(key) is not null (type: boolean)
+ predicate: (key + key) is not null (type: boolean)
Statistics: Num rows: 11 Data size: 2200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: UDFToDouble(key) (type: double)
- sort order: +
- Map-reduce partition columns: UDFToDouble(key) (type: double)
- Statistics: Num rows: 11 Data size: 2200 Basic stats: COMPLETE Column stats: NONE
- tag: 1
- value expressions: key (type: string), value (type: string)
- auto parallelism: false
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {key} {value}
+ 1 {key} {value}
+ keys:
+ 0 (key + key) (type: double)
+ 1 UDFToDouble(key) (type: double)
+ outputColumnNames: _col0, _col1, _col5, _col6
+ input vertices:
+ 1 Map 2
+ Position of Big Table: 0
+ Statistics: Num rows: 12 Data size: 2420 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 12 Data size: 2420 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 12 Data size: 2420 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3
+ columns.types string:string:string:string
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -807,43 +856,7 @@ STAGE PLANS:
name: default.test1
name: default.test1
Truncated Path -> Alias:
- /test1 [r]
- Reducer 2
- Needs Tagging: true
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {VALUE._col0} {VALUE._col1}
- 1 {VALUE._col0} {VALUE._col1}
- outputColumnNames: _col0, _col1, _col5, _col6
- Statistics: Num rows: 12 Data size: 2420 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 12 Data size: 2420 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 12 Data size: 2420 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3
- columns.types string:string:string:string
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
+ /test1 [l]
Stage: Stage-0
Fetch Operator
@@ -902,53 +915,55 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-1 is a root stage
+ Stage-2 is a root stage
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-1
+ Stage: Stage-2
Spark
- Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 1
+ Map 2
Map Operator Tree:
TableScan
- alias: l
+ alias: r
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 0
- auto parallelism: false
+ Spark HashTable Sink Operator
+ condition expressions:
+ 0 {key} {value}
+ 1
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ Position of Big Table: 0
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test1
+ base file name: test2
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name key
+ bucket_field_name value
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test1
+ name default.test2
numFiles 3
- serialization.ddl struct test1 { string key, string value}
+ serialization.ddl struct test2 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -961,60 +976,99 @@ STAGE PLANS:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name key
+ bucket_field_name value
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test1
+ name default.test2
numFiles 3
- serialization.ddl struct test1 { string key, string value}
+ serialization.ddl struct test2 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test1
- name: default.test1
+ name: default.test2
+ name: default.test2
Truncated Path -> Alias:
- /test1 [l]
- Map 3
+ /test2 [r]
+
+ Stage: Stage-1
+ Spark
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
Map Operator Tree:
TableScan
- alias: r
+ alias: l
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 1
- auto parallelism: false
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {key} {value}
+ 1 {key} {value}
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1, _col5, _col6
+ input vertices:
+ 1 Map 2
+ Position of Big Table: 0
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3
+ columns.types string:string:string:string
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test2
+ base file name: test1
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name value
+ bucket_field_name key
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test2
+ name default.test1
numFiles 3
- serialization.ddl struct test2 { string key, string value}
+ serialization.ddl struct test1 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -1027,59 +1081,23 @@ STAGE PLANS:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name value
+ bucket_field_name key
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test2
+ name default.test1
numFiles 3
- serialization.ddl struct test2 { string key, string value}
+ serialization.ddl struct test1 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test2
- name: default.test2
+ name: default.test1
+ name: default.test1
Truncated Path -> Alias:
- /test2 [r]
- Reducer 2
- Needs Tagging: true
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- 1 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- outputColumnNames: _col0, _col1, _col5, _col6
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3
- columns.types string:string:string:string
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
+ /test1 [l]
Stage: Stage-0
Fetch Operator
@@ -1138,39 +1156,41 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-1 is a root stage
+ Stage-2 is a root stage
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-1
+ Stage: Stage-2
Spark
- Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 1
+ Map 2
Map Operator Tree:
TableScan
- alias: l
+ alias: r
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 0
- auto parallelism: false
+ Spark HashTable Sink Operator
+ condition expressions:
+ 0 {key} {value}
+ 1
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ Position of Big Table: 0
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test1
+ base file name: test3
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
@@ -1182,9 +1202,9 @@ STAGE PLANS:
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test1
+ name default.test3
numFiles 3
- serialization.ddl struct test1 { string key, string value}
+ serialization.ddl struct test3 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -1202,41 +1222,80 @@ STAGE PLANS:
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test1
+ name default.test3
numFiles 3
- serialization.ddl struct test1 { string key, string value}
+ serialization.ddl struct test3 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test1
- name: default.test1
+ name: default.test3
+ name: default.test3
Truncated Path -> Alias:
- /test1 [l]
- Map 3
+ /test3 [r]
+
+ Stage: Stage-1
+ Spark
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
Map Operator Tree:
TableScan
- alias: r
+ alias: l
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 1
- auto parallelism: false
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {key} {value}
+ 1 {key} {value}
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1, _col5, _col6
+ input vertices:
+ 1 Map 2
+ Position of Big Table: 0
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3
+ columns.types string:string:string:string
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test3
+ base file name: test1
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
@@ -1248,9 +1307,9 @@ STAGE PLANS:
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test3
+ name default.test1
numFiles 3
- serialization.ddl struct test3 { string key, string value}
+ serialization.ddl struct test1 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -1268,54 +1327,18 @@ STAGE PLANS:
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test3
+ name default.test1
numFiles 3
- serialization.ddl struct test3 { string key, string value}
+ serialization.ddl struct test1 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test3
- name: default.test3
+ name: default.test1
+ name: default.test1
Truncated Path -> Alias:
- /test3 [r]
- Reducer 2
- Needs Tagging: true
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- 1 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- outputColumnNames: _col0, _col1, _col5, _col6
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3
- columns.types string:string:string:string
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
+ /test1 [l]
Stage: Stage-0
Fetch Operator
@@ -1374,53 +1397,55 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-1 is a root stage
+ Stage-2 is a root stage
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-1
+ Stage: Stage-2
Spark
- Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 1
+ Map 2
Map Operator Tree:
TableScan
- alias: l
+ alias: r
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 0
- auto parallelism: false
+ Spark HashTable Sink Operator
+ condition expressions:
+ 0 {key} {value}
+ 1
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ Position of Big Table: 0
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test1
+ base file name: test4
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name key
+ bucket_field_name value
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test1
+ name default.test4
numFiles 3
- serialization.ddl struct test1 { string key, string value}
+ serialization.ddl struct test4 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -1433,60 +1458,99 @@ STAGE PLANS:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name key
+ bucket_field_name value
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test1
+ name default.test4
numFiles 3
- serialization.ddl struct test1 { string key, string value}
+ serialization.ddl struct test4 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test1
- name: default.test1
+ name: default.test4
+ name: default.test4
Truncated Path -> Alias:
- /test1 [l]
- Map 3
+ /test4 [r]
+
+ Stage: Stage-1
+ Spark
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
Map Operator Tree:
TableScan
- alias: r
+ alias: l
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 1
- auto parallelism: false
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {key} {value}
+ 1 {key} {value}
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1, _col5, _col6
+ input vertices:
+ 1 Map 2
+ Position of Big Table: 0
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3
+ columns.types string:string:string:string
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test4
+ base file name: test1
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name value
+ bucket_field_name key
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test4
+ name default.test1
numFiles 3
- serialization.ddl struct test4 { string key, string value}
+ serialization.ddl struct test1 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -1499,59 +1563,23 @@ STAGE PLANS:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name value
+ bucket_field_name key
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test4
+ name default.test1
numFiles 3
- serialization.ddl struct test4 { string key, string value}
+ serialization.ddl struct test1 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test4
- name: default.test4
+ name: default.test1
+ name: default.test1
Truncated Path -> Alias:
- /test4 [r]
- Reducer 2
- Needs Tagging: true
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- 1 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- outputColumnNames: _col0, _col1, _col5, _col6
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3
- columns.types string:string:string:string
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
+ /test1 [l]
Stage: Stage-0
Fetch Operator
@@ -1610,53 +1638,55 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-1 is a root stage
+ Stage-2 is a root stage
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-1
+ Stage: Stage-2
Spark
- Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 1
+ Map 2
Map Operator Tree:
TableScan
- alias: l
+ alias: r
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 0
- auto parallelism: false
+ Spark HashTable Sink Operator
+ condition expressions:
+ 0 {key} {value}
+ 1
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ Position of Big Table: 0
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test2
+ base file name: test3
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name value
+ bucket_field_name key
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test2
+ name default.test3
numFiles 3
- serialization.ddl struct test2 { string key, string value}
+ serialization.ddl struct test3 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -1669,60 +1699,99 @@ STAGE PLANS:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name value
+ bucket_field_name key
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test2
+ name default.test3
numFiles 3
- serialization.ddl struct test2 { string key, string value}
+ serialization.ddl struct test3 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test2
- name: default.test2
+ name: default.test3
+ name: default.test3
Truncated Path -> Alias:
- /test2 [l]
- Map 3
+ /test3 [r]
+
+ Stage: Stage-1
+ Spark
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
Map Operator Tree:
TableScan
- alias: r
+ alias: l
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 1
- auto parallelism: false
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {key} {value}
+ 1 {key} {value}
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1, _col5, _col6
+ input vertices:
+ 1 Map 2
+ Position of Big Table: 0
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3
+ columns.types string:string:string:string
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test3
+ base file name: test2
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name key
+ bucket_field_name value
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test3
+ name default.test2
numFiles 3
- serialization.ddl struct test3 { string key, string value}
+ serialization.ddl struct test2 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -1735,59 +1804,23 @@ STAGE PLANS:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name key
+ bucket_field_name value
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test3
+ name default.test2
numFiles 3
- serialization.ddl struct test3 { string key, string value}
+ serialization.ddl struct test2 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test3
- name: default.test3
+ name: default.test2
+ name: default.test2
Truncated Path -> Alias:
- /test3 [r]
- Reducer 2
- Needs Tagging: true
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- 1 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- outputColumnNames: _col0, _col1, _col5, _col6
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3
- columns.types string:string:string:string
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
+ /test2 [l]
Stage: Stage-0
Fetch Operator
@@ -1846,39 +1879,41 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-1 is a root stage
+ Stage-2 is a root stage
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-1
+ Stage: Stage-2
Spark
- Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 1
+ Map 2
Map Operator Tree:
TableScan
- alias: l
+ alias: r
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 0
- auto parallelism: false
+ Spark HashTable Sink Operator
+ condition expressions:
+ 0 {key} {value}
+ 1
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ Position of Big Table: 0
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test2
+ base file name: test4
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
@@ -1890,9 +1925,9 @@ STAGE PLANS:
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test2
+ name default.test4
numFiles 3
- serialization.ddl struct test2 { string key, string value}
+ serialization.ddl struct test4 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -1910,41 +1945,80 @@ STAGE PLANS:
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test2
+ name default.test4
numFiles 3
- serialization.ddl struct test2 { string key, string value}
+ serialization.ddl struct test4 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test2
- name: default.test2
+ name: default.test4
+ name: default.test4
Truncated Path -> Alias:
- /test2 [l]
- Map 3
+ /test4 [r]
+
+ Stage: Stage-1
+ Spark
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
Map Operator Tree:
TableScan
- alias: r
+ alias: l
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 1
- auto parallelism: false
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {key} {value}
+ 1 {key} {value}
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ outputColumnNames: _col0, _col1, _col5, _col6
+ input vertices:
+ 1 Map 2
+ Position of Big Table: 0
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3
+ columns.types string:string:string:string
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test4
+ base file name: test2
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
@@ -1956,9 +2030,9 @@ STAGE PLANS:
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test4
+ name default.test2
numFiles 3
- serialization.ddl struct test4 { string key, string value}
+ serialization.ddl struct test2 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -1976,54 +2050,18 @@ STAGE PLANS:
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test4
+ name default.test2
numFiles 3
- serialization.ddl struct test4 { string key, string value}
+ serialization.ddl struct test2 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test4
- name: default.test4
+ name: default.test2
+ name: default.test2
Truncated Path -> Alias:
- /test4 [r]
- Reducer 2
- Needs Tagging: true
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- 1 {KEY.reducesinkkey0} {KEY.reducesinkkey1}
- outputColumnNames: _col0, _col1, _col5, _col6
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 1320 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3
- columns.types string:string:string:string
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
+ /test2 [l]
Stage: Stage-0
Fetch Operator
@@ -2082,53 +2120,55 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-1 is a root stage
+ Stage-2 is a root stage
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-1
+ Stage: Stage-2
Spark
- Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 1
+ Map 2
Map Operator Tree:
TableScan
- alias: l
+ alias: r
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
- Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- tag: 0
- auto parallelism: false
+ Spark HashTable Sink Operator
+ condition expressions:
+ 0 {key} {value}
+ 1
+ keys:
+ 0 key (type: string), value (type: string)
+ 1 key (type: string), value (type: string)
+ Position of Big Table: 0
+ Local Work:
+ Map Reduce Local Work
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: test3
+ base file name: test4
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name key
+ bucket_field_name value
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test3
+ name default.test4
numFiles 3
- serialization.ddl struct test3 { string key, string value}
+ serialization.ddl struct test4 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
@@ -2141,60 +2181,99 @@ STAGE PLANS:
COLUMN_STATS_ACCURATE true
SORTBUCKETCOLSPREFIX TRUE
bucket_count 3
- bucket_field_name key
+ bucket_field_name value
columns key,value
columns.comments
columns.types string:string
#### A masked pattern was here ####
- name default.test3
+ name default.test4
numFiles 3
- serialization.ddl struct test3 { string key, string value}
+ serialization.ddl struct test4 { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 4200
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.test3
- name: default.test3
+ name: default.test4
+ name: default.test4
Truncated Path -> Alias:
- /test3 [l]
- Map 3
+ /test4 [r]
+
+ Stage: Stage-1
+ Spark
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
Map Operator Tree:
TableScan
- alias: r
+ alias: l
Statistics: Num rows: 21 Data size: 4200 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: (key is not null and value is not null) (type: boolean)
Statistics: Num rows: 6 Data size: 1200 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: key (type: string), value (type: string)
- sort order: ++
- Map-reduce partition columns: key (type: string), value (type: string)
[... 138 lines stripped ...]