You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by xu...@apache.org on 2014/12/15 18:11:42 UTC
svn commit: r1642997 [18/42] - in /hive/branches/spark:
itests/src/test/resources/
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/
ql/src/test/results/clientpositive/spark/
Modified: hive/branches/spark/ql/src/test/results/clientpositive/spark/join_filters_overlap.q.out
URL: http://svn.apache.org/viewvc/hive/branches/spark/ql/src/test/results/clientpositive/spark/join_filters_overlap.q.out?rev=1642997&r1=1642996&r2=1642997&view=diff
==============================================================================
--- hive/branches/spark/ql/src/test/results/clientpositive/spark/join_filters_overlap.q.out (original)
+++ hive/branches/spark/ql/src/test/results/clientpositive/spark/join_filters_overlap.q.out Tue Dec 2 19:57:10 2014
@@ -93,43 +93,30 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-2 is a root stage
- Stage-1 depends on stages: Stage-2
+ Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-2
+ Stage: Stage-1
Spark
+ Edges:
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 3), Map 3 (PARTITION-LEVEL SORT, 3), Map 4 (PARTITION-LEVEL SORT, 3)
#### A masked pattern was here ####
Vertices:
- Map 2
+ Map 1
Map Operator Tree:
TableScan
- alias: b
+ alias: a
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
- Filter Operator
- isSamplingPred: false
- predicate: (value = 50) (type: boolean)
- Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key} {value}
- 1 {value}
- 2 {key} {value}
- filter mappings:
- 0 [1, 1, 2, 1]
- filter predicates:
- 0 {(value = 50)} {(value = 60)}
- 1
- 2
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- Position of Big Table: 0
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
+ tag: 0
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -178,35 +165,25 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [b]
+ /a [a]
Map 3
Map Operator Tree:
TableScan
- alias: c
+ alias: b
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
- predicate: (value = 60) (type: boolean)
+ predicate: (value = 50) (type: boolean)
Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key} {value}
- 1 {key} {value}
- 2 {value}
- filter mappings:
- 0 [1, 1, 2, 1]
- filter predicates:
- 0 {(value = 50)} {(value = 60)}
- 1
- 2
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- Position of Big Table: 0
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ tag: 1
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -255,69 +232,25 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [c]
-
- Stage: Stage-1
- Spark
-#### A masked pattern was here ####
- Vertices:
- Map 1
+ /a [b]
+ Map 4
Map Operator Tree:
TableScan
- alias: a
+ alias: c
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
- Map Join Operator
- condition map:
- Left Outer Join0 to 1
- Left Outer Join0 to 2
- condition expressions:
- 0 {key} {value}
- 1 {key} {value}
- 2 {key} {value}
- filter mappings:
- 0 [1, 1, 2, 1]
- filter predicates:
- 0 {(value = 50)} {(value = 60)}
- 1
- 2
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11
- input vertices:
- 1 Map 2
- 2 Map 3
- Position of Big Table: 0
- Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: int), _col1 (type: int), _col5 (type: int), _col6 (type: int), _col10 (type: int), _col11 (type: int)
- outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
- Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3,_col4,_col5
- columns.types int:int:int:int:int:int
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
- Local Work:
- Map Reduce Local Work
+ Filter Operator
+ isSamplingPred: false
+ predicate: (value = 60) (type: boolean)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ tag: 2
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -366,7 +299,51 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [a]
+ /a [c]
+ Reducer 2
+ Needs Tagging: true
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Left Outer Join0 to 1
+ Left Outer Join0 to 2
+ condition expressions:
+ 0 {KEY.reducesinkkey0} {VALUE._col0}
+ 1 {KEY.reducesinkkey0} {VALUE._col0}
+ 2 {KEY.reducesinkkey0} {VALUE._col0}
+ filter mappings:
+ 0 [1, 1, 2, 1]
+ filter predicates:
+ 0 {(VALUE._col0 = 50)} {(VALUE._col0 = 60)}
+ 1
+ 2
+ outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11
+ Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: int), _col1 (type: int), _col5 (type: int), _col6 (type: int), _col10 (type: int), _col11 (type: int)
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
+ Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3,_col4,_col5
+ columns.types int:int:int:int:int:int
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
Stage: Stage-0
Fetch Operator
@@ -475,13 +452,14 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-2 is a root stage
- Stage-1 depends on stages: Stage-2
+ Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-2
+ Stage: Stage-1
Spark
+ Edges:
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 3), Map 3 (PARTITION-LEVEL SORT, 3), Map 4 (PARTITION-LEVEL SORT, 3)
#### A masked pattern was here ####
Vertices:
Map 1
@@ -494,24 +472,14 @@ STAGE PLANS:
isSamplingPred: false
predicate: (value = 50) (type: boolean)
Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {value}
- 1 {key} {value}
- 2 {key} {value}
- filter mappings:
- 1 [0, 1, 2, 1]
- filter predicates:
- 0
- 1 {(value = 50)} {(value = 60)}
- 2
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- Position of Big Table: 1
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ tag: 0
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -564,31 +532,17 @@ STAGE PLANS:
Map 3
Map Operator Tree:
TableScan
- alias: c
+ alias: b
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
- Filter Operator
- isSamplingPred: false
- predicate: (value = 60) (type: boolean)
- Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key} {value}
- 1 {key} {value}
- 2 {value}
- filter mappings:
- 1 [0, 1, 2, 1]
- filter predicates:
- 0
- 1 {(value = 50)} {(value = 60)}
- 2
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- Position of Big Table: 1
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
+ tag: 1
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -637,69 +591,25 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [c]
-
- Stage: Stage-1
- Spark
-#### A masked pattern was here ####
- Vertices:
- Map 2
+ /a [b]
+ Map 4
Map Operator Tree:
TableScan
- alias: b
+ alias: c
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
- Map Join Operator
- condition map:
- Right Outer Join0 to 1
- Left Outer Join1 to 2
- condition expressions:
- 0 {key} {value}
- 1 {key} {value}
- 2 {key} {value}
- filter mappings:
- 1 [0, 1, 2, 1]
- filter predicates:
- 0
- 1 {(value = 50)} {(value = 60)}
- 2
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11
- input vertices:
- 0 Map 1
- 2 Map 3
- Position of Big Table: 1
- Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: int), _col1 (type: int), _col5 (type: int), _col6 (type: int), _col10 (type: int), _col11 (type: int)
- outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
- Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3,_col4,_col5
- columns.types int:int:int:int:int:int
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
- Local Work:
- Map Reduce Local Work
+ Filter Operator
+ isSamplingPred: false
+ predicate: (value = 60) (type: boolean)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ tag: 2
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -748,7 +658,51 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [b]
+ /a [c]
+ Reducer 2
+ Needs Tagging: true
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Right Outer Join0 to 1
+ Left Outer Join1 to 2
+ condition expressions:
+ 0 {KEY.reducesinkkey0} {VALUE._col0}
+ 1 {KEY.reducesinkkey0} {VALUE._col0}
+ 2 {KEY.reducesinkkey0} {VALUE._col0}
+ filter mappings:
+ 1 [0, 1, 2, 1]
+ filter predicates:
+ 0
+ 1 {(VALUE._col0 = 50)} {(VALUE._col0 = 60)}
+ 2
+ outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11
+ Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: int), _col1 (type: int), _col5 (type: int), _col6 (type: int), _col10 (type: int), _col11 (type: int)
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
+ Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3,_col4,_col5
+ columns.types int:int:int:int:int:int
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
Stage: Stage-0
Fetch Operator
@@ -871,13 +825,14 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-2 is a root stage
- Stage-1 depends on stages: Stage-2
+ Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-2
+ Stage: Stage-1
Spark
+ Edges:
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 3), Map 3 (PARTITION-LEVEL SORT, 3), Map 4 (PARTITION-LEVEL SORT, 3)
#### A masked pattern was here ####
Vertices:
Map 1
@@ -890,24 +845,14 @@ STAGE PLANS:
isSamplingPred: false
predicate: (value = 50) (type: boolean)
Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {value}
- 1 {key} {value}
- 2 {key} {value}
- filter mappings:
- 1 [0, 2, 2, 2]
- filter predicates:
- 0
- 1 {(value = 50)} {(value > 10)} {(value = 60)} {(value > 20)}
- 2
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- Position of Big Table: 1
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ tag: 0
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -960,31 +905,17 @@ STAGE PLANS:
Map 3
Map Operator Tree:
TableScan
- alias: c
+ alias: b
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
- Filter Operator
- isSamplingPred: false
- predicate: (value = 60) (type: boolean)
- Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key} {value}
- 1 {key} {value}
- 2 {value}
- filter mappings:
- 1 [0, 2, 2, 2]
- filter predicates:
- 0
- 1 {(value = 50)} {(value > 10)} {(value = 60)} {(value > 20)}
- 2
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- Position of Big Table: 1
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
+ tag: 1
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -1033,69 +964,25 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [c]
-
- Stage: Stage-1
- Spark
-#### A masked pattern was here ####
- Vertices:
- Map 2
+ /a [b]
+ Map 4
Map Operator Tree:
TableScan
- alias: b
+ alias: c
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
- Map Join Operator
- condition map:
- Right Outer Join0 to 1
- Left Outer Join1 to 2
- condition expressions:
- 0 {key} {value}
- 1 {key} {value}
- 2 {key} {value}
- filter mappings:
- 1 [0, 2, 2, 2]
- filter predicates:
- 0
- 1 {(value = 50)} {(value > 10)} {(value = 60)} {(value > 20)}
- 2
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11
- input vertices:
- 0 Map 1
- 2 Map 3
- Position of Big Table: 1
- Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: int), _col1 (type: int), _col5 (type: int), _col6 (type: int), _col10 (type: int), _col11 (type: int)
- outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
- Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3,_col4,_col5
- columns.types int:int:int:int:int:int
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
- Local Work:
- Map Reduce Local Work
+ Filter Operator
+ isSamplingPred: false
+ predicate: (value = 60) (type: boolean)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ tag: 2
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -1144,7 +1031,51 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [b]
+ /a [c]
+ Reducer 2
+ Needs Tagging: true
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Right Outer Join0 to 1
+ Left Outer Join1 to 2
+ condition expressions:
+ 0 {KEY.reducesinkkey0} {VALUE._col0}
+ 1 {KEY.reducesinkkey0} {VALUE._col0}
+ 2 {KEY.reducesinkkey0} {VALUE._col0}
+ filter mappings:
+ 1 [0, 2, 2, 2]
+ filter predicates:
+ 0
+ 1 {(VALUE._col0 = 50)} {(VALUE._col0 > 10)} {(VALUE._col0 = 60)} {(VALUE._col0 > 20)}
+ 2
+ outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11
+ Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: int), _col1 (type: int), _col5 (type: int), _col6 (type: int), _col10 (type: int), _col11 (type: int)
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
+ Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 6 Data size: 39 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3,_col4,_col5
+ columns.types int:int:int:int:int:int
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
Stage: Stage-0
Fetch Operator
@@ -1726,46 +1657,30 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-2 is a root stage
- Stage-1 depends on stages: Stage-2
+ Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-2
+ Stage: Stage-1
Spark
+ Edges:
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 3), Map 3 (PARTITION-LEVEL SORT, 3), Map 4 (PARTITION-LEVEL SORT, 3), Map 5 (PARTITION-LEVEL SORT, 3)
#### A masked pattern was here ####
Vertices:
- Map 2
+ Map 1
Map Operator Tree:
TableScan
- alias: b
+ alias: a
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
- Filter Operator
- isSamplingPred: false
- predicate: (value = 50) (type: boolean)
- Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key} {value}
- 1 {value}
- 2 {key} {value}
- 3 {key} {value}
- filter mappings:
- 0 [1, 1, 2, 1, 3, 1]
- filter predicates:
- 0 {(value = 50)} {(value = 60)} {(value = 40)}
- 1
- 2
- 3
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- 3 key (type: int)
- Position of Big Table: 0
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
+ tag: 0
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -1814,38 +1729,25 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [b]
+ /a [a]
Map 3
Map Operator Tree:
TableScan
- alias: c
+ alias: b
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
- predicate: (value = 60) (type: boolean)
+ predicate: (value = 50) (type: boolean)
Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key} {value}
- 1 {key} {value}
- 2 {value}
- 3 {key} {value}
- filter mappings:
- 0 [1, 1, 2, 1, 3, 1]
- filter predicates:
- 0 {(value = 50)} {(value = 60)} {(value = 40)}
- 1
- 2
- 3
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- 3 key (type: int)
- Position of Big Table: 0
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ tag: 1
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -1894,38 +1796,25 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [c]
+ /a [b]
Map 4
Map Operator Tree:
TableScan
- alias: d
+ alias: c
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
- predicate: (value = 40) (type: boolean)
+ predicate: (value = 60) (type: boolean)
Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key} {value}
- 1 {key} {value}
- 2 {key} {value}
- 3 {value}
- filter mappings:
- 0 [1, 1, 2, 1, 3, 1]
- filter predicates:
- 0 {(value = 50)} {(value = 60)} {(value = 40)}
- 1
- 2
- 3
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- 3 key (type: int)
- Position of Big Table: 0
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ tag: 2
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -1974,74 +1863,25 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [d]
-
- Stage: Stage-1
- Spark
-#### A masked pattern was here ####
- Vertices:
- Map 1
+ /a [c]
+ Map 5
Map Operator Tree:
TableScan
- alias: a
+ alias: d
Statistics: Num rows: 3 Data size: 18 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
- Map Join Operator
- condition map:
- Left Outer Join0 to 1
- Left Outer Join0 to 2
- Left Outer Join0 to 3
- condition expressions:
- 0 {key} {value}
- 1 {key} {value}
- 2 {key} {value}
- 3 {key} {value}
- filter mappings:
- 0 [1, 1, 2, 1, 3, 1]
- filter predicates:
- 0 {(value = 50)} {(value = 60)} {(value = 40)}
- 1
- 2
- 3
- keys:
- 0 key (type: int)
- 1 key (type: int)
- 2 key (type: int)
- 3 key (type: int)
- outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11, _col15, _col16
- input vertices:
- 1 Map 2
- 2 Map 3
- 3 Map 4
- Position of Big Table: 0
- Statistics: Num rows: 9 Data size: 59 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: int), _col1 (type: int), _col5 (type: int), _col6 (type: int), _col10 (type: int), _col11 (type: int), _col15 (type: int), _col16 (type: int)
- outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7
- Statistics: Num rows: 9 Data size: 59 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 0
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 9 Data size: 59 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- columns _col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7
- columns.types int:int:int:int:int:int:int:int
- escape.delim \
- hive.serialization.extend.nesting.levels true
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- TotalFiles: 1
- GatherStats: false
- MultiFileSpray: false
- Local Work:
- Map Reduce Local Work
+ Filter Operator
+ isSamplingPred: false
+ predicate: (value = 40) (type: boolean)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: key (type: int)
+ sort order: +
+ Map-reduce partition columns: key (type: int)
+ Statistics: Num rows: 1 Data size: 6 Basic stats: COMPLETE Column stats: NONE
+ tag: 3
+ value expressions: value (type: int)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -2090,7 +1930,54 @@ STAGE PLANS:
name: default.a
name: default.a
Truncated Path -> Alias:
- /a [a]
+ /a [d]
+ Reducer 2
+ Needs Tagging: true
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Left Outer Join0 to 1
+ Left Outer Join0 to 2
+ Left Outer Join0 to 3
+ condition expressions:
+ 0 {KEY.reducesinkkey0} {VALUE._col0}
+ 1 {KEY.reducesinkkey0} {VALUE._col0}
+ 2 {KEY.reducesinkkey0} {VALUE._col0}
+ 3 {KEY.reducesinkkey0} {VALUE._col0}
+ filter mappings:
+ 0 [1, 1, 2, 1, 3, 1]
+ filter predicates:
+ 0 {(VALUE._col0 = 50)} {(VALUE._col0 = 60)} {(VALUE._col0 = 40)}
+ 1
+ 2
+ 3
+ outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11, _col15, _col16
+ Statistics: Num rows: 9 Data size: 59 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: int), _col1 (type: int), _col5 (type: int), _col6 (type: int), _col10 (type: int), _col11 (type: int), _col15 (type: int), _col16 (type: int)
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7
+ Statistics: Num rows: 9 Data size: 59 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 0
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 9 Data size: 59 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ columns _col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7
+ columns.types int:int:int:int:int:int:int:int
+ escape.delim \
+ hive.serialization.extend.nesting.levels true
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ TotalFiles: 1
+ GatherStats: false
+ MultiFileSpray: false
Stage: Stage-0
Fetch Operator
Modified: hive/branches/spark/ql/src/test/results/clientpositive/spark/join_hive_626.q.out
URL: http://svn.apache.org/viewvc/hive/branches/spark/ql/src/test/results/clientpositive/spark/join_hive_626.q.out?rev=1642997&r1=1642996&r2=1642997&view=diff
==============================================================================
--- hive/branches/spark/ql/src/test/results/clientpositive/spark/join_hive_626.q.out (original)
+++ hive/branches/spark/ql/src/test/results/clientpositive/spark/join_hive_626.q.out Tue Dec 2 19:57:10 2014
@@ -65,16 +65,32 @@ select hive_foo.foo_name, hive_bar.bar_n
hive_bar.foo_id join hive_count on hive_count.bar_id = hive_bar.bar_id
POSTHOOK: type: QUERY
STAGE DEPENDENCIES:
- Stage-2 is a root stage
- Stage-1 depends on stages: Stage-2
+ Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
- Stage: Stage-2
+ Stage: Stage-1
Spark
+ Edges:
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 4 (PARTITION-LEVEL SORT, 1)
+ Reducer 3 <- Map 5 (PARTITION-LEVEL SORT, 1), Reducer 2 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
- Map 2
+ Map 1
+ Map Operator Tree:
+ TableScan
+ alias: hive_foo
+ Statistics: Num rows: 0 Data size: 15 Basic stats: PARTIAL Column stats: NONE
+ Filter Operator
+ predicate: foo_id is not null (type: boolean)
+ Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
+ Reduce Output Operator
+ key expressions: foo_id (type: int)
+ sort order: +
+ Map-reduce partition columns: foo_id (type: int)
+ Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
+ value expressions: foo_name (type: string)
+ Map 4
Map Operator Tree:
TableScan
alias: hive_bar
@@ -82,16 +98,13 @@ STAGE PLANS:
Filter Operator
predicate: (foo_id is not null and bar_id is not null) (type: boolean)
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {foo_name}
- 1 {bar_id} {bar_name}
- keys:
- 0 foo_id (type: int)
- 1 foo_id (type: int)
- Local Work:
- Map Reduce Local Work
- Map 3
+ Reduce Output Operator
+ key expressions: foo_id (type: int)
+ sort order: +
+ Map-reduce partition columns: foo_id (type: int)
+ Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
+ value expressions: bar_id (type: int), bar_name (type: string)
+ Map 5
Map Operator Tree:
TableScan
alias: hive_count
@@ -99,67 +112,49 @@ STAGE PLANS:
Filter Operator
predicate: bar_id is not null (type: boolean)
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {_col1} {_col13}
- 1 {n}
- keys:
- 0 _col9 (type: int)
- 1 bar_id (type: int)
- Local Work:
- Map Reduce Local Work
-
- Stage: Stage-1
- Spark
-#### A masked pattern was here ####
- Vertices:
- Map 1
- Map Operator Tree:
- TableScan
- alias: hive_foo
- Statistics: Num rows: 0 Data size: 15 Basic stats: PARTIAL Column stats: NONE
- Filter Operator
- predicate: foo_id is not null (type: boolean)
- Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {foo_name}
- 1 {bar_id} {bar_name}
- keys:
- 0 foo_id (type: int)
- 1 foo_id (type: int)
- outputColumnNames: _col1, _col9, _col13
- input vertices:
- 1 Map 2
+ Reduce Output Operator
+ key expressions: bar_id (type: int)
+ sort order: +
+ Map-reduce partition columns: bar_id (type: int)
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- condition expressions:
- 0 {_col1} {_col13}
- 1 {n}
- keys:
- 0 _col9 (type: int)
- 1 bar_id (type: int)
- outputColumnNames: _col1, _col13, _col22
- input vertices:
- 1 Map 3
- Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
- Select Operator
- expressions: _col1 (type: string), _col13 (type: string), _col22 (type: int)
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
- File Output Operator
- compressed: false
- Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- Local Work:
- Map Reduce Local Work
+ value expressions: n (type: int)
+ Reducer 2
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {VALUE._col0}
+ 1 {VALUE._col0} {VALUE._col3}
+ outputColumnNames: _col1, _col9, _col13
+ Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col9 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col9 (type: int)
+ Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
+ value expressions: _col1 (type: string), _col13 (type: string)
+ Reducer 3
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ condition expressions:
+ 0 {VALUE._col1} {VALUE._col12}
+ 1 {VALUE._col0}
+ outputColumnNames: _col1, _col13, _col22
+ Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
+ Select Operator
+ expressions: _col1 (type: string), _col13 (type: string), _col22 (type: int)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
Modified: hive/branches/spark/ql/src/test/results/clientpositive/spark/join_map_ppr.q.out
URL: http://svn.apache.org/viewvc/hive/branches/spark/ql/src/test/results/clientpositive/spark/join_map_ppr.q.out?rev=1642997&r1=1642996&r2=1642997&view=diff
==============================================================================
--- hive/branches/spark/ql/src/test/results/clientpositive/spark/join_map_ppr.q.out (original)
+++ hive/branches/spark/ql/src/test/results/clientpositive/spark/join_map_ppr.q.out Tue Dec 2 19:57:10 2014
@@ -104,14 +104,15 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-3 is a root stage
- Stage-1 depends on stages: Stage-3
+ Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
Stage-2 depends on stages: Stage-0
STAGE PLANS:
- Stage: Stage-3
+ Stage: Stage-1
Spark
+ Edges:
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 3 (PARTITION-LEVEL SORT, 1), Map 4 (PARTITION-LEVEL SORT, 1)
#### A masked pattern was here ####
Vertices:
Map 1
@@ -124,18 +125,13 @@ STAGE PLANS:
isSamplingPred: false
predicate: key is not null (type: boolean)
Statistics: Num rows: 13 Data size: 99 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0
- 1 {value}
- 2 {value}
- keys:
- 0 key (type: string)
- 1 key (type: string)
- 2 key (type: string)
- Position of Big Table: 1
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: string)
+ sort order: +
+ Map-reduce partition columns: key (type: string)
+ Statistics: Num rows: 13 Data size: 99 Basic stats: COMPLETE Column stats: NONE
+ tag: 0
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -188,36 +184,29 @@ STAGE PLANS:
Map 3
Map Operator Tree:
TableScan
- alias: z
+ alias: y
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: key is not null (type: boolean)
Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key}
- 1 {value}
- 2 {value}
- keys:
- 0 key (type: string)
- 1 key (type: string)
- 2 key (type: string)
- Position of Big Table: 1
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: string)
+ sort order: +
+ Map-reduce partition columns: key (type: string)
+ Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
+ tag: 1
+ value expressions: value (type: string)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: hr=11
+ base file name: src
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- partition values:
- ds 2008-04-08
- hr 11
properties:
COLUMN_STATS_ACCURATE true
bucket_count -1
@@ -225,13 +214,11 @@ STAGE PLANS:
columns.comments default default
columns.types string:string
#### A masked pattern was here ####
- name default.srcpart
+ name default.src
numFiles 1
numRows 500
- partition_columns ds/hr
- partition_columns.types string:string
rawDataSize 5312
- serialization.ddl struct srcpart { string key, string value}
+ serialization.ddl struct src { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 5812
@@ -241,96 +228,55 @@ STAGE PLANS:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
+ COLUMN_STATS_ACCURATE true
bucket_count -1
columns key,value
columns.comments default default
columns.types string:string
#### A masked pattern was here ####
- name default.srcpart
- partition_columns ds/hr
- partition_columns.types string:string
- serialization.ddl struct srcpart { string key, string value}
+ name default.src
+ numFiles 1
+ numRows 500
+ rawDataSize 5312
+ serialization.ddl struct src { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ totalSize 5812
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.srcpart
- name: default.srcpart
+ name: default.src
+ name: default.src
Truncated Path -> Alias:
- /srcpart/ds=2008-04-08/hr=11 [z]
-
- Stage: Stage-1
- Spark
-#### A masked pattern was here ####
- Vertices:
- Map 2
+ /src [y]
+ Map 4
Map Operator Tree:
TableScan
- alias: y
+ alias: z
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: key is not null (type: boolean)
Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- Inner Join 0 to 2
- condition expressions:
- 0 {key}
- 1 {value}
- 2 {value}
- keys:
- 0 key (type: string)
- 1 key (type: string)
- 2 key (type: string)
- outputColumnNames: _col0, _col6, _col11
- input vertices:
- 0 Map 1
- 2 Map 3
- Position of Big Table: 1
- Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col11 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 1
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- bucket_count -1
- columns key,value,val2
- columns.comments
- columns.types string:string:string
-#### A masked pattern was here ####
- name default.dest_j1
- serialization.ddl struct dest_j1 { string key, string value, string val2}
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
-#### A masked pattern was here ####
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.dest_j1
- TotalFiles: 1
- GatherStats: true
- MultiFileSpray: false
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: key (type: string)
+ sort order: +
+ Map-reduce partition columns: key (type: string)
+ Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
+ tag: 2
+ value expressions: value (type: string)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: src
+ base file name: hr=11
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ partition values:
+ ds 2008-04-08
+ hr 11
properties:
COLUMN_STATS_ACCURATE true
bucket_count -1
@@ -338,11 +284,13 @@ STAGE PLANS:
columns.comments default default
columns.types string:string
#### A masked pattern was here ####
- name default.src
+ name default.srcpart
numFiles 1
numRows 500
+ partition_columns ds/hr
+ partition_columns.types string:string
rawDataSize 5312
- serialization.ddl struct src { string key, string value}
+ serialization.ddl struct srcpart { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 5812
@@ -352,26 +300,66 @@ STAGE PLANS:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
- COLUMN_STATS_ACCURATE true
bucket_count -1
columns key,value
columns.comments default default
columns.types string:string
#### A masked pattern was here ####
- name default.src
- numFiles 1
- numRows 500
- rawDataSize 5312
- serialization.ddl struct src { string key, string value}
+ name default.srcpart
+ partition_columns ds/hr
+ partition_columns.types string:string
+ serialization.ddl struct srcpart { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- totalSize 5812
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.src
- name: default.src
+ name: default.srcpart
+ name: default.srcpart
Truncated Path -> Alias:
- /src [y]
+ /srcpart/ds=2008-04-08/hr=11 [z]
+ Reducer 2
+ Needs Tagging: true
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ Inner Join 0 to 2
+ condition expressions:
+ 0 {KEY.reducesinkkey0}
+ 1 {VALUE._col0}
+ 2 {VALUE._col0}
+ outputColumnNames: _col0, _col6, _col11
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col11 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 1
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ bucket_count -1
+ columns key,value,val2
+ columns.comments
+ columns.types string:string:string
+#### A masked pattern was here ####
+ name default.dest_j1
+ serialization.ddl struct dest_j1 { string key, string value, string val2}
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+#### A masked pattern was here ####
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ name: default.dest_j1
+ TotalFiles: 1
+ GatherStats: true
+ MultiFileSpray: false
Stage: Stage-0
Move Operator
@@ -669,14 +657,15 @@ TOK_QUERY
STAGE DEPENDENCIES:
- Stage-3 is a root stage
- Stage-1 depends on stages: Stage-3
+ Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
Stage-2 depends on stages: Stage-0
STAGE PLANS:
- Stage: Stage-3
+ Stage: Stage-1
Spark
+ Edges:
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 3), Map 3 (PARTITION-LEVEL SORT, 3), Map 4 (PARTITION-LEVEL SORT, 3)
#### A masked pattern was here ####
Vertices:
Map 1
@@ -689,18 +678,14 @@ STAGE PLANS:
isSamplingPred: false
predicate: UDFToDouble(key) is not null (type: boolean)
Statistics: Num rows: 13 Data size: 99 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key}
- 1 {value}
- 2 {value}
- keys:
- 0 UDFToDouble(key) (type: double)
- 1 UDFToDouble(key) (type: double)
- 2 UDFToDouble(key) (type: double)
- Position of Big Table: 1
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: UDFToDouble(key) (type: double)
+ sort order: +
+ Map-reduce partition columns: UDFToDouble(key) (type: double)
+ Statistics: Num rows: 13 Data size: 99 Basic stats: COMPLETE Column stats: NONE
+ tag: 0
+ value expressions: key (type: string)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
@@ -753,50 +738,41 @@ STAGE PLANS:
Map 3
Map Operator Tree:
TableScan
- alias: z
+ alias: y
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: UDFToDouble(key) is not null (type: boolean)
Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
- Spark HashTable Sink Operator
- condition expressions:
- 0 {key}
- 1 {value}
- 2 {value}
- keys:
- 0 UDFToDouble(key) (type: double)
- 1 UDFToDouble(key) (type: double)
- 2 UDFToDouble(key) (type: double)
- Position of Big Table: 1
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: UDFToDouble(key) (type: double)
+ sort order: +
+ Map-reduce partition columns: UDFToDouble(key) (type: double)
+ Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
+ tag: 1
+ value expressions: value (type: string)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: hr=11
+ base file name: src_copy
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- partition values:
- ds 2008-04-08
- hr 11
properties:
COLUMN_STATS_ACCURATE true
bucket_count -1
columns key,value
- columns.comments default default
- columns.types string:string
+ columns.comments
+ columns.types int:string
#### A masked pattern was here ####
- name default.srcpart
+ name default.src_copy
numFiles 1
numRows 500
- partition_columns ds/hr
- partition_columns.types string:string
rawDataSize 5312
- serialization.ddl struct srcpart { string key, string value}
+ serialization.ddl struct src_copy { i32 key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 5812
@@ -806,113 +782,69 @@ STAGE PLANS:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
+ COLUMN_STATS_ACCURATE true
bucket_count -1
columns key,value
- columns.comments default default
- columns.types string:string
+ columns.comments
+ columns.types int:string
#### A masked pattern was here ####
- name default.srcpart
- partition_columns ds/hr
- partition_columns.types string:string
- serialization.ddl struct srcpart { string key, string value}
+ name default.src_copy
+ numFiles 1
+ numRows 500
+ rawDataSize 5312
+ serialization.ddl struct src_copy { i32 key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ totalSize 5812
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.srcpart
- name: default.srcpart
+ name: default.src_copy
+ name: default.src_copy
Truncated Path -> Alias:
- /srcpart/ds=2008-04-08/hr=11 [z]
-
- Stage: Stage-1
- Spark
-#### A masked pattern was here ####
- Vertices:
- Map 2
+ /src_copy [y]
+ Map 4
Map Operator Tree:
TableScan
- alias: y
+ alias: z
Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE
GatherStats: false
Filter Operator
isSamplingPred: false
predicate: UDFToDouble(key) is not null (type: boolean)
Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- Inner Join 0 to 2
- condition expressions:
- 0 {key}
- 1 {value}
- 2 {value}
- keys:
- 0 UDFToDouble(key) (type: double)
- 1 UDFToDouble(key) (type: double)
- 2 UDFToDouble(key) (type: double)
- outputColumnNames: _col0, _col6, _col11
- input vertices:
- 0 Map 1
- 2 Map 3
- Position of Big Table: 1
- Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: string), _col11 (type: string), _col6 (type: string)
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
- File Output Operator
- compressed: false
- GlobalTableId: 1
-#### A masked pattern was here ####
- NumFilesPerFileSink: 1
- Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
-#### A masked pattern was here ####
- table:
- input format: org.apache.hadoop.mapred.TextInputFormat
- output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
- properties:
- COLUMN_STATS_ACCURATE true
- bucket_count -1
- columns key,value,val2
- columns.comments
- columns.types string:string:string
-#### A masked pattern was here ####
- name default.dest_j1
- numFiles 1
- numRows 107
- rawDataSize 2018
- serialization.ddl struct dest_j1 { string key, string value, string val2}
- serialization.format 1
- serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- totalSize 2125
-#### A masked pattern was here ####
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.dest_j1
- TotalFiles: 1
- GatherStats: true
- MultiFileSpray: false
- Local Work:
- Map Reduce Local Work
+ Reduce Output Operator
+ key expressions: UDFToDouble(key) (type: double)
+ sort order: +
+ Map-reduce partition columns: UDFToDouble(key) (type: double)
+ Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE
+ tag: 2
+ value expressions: value (type: string)
+ auto parallelism: false
Path -> Alias:
#### A masked pattern was here ####
Path -> Partition:
#### A masked pattern was here ####
Partition
- base file name: src_copy
+ base file name: hr=11
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ partition values:
+ ds 2008-04-08
+ hr 11
properties:
COLUMN_STATS_ACCURATE true
bucket_count -1
columns key,value
- columns.comments
- columns.types int:string
+ columns.comments default default
+ columns.types string:string
#### A masked pattern was here ####
- name default.src_copy
+ name default.srcpart
numFiles 1
numRows 500
+ partition_columns ds/hr
+ partition_columns.types string:string
rawDataSize 5312
- serialization.ddl struct src_copy { i32 key, string value}
+ serialization.ddl struct srcpart { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
totalSize 5812
@@ -922,26 +854,71 @@ STAGE PLANS:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
- COLUMN_STATS_ACCURATE true
bucket_count -1
columns key,value
- columns.comments
- columns.types int:string
+ columns.comments default default
+ columns.types string:string
#### A masked pattern was here ####
- name default.src_copy
- numFiles 1
- numRows 500
- rawDataSize 5312
- serialization.ddl struct src_copy { i32 key, string value}
+ name default.srcpart
+ partition_columns ds/hr
+ partition_columns.types string:string
+ serialization.ddl struct srcpart { string key, string value}
serialization.format 1
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- totalSize 5812
#### A masked pattern was here ####
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- name: default.src_copy
- name: default.src_copy
+ name: default.srcpart
+ name: default.srcpart
Truncated Path -> Alias:
- /src_copy [y]
+ /srcpart/ds=2008-04-08/hr=11 [z]
+ Reducer 2
+ Needs Tagging: true
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ Inner Join 0 to 2
+ condition expressions:
+ 0 {VALUE._col0}
+ 1 {VALUE._col1}
+ 2 {VALUE._col1}
+ outputColumnNames: _col0, _col6, _col11
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: _col0 (type: string), _col11 (type: string), _col6 (type: string)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+ File Output Operator
+ compressed: false
+ GlobalTableId: 1
+#### A masked pattern was here ####
+ NumFilesPerFileSink: 1
+ Statistics: Num rows: 550 Data size: 5843 Basic stats: COMPLETE Column stats: NONE
+#### A masked pattern was here ####
+ table:
+ input format: org.apache.hadoop.mapred.TextInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+ properties:
+ COLUMN_STATS_ACCURATE true
+ bucket_count -1
+ columns key,value,val2
+ columns.comments
+ columns.types string:string:string
+#### A masked pattern was here ####
+ name default.dest_j1
+ numFiles 1
+ numRows 107
+ rawDataSize 2018
+ serialization.ddl struct dest_j1 { string key, string value, string val2}
+ serialization.format 1
+ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ totalSize 2125
+#### A masked pattern was here ####
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ name: default.dest_j1
+ TotalFiles: 1
+ GatherStats: true
+ MultiFileSpray: false
Stage: Stage-0
Move Operator