You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Szehon Ho (JIRA)" <ji...@apache.org> on 2014/11/21 02:04:34 UTC
[jira] [Commented] (HIVE-8908) Investigate test failure on join34.q
[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220341#comment-14220341 ]
Szehon Ho commented on HIVE-8908:
---------------------------------
+1 pending tests. As we discussed, let's file a follow-up JIRA to track the fix, as it involves union operator processing and its more complex.
> Investigate test failure on join34.q [Spark Branch]
> ---------------------------------------------------
>
> Key: HIVE-8908
> URL: https://issues.apache.org/jira/browse/HIVE-8908
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Affects Versions: spark-branch
> Reporter: Chao
> Assignee: Chao
> Attachments: HIVE-8908.1-spark.patch
>
>
> For this query, the plan doesn't look correct:
> {noformat}
> OK
> STAGE DEPENDENCIES:
> Stage-4 is a root stage
> Stage-1 depends on stages: Stage-5, Stage-4
> Stage-2 depends on stages: Stage-1
> Stage-0 depends on stages: Stage-2
> Stage-3 depends on stages: Stage-0
> Stage-5 is a root stage
> STAGE PLANS:
> Stage: Stage-4
> Spark
> DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:6
> Vertices:
> Map 4
> Map Operator Tree:
> TableScan
> alias: x
> Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
> Filter Operator
> predicate: key is not null (type: boolean)
> Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
> Spark HashTable Sink Operator
> condition expressions:
> 0 {_col1}
> 1 {value}
> keys:
> 0 _col0 (type: string)
> 1 key (type: string)
> Reduce Output Operator
> key expressions: key (type: string)
> sort order: +
> Map-reduce partition columns: key (type: string)
> Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
> value expressions: value (type: string)
> Local Work:
> Map Reduce Local Work
> Stage: Stage-1
> Spark
> Edges:
> Union 2 <- Map 1 (NONE, 0), Map 3 (NONE, 0)
> DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:4
> Vertices:
> Map 1
> Map Operator Tree:
> TableScan
> alias: x
> Filter Operator
> predicate: (key < 20) (type: boolean)
> Select Operator
> expressions: key (type: string), value (type: string)
> outputColumnNames: _col0, _col1
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1}
> 1 {key} {value}
> keys:
> 0 _col0 (type: string)
> 1 key (type: string)
> outputColumnNames: _col1, _col2, _col3
> input vertices:
> 1 Map 4
> Select Operator
> expressions: _col2 (type: string), _col3 (type: string), _col1 (type: string)
> outputColumnNames: _col0, _col1, _col2
> File Output Operator
> compressed: false
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> name: default.dest_j1
> Local Work:
> Map Reduce Local Work
> Map 3
> Map Operator Tree:
> TableScan
> alias: x1
> Filter Operator
> predicate: (key > 100) (type: boolean)
> Select Operator
> expressions: key (type: string), value (type: string)
> outputColumnNames: _col0, _col1
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1}
> 1 {key} {value}
> keys:
> 0 _col0 (type: string)
> 1 key (type: string)
> outputColumnNames: _col1, _col2, _col3
> input vertices:
> 1 Map 4
> Select Operator
> expressions: _col2 (type: string), _col3 (type: string), _col1 (type: string)
> outputColumnNames: _col0, _col1, _col2
> File Output Operator
> compressed: false
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> name: default.dest_j1
> Local Work:
> Map Reduce Local Work
> Union 2
> Vertex: Union 2
> Stage: Stage-2
> Dependency Collection
> Stage: Stage-0
> Move Operator
> tables:
> replace: true
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> name: default.dest_j1
> Stage: Stage-3
> Stats-Aggr Operator
> Stage: Stage-5
> Spark
> DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:5
> Vertices:
> Map 4
> Map Operator Tree:
> TableScan
> alias: x
> Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
> Filter Operator
> predicate: key is not null (type: boolean)
> Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
> Spark HashTable Sink Operator
> condition expressions:
> 0 {_col1}
> 1 {value}
> keys:
> 0 _col0 (type: string)
> 1 key (type: string)
> Reduce Output Operator
> key expressions: key (type: string)
> sort order: +
> Map-reduce partition columns: key (type: string)
> Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
> value expressions: value (type: string)
> Local Work:
> Map Reduce Local Work
> Time taken: 0.127 seconds, Fetched: 156 row(s)
> {noformat}
> Note that Stage-4 and Stage-5 are identical. Also, in Stage-4 there's a parallel RS operator with the HTS operator, which is strange.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)