You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Szehon Ho (JIRA)" <ji...@apache.org> on 2014/11/21 02:04:34 UTC

[jira] [Commented] (HIVE-8908) Investigate test failure on join34.q [Spark Branch]

    [ https://issues.apache.org/jira/browse/HIVE-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220341#comment-14220341 ] 

Szehon Ho commented on HIVE-8908:
---------------------------------

+1 pending tests.  As we discussed, let's file a follow-up JIRA to track the fix, as it involves union operator processing and its more complex.

> Investigate test failure on join34.q [Spark Branch]
> ---------------------------------------------------
>
>                 Key: HIVE-8908
>                 URL: https://issues.apache.org/jira/browse/HIVE-8908
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Chao
>            Assignee: Chao
>         Attachments: HIVE-8908.1-spark.patch
>
>
> For this query, the plan doesn't look correct:
> {noformat}
> OK
> STAGE DEPENDENCIES:
>   Stage-4 is a root stage
>   Stage-1 depends on stages: Stage-5, Stage-4
>   Stage-2 depends on stages: Stage-1
>   Stage-0 depends on stages: Stage-2
>   Stage-3 depends on stages: Stage-0
>   Stage-5 is a root stage
> STAGE PLANS:
>   Stage: Stage-4
>     Spark
>       DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:6
>       Vertices:
>         Map 4 
>             Map Operator Tree:
>                 TableScan
>                   alias: x
>                   Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
>                   Filter Operator
>                     predicate: key is not null (type: boolean)
>                     Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
>                     Spark HashTable Sink Operator
>                       condition expressions:
>                         0 {_col1}
>                         1 {value}
>                       keys:
>                         0 _col0 (type: string)
>                         1 key (type: string)
>                     Reduce Output Operator
>                       key expressions: key (type: string)
>                       sort order: +
>                       Map-reduce partition columns: key (type: string)
>                       Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
>                       value expressions: value (type: string)
>             Local Work:
>               Map Reduce Local Work
>   Stage: Stage-1
>     Spark
>       Edges:
>         Union 2 <- Map 1 (NONE, 0), Map 3 (NONE, 0)
>       DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:4
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: x
>                   Filter Operator
>                     predicate: (key < 20) (type: boolean)
>                     Select Operator
>                       expressions: key (type: string), value (type: string)
>                       outputColumnNames: _col0, _col1
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         condition expressions:
>                           0 {_col1}
>                           1 {key} {value}
>                         keys:
>                           0 _col0 (type: string)
>                           1 key (type: string)
>                         outputColumnNames: _col1, _col2, _col3
>                         input vertices:
>                           1 Map 4
>                         Select Operator
>                           expressions: _col2 (type: string), _col3 (type: string), _col1 (type: string)
>                           outputColumnNames: _col0, _col1, _col2
>                           File Output Operator
>                             compressed: false
>                             table:
>                                 input format: org.apache.hadoop.mapred.TextInputFormat
>                                 output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                                 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>                                 name: default.dest_j1
>             Local Work:
>               Map Reduce Local Work
>         Map 3 
>             Map Operator Tree:
>                 TableScan
>                   alias: x1
>                   Filter Operator
>                     predicate: (key > 100) (type: boolean)
>                     Select Operator
>                       expressions: key (type: string), value (type: string)
>                       outputColumnNames: _col0, _col1
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         condition expressions:
>                           0 {_col1}
>                           1 {key} {value}
>                         keys:
>                           0 _col0 (type: string)
>                           1 key (type: string)
>                         outputColumnNames: _col1, _col2, _col3
>                         input vertices:
>                           1 Map 4
>                         Select Operator
>                           expressions: _col2 (type: string), _col3 (type: string), _col1 (type: string)
>                           outputColumnNames: _col0, _col1, _col2
>                           File Output Operator
>                             compressed: false
>                             table:
>                                 input format: org.apache.hadoop.mapred.TextInputFormat
>                                 output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                                 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>                                 name: default.dest_j1
>             Local Work:
>               Map Reduce Local Work
>         Union 2 
>             Vertex: Union 2
>   Stage: Stage-2
>     Dependency Collection
>   Stage: Stage-0
>     Move Operator
>       tables:
>           replace: true
>           table:
>               input format: org.apache.hadoop.mapred.TextInputFormat
>               output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>               serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>               name: default.dest_j1
>   Stage: Stage-3
>     Stats-Aggr Operator
>   Stage: Stage-5
>     Spark
>       DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:5
>       Vertices:
>         Map 4 
>             Map Operator Tree:
>                 TableScan
>                   alias: x
>                   Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
>                   Filter Operator
>                     predicate: key is not null (type: boolean)
>                     Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
>                     Spark HashTable Sink Operator
>                       condition expressions:
>                         0 {_col1}
>                         1 {value}
>                       keys:
>                         0 _col0 (type: string)
>                         1 key (type: string)
>                     Reduce Output Operator
>                       key expressions: key (type: string)
>                       sort order: +
>                       Map-reduce partition columns: key (type: string)
>                       Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE
>                       value expressions: value (type: string)
>             Local Work:
>               Map Reduce Local Work
> Time taken: 0.127 seconds, Fetched: 156 row(s)
> {noformat}
> Note that Stage-4 and Stage-5 are identical. Also, in Stage-4 there's a parallel RS operator with the HTS operator, which is strange.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)