You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Szehon Ho (JIRA)" <ji...@apache.org> on 2014/11/12 19:55:35 UTC

[jira] [Created] (HIVE-8842) auto_join2.q produces incorrect tree [Spark Branch]

Szehon Ho created HIVE-8842:
-------------------------------

             Summary: auto_join2.q produces incorrect tree [Spark Branch]
                 Key: HIVE-8842
                 URL: https://issues.apache.org/jira/browse/HIVE-8842
             Project: Hive
          Issue Type: Sub-task
            Reporter: Szehon Ho


Enabling the SparkMapJoinResolver and SparkReduceSinkMapJoinProc, I see the following:

{noformat}
explain select * from src src1 JOIN src src2 ON (src1.key = src2.key) JOIN src src3 ON (src1.key + src2.key = src3.key);
{noformat}

produces too many stages (six), and too many HashTableSink.
{noformat}
STAGE DEPENDENCIES:
  Stage-5 is a root stage
  Stage-4 depends on stages: Stage-5
  Stage-0 depends on stages: Stage-4

STAGE PLANS:
  Stage: Stage-5
    Map Reduce Local Work
      Alias -> Map Local Tables:
        src1 
          Fetch Operator
            limit: -1
        src2 
          Fetch Operator
            limit: -1
      Alias -> Map Local Operator Tree:
        src1 
          TableScan
            alias: src1
            Statistics: Num rows: 29 Data size: 5812 Basic stats: COMPLETE Column stats: NONE
            Filter Operator
              predicate: key is not null (type: boolean)
              Statistics: Num rows: 15 Data size: 3006 Basic stats: COMPLETE Column stats: NONE
              HashTable Sink Operator
                condition expressions:
                  0 {value}
                  1 {value}
                  2 {key} {value}
                keys:
                  0 key (type: string)
                  1 key (type: string)
                  2 key (type: string)
        src2 
          TableScan
            alias: src2
            Statistics: Num rows: 29 Data size: 5812 Basic stats: COMPLETE Column stats: NONE
            Filter Operator
              predicate: key is not null (type: boolean)
              Statistics: Num rows: 15 Data size: 3006 Basic stats: COMPLETE Column stats: NONE
              HashTable Sink Operator
                condition expressions:
                  0 {value}
                  1 {value}
                  2 {key} {value}
                keys:
                  0 key (type: string)
                  1 key (type: string)
                  2 key (type: string)

  Stage: Stage-4
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: src3
            Statistics: Num rows: 29 Data size: 5812 Basic stats: COMPLETE Column stats: NONE
            Filter Operator
              predicate: key is not null (type: boolean)
              Statistics: Num rows: 15 Data size: 3006 Basic stats: COMPLETE Column stats: NONE
              Map Join Operator
                condition map:
                     Inner Join 0 to 1
                     Inner Join 1 to 2
                condition expressions:
                  0 {key} {value}
                  1 {key} {value}
                  2 {key} {value}
                keys:
                  0 key (type: string)
                  1 key (type: string)
                  2 key (type: string)
                outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11
                Statistics: Num rows: 33 Data size: 6613 Basic stats: COMPLETE Column stats: NONE
                Select Operator
                  expressions: _col0 (type: string), _col1 (type: string), _col5 (type: string), _col6 (type: string), _col10 (type: string), _col11 (type: string)
                  outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
                  Statistics: Num rows: 33 Data size: 6613 Basic stats: COMPLETE Column stats: NONE
                  File Output Operator
                    compressed: false
                    Statistics: Num rows: 33 Data size: 6613 Basic stats: COMPLETE Column stats: NONE
                    table:
                        input format: org.apache.hadoop.mapred.TextInputFormat
                        output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                        serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
      Local Work:
        Map Reduce Local Work

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink


{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)