You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ido Hadanny <id...@gmail.com> on 2011/07/20 15:09:31 UTC
Questions about hive explain-plan

Hey,
I'm trying to understand a hive explain-plan I generated and understand
what's what. I couldn't find any documentation as this
<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain>is
too high-level
basically, the query is an equi-join of 8 tables on the same 2 keys:

select * from c
 left outer join i1 on c.PURSWAY_ID1 = i1.PURSWAY_ID1 and c.PURSWAY_ID2 =
i1.PURSWAY_ID2
 left outer join i2 on c.PURSWAY_ID1 = i2.PURSWAY_ID1 and c.PURSWAY_ID2 =
i2.PURSWAY_ID2
...
 left outer join i7 on c.PURSWAY_ID1 = i7.PURSWAY_ID1 and c.PURSWAY_ID2 =
i7.PURSWAY_ID2

The explain plan at the end of this mail, with parts I need clarification on
in red.

Questions:
1. what is the meaning of the "Map Reduce" line at the beginning?
2. is Select Operator a map-side only operation? if so, why is the reduce
output operator indented inside it?
3. what is the meaning of the "sort order: ++" line?
4. Join Operator - what is the meaning of condition map and condition
expressions?
5. meaning of handleSkewJoin: false
6. where can I find the temporary output files of each map operation?

I'll be grateful for every bit of help :)

  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        c:ido_standard_connections
          TableScan
            alias: ido_standard_connections
            Select Operator
              expressions:
                    expr: pursway_id1
                    expr: pursway_id2
              outputColumnNames: _col0, _col1
              Reduce Output Operator
                key expressions:
                      expr: _col0
                      expr: _col1
                sort order: ++
                Map-reduce partition columns:
                      expr: _col0
                      expr: _col1
                tag: 0
                value expressions:
                      expr: _col0
                      expr: _col1
        i1
          TableScan
            alias: i1
            Reduce Output Operator
              key expressions:
                    expr: pursway_id1
                    expr: pursway_id2
              sort order: ++
              Map-reduce partition columns:
                    expr: pursway_id1
                    expr: pursway_id2
              tag: 3
              value expressions:
                    expr: ind_products_cnt
                    expr: ind_products_min_diff_days
                    expr: ind_products_expectancy
        ...
      Reduce Operator Tree:
        Join Operator
          condition map:
               Left Outer Join0 to 1
               Left Outer Join0 to 2
               ...
               Left Outer Join0 to 9
          condition expressions:
            0 {VALUE._col0} {VALUE._col1}
            1 {VALUE._col2}
            2 {VALUE._col2}
            3 {VALUE._col2} {VALUE._col3} {VALUE._col4}
            4 {VALUE._col2} {VALUE._col3} {VALUE._col4}
            5 {VALUE._col2} {VALUE._col3} {VALUE._col4}
            6 {VALUE._col2} {VALUE._col3} {VALUE._col4}
            7 {VALUE._col2} {VALUE._col3} {VALUE._col4}
            8 {VALUE._col2} {VALUE._col3} {VALUE._col4}
            9 {VALUE._col2} {VALUE._col3} {VALUE._col4}
          handleSkewJoin: false
          outputColumnNames: _col0, _col1, _col4, _col9 ... _col58
          Select Operator
            expressions:
                  expr: _col0
                  expr: _col1
                  ...
                  expr: _col58
            outputColumnNames: _col0, _col1, _col2 ... _col24
            File Output Operator
              compressed: false
              GlobalTableId: 0
              table:
                  input format: org.apache.hadoop.mapred.TextInputFormat
                  output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat