You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2013/01/21 06:06:12 UTC

[jira] [Created] (HIVE-3922) repeated columns in join keys and join values

Namit Jain created HIVE-3922:
--------------------------------

             Summary: repeated columns in join keys and join values
                 Key: HIVE-3922
                 URL: https://issues.apache.org/jira/browse/HIVE-3922
             Project: Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain


A simple query like:


    > explain select a.*, b.* from src a join src b on a.key = b.key;         
OK
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) a) (TOK_TABREF (TOK_TABNAME src) b) (= (. (TOK_TABLE_OR_COL a) key) (. (TOK_TABLE_OR_COL b) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME b))))))

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        a 
          TableScan
            alias: a
            Reduce Output Operator
              key expressions:
                    expr: key
                    type: string
              sort order: +
              Map-reduce partition columns:
                    expr: key
                    type: string
              tag: 0
              value expressions:
                    expr: key
                    type: string
                    expr: value
                    type: string
        b 
          TableScan
            alias: b
            Reduce Output Operator
              key expressions:
                    expr: key
                    type: string
              sort order: +
              Map-reduce partition columns:
                    expr: key
                    type: string
              tag: 1
              value expressions:
                    expr: key
                    type: string
                    expr: value
                    type: string


shows that a.key and b.key are transferred twice across the map-reduce
boundaries. This should be done away with.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira