You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2013/01/21 06:06:12 UTC
[jira] [Created] (HIVE-3922) repeated columns in join keys and join
values
Namit Jain created HIVE-3922:
--------------------------------
Summary: repeated columns in join keys and join values
Key: HIVE-3922
URL: https://issues.apache.org/jira/browse/HIVE-3922
Project: Hive
Issue Type: Improvement
Components: Query Processor
Reporter: Namit Jain
A simple query like:
> explain select a.*, b.* from src a join src b on a.key = b.key;
OK
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME src) a) (TOK_TABREF (TOK_TABNAME src) b) (= (. (TOK_TABLE_OR_COL a) key) (. (TOK_TABLE_OR_COL b) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME b))))))
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-1
Map Reduce
Alias -> Map Operator Tree:
a
TableScan
alias: a
Reduce Output Operator
key expressions:
expr: key
type: string
sort order: +
Map-reduce partition columns:
expr: key
type: string
tag: 0
value expressions:
expr: key
type: string
expr: value
type: string
b
TableScan
alias: b
Reduce Output Operator
key expressions:
expr: key
type: string
sort order: +
Map-reduce partition columns:
expr: key
type: string
tag: 1
value expressions:
expr: key
type: string
expr: value
type: string
shows that a.key and b.key are transferred twice across the map-reduce
boundaries. This should be done away with.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira