You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Johannes Alkjær (JIRA)" <ji...@apache.org> on 2013/10/01 00:00:27 UTC
[jira] [Commented] (HIVE-4598) Incorrect results when using
subquery in multi table insert
[ https://issues.apache.org/jira/browse/HIVE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782314#comment-13782314 ]
Johannes Alkjær commented on HIVE-4598:
---------------------------------------
Adding an extra select block, fixes the execution plan though,
{code}
EXPLAIN
FROM (
SELECT * FROM (
FROM ( SELECT * FROM sample ) mapout
REDUCE * USING 'cat' AS x,y
) reduced
) zz
insert overwrite local directory '/tmp/a' select * where x='a' or x='b'
insert overwrite local directory '/tmp/b' select * where x='c' or x='d';
{code}
{code}
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME sample))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)))) mapout)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TRANSFORM (TOK_EXPLIST TOK_ALLCOLREF) TOK_SERDE TOK_RECORDWRITER 'cat' TOK_SERDE TOK_RECORDREADER (TOK_ALIASLIST x y)))))) reduced)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)))) zz)) (TOK_INSERT (TOK_DESTINATION (TOK_LOCAL_DIR '/tmp/a')) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or (= (TOK_TABLE_OR_COL x) 'a') (= (TOK_TABLE_OR_COL x) 'b')))) (TOK_INSERT (TOK_DESTINATION (TOK_LOCAL_DIR '/tmp/b')) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or (= (TOK_TABLE_OR_COL x) 'c') (= (TOK_TABLE_OR_COL x) 'd')))))
STAGE DEPENDENCIES:
Stage-2 is a root stage
Stage-0 depends on stages: Stage-2
Stage-1 depends on stages: Stage-2
STAGE PLANS:
Stage: Stage-2
Map Reduce
Alias -> Map Operator Tree:
zz:reduced:mapout:sample
TableScan
alias: sample
Select Operator
expressions:
expr: key
type: string
expr: val
type: string
outputColumnNames: _col0, _col1
Transform Operator
command: cat
output info:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: string
outputColumnNames: _col0, _col1
Filter Operator
predicate:
expr: ((_col0 = 'a') or (_col0 = 'b'))
type: boolean
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: string
outputColumnNames: _col0, _col1
File Output Operator
compressed: false
GlobalTableId: 1
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Filter Operator
predicate:
expr: ((_col0 = 'c') or (_col0 = 'd'))
type: boolean
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: string
outputColumnNames: _col0, _col1
File Output Operator
compressed: false
GlobalTableId: 2
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Stage: Stage-0
Move Operator
files:
hdfs directory: false
destination: /tmp/a
Stage: Stage-1
Move Operator
files:
hdfs directory: false
destination: /tmp/b
{code}
> Incorrect results when using subquery in multi table insert
> -----------------------------------------------------------
>
> Key: HIVE-4598
> URL: https://issues.apache.org/jira/browse/HIVE-4598
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0, 0.11.0
> Reporter: Sebastian
>
> I'm using a multi table insert like this:
> FROM <x>
> INSERT INTO TABLE t PARTITION (type='x')
> SELECT * WHERE type='x'
> INSERT INTO TABLE t PARTITION (type='y')
> SELECT * WHERE type='y';
> Now when <x> is the name of a table, everything works as expected.
> However if I use a subquery as <x>, the query runs but it inserts all results from the subquery into each partition, as if there were no "WHERE" clauses in the selects.
--
This message was sent by Atlassian JIRA
(v6.1#6144)