You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gordon Wang (JIRA)" <ji...@apache.org> on 2014/10/27 09:04:33 UTC

[jira] [Commented] (HIVE-8439) multiple insert into the same table

    [ https://issues.apache.org/jira/browse/HIVE-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184938#comment-14184938 ] 

Gordon Wang commented on HIVE-8439:
-----------------------------------

currently, hive semantic analyzer can not handle multiple insert clause correctly. When mixing "INSERT INTO" and "INSERT OVERWRITE" with the same table, semantic analyzer can not aware which clause is "OVERWRITE".

Some more information about "overwrite" clause should be recorded in QueryBlock.

> multiple insert into the same table
> -----------------------------------
>
>                 Key: HIVE-8439
>                 URL: https://issues.apache.org/jira/browse/HIVE-8439
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.12.0, 0.13.0
>            Reporter: Gordon Wang
>
> when putting multiple inserts for the same table in one SQL, hive query plan analyzer fails to synthesis the right plan.
> Here is the reproduce steps.
> {noformat}
> create table T1(i int, j int);
> create table T2(m int) partitioned by (n int);
> explain from T1
> insert into table T2 partition (n = 1)
>   select T1.i where T1.j = 1
> insert overwrite table T2 partition (n = 2)
>   select T1.i where T1.j = 2
>   ;
> {noformat}
> When there is a "insert into" clause in the multiple insert part, the "insert overwrite" is considered as "insert into".
> I dig into the source code, looks like Hive does not support mixing "insert into" and "insert overwrite" for the same table in multiple insert clauses.
> Here is my finding.
> 1. in semantic analyzer, when processing TOK_INSERT_INTO, the analyzer will put the table name into a set which contains all the insert into table names.
> 2. when generating file sink plan, the analyzer will check if the table name is in the set, if in the set, the replace flag is set to false. Here is the code snippet.
> {noformat}
>       // Create the work for moving the table
>       // NOTE: specify Dynamic partitions in dest_tab for WriteEntity
>       if (!isNonNativeTable) {
>         ltd = new LoadTableDesc(queryTmpdir, ctx.getExternalTmpFileURI(dest_path.toUri()),
>             table_desc, dpCtx);
>         ltd.setReplace(!qb.getParseInfo().isInsertIntoTable(dest_tab.getDbName(),
>             dest_tab.getTableName()));
>         ltd.setLbCtx(lbCtx);
>         if (holdDDLTime) {
>           LOG.info("this query will not update transient_lastDdlTime!");
>           ltd.setHoldDDLTime(true);
>         }
>         loadTableWork.add(ltd);
>       }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)