You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gordon Wang (JIRA)" <ji...@apache.org> on 2014/10/13 04:10:33 UTC

[jira] [Created] (HIVE-8439) multiple insert into the same table

Gordon Wang created HIVE-8439:
---------------------------------

             Summary: multiple insert into the same table
                 Key: HIVE-8439
                 URL: https://issues.apache.org/jira/browse/HIVE-8439
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.13.0, 0.12.0
            Reporter: Gordon Wang


when putting multiple inserts for the same table in one SQL, hive query plan analyzer fails to synthesis the right plan.

Here is the reproduce steps.
{noformat}
create table T1(i int, j int);
create table T2(m int) partitioned by (n int);
explain from T1
insert into table T2 partition (n = 1)
  select T1.i where T1.j = 1
insert overwrite table T2 partition (n = 2)
  select T1.i where T1.j = 2
  ;
{noformat}
When there is a "insert into" clause in the multiple insert part, the "insert overwrite" is considered as "insert into".

I dig into the source code, looks like Hive does not support mixing "insert into" and "insert overwrite" for the same table in multiple insert clauses.

Here is my finding.
1. in semantic analyzer, when processing TOK_INSERT_INTO, the analyzer will put the table name into a set which contains all the insert into table names.
2. when generate file sink plan, the analyzer will check if the table name is in the set, if in the set, the replace flag is set to false. Here is the code snippet.
{noformat}
      // Create the work for moving the table
      // NOTE: specify Dynamic partitions in dest_tab for WriteEntity
      if (!isNonNativeTable) {
        ltd = new LoadTableDesc(queryTmpdir, ctx.getExternalTmpFileURI(dest_path.toUri()),
            table_desc, dpCtx);
        ltd.setReplace(!qb.getParseInfo().isInsertIntoTable(dest_tab.getDbName(),
            dest_tab.getTableName()));
        ltd.setLbCtx(lbCtx);

        if (holdDDLTime) {
          LOG.info("this query will not update transient_lastDdlTime!");
          ltd.setHoldDDLTime(true);
        }
        loadTableWork.add(ltd);
      }
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)