You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "Seong Yeom (JIRA)" <ji...@apache.org> on 2017/10/04 17:51:00 UTC

[jira] [Commented] (HIVE-15899) Make CTAS with acid target table and insert into acid_tbl select ... union all ... work

    [ https://issues.apache.org/jira/browse/HIVE-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191700#comment-16191700 ] 

Seong Yeom commented on HIVE-15899:
-----------------------------------

[~steveyeom2017] please look at this jira

> Make CTAS with acid target table and insert into acid_tbl select ... union all ... work
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-15899
>                 URL: https://issues.apache.org/jira/browse/HIVE-15899
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>             Fix For: 3.0.0
>
>         Attachments: HIVE-15899.01.patch, HIVE-15899.02.patch, HIVE-15899.03.patch, HIVE-15899.04.patch, HIVE-15899.05.patch, HIVE-15899.07.patch, HIVE-15899.08.patch, HIVE-15899.09.patch, HIVE-15899.10.patch, HIVE-15899.11.patch, HIVE-15899.12.patch, HIVE-15899.13.patch
>
>
> Consider:
> {noformat}
> create table T stored as ORC TBLPROPERTIES('transactional'='true') as
>       select a, b from A where a <= 5 union all select a, b from B where a >= 5
> {noformat}
> and 
> {noformat}
> create table T (a int, b int) stored as ORC  TBLPROPERTIES ('transactional'='false';
> insert into T(a,b) select a, b from T where a between 1 and 3 group by a, b union all select a, b from A where a between 5 and 7 union all select a, b from B where a >= 9
> {noformat}
> On Tez, there is an optimization that removes Union All operator writes the data into
> subdirectories of T (in this case T is unpartitioned).
> This also happens on MR but requires 
> {noformat}
> hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_OPTIMIZE_UNION_REMOVE, true);
> hiveConf.setVar(HiveConf.ConfVars.HIVEFETCHTASKCONVERSION, "none");
> {noformat} 
> Need to ensure that when target table is Acid, we generate unique ROW__IDs
> When target is not acid, that we can convert it to Acid via Alter Table even when data layout includes subdirectories.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)