You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2017/11/28 01:17:00 UTC

[jira] [Created] (HIVE-18154) Acid Load Data/Insert with Overwrite in multi statement transactions

Eugene Koifman created HIVE-18154:
-------------------------------------

             Summary: Acid Load Data/Insert with Overwrite in multi statement transactions
                 Key: HIVE-18154
                 URL: https://issues.apache.org/jira/browse/HIVE-18154
             Project: Hive
          Issue Type: Bug
          Components: Transactions
    Affects Versions: 3.0.0
            Reporter: Eugene Koifman
            Assignee: Eugene Koifman


Consider:
{noformat}
START TRANSACTION
insert into T values(1,2),(3,4)
load data local inpath '" + getWarehouseDir() + "/1/data' overwrite into table T
update T set a = 0 where a = 6
COMMIT
{noformat}

So what we should have on disk is
{noformat}
├── base_0000028
│   ├── 000000_0
│   └── _metadata_acid
├── delete_delta_0000028_0000028_0002
│   └── bucket_00000
├── delta_0000028_0000028_0000
│   └── bucket_00000
└── delta_0000028_0000028_0002
    └── bucket_00000
{noformat}
where base_28 is from overwrite, delta_0000028_0000028_0000 from 1st insert nad delta_0000028_0000028_0002/delete_delta_0000028_0000028_0002 is from update.

AcidUtils.getAcidState() only returns base_28 thinking that all other deltas are included in it - not what we want here.  

The simple way to get correct behavior is to disallow commands with Overwrite clause in multi-statement txns.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)