You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2018/10/11 16:26:00 UTC
[jira] [Commented] (HIVE-20725) Simultaneous dynamic inserts can
result in partition files lost
[ https://issues.apache.org/jira/browse/HIVE-20725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646721#comment-16646721 ]
Gopal V commented on HIVE-20725:
--------------------------------
This belongs to a class of bugs fixed via - HIVE-14535
> Simultaneous dynamic inserts can result in partition files lost
> ----------------------------------------------------------------
>
> Key: HIVE-20725
> URL: https://issues.apache.org/jira/browse/HIVE-20725
> Project: Hive
> Issue Type: Bug
> Reporter: zhuwei
> Assignee: zhuwei
> Priority: Major
>
> If two users attempt a dynamic insert into the same new partition at the same time, a possible race condition exists which result in error state. In that case the partition info has been inserted to metastore but data files been removed.
> The current logic in function "add_partition_core" in class HiveMetaStore.HMSHandler is like this :
> # check if partition already exists
> # create the partition files directory if not exists
> # try to add partition
> # if add partition failed and it created the directory in step 2, delete that directory
> Assume that two users are inserting the same partition at the same time, there are two threads operating their requests, say thread A and thread B. If 1~4 steps of thread B are all done between step 2 and step 3 of thread A. The sequence like this : A1 A2 B1 B2 B3 B4 A3 A4. The partition files written by B will be removed by A.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)