You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "zhuwei (JIRA)" <ji...@apache.org> on 2018/10/11 09:45:00 UTC
[jira] [Created] (HIVE-20725) Simultaneous dynamic inserts can
result in partition files lost
zhuwei created HIVE-20725:
-----------------------------
Summary: Simultaneous dynamic inserts can result in partition files lost
Key: HIVE-20725
URL: https://issues.apache.org/jira/browse/HIVE-20725
Project: Hive
Issue Type: Bug
Reporter: zhuwei
Assignee: zhuwei
If two users attempt a dynamic insert into the same new partition at the same time, a possible race condition exists which result in error state. In that case the partition info has been inserted to metastore but data files been removed.
The current logic in function "add_partition_core" in class HiveMetaStore.HMSHandler is like this :
# check if partition already exists
# create the partition files directory if not exists
# try to add partition
# if add partition failed and it created the directory in step 2, delete that directory
Assume that two users are inserting the same partition at the same time, there are two threads operating their requests, say thread A and thread B. If 1~4 steps of thread B are all done between step 2 and step 3 of thread A. The sequence like this : A1 A2 B1 B2 B3 B4 A3 A4. The partition files written by B will be removed by A.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)