You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Jinyang Li (Jira)" <ji...@apache.org> on 2019/09/27 17:49:00 UTC
[jira] [Created] (HIVE-22251) Existing files on new partition path
not cleared during insert overwrite
Jinyang Li created HIVE-22251:
---------------------------------
Summary: Existing files on new partition path not cleared during insert overwrite
Key: HIVE-22251
URL: https://issues.apache.org/jira/browse/HIVE-22251
Project: Hive
Issue Type: Bug
Affects Versions: 2.3.4, 0.13.0
Reporter: Jinyang Li
*Description*
When insert overwrite to a new partition, if there are files already exist on the partition path, Hive will not clear them and cause extra files in final partition location.
Reading the partition may return extra incorrect result.
*Reproduce*
# Make file exist on partition path
hdfs dfs -mkdir hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/
hdfs dfs -put 000000_0 hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_1
# insert overwrite table jl_test1 partition (ds='1') select `(ds)?+.+` from src_table limit 100;
# Found two files in the partition location
hdfs dfs -ls hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/
Found 2 items
-rw-r--r-- 3 jinyang_li supergroup 2770 2019-09-27 06:53 hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_0
-rw-r--r-- 3 jinyang_li supergroup 8483 2019-09-27 06:50 hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_1
--
This message was sent by Atlassian Jira
(v8.3.4#803005)