You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Jinyang Li (Jira)" <ji...@apache.org> on 2019/09/27 17:49:00 UTC

[jira] [Created] (HIVE-22251) Existing files on new partition path not cleared during insert overwrite

Jinyang Li created HIVE-22251:
---------------------------------

             Summary: Existing files on new partition path not cleared during insert overwrite
                 Key: HIVE-22251
                 URL: https://issues.apache.org/jira/browse/HIVE-22251
             Project: Hive
          Issue Type: Bug
    Affects Versions: 2.3.4, 0.13.0
            Reporter: Jinyang Li


*Description*

When insert overwrite to a new partition, if there are files already exist on the partition path, Hive will not clear them and cause extra files in final partition location.

Reading the partition may return extra incorrect result.

 

*Reproduce*
 # Make file exist on partition path 
hdfs dfs -mkdir hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/
hdfs dfs -put 000000_0 hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_1
 # insert overwrite table jl_test1 partition (ds='1') select `(ds)?+.+` from src_table limit 100;
 # Found two files in the partition location
hdfs dfs -ls hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/
Found 2 items
-rw-r--r-- 3 jinyang_li supergroup 2770 2019-09-27 06:53 hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_0
-rw-r--r-- 3 jinyang_li supergroup 8483 2019-09-27 06:50 hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)