You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hui An (JIRA)" <ji...@apache.org> on 2019/08/02 07:52:00 UTC

[jira] [Assigned] (HIVE-22077) Inserting overwrite partitions clause does not clean directories while partitions' info is not stored in metadata

     [ https://issues.apache.org/jira/browse/HIVE-22077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hui An reassigned HIVE-22077:
-----------------------------


> Inserting overwrite partitions clause does not clean directories while partitions' info is not stored in metadata
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-22077
>                 URL: https://issues.apache.org/jira/browse/HIVE-22077
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 2.3.4, 1.1.1, 4.0.0
>            Reporter: Hui An
>            Assignee: Hui An
>            Priority: Major
>
> Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata.
> Steps to Reproduce this issue : 
> ------------------------------------------------
> 1. Create a managed table :
> ------------------------------------------------
> {code:sql}
>  CREATE TABLE `test`(                               
>    `id` string)                                     
>  PARTITIONED BY (                                   
>    `dayno` string)                                  
>  ROW FORMAT SERDE                                   
>    'org.apache.hadoop.hive.ql.io.orc.OrcSerde'      
>  STORED AS INPUTFORMAT                              
>    'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
>  OUTPUTFORMAT                                       
>    'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
>  LOCATION                                           |
>    'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' 
>  TBLPROPERTIES (                                    
>    'transient_lastDdlTime'='1564731656')   
> {code}
> ------------------------------------------------
> 2. Create partition's directory and put some data under it
> ------------------------------------------------
> {code:java}
> hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802
> hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802
> {code}
> ------------------------------------------------
> 3. Insert overwrite partition dayno=20190802
> ------------------------------------------------
> {code:sql}
> INSERT OVERWRITE TABLE test PARTITION(dayno='20190802')
> SELECT 1;
> {code}
> ------------------------------------------------
> 4. We could see the test.data under partition directory is not deleted.
> ------------------------------------------------



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)