You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/09/02 18:46:00 UTC

[jira] [Work logged] (HIVE-26127) INSERT OVERWRITE throws FileNotFound when destination partition is deleted

     [ https://issues.apache.org/jira/browse/HIVE-26127?focusedWorklogId=805853&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-805853 ]

ASF GitHub Bot logged work on HIVE-26127:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Sep/22 18:45
            Start Date: 02/Sep/22 18:45
    Worklog Time Spent: 10m 
      Work Description: ayushtkn commented on code in PR #3561:
URL: https://github.com/apache/hive/pull/3561#discussion_r961928935


##########
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:
##########
@@ -4293,7 +4293,7 @@ private void deleteOldPathForReplace(Path destPath, Path oldPath, HiveConf conf,
       // But not sure why we changed not to delete the oldPath in HIVE-8750 if it is
       // not the destf or its subdir?
       isOldPathUnderDestf = isSubDir(oldPath, destPath, oldFs, destFs, false);
-      if (isOldPathUnderDestf) {
+      if (isOldPathUnderDestf && oldFs.exists(oldPath)) {
         cleanUpOneDirectoryForReplace(oldPath, oldFs, pathFilter, conf, purge, isNeedRecycle);

Review Comment:
   Well I see it is a backport but about the fix, having an exist check increased on RPC for every genuine case as well, instead shouldn't we have ignored the FNF in `` cleanUpOneDirectoryForReplace``?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 805853)
    Time Spent: 1.5h  (was: 1h 20m)

> INSERT OVERWRITE throws FileNotFound when destination partition is deleted 
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-26127
>                 URL: https://issues.apache.org/jira/browse/HIVE-26127
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Yu-Wen Lai
>            Assignee: Yu-Wen Lai
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0-alpha-2
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Steps to reproduce:
>  # create external table src (col int) partitioned by (year int);
>  # create external table dest (col int) partitioned by (year int);
>  # insert into src partition (year=2022) values (1);
>  # insert into dest partition (year=2022) values (2);
>  # hdfs dfs -rm -r ${hive.metastore.warehouse.external.dir}/dest/year=2022
>  # insert overwrite table dest select * from src;
> We will get FileNotFoundException as below.
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Directory file:/home/yuwen/workdir/upstream/hive/itests/qtest/target/localfs/warehouse/ext_part/par=1 could not be cleaned up.
>     at org.apache.hadoop.hive.ql.metadata.Hive.deleteOldPathForReplace(Hive.java:5387)
>     at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:5282)
>     at org.apache.hadoop.hive.ql.metadata.Hive.loadPartitionInternal(Hive.java:2657)
>     at org.apache.hadoop.hive.ql.metadata.Hive.lambda$loadDynamicPartitions$6(Hive.java:3143)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748) {code}
> It is because it call listStatus on a path doesn't exist. We should not fail insert overwrite because there is nothing to be clean up.
> {code:java}
> fs.listStatus(path, pathFilter){code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)