You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/09/02 18:46:00 UTC
[jira] [Work logged] (HIVE-26127) INSERT OVERWRITE throws FileNotFound when destination partition is deleted
[ https://issues.apache.org/jira/browse/HIVE-26127?focusedWorklogId=805853&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-805853 ]
ASF GitHub Bot logged work on HIVE-26127:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 02/Sep/22 18:45
Start Date: 02/Sep/22 18:45
Worklog Time Spent: 10m
Work Description: ayushtkn commented on code in PR #3561:
URL: https://github.com/apache/hive/pull/3561#discussion_r961928935
##########
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:
##########
@@ -4293,7 +4293,7 @@ private void deleteOldPathForReplace(Path destPath, Path oldPath, HiveConf conf,
// But not sure why we changed not to delete the oldPath in HIVE-8750 if it is
// not the destf or its subdir?
isOldPathUnderDestf = isSubDir(oldPath, destPath, oldFs, destFs, false);
- if (isOldPathUnderDestf) {
+ if (isOldPathUnderDestf && oldFs.exists(oldPath)) {
cleanUpOneDirectoryForReplace(oldPath, oldFs, pathFilter, conf, purge, isNeedRecycle);
Review Comment:
Well I see it is a backport but about the fix, having an exist check increased on RPC for every genuine case as well, instead shouldn't we have ignored the FNF in `` cleanUpOneDirectoryForReplace``?
Issue Time Tracking
-------------------
Worklog Id: (was: 805853)
Time Spent: 1.5h (was: 1h 20m)
> INSERT OVERWRITE throws FileNotFound when destination partition is deleted
> ---------------------------------------------------------------------------
>
> Key: HIVE-26127
> URL: https://issues.apache.org/jira/browse/HIVE-26127
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Reporter: Yu-Wen Lai
> Assignee: Yu-Wen Lai
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> Steps to reproduce:
> # create external table src (col int) partitioned by (year int);
> # create external table dest (col int) partitioned by (year int);
> # insert into src partition (year=2022) values (1);
> # insert into dest partition (year=2022) values (2);
> # hdfs dfs -rm -r ${hive.metastore.warehouse.external.dir}/dest/year=2022
> # insert overwrite table dest select * from src;
> We will get FileNotFoundException as below.
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Directory file:/home/yuwen/workdir/upstream/hive/itests/qtest/target/localfs/warehouse/ext_part/par=1 could not be cleaned up.
> at org.apache.hadoop.hive.ql.metadata.Hive.deleteOldPathForReplace(Hive.java:5387)
> at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:5282)
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartitionInternal(Hive.java:2657)
> at org.apache.hadoop.hive.ql.metadata.Hive.lambda$loadDynamicPartitions$6(Hive.java:3143)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) {code}
> It is because it call listStatus on a path doesn't exist. We should not fail insert overwrite because there is nothing to be clean up.
> {code:java}
> fs.listStatus(path, pathFilter){code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)