You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2022/01/13 10:24:00 UTC
[jira] [Created] (HADOOP-18081) FileNotFoundException in abfs mkdirs() call
Steve Loughran created HADOOP-18081:
---------------------------------------
Summary: FileNotFoundException in abfs mkdirs() call
Key: HADOOP-18081
URL: https://issues.apache.org/jira/browse/HADOOP-18081
Project: Hadoop Common
Issue Type: Bug
Components: fs/azure
Affects Versions: 3.3.1
Reporter: Steve Loughran
seen in production: calling mkdirs in FileOutputCommitter setupJob is triggering an FNFE
{code}
java.io.FileNotFoundException: Operation failed: "The specified path does not exist.", 404, PUT, https://bcket.dfs.core.windows.net/table1/_temporary/0?resource=directory&timeout=90, PathNotFound, "The specified path does not exist."
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:1131)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.mkdirs(AzureBlobFileSystem.java:445)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2347)
{code}
I suspect what is happening is that while this job is setting up, a previous job is doing cleanup/abort on the same path
assuming that abfs mkdirs is like the posix one -nonatomic, as it goes up/down the chain of parent dirs, something else gets in the way.
if so, this is something which can be handled in the client -when we get an FNFE we could warn and retry.
in the manifest committer each job will have a unique id under _temporary and there will be the option to skip deleting the temp dir entirely, for better coexistence of active jobs.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org