You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Matthew Ho (Jira)" <ji...@apache.org> on 2022/03/11 00:27:00 UTC

[jira] [Created] (GOBBLIN-1619) WriterUtils.mkdirsWithRecursivePermission contains race condition and puts unnecessary load on filesystem

Matthew Ho created GOBBLIN-1619:
-----------------------------------

             Summary: WriterUtils.mkdirsWithRecursivePermission contains race condition and puts unnecessary load on filesystem
                 Key: GOBBLIN-1619
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1619
             Project: Apache Gobblin
          Issue Type: Bug
            Reporter: Matthew Ho


The current implementation recursively calls fs.mkdirs has the following issues:
 * *Race condition for creating parent directories, causing FileNotFound exception even when the file exists on file system*

 * {*}HDFS fs.mkdirs atomically creates missing parent directories. Thus, the recursive approach is making unnecessary calls.{*}{*}{*}

HDFS, which the current FileSystem interface is built upon, guarantees the parents will be created. So all FileSystem class implementations should also follow this behavior. 

 

*Note the [FileSystem|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html] abstract class documentation says the following:*

The behaviour of the filesystem is [specified in the Hadoop documentation. |https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html]However, the normative specification of the behavior of this class is actually HDFS: {color:#de350b}if HDFS does not behave the way these Javadocs or the specification in the Hadoop documentations define, assume that the documentation is incorrect{color}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)