You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/29 05:43:00 UTC

[GitHub] [hudi] yihua commented on a diff in pull request #5048: [HUDI-3634] Could read empty or partial HoodieCommitMetaData in downstream if using HDFS

yihua commented on code in PR #5048:
URL: https://github.com/apache/hudi/pull/5048#discussion_r909216322


##########
hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java:
##########
@@ -986,7 +991,63 @@ public long getBytesWritten(Path file) {
         file.toString() + " does not have a open stream. Cannot get the bytes written on the stream");
   }
 
+  protected boolean needCreateTempFile() {
+    return HDFS.getScheme().equals(fileSystem.getScheme());
+  }
+
+  /**
+   * Creates a new file with overwrite set to false. This ensures files are created
+   * only once and never rewritten, also, here we take care if the content is not
+   * empty, will first write the content to a temp file if {needCreateTempFile} is
+   * true, and then rename it back after the content is written.
+   *
+   * @param fullPath File Path
+   * @param content Content to be stored
+   */
+  public void createImmutableFileInPath(Path fullPath, Option<byte[]> content)
+      throws HoodieIOException {
+    FSDataOutputStream fsout = null;
+    Path tmpPath = null;
+
+    boolean needTempFile = needCreateTempFile();
+
+    try {
+      if (!content.isPresent()) {
+        fsout = fileSystem.create(fullPath, false);
+      }
+
+      if (content.isPresent() && needTempFile) {
+        Path parent = fullPath.getParent();
+        tmpPath = new Path(parent, fullPath.getName() + TMP_PATH_POSTFIX);
+        fsout = fileSystem.create(tmpPath, false);
+        fsout.write(content.get());
+      }
+
+      if (content.isPresent() && !needTempFile) {
+        fsout = fileSystem.create(fullPath, false);
+        fsout.write(content.get());
+      }
+    } catch (IOException e) {
+      throw new HoodieIOException("Failed to create file " + fullPath, e);
+    } finally {
+      try {
+        if (null != fsout) {
+          fsout.close();
+        }
+        if (null != tmpPath) {
+          fileSystem.rename(tmpPath, fullPath);
+        }
+      } catch (IOException e) {
+        throw new HoodieIOException("Failed to close file " + fullPath, e);

Review Comment:
   nit: the error message should be different if erroring out during renaming.  `fsout` can also be `tmpPath` instead of `fullPath`.



##########
hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java:
##########
@@ -986,7 +991,63 @@ public long getBytesWritten(Path file) {
         file.toString() + " does not have a open stream. Cannot get the bytes written on the stream");
   }
 
+  protected boolean needCreateTempFile() {
+    return HDFS.getScheme().equals(fileSystem.getScheme());
+  }
+
+  /**
+   * Creates a new file with overwrite set to false. This ensures files are created
+   * only once and never rewritten, also, here we take care if the content is not
+   * empty, will first write the content to a temp file if {needCreateTempFile} is
+   * true, and then rename it back after the content is written.
+   *
+   * @param fullPath File Path
+   * @param content Content to be stored
+   */
+  public void createImmutableFileInPath(Path fullPath, Option<byte[]> content)
+      throws HoodieIOException {
+    FSDataOutputStream fsout = null;
+    Path tmpPath = null;
+
+    boolean needTempFile = needCreateTempFile();
+
+    try {
+      if (!content.isPresent()) {
+        fsout = fileSystem.create(fullPath, false);
+      }
+
+      if (content.isPresent() && needTempFile) {
+        Path parent = fullPath.getParent();
+        tmpPath = new Path(parent, fullPath.getName() + TMP_PATH_POSTFIX);
+        fsout = fileSystem.create(tmpPath, false);
+        fsout.write(content.get());
+      }
+
+      if (content.isPresent() && !needTempFile) {
+        fsout = fileSystem.create(fullPath, false);
+        fsout.write(content.get());
+      }
+    } catch (IOException e) {
+      throw new HoodieIOException("Failed to create file " + fullPath, e);
+    } finally {
+      try {
+        if (null != fsout) {
+          fsout.close();
+        }
+        if (null != tmpPath) {
+          fileSystem.rename(tmpPath, fullPath);
+        }
+      } catch (IOException e) {
+        throw new HoodieIOException("Failed to close file " + fullPath, e);
+      }
+    }
+  }
+
   public FileSystem getFileSystem() {
     return fileSystem;
   }
+
+  public ConsistencyGuard getConsistencyGuard() {
+    return consistencyGuard;
+  }

Review Comment:
   If this is only for tests, could you remove it and instantiate the consistency guard separately in the test?



##########
hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java:
##########
@@ -986,7 +991,63 @@ public long getBytesWritten(Path file) {
         file.toString() + " does not have a open stream. Cannot get the bytes written on the stream");
   }
 
+  protected boolean needCreateTempFile() {
+    return HDFS.getScheme().equals(fileSystem.getScheme());
+  }
+
+  /**
+   * Creates a new file with overwrite set to false. This ensures files are created
+   * only once and never rewritten, also, here we take care if the content is not
+   * empty, will first write the content to a temp file if {needCreateTempFile} is
+   * true, and then rename it back after the content is written.
+   *
+   * @param fullPath File Path
+   * @param content Content to be stored
+   */
+  public void createImmutableFileInPath(Path fullPath, Option<byte[]> content)
+      throws HoodieIOException {
+    FSDataOutputStream fsout = null;
+    Path tmpPath = null;
+
+    boolean needTempFile = needCreateTempFile();
+
+    try {
+      if (!content.isPresent()) {
+        fsout = fileSystem.create(fullPath, false);
+      }
+
+      if (content.isPresent() && needTempFile) {
+        Path parent = fullPath.getParent();
+        tmpPath = new Path(parent, fullPath.getName() + TMP_PATH_POSTFIX);
+        fsout = fileSystem.create(tmpPath, false);
+        fsout.write(content.get());
+      }
+
+      if (content.isPresent() && !needTempFile) {
+        fsout = fileSystem.create(fullPath, false);
+        fsout.write(content.get());
+      }

Review Comment:
   nit: this can be folded into the else branch, sth like:
   ```
   if (!content.isPresent()) {
     ...
   } else if (needTempFile) {
     ...
   } else {
     ...
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org