You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "hbgstc123 (via GitHub)" <gi...@apache.org> on 2023/04/29 09:35:47 UTC

[GitHub] [hudi] hbgstc123 opened a new pull request, #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

hbgstc123 opened a new pull request, #8610:
URL: https://github.com/apache/hudi/pull/8610

   …ry to complete the same instant
   
   ### Change Logs
   
   Now if to task try to complete the same instant, a "xxx.tmp" file will leave in the .hoodie dir.
   
   For example a flink ingestion job with offline compaction, the ingestion job and offline compaction could both trigger clean task, and there are chances 2 clean task running the same clean instant, and the slow one will fail to rename tmp file(e.g. 20230429171948763.clean.tmp) to final file name (e.g. 20230429171948763.clean), leaving tmp file in timeline.
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1619353878

   @hbgstc123 Can you update the PR and resolve the test failures.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1631825131

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18512",
       "triggerID" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758) 
   * 0bd8531ac13126771f84db19fa02fbe1828b762d UNKNOWN
   * bfc9450676a0742bad6e25ee1e56aa240e31fe74 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18512) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1528802806

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1698577888

   We need a lock anyway.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hbgstc123 commented on pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

Posted by "hbgstc123 (via GitHub)" <gi...@apache.org>.
hbgstc123 commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1551595956

   yes, cause flink offline table service will run clean, may conflict with the clean process of the main writing job.
   Its hard to make sure only one clean running at a time, I think we better delete the tmp file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1632503786

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18512",
       "triggerID" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d28d8dc0e7a243f6ce8806c7b345ed17ecb4b619",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18523",
       "triggerID" : "d28d8dc0e7a243f6ce8806c7b345ed17ecb4b619",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0bd8531ac13126771f84db19fa02fbe1828b762d UNKNOWN
   * d28d8dc0e7a243f6ce8806c7b345ed17ecb4b619 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18523) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 merged pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 merged PR #8610:
URL: https://github.com/apache/hudi/pull/8610


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1697126572

   @danny0405 @hbgstc123 hello, why not throw exception?  in #9212 case, if we not throw error, the multi job in the same action can create the same request file too, because rename will not throw exception


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8610:
URL: https://github.com/apache/hudi/pull/8610#discussion_r1194799259


##########
hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java:
##########
@@ -1039,21 +1039,27 @@ public void createImmutableFileInPath(Path fullPath, Option<byte[]> content)
         fsout.write(content.get());
       }
     } catch (IOException e) {
-      String errorMsg = "Failed to create file" + (tmpPath != null ? tmpPath : fullPath);
+      String errorMsg = "Failed to create file " + (tmpPath != null ? tmpPath : fullPath);
       throw new HoodieIOException(errorMsg, e);
     } finally {
       try {
         if (null != fsout) {
           fsout.close();
         }
       } catch (IOException e) {
-        String errorMsg = "Failed to close file" + (needTempFile ? tmpPath : fullPath);
+        String errorMsg = "Failed to close file " + (needTempFile ? tmpPath : fullPath);
         throw new HoodieIOException(errorMsg, e);
       }
 
       try {
         if (null != tmpPath) {
-          fileSystem.rename(tmpPath, fullPath);
+          boolean renameSuccess = fileSystem.rename(tmpPath, fullPath);
+          if (!renameSuccess) {
+            fileSystem.delete(tmpPath, false);
+            LOG.error("Fail to rename " + tmpPath + " to " + fullPath);
+            throw new HoodieIOException("Failed to rename " + tmpPath + " to the target " + fullPath
+                + ", target file exists: " + fileSystem.exists(fullPath));
+          }

Review Comment:
   The exception `HoodieIOException` is catched up by the outer code, can we avoid that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1631819960

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758) 
   * 0bd8531ac13126771f84db19fa02fbe1828b762d UNKNOWN
   * bfc9450676a0742bad6e25ee1e56aa240e31fe74 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on a diff in pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on code in PR #8610:
URL: https://github.com/apache/hudi/pull/8610#discussion_r1309663986


##########
hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java:
##########
@@ -1039,21 +1039,27 @@ public void createImmutableFileInPath(Path fullPath, Option<byte[]> content)
         fsout.write(content.get());
       }
     } catch (IOException e) {
-      String errorMsg = "Failed to create file" + (tmpPath != null ? tmpPath : fullPath);
+      String errorMsg = "Failed to create file " + (tmpPath != null ? tmpPath : fullPath);
       throw new HoodieIOException(errorMsg, e);
     } finally {
       try {
         if (null != fsout) {
           fsout.close();
         }
       } catch (IOException e) {
-        String errorMsg = "Failed to close file" + (needTempFile ? tmpPath : fullPath);
+        String errorMsg = "Failed to close file " + (needTempFile ? tmpPath : fullPath);
         throw new HoodieIOException(errorMsg, e);
       }
 
       try {
         if (null != tmpPath) {
-          fileSystem.rename(tmpPath, fullPath);
+          boolean renameSuccess = fileSystem.rename(tmpPath, fullPath);
+          if (!renameSuccess) {
+            fileSystem.delete(tmpPath, false);
+            LOG.error("Fail to rename " + tmpPath + " to " + fullPath);
+            throw new HoodieIOException("Failed to rename " + tmpPath + " to the target " + fullPath
+                + ", target file exists: " + fileSystem.exists(fullPath));
+          }

Review Comment:
   @danny0405 @hbgstc123 hello, why not throw exception? in https://github.com/apache/hudi/pull/9212 case, if we not throw error, the multi job in the same action can create the same request file too, because rename will not throw exception



##########
hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java:
##########
@@ -1039,21 +1039,27 @@ public void createImmutableFileInPath(Path fullPath, Option<byte[]> content)
         fsout.write(content.get());
       }
     } catch (IOException e) {
-      String errorMsg = "Failed to create file" + (tmpPath != null ? tmpPath : fullPath);
+      String errorMsg = "Failed to create file " + (tmpPath != null ? tmpPath : fullPath);
       throw new HoodieIOException(errorMsg, e);
     } finally {
       try {
         if (null != fsout) {
           fsout.close();
         }
       } catch (IOException e) {
-        String errorMsg = "Failed to close file" + (needTempFile ? tmpPath : fullPath);
+        String errorMsg = "Failed to close file " + (needTempFile ? tmpPath : fullPath);
         throw new HoodieIOException(errorMsg, e);
       }
 
       try {
         if (null != tmpPath) {
-          fileSystem.rename(tmpPath, fullPath);
+          boolean renameSuccess = fileSystem.rename(tmpPath, fullPath);
+          if (!renameSuccess) {
+            fileSystem.delete(tmpPath, false);
+            LOG.error("Fail to rename " + tmpPath + " to " + fullPath);
+            throw new HoodieIOException("Failed to rename " + tmpPath + " to the target " + fullPath
+                + ", target file exists: " + fileSystem.exists(fullPath));
+          }

Review Comment:
   And in use marker to solve conflict file, one job will delete other job data file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hbgstc123 commented on a diff in pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

Posted by "hbgstc123 (via GitHub)" <gi...@apache.org>.
hbgstc123 commented on code in PR #8610:
URL: https://github.com/apache/hudi/pull/8610#discussion_r1195352323


##########
hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java:
##########
@@ -1039,21 +1039,27 @@ public void createImmutableFileInPath(Path fullPath, Option<byte[]> content)
         fsout.write(content.get());
       }
     } catch (IOException e) {
-      String errorMsg = "Failed to create file" + (tmpPath != null ? tmpPath : fullPath);
+      String errorMsg = "Failed to create file " + (tmpPath != null ? tmpPath : fullPath);
       throw new HoodieIOException(errorMsg, e);
     } finally {
       try {
         if (null != fsout) {
           fsout.close();
         }
       } catch (IOException e) {
-        String errorMsg = "Failed to close file" + (needTempFile ? tmpPath : fullPath);
+        String errorMsg = "Failed to close file " + (needTempFile ? tmpPath : fullPath);
         throw new HoodieIOException(errorMsg, e);
       }
 
       try {
         if (null != tmpPath) {
-          fileSystem.rename(tmpPath, fullPath);
+          boolean renameSuccess = fileSystem.rename(tmpPath, fullPath);
+          if (!renameSuccess) {
+            fileSystem.delete(tmpPath, false);
+            LOG.error("Fail to rename " + tmpPath + " to " + fullPath);
+            throw new HoodieIOException("Failed to rename " + tmpPath + " to the target " + fullPath
+                + ", target file exists: " + fileSystem.exists(fullPath));
+          }

Review Comment:
   how about move `renameSuccess ` and `throw` out of  the try block



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1698497261

   > why not throw exception?
   
   Which step you mean to throw exception, we generally need some lock to make the instant time generation monotonically increasing. Currently we have no good manner to ensure that, but for HDFS, same name file creation would throws if the overwrite is disabled.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1528739547

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1632304488

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18512",
       "triggerID" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d28d8dc0e7a243f6ce8806c7b345ed17ecb4b619",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d28d8dc0e7a243f6ce8806c7b345ed17ecb4b619",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0bd8531ac13126771f84db19fa02fbe1828b762d UNKNOWN
   * bfc9450676a0742bad6e25ee1e56aa240e31fe74 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18512) 
   * d28d8dc0e7a243f6ce8806c7b345ed17ecb4b619 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1528737263

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1698503648

   > > why not throw exception?
   > 
   > Which step you mean to throw exception, we generally need some lock to make the instant time generation monotonically increasing. Currently we have no good manner to ensure that, but for HDFS, same name file creation would throws if the overwrite is disabled.
   
   yes, I think we need throw the exception, if multi job use then same instance to process, one of them should need to fail. And on the other hand, the atomicity of the method(createImmutableFileInPath) should be guaranteed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1698499775

   @danny0405 sorry, I comment under the review discuss


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1697151244

   And in use marker to solve conflict file, one job will delete other job data file


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1618459143

   hey @hbgstc123 @danny0405 : Is this very critical? We have few days before which we might need to land this if we want to get this in. Can you guys try to gauge the risk and ROI on landing this and take a call. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1631964372

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18512",
       "triggerID" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0bd8531ac13126771f84db19fa02fbe1828b762d UNKNOWN
   * bfc9450676a0742bad6e25ee1e56aa240e31fe74 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18512) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1632335297

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18512",
       "triggerID" : "bfc9450676a0742bad6e25ee1e56aa240e31fe74",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d28d8dc0e7a243f6ce8806c7b345ed17ecb4b619",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18523",
       "triggerID" : "d28d8dc0e7a243f6ce8806c7b345ed17ecb4b619",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0bd8531ac13126771f84db19fa02fbe1828b762d UNKNOWN
   * bfc9450676a0742bad6e25ee1e56aa240e31fe74 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18512) 
   * d28d8dc0e7a243f6ce8806c7b345ed17ecb4b619 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=18523) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1550583117

   Do we still need this change?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1551654872

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758",
       "triggerID" : "f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0bd8531ac13126771f84db19fa02fbe1828b762d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758) 
   * 0bd8531ac13126771f84db19fa02fbe1828b762d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org