You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "voonhous (via GitHub)" <gi...@apache.org> on 2023/03/21 05:03:42 UTC

[GitHub] [hudi] voonhous opened a new pull request, #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

voonhous opened a new pull request, #7997:
URL: https://github.com/apache/hudi/pull/7997

   ...ndle
   
   Applying the fix from #5185 will fix write issues for MOR tables, but will cause write issues for COW tables.
   
   More information on how to reproduce the COW error in this jira issue:
   https://issues.apache.org/jira/browse/HUDI-5822
   
   This addresses the issue raised here: https://github.com/apache/hudi/issues/5782
   
   ### Change Logs
   1. Complement the change of `getLatestFileSlices` -> `getAllFileGroups` in #5185
   
   ### Impact
   
   1. Allow jobs to recover properly
   3. Ensure that the correct fileSlices are being read
   
   ### Risk level (write none, low medium or high below)
   
   None
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119607785


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   Please take a look at the fixes again. 
   
   I don't think it's possible to move the test to `ITTestDataStreamWrite` as the tests assume that the end-state of the table will only have parquet files. 
   
   The same assumption cannot be made for the tests that i added in `ITTestBucketStreamWrite`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118334031


##########
hudi-common/src/test/java/org/apache/hudi/common/testutils/FileCreateUtils.java:
##########
@@ -101,7 +101,11 @@ public static String markerFileName(String instantTime, String fileId, IOType io
   }
 
   public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension) {
-    return String.format("%s_%s_%s%s%s.%s", fileId, WRITE_TOKEN, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);
+    return markerFileName(instantTime, fileId, ioType, fileExtension, WRITE_TOKEN);
+  }
+
+  public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension, String writeToken) {
+    return String.format("%s_%s_%s%s%s.%s", fileId, writeToken, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);

Review Comment:
   @danny0405 ignore my last message. 
   
   May i ask why it is not recommended to add overloaded static functions in test classes that is mainly used for testing? 
   
   The file i am modifying is in the project<>namespace that belongs to the test scope.
   
   I feel it is entirely reasonable to add helper functions to a helper class for testing purposes. WDYT? 
   
   ```markdown
   hudi-common/src/**test**/java/org/apache/hudi/common/testutils/FileCreateUtils.java
   ```
   
   CMIIW, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1477286075

   @danny0405 Running the tests included with this PR without the fix fails... i.e. The fix in this PR is still required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118389369


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   I don't think this is a Flink specific bug, agree with @bvaradar somwhow the mor log rollback or the fs view(maybe related with metadata) should be fixed, let's create another JIRA issue to track the improvement.
   
   To eradicate the bug from the root cause takes some time, I also agree @voonhous we can have a quick fix on Flink side.
   
   @voonhous The PT overall looks good, can we move the tests into `ITTestDataStreamWrite` to reuse some tool methods, or can we re-organize these two test cases for some reuse things, because there are alreay bucket index tests in `ITTestDataStreamWrite`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7997: Fix FileId not found exception when FileId is passed to HoodieMergeHa…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1436864541

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix write and read correctness issue when a rollback is performed

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1113914745


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -148,7 +172,10 @@ public Option<FileSlice> getLatestFileSlicesIncludingInflight() {
    */
   public Stream<FileSlice> getAllFileSlices() {
     if (!timeline.empty()) {
-      return fileSlices.values().stream().filter(this::isFileSliceCommitted);
+      List<String> instantsToRollback = getInstantsToRollbackFromTimeline();
+      return fileSlices.values().stream()
+          .filter(this::isFileSliceCommitted)
+          .filter(s -> !instantsToRollback.contains(s.getBaseInstantTime()));

Review Comment:
   @bvaradar
   
   Understood, am working on a hacky fix for this now... I'll try to avoid touching core APIs for now. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1439621064

   > @voonhous, could you please add the test cases for the changes?
   
   Hmmm, not easy to write a test for this. This bug can only be triggered if JM's rollback is running behind TM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] bvaradar commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "bvaradar (via GitHub)" <gi...@apache.org>.
bvaradar commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118142197


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   After reading your comment in [BucketStreamWriteFunction](https://github.com/apache/hudi/pull/7997/files#diff-bc2aaf0958acca5adb72c26a76348ab70c994929ac76a7befe70a263ffb66353), I understand the difference in COW and MOR behavior. The cleaner way would be to have MOR rollback logic delete the log files if the commit to be rollback created that file-group.  This avoids special casing in other places. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] bvaradar commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "bvaradar (via GitHub)" <gi...@apache.org>.
bvaradar commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119512693


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1117898536


##########
hudi-common/src/test/java/org/apache/hudi/common/testutils/FileCreateUtils.java:
##########
@@ -101,7 +101,11 @@ public static String markerFileName(String instantTime, String fileId, IOType io
   }
 
   public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension) {
-    return String.format("%s_%s_%s%s%s.%s", fileId, WRITE_TOKEN, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);
+    return markerFileName(instantTime, fileId, ioType, fileExtension, WRITE_TOKEN);
+  }
+
+  public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension, String writeToken) {
+    return String.format("%s_%s_%s%s%s.%s", fileId, writeToken, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);

Review Comment:
   Will implement a custom marker creation logic within the test then



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 closed pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 closed pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...
URL: https://github.com/apache/hudi/pull/7997


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1443823058

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297",
       "triggerID" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15311",
       "triggerID" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15388",
       "triggerID" : "1443787040",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 4ea65336bf55d988e388f7301a0cad9f42bd7b9b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15311) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15388) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas commented on a diff in pull request #7997: Fix FileId not found exception when FileId is passed to HoodieMergeHa…

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1111878645


##########
hudi-common/src/main/java/org/apache/hudi/common/table/view/TableFileSystemView.java:
##########
@@ -157,6 +157,8 @@ interface SliceView extends SliceViewWithLatestSlice {
 
   /**
    * Stream all the file groups for a given partition.
+   * <p>
+   * Note: This method will return all file groups in a partition, i.e. uncommitted filegroups will be returned

Review Comment:
   Why does this add the note? IMO, the description of this interface doesn't mentition that the return value doesn't include uncommitted filegroups.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] bvaradar commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "bvaradar (via GitHub)" <gi...@apache.org>.
bvaradar commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118140213


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   @voonhous : thanks for the great explanation. From your comment, it looks like there is a difference  in the way Flink integration resolves valid files. 
   
   Regarding your comment: 
   ```Once the rollback completes, a partition might have a bucketId that maps to two fileGroups, breaking the 1 bucketId <> 1 fileGroup mapping contract.```
   
   As part of rollback, shouldn't the underlying file (which was newly created as part of failed commit gettting rolled back) get deleted ? Also, why does the fileId getting read if the commit did not finish ?  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1475768034

   > TM might run ahead of of JM
   
   How could this happen? The write tasks on JM would wait for the JM coordinator to finish initialization then start to hande the writiing process.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119831489


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java:
##########
@@ -115,16 +115,24 @@ protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePat
     // TODO(HUDI-1517) use provided marker-file's path instead
     Option<HoodieLogFile> latestLogFileOption = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
         HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime);
+
+    // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup

Review Comment:
   Okay, seems here is the code you wanna folk from the listing based rollback strategy:
   
   ```java
                 // In case all data was inserts and the commit failed, delete the file belonging to that commit
                 // We do not know fileIds for inserts (first inserts are either log files or base files),
                 // delete all files for the corresponding failed commit, if present (same as COW)
                 hoodieRollbackRequests.add(getHoodieRollbackRequest(partitionPath, filesToDelete));
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119832662


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java:
##########
@@ -115,16 +115,24 @@ protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePat
     // TODO(HUDI-1517) use provided marker-file's path instead
     Option<HoodieLogFile> latestLogFileOption = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
         HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime);
+
+    // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup

Review Comment:
   Yes. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118217290


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   IIUC, this issue is only specific to Flink writers as the the write and rollback logic might execute concurrently. 
   
   While this will fix the issue, i agree that the cleaner way is for MOR to delete the log files so that the responsibility of writers and rollback executors are clearly defined. i.e. Writers should not know or handle the quirks of rollback executors. The current way of fixing things does introduce code smell.
   
   To be honest, I am getting a lot of resistance and pushback with this fix, so I am going to take small steps. I have already spent 8 hours investigating, writing the "required" test case and fixes for this PR and there doesn't seem to be an agreement in which direction everybody wants to take. 
   
   @danny0405 @bvaradar LMK which direction the both of you would like to take to fix this issue. 
   
   Thank you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] bvaradar commented on a diff in pull request #7997: [HUDI-5822] Fix write and read correctness issue when a rollback is performed

Posted by "bvaradar (via GitHub)" <gi...@apache.org>.
bvaradar commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1113911602


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   @voonhous : Not sure I am following this. when rollback happens, command block gets added to skip rollbacked log blocks. 
   
   



##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -148,7 +172,10 @@ public Option<FileSlice> getLatestFileSlicesIncludingInflight() {
    */
   public Stream<FileSlice> getAllFileSlices() {
     if (!timeline.empty()) {
-      return fileSlices.values().stream().filter(this::isFileSliceCommitted);
+      List<String> instantsToRollback = getInstantsToRollbackFromTimeline();
+      return fileSlices.values().stream()
+          .filter(this::isFileSliceCommitted)
+          .filter(s -> !instantsToRollback.contains(s.getBaseInstantTime()));

Review Comment:
   HoodieFileGroup is a core class and getAllFileSlices() is extensively used.  this is not the right place to be reading rollback metadata.  This will slow down almost all timeline operations. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7997: Fix FileId not found exception when FileId is passed to HoodieMergeHa…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1436885969

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297",
       "triggerID" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7997: [HUDI-5822] Fix write and read correctness issue when a rollback is performed

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1438286983

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297",
       "triggerID" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297) 
   * 4ea65336bf55d988e388f7301a0cad9f42bd7b9b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1475695770

   Close because it should be fixed in https://github.com/apache/hudi/pull/8077


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous closed pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous closed pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...
URL: https://github.com/apache/hudi/pull/7997


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: Fix FileId not found exception when FileId is passed to HoodieMergeHa…

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1436895922

   Do not merge this in yet, my fix rollback the fix done here:
   
   https://github.com/apache/hudi/pull/5185


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1113948876


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bucket/BucketStreamWriteFunction.java:
##########
@@ -157,8 +157,8 @@ private void bootstrapIndexIfNeed(String partition) {
 
     // Load existing fileID belongs to this task

Review Comment:
   Not really, added comments on when we cannot load the latest fileId (latest fileID = committed fileId)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1444088547

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297",
       "triggerID" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15311",
       "triggerID" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15388",
       "triggerID" : "1443787040",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 4ea65336bf55d988e388f7301a0cad9f42bd7b9b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15311) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15388) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1443787040

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa…

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1437312588

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297",
       "triggerID" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1439619395

   @voonhous, could you please add the test cases for the changes?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1113959105


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   @bvaradar Reverted the change, please take a look. 
   
   Understood. The crux of the issue that I am trying to fix is somewhat similar to a multi-writer scenario. 
   
   In Flink, the job manager (JM) is responsible for performing a rollback, while the task manager (TM) is responsible for performing writes. 
   
   It is entirely possible and very common for TM to perform writes before a JM performs a rollback (see https://issues.apache.org/jira/browse/HUDI-5822) when a job is recovering from and is restarting.
   
   Under the bucket index use-case, a bucketId can only have 1 fileGroup. Using `getLatestFileSlices` when JM has yet to complete a rollback will cause the fileGroup that is pending rollback completion to not be visible to the TM.
   
   TM will hence generate a new fileGroup for the same bucketId. Once the rollback completes, a partition might have a bucketId that maps to two fileGroups, breaking the 1 bucketId <> 1 fileGroup mapping contract.
   
   As such, this was what #5185 was trying to fix. And allow fileGroups pending rollback to be re-used when performing a bucketIndex bootstrap. 
   
   I found that this fix was a tad hacky, which was why I tried modifying the lower level APIs to address such scenarios, which as can be seen, i shortly gave up due to the performance penalties that you have highlighted.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119820861


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java:
##########
@@ -115,16 +115,24 @@ protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePat
     // TODO(HUDI-1517) use provided marker-file's path instead
     Option<HoodieLogFile> latestLogFileOption = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
         HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime);
+
+    // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup

Review Comment:
   @danny0405 here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118334031


##########
hudi-common/src/test/java/org/apache/hudi/common/testutils/FileCreateUtils.java:
##########
@@ -101,7 +101,11 @@ public static String markerFileName(String instantTime, String fileId, IOType io
   }
 
   public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension) {
-    return String.format("%s_%s_%s%s%s.%s", fileId, WRITE_TOKEN, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);
+    return markerFileName(instantTime, fileId, ioType, fileExtension, WRITE_TOKEN);
+  }
+
+  public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension, String writeToken) {
+    return String.format("%s_%s_%s%s%s.%s", fileId, writeToken, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);

Review Comment:
   @danny0405 ignore my last message. 
   
   May i ask why it is not recommended to add overloaded static functions in test classes that is mainly used for testing? 
   
   The file i am modifying is in the project<>namespace that belongs to the test scope.
   
   I feel it is entirely reasonable to add helper functions to a helper class for testing purposes. WDYT? 
   
   hudi-common/src/**test**/java/org/apache/hudi/common/testutils/FileCreateUtils.java
   
   CMIIW, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1477288540

   > @danny0405 Running the tests included with this PR without the fix fails... i.e. The fix in this PR is still required.
   
   Got ya, can you rebase with the latest master and force-push again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118389369


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   I don't think this is a Flink specific bug, agree with @bvaradar somewhow the mor log rollback or the fs view(maybe related with metadata) should be fixed, let's create another JIRA issue to track the improvement.
   
   To eradicate the bug from the root cause takes some time, I also agree @voonhous we can have a quick fix on Flink side.
   
   @voonhous The PT overall looks good, can we move the tests into `ITTestDataStreamWrite` to reuse some tool methods, or can we re-organize these two test cases for some reuse things, because there are alreay bucket index tests in `ITTestDataStreamWrite`.



##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   I don't think this is a Flink specific bug, agree with @bvaradar somehow the MOR log rollback or the fs view(maybe related with metadata) should be fixed, let's create another JIRA issue to track the improvement.
   
   To eradicate the bug from the root cause takes some time, I also agree @voonhous we can have a quick fix on Flink side.
   
   @voonhous The PT overall looks good, can we move the tests into `ITTestDataStreamWrite` to reuse some tool methods, or can we re-organize these two test cases for some reuse things, because there are alreay bucket index tests in `ITTestDataStreamWrite`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118389369


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   I don't think this is a Flink specific bug, agree with @bvaradar somehow the MOR log rollback or the fs view(maybe related with metadata) should be fixed, let's create another JIRA issue to track the improvement.
   
   Eradicating the bug from the root cause takes some time, I also agree @voonhous we can have a quick fix on Flink side.
   
   @voonhous The PR overall looks good, can we move the tests into `ITTestDataStreamWrite` to reuse some tool methods, or can we re-organize these two test cases for some reuse things, because there are alreay bucket index tests in `ITTestDataStreamWrite`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1117860492


##########
hudi-common/src/test/java/org/apache/hudi/common/testutils/FileCreateUtils.java:
##########
@@ -101,7 +101,11 @@ public static String markerFileName(String instantTime, String fileId, IOType io
   }
 
   public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension) {
-    return String.format("%s_%s_%s%s%s.%s", fileId, WRITE_TOKEN, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);
+    return markerFileName(instantTime, fileId, ioType, fileExtension, WRITE_TOKEN);
+  }
+
+  public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension, String writeToken) {
+    return String.format("%s_%s_%s%s%s.%s", fileId, writeToken, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);

Review Comment:
   Usually we do not add logic only for testing purpose.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1121031667


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java:
##########
@@ -115,16 +115,24 @@ protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePat
     // TODO(HUDI-1517) use provided marker-file's path instead
     Option<HoodieLogFile> latestLogFileOption = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
         HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime);
+
+    // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup

Review Comment:
   separate PR: https://github.com/apache/hudi/pull/8077



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1478849285

   > @voonhous The CI can not be triggered, can you fire another PR instead.
   
   https://github.com/apache/hudi/pull/8263


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1475740832

   @danny0405 This is not fixed... The rollback standardises the rollback logic between markers and listing for MOR tables.
   
   The bug is still present in COW tables... TM might run ahead of of JM. If we continue to use `getAllFileGroups`, TM might still be able to fetch the fileGroup that is destined to be deleted, causing the fileId/Group not found in issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118454395


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   @bvaradar @danny0405 there seems to be some discrepancy the 2 rollback strategies below:
   
   1. rollback with marker - will NOT delete log files that were created by the commit to rollback
   2. rollback via listing - will delete log files that were created by the commit to rollback
   
   ``` java
       // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup
       if (latestLogFileOption.isPresent() && Objects.equals(baseCommitTime, instantToRollback.getTimestamp())) {
         Path fullDeletePath = new Path(partitionPath, latestLogFileOption.get().getFileName());
         return new HoodieRollbackRequest(relativePartitionPath, EMPTY_STRING, EMPTY_STRING,
             Collections.singletonList(fullDeletePath.toString()),
             Collections.emptyMap());
       }
   ```
   
   Perhaps, a fix like this would be sufficient to resolve the discrepancy between how these 2 strategies rollback log files?
   
   
   Parallel fix:
   https://github.com/apache/hudi/compare/master...voonhous:hudi:HUDI-5822_rollback_fix?expand=1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119831489


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java:
##########
@@ -115,16 +115,24 @@ protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePat
     // TODO(HUDI-1517) use provided marker-file's path instead
     Option<HoodieLogFile> latestLogFileOption = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
         HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime);
+
+    // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup

Review Comment:
   Okey, seems here is the code you wanna folk from the listing based rollback strategy:
   
   ```java
                 // In case all data was inserts and the commit failed, delete the file belonging to that commit
                 // We do not know fileIds for inserts (first inserts are either log files or base files),
                 // delete all files for the corresponding failed commit, if present (same as COW)
                 hoodieRollbackRequests.add(getHoodieRollbackRequest(partitionPath, filesToDelete));
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119857282


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java:
##########
@@ -115,16 +115,24 @@ protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePat
     // TODO(HUDI-1517) use provided marker-file's path instead
     Option<HoodieLogFile> latestLogFileOption = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
         HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime);
+
+    // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup

Review Comment:
   Sure, good idea!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119499745


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   Yeah, let's move to that direction.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas commented on a diff in pull request #7997: Fix FileId not found exception when FileId is passed to HoodieMergeHa…

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1111886113


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bucket/BucketStreamWriteFunction.java:
##########
@@ -157,8 +157,8 @@ private void bootstrapIndexIfNeed(String partition) {
 
     // Load existing fileID belongs to this task

Review Comment:
   Does this comment need to change to `Load the latest fileID belongs to this task`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1477481363

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1477321119

   > > @danny0405 Running the tests included with this PR without the fix fails... i.e. The fix in this PR is still required.
   > 
   > Got ya, can you rebase with the latest master and force-push again
   
   Rebased and force-pushed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7997: [HUDI-5822] Fix write and read correctness issue when a rollback is performed

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1438298315

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297",
       "triggerID" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15311",
       "triggerID" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297) 
   * 4ea65336bf55d988e388f7301a0cad9f42bd7b9b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15311) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1113960034


##########
hudi-common/src/main/java/org/apache/hudi/common/table/view/TableFileSystemView.java:
##########
@@ -157,6 +157,8 @@ interface SliceView extends SliceViewWithLatestSlice {
 
   /**
    * Stream all the file groups for a given partition.
+   * <p>
+   * Note: This method will return all file groups in a partition, i.e. uncommitted filegroups will be returned

Review Comment:
   Sorry, it was a WIP thing while i was debugging this error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1443459830

   @SteNicholas @danny0405 Can you please help to review this commit? Thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1475802636

   # JM
   ```log
   2023-02-20 11:36:34,826 INFO  org.apache.hudi.client.BaseHoodieWriteClient                 [] - Begin rollback of instant 20230220112929727
   2023-02-20 11:36:34,833 INFO  org.apache.hudi.common.table.HoodieTableMetaClient           [] - Loading HoodieTableMetaClient from hdfs://hudi_path
   2023-02-20 11:36:34,947 INFO  org.apache.hudi.common.table.HoodieTableConfig               [] - Loading table properties from hdfs://hudi_path/.hoodie/hoodie.properties
   2023-02-20 11:36:34,952 INFO  org.apache.hudi.common.table.HoodieTableMetaClient           [] - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from hdfs://hudi_path
   2023-02-20 11:36:34,952 INFO  org.apache.hudi.common.table.HoodieTableMetaClient           [] - Loading Active commit timeline for hdfs://hudi_path
   2023-02-20 11:36:35,320 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Loaded instants upto : Option{val=[==>20230220112929727__commit__INFLIGHT]}
   2023-02-20 11:36:35,321 INFO  org.apache.hudi.common.table.view.FileSystemViewManager      [] - Creating View Manager with storage type :REMOTE_FIRST
   2023-02-20 11:36:35,321 INFO  org.apache.hudi.common.table.view.FileSystemViewManager      [] - Creating remote first table view
   2023-02-20 11:36:35,323 INFO  org.apache.hudi.client.BaseHoodieWriteClient                 [] - Scheduling Rollback at instant time : 20230220113634829 (exists in active timeline: true), with rollback plan: false
   2023-02-20 11:36:35,612 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Loaded instants upto : Option{val=[==>20230220113634829__rollback__REQUESTED]}
   2023-02-20 11:36:35,612 INFO  org.apache.hudi.table.action.rollback.BaseRollbackPlanActionExecutor [] - Requesting Rollback with instant time [==>20230220113634829__rollback__REQUESTED]
   2023-02-20 11:36:35,620 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Loaded instants upto : Option{val=[==>20230220113634829__rollback__REQUESTED]}
   2023-02-20 11:36:35,694 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Checking for file exists ?hdfs://hudi_path/.hoodie/20230220113634829.rollback.requested
   2023-02-20 11:36:35,706 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Create new file for toInstant ?hdfs://hudi_path/.hoodie/20230220113634829.rollback.inflight
   2023-02-20 11:36:35,709 INFO  org.apache.hudi.table.action.rollback.CopyOnWriteRollbackActionExecutor [] - Clean out all base files generated for commit: [==>20230220112929727__commit__INFLIGHT]
   2023-02-20 11:36:35,720 INFO  org.apache.hudi.table.action.rollback.CopyOnWriteRollbackActionExecutor [] - Time(in ms) taken to finish rollback 11
   2023-02-20 11:36:35,720 INFO  org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Rolled back inflight instant 20230220112929727
   2023-02-20 11:36:35,721 INFO  org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Index rolled back for commits [==>20230220112929727__commit__INFLIGHT]
   2023-02-20 11:36:35,725 INFO  org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Deleting instant=[==>20230220112929727__commit__INFLIGHT]
   2023-02-20 11:36:35,725 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Deleting instant [==>20230220112929727__commit__INFLIGHT]
   2023-02-20 11:36:35,728 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Removed instant [==>20230220112929727__commit__INFLIGHT]
   2023-02-20 11:36:35,728 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Deleting instant [==>20230220112929727__commit__REQUESTED]
   2023-02-20 11:36:35,731 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Removed instant [==>20230220112929727__commit__REQUESTED]
   2023-02-20 11:36:35,732 INFO  org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Deleted pending commit [==>20230220112929727__commit__REQUESTED]
   2023-02-20 11:36:35,733 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Checking for file exists ?hdfs://hudi_path/.hoodie/20230220113634829.rollback.inflight
   2023-02-20 11:36:35,786 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Create new file for toInstant ?hdfs://hudi_path/.hoodie/20230220113634829.rollback
   2023-02-20 11:36:35,786 INFO  org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor [] - Rollback of Commits [20230220112929727] is complete
   2023-02-20 11:36:35,805 INFO  org.apache.hudi.common.fs.FSUtils                            [] - Removed directory at hdfs://hudi_path/.hoodie/.temp/20230220112929727
   2023-02-20 11:36:35,806 INFO  org.apache.hudi.metrics.HoodieMetrics                        [] - Sending rollback metrics (duration=973, numFilesDeleted=2)
   2023-02-20 11:36:35,812 INFO  org.apache.hudi.client.BaseHoodieWriteClient                 [] - Generate a new instant time: 20230220113635812 action: commit
   2023-02-20 11:36:35,815 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Loaded instants upto : Option{val=[20230220113634829__rollback__COMPLETED]}
   ```
   
   # TM
   ```log
    11:36:33,837 INFO  org.apache.hudi.common.table.HoodieTableMetaClient           [] - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from hdfs://hdfs_path
   2023-02-20 11:36:33,837 INFO  org.apache.hudi.common.table.HoodieTableMetaClient           [] - Loading Active commit timeline for hdfs://hdfs_path
   2023-02-20 11:36:33,840 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Loaded instants upto : Option{val=[==>20230220112929727__commit__INFLIGHT]}
   2023-02-20 11:36:33,841 INFO  org.apache.hudi.client.BaseHoodieClient                      [] - Embedded Timeline Server is disabled. Not starting timeline service
   2023-02-20 11:36:33,843 INFO  org.apache.hudi.common.table.view.FileSystemViewManager      [] - Creating View Manager with storage type :REMOTE_FIRST
   2023-02-20 11:36:33,843 INFO  org.apache.hudi.common.table.view.FileSystemViewManager      [] - Creating remote first table view
   2023-02-20 11:36:33,849 INFO  org.apache.hudi.common.table.view.AbstractTableFileSystemView [] - Took 2 ms to read  0 instants, 0 replaced file groups
   2023-02-20 11:36:33,850 INFO  org.apache.hudi.sink.common.AbstractStreamWriteFunction      [] - Send bootstrap write metadata event to coordinator, task[0].
   2023-02-20 11:36:33,850 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Loaded instants upto : Option{val=[==>20230220112929727__commit__INFLIGHT]}
   2023-02-20 11:36:33,853 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - bucket_write: dim_buyer_info_test (1/2)#0 (e2e7a38d69393b8d814ad48544573435_829468138015e9cb689e833f1800885e_0_0) switched from INITIALIZING to RUNNING.
   2023-02-20 11:36:33,857 INFO  org.apache.hudi.sink.CleanFunction                           [] - Executor executes action [wait for cleaning finish] success!
   2023-02-20 11:36:33,860 INFO  org.apache.hudi.sink.bucket.BucketStreamWriteFunction        [] - Loading Hoodie Table dim_buyer_info_test, with path hdfs://hdfs_path/age=0
   2023-02-20 11:36:33,860 INFO  org.apache.hudi.common.table.HoodieTableMetaClient           [] - Loading HoodieTableMetaClient from hdfs://hdfs_path
   2023-02-20 11:36:33,867 INFO  org.apache.hudi.common.table.HoodieTableConfig               [] - Loading table properties from hdfs://hdfs_path/.hoodie/hoodie.properties
   2023-02-20 11:36:33,868 INFO  org.apache.hudi.common.util.ClusteringUtils                  [] - Found 0 files in pending clustering operations
   2023-02-20 11:36:33,868 INFO  org.apache.hudi.common.table.view.AbstractTableFileSystemView [] - Building file system view for partition (age=1)
   2023-02-20 11:36:33,872 INFO  org.apache.hudi.common.table.HoodieTableMetaClient           [] - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from hdfs://hdfs_path
   2023-02-20 11:36:33,872 INFO  org.apache.hudi.common.table.HoodieTableMetaClient           [] - Loading Active commit timeline for hdfs://hdfs_path
   2023-02-20 11:36:33,878 INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline   [] - Loaded instants upto : Option{val=[==>20230220112929727__commit__INFLIGHT]}
   2023-02-20 11:36:33,879 INFO  org.apache.hudi.common.table.view.AbstractTableFileSystemView [] - addFilesToView: NumFiles=2, NumFileGroups=1, FileGroupsCreationTime=6, StoreTimeTaken=1
   2023-02-20 11:36:33,880 INFO  org.apache.hudi.common.table.view.FileSystemViewManager      [] - Creating View Manager with storage type :REMOTE_FIRST
   2023-02-20 11:36:33,880 INFO  org.apache.hudi.common.table.view.FileSystemViewManager      [] - Creating remote first table view
   2023-02-20 11:36:33,880 INFO  org.apache.hudi.sink.bucket.BucketStreamWriteFunction        [] - bootstrapIndexIfNeed with timeline: [[==>20230220112929727__commit__INFLIGHT]]
   2023-02-20 11:36:33,880 INFO  org.apache.hudi.common.table.HoodieTableMetaClient           [] - Loading HoodieTableMetaClient from hdfs://hdfs_path
   2023-02-20 11:36:33,880 INFO  org.apache.hudi.sink.bucket.BucketStreamWriteFunction        [] - Should load this partition bucket 0 with fileID 00000000-ee86-4b41-a704-9e075dd253d8
   2023-02-20 11:36:33,880 INFO  org.apache.hudi.sink.bucket.BucketStreamWriteFunction        [] - Adding fileID 00000000-ee86-4b41-a704-9e075dd253d8 to the bucket 0 of partition age=1. 
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1440000282

   @voonhous, could you add the unit tests for `bootstrapIndexIfNeed`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7997: [HUDI-5822] Fix write and read correctness issue when a rollback is performed

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1438537025

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15297",
       "triggerID" : "907d6ed5d9fe948a0eb9da995b29ee41fe3a6b87",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15311",
       "triggerID" : "4ea65336bf55d988e388f7301a0cad9f42bd7b9b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4ea65336bf55d988e388f7301a0cad9f42bd7b9b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15311) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118217290


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   IIUC, this issue is only specific to Flink writers as the the write and rollback logic might execute concurrently. 
   
   While this will fix the issue, i agree that the cleaner way is for MOR to delete the log files so that the responsibility of writers and rollback executors are clearly defined. i.e. Writers should not know or handle the quirks of rollback executors. The current way of fixing things does introduce code smell.
   
   To be honest, I am getting a lot of resistance and pushback with this fix, so I am going to take small steps. I have already spent 8 hours writing a test case and there doesn't seem to be an agreement in which direction everybody wants to take. 
   
   @danny0405 @bvaradar LMK which direction the both of you would like to take to fix this issue. 
   
   Thank you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118454395


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   @bvaradar @danny0405 there seems to be some discrepancy the 2 rollback strategies below:
   
   1. rollback with marker - will NOT delete log files that were created by the commit to rollback
   2. rollback via listing - will delete log files that were created by the commit to rollback
   
   ``` java
       // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup
       if (latestLogFileOption.isPresent() && Objects.equals(baseCommitTime, instantToRollback.getTimestamp())) {
         Path fullDeletePath = new Path(partitionPath, latestLogFileOption.get().getFileName());
         return new HoodieRollbackRequest(relativePartitionPath, EMPTY_STRING, EMPTY_STRING,
             Collections.singletonList(fullDeletePath.toString()),
             Collections.emptyMap());
       }
   ```
   
   Perhaps, a fix like this would be sufficient to resolve the discrepancy between how these 2 strategies rollback log files?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119819270


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   I mean to fix the log file cleaning of when marker file rollback is enabled(by default), BTW, I didn't see the code snippet you pasted, can you show the permenent link instead?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118371624


##########
hudi-common/src/test/java/org/apache/hudi/common/testutils/FileCreateUtils.java:
##########
@@ -101,7 +101,11 @@ public static String markerFileName(String instantTime, String fileId, IOType io
   }
 
   public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension) {
-    return String.format("%s_%s_%s%s%s.%s", fileId, WRITE_TOKEN, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);
+    return markerFileName(instantTime, fileId, ioType, fileExtension, WRITE_TOKEN);
+  }
+
+  public static String markerFileName(String instantTime, String fileId, IOType ioType, String fileExtension, String writeToken) {
+    return String.format("%s_%s_%s%s%s.%s", fileId, writeToken, instantTime, fileExtension, HoodieTableMetaClient.MARKER_EXTN, ioType);

Review Comment:
   Yeah, you are right, my mistake



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119855888


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java:
##########
@@ -115,16 +115,24 @@ protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePat
     // TODO(HUDI-1517) use provided marker-file's path instead
     Option<HoodieLogFile> latestLogFileOption = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
         HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime);
+
+    // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup

Review Comment:
   Can we split this PR into 2, one is for the marker based rollback fix, another is for the testing of flink stream writer with hashing index enabled? I mean we create two JIRA issues here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1119855888


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java:
##########
@@ -115,16 +115,24 @@ protected HoodieRollbackRequest getRollbackRequestForAppend(String markerFilePat
     // TODO(HUDI-1517) use provided marker-file's path instead
     Option<HoodieLogFile> latestLogFileOption = FSUtils.getLatestLogFile(table.getMetaClient().getFs(), partitionPath, fileId,
         HoodieFileFormat.HOODIE_LOG.getFileExtension(), baseCommitTime);
+
+    // Log file can be deleted if the commit to rollback is also the commit that created the fileGroup

Review Comment:
   Can we split this PR into 2, one is for the marker based rollback fix, another is for the testing of flink stream writer with hashing index enabled?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "voonhous (via GitHub)" <gi...@apache.org>.
voonhous commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1439782694

   @hbgstc123 @TengHuo for visibility


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1477770081

   @voonhous The CI can not be triggered, can you fire another PR instead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #7997:
URL: https://github.com/apache/hudi/pull/7997#issuecomment-1477359967

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org