You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "bvaradar (via GitHub)" <gi...@apache.org> on 2023/02/26 19:36:42 UTC

[GitHub] [hudi] bvaradar commented on a diff in pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

bvaradar commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118140213


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   @voonhous : thanks for the great explanation. From your comment, it looks like there is a difference  in the way Flink integration resolves valid files. 
   
   Regarding your comment: 
   ```Once the rollback completes, a partition might have a bucketId that maps to two fileGroups, breaking the 1 bucketId <> 1 fileGroup mapping contract.```
   
   As part of rollback, shouldn't the underlying file (which was newly created as part of failed commit gettting rolled back) get deleted ? Also, why does the fileId getting read if the commit did not finish ?  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org