You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/03/17 20:17:28 UTC

[GitHub] [hudi] nsivabalan commented on a change in pull request #5052: [HUDI-3644] hoodie log scan bug cause data duplication bugfix

nsivabalan commented on a change in pull request #5052:
URL: https://github.com/apache/hudi/pull/5052#discussion_r829464460



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java
##########
@@ -217,16 +215,21 @@ public synchronized void scan(Option<List<String>> keys) {
             && !HoodieTimeline.compareTimestamps(logBlock.getLogBlockHeader().get(INSTANT_TIME), HoodieTimeline.LESSER_THAN_OR_EQUALS, this.latestInstantTime
         )) {
           // hit a block with instant time greater than should be processed, stop processing further
+          LOG.info("hit a block with instant time greater than should be processed, stop processing further. logfile: + " + logFile
+                  + " , blockType: " + logBlock.getBlockType() + " , instantTime: " + instantTime + " , latestInstantTime : " + latestInstantTime  );
           break;
         }
         if (logBlock.getBlockType() != CORRUPT_BLOCK && logBlock.getBlockType() != COMMAND_BLOCK) {
-          if (!completedInstantsTimeline.containsOrBeforeTimelineStarts(instantTime)
-              || inflightInstantsTimeline.containsInstant(instantTime)) {
+            if (!checkIfValidCommit(instantTime) || inflightInstantsTimeline.containsInstant(instantTime)) {

Review comment:
       what exactly are we trying to fix here. can you help explain the scenario. 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org