You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/22 21:51:49 UTC

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5917: [HUDI-4279] Strength the remote fs view lagging check when latest com…

nsivabalan commented on code in PR #5917:
URL: https://github.com/apache/hudi/pull/5917#discussion_r904268325


##########
hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java:
##########
@@ -138,14 +137,31 @@ private boolean isLocalViewBehind(Context ctx) {
     String localTimelineHash = localTimeline.getTimelineHash();
     // refresh if timeline hash mismatches and if local's last known instant < client's last known instant (if config is enabled)
     if (!localTimelineHash.equals(timelineHashFromClient)
-        && (!timelineServiceConfig.refreshTimelineBasedOnLatestCommit || HoodieTimeline.compareTimestamps(localLastKnownInstant, HoodieTimeline.LESSER_THAN, lastKnownInstantFromClient))) {
+        && (!timelineServiceConfig.refreshTimelineBasedOnLatestCommit
+            || localTimelineBehind(localTimeline, lastKnownInstantFromClient, numInstantsFromClient))) {
       return true;
     }
 
     // As a safety check, even if hash is same, ensure instant is present
     return !localTimeline.containsOrBeforeTimelineStarts(lastKnownInstantFromClient);
   }
 
+  private static boolean localTimelineBehind(HoodieTimeline localTimeline, String lastKnownInstantFromClient, String numInstantsFromClient) {
+    String localLastKnownInstant = localTimeline.lastInstant().isPresent() ? localTimeline.lastInstant().get().getTimestamp()
+        : HoodieTimeline.INVALID_INSTANT_TS;
+    // Why comparing the num commits ?
+    // Assumes there are 4 commits on the timeline:
+    // timestamp(action): ts_0(commit), ts_1(commit), ts_2(clean), ts_3(commit)
+    // when ts_1 is in INFLIGHT state, ts_2 clean action is already finished,
+    // after ts_1 triggers #sync, the local timeline is refreshed as [ts_0, ts_2],
+    // when ts_1 switches state from INFLIGHT to COMPLETED, no #sync triggers.
+    // at ts_3, when the fs view snapshot is requested, the ts_3 client timeline should be [ts_0, ts_1, ts_2],
+    // if we only compare the latest commit, the local timeline is NOT behind, but the fs view is not complete
+    // because ts_1 is lost.
+    return HoodieTimeline.compareTimestamps(localLastKnownInstant, HoodieTimeline.LESSER_THAN, lastKnownInstantFromClient)
+        || localTimeline.countInstants() < Integer.parseInt(numInstantsFromClient);

Review Comment:
   should we check just "<" or "!=" to catch any mismatch ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org