You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/03/28 13:31:51 UTC

[GitHub] [hudi] nsivabalan commented on a change in pull request #5149: [WIP][HUDI-3721] Allow rollback of commits before metadata table initialization

nsivabalan commented on a change in pull request #5149:
URL: https://github.com/apache/hudi/pull/5149#discussion_r836430932



##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -477,6 +479,18 @@ private static void processRestoreMetadata(HoodieActiveTimeline metadataTableTim
     return partitionToRecordsMap;
   }
 
+  public static boolean isMetadataTableInitCommit(String instantTimestamp) {
+    if (SOLO_COMMIT_TIMESTAMP.equals(instantTimestamp)) {
+      return true;
+    }
+    String newInstantTimestamp = HoodieActiveTimeline.createNewInstantTime();
+    if (instantTimestamp.length() - newInstantTimestamp.length() == 3

Review comment:
       we already have variables in HoodieInstantTimeGenerator which tracks the length of commit timestamps.
   MILLIS_INSTANT_TIMESTAMP_FORMAT_LENGTH
   

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -504,9 +504,11 @@ private boolean initializeFromFilesystem(HoodieTableMetaClient dataMetaClient,
     }
 
     // If there is no commit on the dataset yet, use the SOLO_COMMIT_TIMESTAMP as the instant time for initial commit
-    // Otherwise, we use the timestamp of the latest completed action.
+    // Otherwise, we use the timestamp of the latest completed action with the suffix "000".
     String createInstantTime = dataMetaClient.getActiveTimeline().filterCompletedInstants()
-        .getReverseOrderedInstants().findFirst().map(HoodieInstant::getTimestamp).orElse(SOLO_COMMIT_TIMESTAMP);
+        .getReverseOrderedInstants().findFirst().map(HoodieInstant::getTimestamp)
+        .map(timestamp -> timestamp + HoodieTableMetadata.METADATA_TABLE_INIT_TIMESTAMP_SUFFIX)

Review comment:
       please do remember that this is not applicable for a table that comes from 0.10.0. 
   i.e. metadata initialization happened with 0.10.0, and then after upgrading to 0.11, someone tries to do a restore. We need to call it out in our release guide. 
   

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -556,7 +570,9 @@ private static void processRollbackMetadata(HoodieActiveTimeline metadataTableTi
       //
       // when at time t4, we commit the compaction rollback,the above check returns true.
       HoodieInstant syncedInstant = new HoodieInstant(false, HoodieTimeline.DELTA_COMMIT_ACTION, instantToRollback);
-      if (metadataTableTimeline.getCommitsTimeline().isBeforeTimelineStarts(syncedInstant.getTimestamp())) {
+      Option<HoodieInstant> firstInstant = metadataTableTimeline.getCommitsTimeline().firstInstant();

Review comment:
       there could be more cases to consider here :( 
   if archival of MDT kicks in, then the firstInstant will refer to some other future commit. so, the fix may not work in all cases. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org