You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/11/18 08:37:01 UTC

[GitHub] [hudi] xushiyan commented on a change in pull request #4024: [HUDI-2559] Converting commit timestamp format to millisecs

xushiyan commented on a change in pull request #4024:
URL: https://github.com/apache/hudi/pull/4024#discussion_r752008715



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java
##########
@@ -33,14 +35,27 @@
  */
 public class HoodieInstantTimeGenerator {
   // Format of the timestamp used for an Instant
-  private static final String INSTANT_TIMESTAMP_FORMAT = "yyyyMMddHHmmss";
-  private static final int INSTANT_TIMESTAMP_FORMAT_LENGTH = INSTANT_TIMESTAMP_FORMAT.length();
+  public static final String SECS_INSTANT_TIMESTAMP_FORMAT = "yyyyMMddHHmmss";
+  public static final int SECS_INSTANT_ID_LENGTH = SECS_INSTANT_TIMESTAMP_FORMAT.length();
+  private static final String MILLIS_INSTANT_TIMESTAMP_FORMAT = "yyyyMMddHHmmssSSS";

Review comment:
       Can we actually introduce `.` in the time format? it improves readability and people can easily distinguish it from the old format. Plus it will bypass the jdk bug with millisecond parsing?
   
   ```suggestion
     private static final String MILLIS_INSTANT_TIMESTAMP_FORMAT = "yyyyMMddHHmmss.SSS";
   ```

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
##########
@@ -72,24 +72,24 @@
   protected HoodieTableMetaClient metaClient;
 
   /**
-   * Parse the timestamp of an Instant and return a {@code SimpleDateFormat}.
+   * Parse the timestamp of an Instant and return a {@code Date}.
    */
-  public static Date parseInstantTime(String timestamp) throws ParseException {
-    return HoodieInstantTimeGenerator.parseInstantTime(timestamp);
+  public static Date parseDateFromInstantTime(String timestamp) throws ParseException {
+    return HoodieInstantTimeGenerator.parseDateFromInstantTime(timestamp);
   }
 
   /**
    * Format the java.time.Instant to a String representing the timestamp of a Hoodie Instant.
    */
-  public static String formatInstantTime(Instant timestamp) {
-    return HoodieInstantTimeGenerator.formatInstantTime(timestamp);
+  public static String getInstantForDate(Instant timestamp) {
+    return HoodieInstantTimeGenerator.getInstantForDate(timestamp);

Review comment:
       `getInstantForDate()` sounds like taking a `Date` and returning a java `Instant`. I think the original name sounds just fine.
   

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java
##########
@@ -52,36 +67,65 @@ public static String createNewInstantTime(long milliseconds) {
       String newCommitTime;
       do {
         Date d = new Date(System.currentTimeMillis() + milliseconds);
-        newCommitTime = INSTANT_TIME_FORMATTER.format(convertDateToTemporalAccessor(d));
+        newCommitTime = MILLIS_INSTANT_TIME_FORMATTER.format(convertDateToTemporalAccessor(d));
       } while (HoodieTimeline.compareTimestamps(newCommitTime, HoodieActiveTimeline.LESSER_THAN_OR_EQUALS, oldVal));
       return newCommitTime;
     });
   }
 
-  public static Date parseInstantTime(String timestamp) throws ParseException {
+  public static Date parseDateFromInstantTime(String timestamp) throws ParseException {
     try {
-      LocalDateTime dt = LocalDateTime.parse(timestamp, INSTANT_TIME_FORMATTER);
+      // Enables backwards compatibility with non-millisecond granularity instants
+      String timestampInMillis = timestamp;
+      if (isSecondGranularity(timestamp)) {
+        // Add milliseconds to the instant in order to parse successfully
+        timestampInMillis = timestamp + DEFAULT_MILLIS_EXT;
+      } else if (timestamp.length() > MILLIS_INSTANT_TIMESTAMP_FORMAT_LENGTH) {
+        // compaction and cleaning in metadata has special format. handling it by trimming extra chars and treating it with secs granularity
+        timestampInMillis = timestamp.substring(0, MILLIS_INSTANT_TIMESTAMP_FORMAT_LENGTH);
+      }
+
+      LocalDateTime dt = LocalDateTime.parse(timestampInMillis, MILLIS_INSTANT_TIME_FORMATTER);
       return Date.from(dt.atZone(ZoneId.systemDefault()).toInstant());
     } catch (DateTimeParseException e) {
       // Special handling for all zero timestamp which is not parsable by DateTimeFormatter
       if (timestamp.equals(ALL_ZERO_TIMESTAMP)) {
         return new Date(0);
       }
-      // compaction and cleaning in metadata has special format. handling it by trimming extra chars and treating it with secs granularity
-      if (timestamp.length() > INSTANT_TIMESTAMP_FORMAT_LENGTH) {
-        LocalDateTime dt = LocalDateTime.parse(timestamp.substring(0, INSTANT_TIMESTAMP_FORMAT_LENGTH), INSTANT_TIME_FORMATTER);
-        return Date.from(dt.atZone(ZoneId.systemDefault()).toInstant());
-      }
       throw e;
     }
   }
 
-  public static String formatInstantTime(Instant timestamp) {
-    return INSTANT_TIME_FORMATTER.format(timestamp);
+  private static boolean isSecondGranularity(String instant) {
+    return instant.length() == SECS_INSTANT_ID_LENGTH;
+  }
+
+  public static String getInstantForDate(Instant timestamp) {

Review comment:
       i just felt `getInstantForXXX()` confuses people as it sounds like returning java Instant. I think `formatInstantTime()`, `formatDate()` suit better, at least people expect it to return String.

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
##########
@@ -72,24 +72,24 @@
   protected HoodieTableMetaClient metaClient;
 
   /**
-   * Parse the timestamp of an Instant and return a {@code SimpleDateFormat}.
+   * Parse the timestamp of an Instant and return a {@code Date}.
    */
-  public static Date parseInstantTime(String timestamp) throws ParseException {
-    return HoodieInstantTimeGenerator.parseInstantTime(timestamp);
+  public static Date parseDateFromInstantTime(String timestamp) throws ParseException {
+    return HoodieInstantTimeGenerator.parseDateFromInstantTime(timestamp);
   }
 
   /**
    * Format the java.time.Instant to a String representing the timestamp of a Hoodie Instant.
    */
-  public static String formatInstantTime(Instant timestamp) {
-    return HoodieInstantTimeGenerator.formatInstantTime(timestamp);
+  public static String getInstantForDate(Instant timestamp) {
+    return HoodieInstantTimeGenerator.getInstantForDate(timestamp);
   }
 
   /**
    * Format the Date to a String representing the timestamp of a Hoodie Instant.
    */
-  public static String formatInstantTime(Date timestamp) {
-    return HoodieInstantTimeGenerator.formatInstantTime(timestamp);
+  public static String getInstantForDate(Date timestamp) {
+    return HoodieInstantTimeGenerator.getInstantForDate(timestamp);

Review comment:
       similarly how about `formatDate()`.

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java
##########
@@ -33,14 +35,27 @@
  */
 public class HoodieInstantTimeGenerator {
   // Format of the timestamp used for an Instant
-  private static final String INSTANT_TIMESTAMP_FORMAT = "yyyyMMddHHmmss";
-  private static final int INSTANT_TIMESTAMP_FORMAT_LENGTH = INSTANT_TIMESTAMP_FORMAT.length();
+  public static final String SECS_INSTANT_TIMESTAMP_FORMAT = "yyyyMMddHHmmss";
+  public static final int SECS_INSTANT_ID_LENGTH = SECS_INSTANT_TIMESTAMP_FORMAT.length();
+  private static final String MILLIS_INSTANT_TIMESTAMP_FORMAT = "yyyyMMddHHmmssSSS";

Review comment:
       why not public? also feel the constants below can be all public final. don't see harm exposing them but convenience in testing.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org