You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/27 02:01:19 UTC

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #5364: Core, API: Support scanning from a branch along with time travel

amogh-jahagirdar commented on code in PR #5364:
URL: https://github.com/apache/iceberg/pull/5364#discussion_r930551249


##########
core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java:
##########
@@ -322,6 +325,32 @@ public static long snapshotIdAsOfTime(Table table, long timestampMillis) {
     return snapshotId;
   }
 
+  /**
+   * Returns the ID of the most recent snapshot in the branch
+   *
+   * @param table a {@link Table}
+   * @param branch a {@link String}
+   * @param timestampMillis the timestamp in millis since the Unix epoch
+   * @return the snapshot ID
+   * @throws IllegalArgumentException when no snapshot is found in the table, on the given branch
+   * older than the timestamp
+   */
+  public static long snapshotIdAsOfTime(Table table, String branch, long timestampMillis) {
+    SnapshotRef ref = table.refs().get(branch);
+    Preconditions.checkArgument(ref != null, "Branch %s does not exist", branch);
+    Preconditions.checkArgument(ref.isBranch(), "Ref %s is a tag, not a branch", branch);
+    Long snapshotId = null;
+    for (Snapshot snapshot : ancestorsOf(ref.snapshotId(), table::snapshot)) {
+      if (snapshot.timestampMillis() <= timestampMillis) {
+        snapshotId = snapshot.snapshotId();
+      }
+    }

Review Comment:
   I don't think this is right. Here we're iterating over the ancestors rather than the log entries like we do for time travel on the main branch. 
   
   To be consistent with previous time travel behavior I think we need to leverage history logs but I think that means in the metadata we should actually maintain a history log for each branch (requiring a spec change). @rdblue @jackye1995 
   
   Also we're not guaranteed timestamp ordering due to WAP commits, so I think the linear search is required as opposed to using a binary search. https://github.com/apache/iceberg/issues/3891 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org