You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/09/20 04:59:46 UTC

[GitHub] [iceberg] CodingCat commented on a change in pull request #1508: Use schema at the time of the snapshot when reading a snapshot.

CodingCat commented on a change in pull request #1508:
URL: https://github.com/apache/iceberg/pull/1508#discussion_r711877815



##########
File path: core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java
##########
@@ -144,4 +146,72 @@ public static Snapshot snapshotAfter(Table table, long snapshotId) {
     throw new IllegalStateException(
         String.format("Cannot find snapshot after %s: not an ancestor of table's current snapshot", snapshotId));
   }
+
+  /**
+   * Returns the ID of the most recent snapshot for the table as of the timestamp.
+   *
+   * @param table a {@link Table}
+   * @param timestampMillis the timestamp in millis since the epoch
+   * @return the snapshot ID
+   * @throws IllegalArgumentException when no snapshot is found in the table
+   * older than the timestamp
+   */
+  public static long snapshotIdAsOfTime(Table table, long timestampMillis) {
+    Long snapshotId = null;
+    for (HistoryEntry logEntry : table.history()) {
+      if (logEntry.timestampMillis() <= timestampMillis) {
+        snapshotId = logEntry.snapshotId();
+      }
+    }
+    Preconditions.checkArgument(snapshotId != null,
+        "Cannot find a snapshot older than %s", timestampMillis);
+    return snapshotId;
+  }
+
+  /**
+   * Returns the schema of the table for the specified snapshot.
+   *
+   * @param table a {@link Table}
+   * @param snapshotId the ID of the snapshot
+   * @return the schema
+   */
+  public static Schema schemaFor(Table table, long snapshotId) {
+    Snapshot snapshot = table.snapshot(snapshotId);
+    Preconditions.checkArgument(snapshot != null, "Cannot find snapshot %s", snapshotId);
+    Integer schemaId = snapshot.schemaId();
+    // schemaId could be null, if snapshot was created before Iceberg added schema id to snapshot
+    if (schemaId != null) {
+      return table.schemas().get(schemaId);

Review comment:
       shall we allow the current table.schema as the default? IMO, the current behavior is actually confusing, especially when the schema is not compatible, for example 
   
   first snapshot
   ```
   +-----+
   |value|
   +-----+
   |    1|
   |    2|
   |    3|
   |    4|
   +-----+
   ```
   overwrite with
   
   ```
   +---------+
   |new_value|
   +---------+
   |        2|
   |        3|
   |        4|
   |        5|
   +---------+
   ```
   
   time travel to the first one
   
   ```
   +---------+
   |new_value|
   +---------+
   |        0|
   |        0|
   |        0|
   |        0|
   +---------+
   
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org