You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/13 17:43:45 UTC

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #5984: Core, API: Support incremental scanning with branch

stevenzwu commented on code in PR #5984:
URL: https://github.com/apache/iceberg/pull/5984#discussion_r1020940001


##########
api/src/main/java/org/apache/iceberg/IncrementalScan.java:
##########
@@ -21,6 +21,23 @@
 /** API for configuring an incremental scan. */
 public interface IncrementalScan<ThisT, T extends ScanTask, G extends ScanTaskGroup<T>>
     extends Scan<ThisT, T, G> {
+
+  /**
+   * Instructs this scan to look for changes starting from a particular snapshot (inclusive).
+   *
+   * <p>If the start snapshot is not configured, it is defaulted to the oldest ancestor of the end
+   * snapshot (inclusive).
+   *
+   * @param fromSnapshotId the start snapshot ID (inclusive)
+   * @param referenceName the ref used
+   * @return this for method chaining
+   * @throws IllegalArgumentException if the start snapshot is not an ancestor of the end snapshot
+   */
+  default ThisT fromSnapshotInclusive(long fromSnapshotId, String referenceName) {

Review Comment:
   >  It doesn't make sense to read incremental changes from a tag because it is a fixed point in time.
   
   This is an interesting point. A tag is just a name pointing to some snapshot in the past. There are two possibilities.
   
   * the tag is outside the snapshot retention window (e.g. 7 days). Then this is an invalid case for incremental scan
   * the tag is still within the snapshot retention window (e.g. 7 days). Then I think it is fine to have increment scan with `fromSnapshot(String tag)`. E.g., maybe the Iceberg table has daily tags for bookkeeping of good rewind points for easier backfill.
   
   We can allow `fromSnapshot(String tag)` but add a validation?
   
   @rdblue what do you think?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org