You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/03 18:01:31 UTC

[GitHub] [iceberg] rdblue commented on a change in pull request #3796: Docs: update spark doc about incremental scan

rdblue commented on a change in pull request #3796:
URL: https://github.com/apache/iceberg/pull/3796#discussion_r777624777



##########
File path: site/docs/spark-queries.md
##########
@@ -104,6 +104,28 @@ spark.read
 
 Time travel is not yet supported by Spark's SQL syntax.
 
+### Incremental read
+
+To read incremental data between the snapshots, Configure below Spark read options:
+
+* `start-snapshot-id` Start snapshot ID used in incremental scans (exclusive)
+* `end-snapshot-id` End snapshot ID used in incremental scans (inclusive)
+
+```scala
+// get the data added after start-snapshot-id (10963874102873L) till end-snapshot-id (63874143573109L)
+spark.read()
+  .format("iceberg")
+  .option("start-snapshot-id", "10963874102873")
+  .option("end-snapshot-id", "63874143573109")
+  .load("path/to/table")
+```
+
+!!! Note
+Currently gets only the data from `append` operation. Cannot support `replace`, `overwrite`, `delete` operations yet.
+Works with both V1 and V2 format-version.
+
+Incremental read is not yet supported by Spark's SQL syntax.

Review comment:
       Let's remove "yet" because it is unclear whether it will be supported.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org