You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2023/01/12 21:03:37 UTC

[GitHub] [iceberg] amogh-jahagirdar opened a new pull request, #6573: Docs: Add information on how to read from branches and tags in Spark docs

amogh-jahagirdar opened a new pull request, #6573:
URL: https://github.com/apache/iceberg/pull/6573

   https://github.com/apache/iceberg/pull/5150/files introduced the ability to read from branches and tags, but the docs haven't been updated. This change updates the docs and examples for reading from branches and tags.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on code in PR #6573:
URL: https://github.com/apache/iceberg/pull/6573#discussion_r1069032742


##########
docs/spark-queries.md:
##########
@@ -126,6 +126,8 @@ To select a specific table snapshot or the snapshot at some time in the DataFram
 
 * `snapshot-id` selects a specific table snapshot
 * `as-of-timestamp` selects the current snapshot at a timestamp, in milliseconds
+* `branch` selects the head snapshot of the specified branch. Note that currently branch cannot be combined with as-of-timestamp.
+* `tag` selects the snapshot associated with the specified tag

Review Comment:
   or we can wait till https://github.com/apache/iceberg/pull/6575 gets merged. So that we don't have to mention it for both branch and tag. But we need to add an example in ##SQL also. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
amogh-jahagirdar commented on code in PR #6573:
URL: https://github.com/apache/iceberg/pull/6573#discussion_r1070158484


##########
docs/spark-queries.md:
##########
@@ -126,6 +126,8 @@ To select a specific table snapshot or the snapshot at some time in the DataFram
 
 * `snapshot-id` selects a specific table snapshot
 * `as-of-timestamp` selects the current snapshot at a timestamp, in milliseconds
+* `branch` selects the head snapshot of the specified branch. Note that currently branch cannot be combined with as-of-timestamp.
+* `tag` selects the snapshot associated with the specified tag

Review Comment:
   Definitely agree on having a SQL example once #6575 gets merged. For combining as-of-timestamp with tag I felt that was apparent since a tag can only map to a single snapshot which conflicts with passing in a timestamp, where as branch + time travel is a different case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 merged pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
jackye1995 merged PR #6573:
URL: https://github.com/apache/iceberg/pull/6573


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on code in PR #6573:
URL: https://github.com/apache/iceberg/pull/6573#discussion_r1069020854


##########
docs/spark-queries.md:
##########
@@ -126,6 +126,8 @@ To select a specific table snapshot or the snapshot at some time in the DataFram
 

Review Comment:
   we need to change two to four.



##########
docs/spark-queries.md:
##########
@@ -126,6 +126,8 @@ To select a specific table snapshot or the snapshot at some time in the DataFram
 
 * `snapshot-id` selects a specific table snapshot
 * `as-of-timestamp` selects the current snapshot at a timestamp, in milliseconds
+* `branch` selects the head snapshot of the specified branch. Note that currently branch cannot be combined with as-of-timestamp.
+* `tag` selects the snapshot associated with the specified tag

Review Comment:
   do we need to mention that tag also cannot be combined with as-of-timestamp.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on code in PR #6573:
URL: https://github.com/apache/iceberg/pull/6573#discussion_r1070189868


##########
docs/spark-queries.md:
##########
@@ -126,6 +126,8 @@ To select a specific table snapshot or the snapshot at some time in the DataFram
 
 * `snapshot-id` selects a specific table snapshot
 * `as-of-timestamp` selects the current snapshot at a timestamp, in milliseconds
+* `branch` selects the head snapshot of the specified branch. Note that currently branch cannot be combined with as-of-timestamp.
+* `tag` selects the snapshot associated with the specified tag

Review Comment:
   Given that is a syntax change, I am waiting for more time for others to take a look. I think we can first merge this one and add that later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
amogh-jahagirdar commented on code in PR #6573:
URL: https://github.com/apache/iceberg/pull/6573#discussion_r1068707350


##########
docs/spark-queries.md:
##########
@@ -143,6 +145,22 @@ spark.read
     .load("path/to/table")
 ```
 
+```scala
+// time travel to tag historical-snapshot
+spark.read
+    .option("tag", "historical-snapshot")
+    .format("iceberg")
+    .load("path/to/table")
+```
+
+```scala
+// time travel to the head snapshot of audit-branch
+spark.read
+    .option("branch", "audit-branch")

Review Comment:
   Sure, updated! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on code in PR #6573:
URL: https://github.com/apache/iceberg/pull/6573#discussion_r1068695683


##########
docs/spark-queries.md:
##########
@@ -143,6 +145,22 @@ spark.read
     .load("path/to/table")
 ```
 
+```scala
+// time travel to tag historical-snapshot
+spark.read
+    .option("tag", "historical-snapshot")
+    .format("iceberg")
+    .load("path/to/table")
+```
+
+```scala
+// time travel to the head snapshot of audit-branch
+spark.read
+    .option("branch", "audit-branch")

Review Comment:
   Can we specify the static variable to use? `SparkReadOptions.BRANCH`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on PR #6573:
URL: https://github.com/apache/iceberg/pull/6573#issuecomment-1382625570

   Thanks everyone for the review, as I said in the thread for the SQL related changes, I will wait for some more time in case there are disagreements. I will merge this in first and we can add follow up PRs at this front.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
singhpk234 commented on code in PR #6573:
URL: https://github.com/apache/iceberg/pull/6573#discussion_r1068696488


##########
docs/spark-queries.md:
##########
@@ -143,6 +145,22 @@ spark.read
     .load("path/to/table")
 ```
 
+```scala
+// time travel to tag historical-snapshot

Review Comment:
   [question] should we just say `read the table from the snapshot, historical-snapshot tag is pointing to` as it's not exactly a time travel what we are doing here, thoughts ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6573: Docs: Add information on how to read from branches and tags in Spark docs

Posted by GitBox <gi...@apache.org>.
singhpk234 commented on code in PR #6573:
URL: https://github.com/apache/iceberg/pull/6573#discussion_r1068696488


##########
docs/spark-queries.md:
##########
@@ -143,6 +145,22 @@ spark.read
     .load("path/to/table")
 ```
 
+```scala
+// time travel to tag historical-snapshot

Review Comment:
   [question] should we just say `read the table from the snapshot, historical-snapshot is pointing to` as it's not exactly a time travel what we are doing here, thoughts ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org