You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/26 15:57:31 UTC

[GitHub] [iceberg-docs] samredai opened a new pull request #19: Adds graphic for time-travel section of splash page

samredai opened a new pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19


   This adds a timeline graphic for the Time Travel section feature description on the splash page.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] rdblue commented on a change in pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#discussion_r792883478



##########
File path: landing-page/content/services/time-travel.html
##########
@@ -0,0 +1,70 @@
+---
+Title: Time Travel
+Description: Time-travel enables reproducible queries that use exactly the same table snapshot, or lets users easily examine changes. Version rollback allows users to quickly correct problems by resetting tables to a good state.
+LearnMore: /docs/latest/spark-queries/#time-travel
+Category: Post
+Draft: false
+weight: 400
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+ <ul class="timeline">
+
+	<!-- Item 1 -->
+	<li>

Review comment:
       Did you intend to use tabs instead of spaces? It makes this harder to read.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] samredai commented on a change in pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
samredai commented on a change in pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#discussion_r796011172



##########
File path: landing-page/content/services/time-travel.html
##########
@@ -0,0 +1,38 @@
+---
+Title: Time Travel and Rollback
+Description: Time-travel enables reproducible queries that use exactly the same table snapshot, or lets users easily examine changes. Version rollback allows users to quickly correct problems by resetting tables to a good state.
+LearnMore: /docs/latest/spark-queries/#time-travel
+Category: Post
+Draft: false
+weight: 400
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+<div class="termynal-container">
+    <div id="termynal-time-travel" data-termynal data-ty-startDelay="600" data-ty-typeDelay="20" data-ty-lineDelay="500">
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">spark.read.table("taxis").count()</span>
+        <span data-ty>2,853,020</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">val ONE_DAY_MS=86400000;</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">val NOW=System.currentTimeMillis()</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">(spark</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="">.read</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="">.option("as-of-timestamp", NOW_MS - ONE_DAY_MS)</span>

Review comment:
       How about this which gets rid of all line wraps?
   ![time_travel_example](https://user-images.githubusercontent.com/43911210/151862244-e0492557-637d-41c2-a50b-aef6dad00a91.gif)
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] rdblue commented on a change in pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#discussion_r796003282



##########
File path: landing-page/content/services/time-travel.html
##########
@@ -0,0 +1,38 @@
+---
+Title: Time Travel and Rollback
+Description: Time-travel enables reproducible queries that use exactly the same table snapshot, or lets users easily examine changes. Version rollback allows users to quickly correct problems by resetting tables to a good state.
+LearnMore: /docs/latest/spark-queries/#time-travel
+Category: Post
+Draft: false
+weight: 400
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+<div class="termynal-container">
+    <div id="termynal-time-travel" data-termynal data-ty-startDelay="600" data-ty-typeDelay="20" data-ty-lineDelay="500">
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">spark.read.table("taxis").count()</span>
+        <span data-ty>2,853,020</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">val ONE_DAY_MS=86400000;</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">val NOW=System.currentTimeMillis()</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">(spark</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="">.read</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="">.option("as-of-timestamp", NOW_MS - ONE_DAY_MS)</span>

Review comment:
       Since this line still wraps, is it possible to make a constant above? It would be better to wrap the `val NOW` line:
   
   ```scala
   scala> val TUESDAY = System.currentTimeMillis() - ONE_DAY_MS;
   scala> ...
   >.option("as-of-timestamp", TUESDAY)
   >...
   ```
   
   We could call it something specific but short (Tuesday works for me) or we could call it YESTERDAY?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] rdblue edited a comment on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
rdblue edited a comment on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1022430192


   I have the same reaction as @RussellSpitzer. I like the visualization of snapshots, but I don't consider rollback to be time travel. Rollback alters the state of the table, while time travel actually reads older versions.
   
   The trouble here is that there isn't a good SQL demonstration of time travel yet. We've added table names for time travel in 3.2, but we're waiting for Spark 3.3 to get the `AS OF TIMESTAMP` and `AS OF VERSION` syntax. Maybe we should use those anyway? Or maybe we should use Spark's dataframe syntax to demo time travel right now:
   
   ```scala
   spark.read.option("as-of-timestamp", System.currentTimeMillis() - ONE_DAY_MS).load("db.table)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] samredai edited a comment on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
samredai edited a comment on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1023462940


   Changed this to be a termynal example. Does the sentence "Version rollback allows users to quickly correct problems by resetting tables to a good state." still fit in here or is it better to just remove it completely?
   
   https://user-images.githubusercontent.com/43911210/151410500-dfb6066a-af6d-4e04-a5b7-3b1dc06cd3c1.mp4
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] rdblue commented on a change in pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#discussion_r796003597



##########
File path: landing-page/content/services/time-travel.html
##########
@@ -0,0 +1,38 @@
+---
+Title: Time Travel and Rollback
+Description: Time-travel enables reproducible queries that use exactly the same table snapshot, or lets users easily examine changes. Version rollback allows users to quickly correct problems by resetting tables to a good state.
+LearnMore: /docs/latest/spark-queries/#time-travel
+Category: Post
+Draft: false
+weight: 400
+---
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one or more
+ - contributor license agreements.  See the NOTICE file distributed with
+ - this work for additional information regarding copyright ownership.
+ - The ASF licenses this file to You under the Apache License, Version 2.0
+ - (the "License"); you may not use this file except in compliance with
+ - the License.  You may obtain a copy of the License at
+ -
+ -   http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing, software
+ - distributed under the License is distributed on an "AS IS" BASIS,
+ - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ - See the License for the specific language governing permissions and
+ - limitations under the License.
+ -->
+<div class="termynal-container">
+    <div id="termynal-time-travel" data-termynal data-ty-startDelay="600" data-ty-typeDelay="20" data-ty-lineDelay="500">
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">spark.read.table("taxis").count()</span>
+        <span data-ty>2,853,020</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">val ONE_DAY_MS=86400000;</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">val NOW=System.currentTimeMillis()</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="scala>">(spark</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="">.read</span>
+        <span data-ty="input" data-ty-cursor="▋" data-ty-prompt="">.option("as-of-timestamp", NOW_MS - ONE_DAY_MS)</span>

Review comment:
       Oh, this also fixes the slight problem that you used NOW and NOW_MS




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] RussellSpitzer commented on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1022407790


   Although I like the intent here, I'm not sure we want to call out "rollback_to_timestamp" as the key use of time travel. I think we should center querying as as of a certain time as "time travel" while rollback is a more of a maintenance procedure.  I feel like the graphic kind of implies that the key way to time travel is to rollback the whole table.
   
   Also are we sure we want to add in a Spark specific command here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] rdblue commented on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1023631703


   On the termynal example, I think there are a couple things we can do to improve it. For example, we could first do `spark.read.load("nyc.taxis").count()` and show like 2,000,000 or something. Then we could do `spark.read.option("as-of-timestamp", 1526266800000).count()` and show a lower number. I think that's good to show that the data is changing, rather than relying on variable names. And I also think my earlier suggestion to use `System.currentTimeMillis() - ONE_DAY_MS` is a bad idea because it makes the code look way too long and complicated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] samredai commented on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
samredai commented on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1026133495


   > On the termynal example, I think there are a couple things we can do to improve it. For example, we could first do `spark.read.load("nyc.taxis").count()` and show like 2,000,000 or something. Then we could do `spark.read.option("as-of-timestamp", 1526266800000).count()` and show a lower number. I think that's good to show that the data is changing, rather than relying on variable names. And I also think my earlier suggestion to use `System.currentTimeMillis() - ONE_DAY_MS` is a bad idea because it makes the code look way too long and complicated.
   
   Updated this example!
   ![time_travel_example](https://user-images.githubusercontent.com/43911210/151860000-fdc7f74d-388a-44fe-b47f-744ae4de454e.gif)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] rdblue commented on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1022430192


   I have the same reaction as @RussellSpitzer. I like the visualization of snapshots, but I don't consider rollback to be time travel. Rollback alters the state of the table, while time travel actually reads older versions.
   
   The trouble here is that there isn't a good SQL demonstration of time travel yet. We've added table names for time travel in 3.2, but we're waiting for Spark 3.3 to get the `AS OF TIMESTAMP` and `AS OF VERSION` syntax. Maybe we should use those anyway? Or maybe we should use Spark's dataframe syntax to demo time travel right now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] rdblue merged pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] samredai commented on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
samredai commented on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1023462940


   Changed this to be a termynal example:
   
   https://user-images.githubusercontent.com/43911210/151410500-dfb6066a-af6d-4e04-a5b7-3b1dc06cd3c1.mp4
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] rdblue commented on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1023628644


   I like the sentence about rollback. I'd probably update the heading to "Time travel and rollback"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] samredai commented on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
samredai commented on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1023367178


   I'll update it to use the spark line @rdblue provided but to @RussellSpitzer's point about not using a spark specific command. That has me thinking that maybe we should eventually just have SQL everywhere and not specify any engine at all so as not to give the impression that these are fundamentally specific to an engine since these features _can_ be implemented into any engine, even if it hasn't been yet. The question of which engines have yet to implement specific features feels like a distraction from describing what the features actually are.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg-docs] samredai commented on pull request #19: Adds graphic for time-travel section of splash page

Posted by GitBox <gi...@apache.org>.
samredai commented on pull request #19:
URL: https://github.com/apache/iceberg-docs/pull/19#issuecomment-1023631754


   > I like the sentence about rollback. I'd probably update the heading to "Time travel and rollback"
   
   Done!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org