You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/07/15 00:31:22 UTC

[GitHub] [iceberg] rdblue opened a new pull request #1205: Update the site for the 0.9.0 release

rdblue opened a new pull request #1205:
URL: https://github.com/apache/iceberg/pull/1205


   This updates the site for the 0.9.0 release:
   * Add 0.9.0 to the releases page
   * Point Javadoc redirect to 0.9.0
   * Add warnings to Spark documentation
   * Update the Getting Started page to use Spark 3 SQL instead of the Java API and DataFrames
   * Add Spark catalogs to Configuration
   * Remove duplicate Spark API Quickstart with the same content as Java API Quickstart
   * Minor updates to CSS to reduce density


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on pull request #1205: Update the site for the 0.9.0 release

Posted by GitBox <gi...@apache.org>.

rdblue commented on pull request #1205:
URL: https://github.com/apache/iceberg/pull/1205#issuecomment-658944517


   Thanks for reviewing, @dongjoon-hyun and @rdsr!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on pull request #1205: Update the site for the 0.9.0 release

Posted by GitBox <gi...@apache.org>.

rdblue commented on pull request #1205:
URL: https://github.com/apache/iceberg/pull/1205#issuecomment-658944683


   Merging since this doesn't affect the build and CI was flaky.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #1205: Update the site for the 0.9.0 release

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #1205:
URL: https://github.com/apache/iceberg/pull/1205#discussion_r454721349



##########
File path: site/docs/spark.md
##########
@@ -286,6 +286,11 @@ val df = spark.read
     .table("prod.db.table")
 ```
 
+!!! Warning
+    When reading with DataFrames in Spark 3, use `table` to load a table by name from a catalog.
+    Using `format("iceberg")` loads an isolated table reference that is not refreshed when other queries update the table.

Review comment:
       @dongjoon-hyun, this is the warning I've added to make people aware of the issues with the `DataFrameReader`. Please take a look if you have time. There is also one below for the v1 `DataFrameWriter` API.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] dongjoon-hyun commented on a change in pull request #1205: Update the site for the 0.9.0 release

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on a change in pull request #1205:
URL: https://github.com/apache/iceberg/pull/1205#discussion_r454735068



##########
File path: site/docs/spark.md
##########
@@ -286,6 +286,11 @@ val df = spark.read
     .table("prod.db.table")
 ```
 
+!!! Warning
+    When reading with DataFrames in Spark 3, use `table` to load a table by name from a catalog.
+    Using `format("iceberg")` loads an isolated table reference that is not refreshed when other queries update the table.

Review comment:
       Thank you, @rdblue . It looks good to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] dongjoon-hyun commented on a change in pull request #1205: Update the site for the 0.9.0 release

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on a change in pull request #1205:
URL: https://github.com/apache/iceberg/pull/1205#discussion_r454735340



##########
File path: site/docs/spark.md
##########
@@ -353,6 +358,13 @@ To replace data in the table with the result of a query, use `INSERT OVERWRITE`.
 
 The partitions that will be replaced by `INSERT OVERWRITE` depends on Spark's partition overwrite mode and the partitioning of a table.
 
+!!! Warning
+    Spark 3.0.0 has a correctness bug that affects dynamic `INSERT OVERWRITE` with hidden partitioning, [SPARK-32168][spark-32168].
+    For tables with [hidden partitions](../partitioning), wait for Spark 3.0.1.

Review comment:
       Ya. We should release Apache Spark 3.0.1 soon for the users.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue merged pull request #1205: Update the site for the 0.9.0 release

Posted by GitBox <gi...@apache.org>.

rdblue merged pull request #1205:
URL: https://github.com/apache/iceberg/pull/1205


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] dongjoon-hyun commented on a change in pull request #1205: Update the site for the 0.9.0 release

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on a change in pull request #1205:
URL: https://github.com/apache/iceberg/pull/1205#discussion_r454735418



##########
File path: site/docs/spark.md
##########
@@ -432,6 +444,13 @@ Spark 3 introduced the new `DataFrameWriterV2` API for writing to tables using d
     - `df.writeTo(t).append()` is equivalent to `INSERT INTO`
     - `df.writeTo(t).overwritePartitions()` is equivalent to dynamic `INSERT OVERWRITE`
 
+The v1 DataFrame `write` API is still supported, but is not recommended.
+
+!!! Warning
+    When writing with the v1 DataFrame API in Spark 3, use `saveAsTable` or `insertInto` to load tables with a catalog.
+    Using `format("iceberg")` loads an isolated table reference that will not automatically refresh tables used by queries.

Review comment:
       Thanks, @rdblue !




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org