You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iceberg.apache.org by bl...@apache.org on 2021/09/14 22:48:38 UTC

[iceberg] branch master updated: Docs: Add UPDATE describtion for Spark (#2897)

This is an automated email from the ASF dual-hosted git repository.

blue pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iceberg.git


The following commit(s) were added to refs/heads/master by this push:
     new f6ce6cd  Docs: Add UPDATE describtion for Spark (#2897)
f6ce6cd is described below

commit f6ce6cd77821afa5489c93346af9cc93009b1b98
Author: Peidian li <38...@users.noreply.github.com>
AuthorDate: Wed Sep 15 06:48:25 2021 +0800

    Docs: Add UPDATE describtion for Spark (#2897)
---
 site/docs/spark-writes.md | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/site/docs/spark-writes.md b/site/docs/spark-writes.md
index 1fd2121..6f042ce 100644
--- a/site/docs/spark-writes.md
+++ b/site/docs/spark-writes.md
@@ -29,6 +29,7 @@ Iceberg uses Apache Spark's DataSourceV2 API for data source and catalog impleme
 | [SQL merge into](#merge-into)                    | ✔️        |            | ⚠ Requires Iceberg Spark extensions            |
 | [SQL insert overwrite](#insert-overwrite)        | ✔️        |            |                                                |
 | [SQL delete from](#delete-from)                  | ✔️        |            | ⚠ Row-level delete requires Spark extensions   |
+| [SQL update](#update)                            | ✔️        |            | ⚠ Requires Iceberg Spark extensions            |
 | [DataFrame append](#appending-data)              | ✔️        | ✔️          |                                                |
 | [DataFrame overwrite](#overwriting-data)         | ✔️        | ✔️          | ⚠ Behavior changed in Spark 3.0                |
 | [DataFrame CTAS and RTAS](#creating-tables)      | ✔️        |            |                                                |
@@ -171,6 +172,20 @@ WHERE ts >= '2020-05-01 00:00:00' and ts < '2020-06-01 00:00:00'
 
 If the delte filter matches entire partitions of the table, Iceberg will perform a metadata-only delete. If the filter matches individual rows of a table, then Iceberg will rewrite only the affected data files.
 
+### `UPDATE`
+
+Spark 3.1 added support for `UPDATE` queries that update matching rows in tables.
+
+Update queries accept a filter to match rows to update.
+
+```sql
+UPDATE prod.db.table
+SET c1 = 'update_c1', c2 = 'update_c2'
+WHERE ts >= '2020-05-01 00:00:00' and ts < '2020-06-01 00:00:00'
+```
+
+For more complex row-level updates based on incoming data, see the section on `MERGE INTO`.
+
 ## Writing with DataFrames
 
 Spark 3 introduced the new `DataFrameWriterV2` API for writing to tables using data frames. The v2 API is recommended for several reasons: