You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/11 21:50:29 UTC

[GitHub] [iceberg] rdblue commented on a change in pull request #2067: Adds initial Documentation for Iceberg Stored Procedures

rdblue commented on a change in pull request #2067:
URL: https://github.com/apache/iceberg/pull/2067#discussion_r555362432



##########
File path: site/docs/spark.md
##########
@@ -814,3 +815,309 @@ This type conversion table describes how Iceberg types are converted to the Spar
 | struct                     | struct                  |               |
 | list                       | array                   |               |
 | map                        | map                     |               |
+
+## Procedures
+
+In Spark 3 Iceberg provides an SQL API for accomplishing the [maintenance actions](maintenance.md). Support for
+stored procedures is tied to the DataSourceV2 catalog and requires that the Iceberg Extensions are enabled for the
+Spark Session.
+
+### General Usage
+
+To call an Iceberg stored procedure, execute a `CALL` command against the iceberg catalog. All procedures are added to
+the `system` keyspace. Procedures can take positional or named arguments.
+
+#### Generic Call with Positional Arguments
+```sql
+    CALL catalog_name.system.procedure_name(arg_1, arg_2, ... arg_n)
+```
+
+#### Generic Call with Named Arguments
+```sql
+    CALL catalog_name.system.procedure_name(arg_name_2 => arg_2, arg_name_1 => arg_1)
+```
+
+### Cherrypick Snapshot Procedure
+
+A procedure that applies changes in a given snapshot and creates a new snapshot which will
+be set as the current snapshot in a table.
+
+**Note** this procedure invalidates all cached Spark plans that reference the affected table
+
+#### Usage
+
+| Argument Name | Required? | Type | Description |
+|---------------|-----------|------|-------------|
+| table         | ✔️  | String | Name of table to perform cherrypick on |
+| snapshot_id   | ✔️   | Long | The snapshot ID to cherrypick |
+
+#### Output
+
+| Output Name | Type | Description |
+| ------------|------|-------------|
+| source_snapshot_id | Long | The snapshot before applying the cherrypick |
+| current_snapshot_id | Long | The current snapshot now that the cherrypick has been applied|
+
+#### Examples
+
+Cherrypick Snapshot 1
+```sql
+    CALL catalog_name hive_prod.system.cherrypick_snapshot('my_table', 1)
+```
+
+Cherrypick Snapshot 1 with named args
+```sql
+    CALL catalog_name hive_prod.system.cherrypick_snapshot(snapshot_id => 1, table => 'my_table' )
+```
+
+### Expire Snapshot Procedure
+
+Each write/update/delete/upsert/compaction in Iceberg produces a new snapshot while keeping the old data and metadata
+around for snapshot isolation and time travel. The `expire_snapshots` procedure can be used to remove older snapshots
+and their files which are no longer needed.
+
+This procedure will remove old snapshots and data files which are uniquely required by those old snapshots. This means
+the ``expire_snapshots`` procedure will never remove files which are still required by a non-expired snapshot.
+
+#### Usage
+
+| Argument Name | Required? | Type | Description |
+|---------------|-----------|------|-------------|
+| table         | ✔️  | String | Name of table to expire snapshots from |
+| older_than    | ️   | Timestamp   | Remove snapshots older than this date (Defaults to now()) |

Review comment:
       This defaults to now? I thought it defaulted to 3 days ago?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org