You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "tomtongue (via GitHub)" <gi...@apache.org> on 2023/02/10 18:27:13 UTC

[GitHub] [iceberg] tomtongue opened a new pull request, #6806: Spark 3.3: Change the default dest catalog name for TableSnapshot

tomtongue opened a new pull request, #6806:
URL: https://github.com/apache/iceberg/pull/6806

   ## Change
   By default, in Spark, the destination catalog name for table snapshot is specified spark default catalog name like `spark_catalog` based on [SparkActions.java#L53](https://github.com/apache/iceberg/blob/370c135144ce2d1b00e9938e267548e78ee6edaf/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/SparkActions.java#L53) and [SnapshotTableSparkAction.java#L81](https://github.com/apache/iceberg/blob/370c135144ce2d1b00e9938e267548e78ee6edaf/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/SnapshotTableSparkAction.java#L81) (the issue that is caused by this specification is described in ***Issue*** section below). 
   
   This commit changes this destination catalog name to the user specified catalog name that comes from `CALL <catalog>.system.snapshot(...)` in the table snapshot procedure.
   
   
   ## Issue
   Currently, in Spark, if we don't specify any catalog for the destination table (e.g. `CALL catalog.system.snapshot('db.src', 'db.dest')`), it fails because of the following exception in [checkDestinationCatalog](https://github.com/apache/iceberg/blob/370c135144ce2d1b00e9938e267548e78ee6edaf/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/SnapshotTableSparkAction.java#L84). This means if we don't specify the catalog name in the destination table, `spark_catalog` (that is `V2SessionCatalog`) is passed to the `SnapshotTableSparkAction.java`. Therefore, this commit tries passing the first catalog name in the procedure query to the destination as the destination catalog name.
   
   ```
   java.lang.IllegalArgumentException: Cannot create Iceberg table in non-Iceberg Catalog. Catalog 'spark_catalog' was of class 'org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog' but 'org.apache.iceberg.spark.SparkSessionCatalog' or 'org.apache.iceberg.spark.SparkCatalog' are required
   	at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:472) ~[iceberg-spark-runtime-3.3_2.12-1.2.0-SNAPSHOT.jar:?]
   	at org.apache.iceberg.spark.actions.BaseTableCreationSparkAction.checkDestinationCatalog(BaseTableCreationSparkAction.java:142) ~[iceberg-spark-runtime-3.3_2.12-1.2.0-SNAPSHOT.jar:?]
   	at org.apache.iceberg.spark.actions.SnapshotTableSparkAction.as(SnapshotTableSparkAction.java:84) ~[iceberg-spark-runtime-3.3_2.12-1.2.0-SNAPSHOT.jar:?]
   	at org.apache.iceberg.spark.procedures.SnapshotTableProcedure.call(SnapshotTableProcedure.java:99) ~[iceberg-spark-runtime-3.3_2.12-1.2.0-SNAPSHOT.jar:?]
   	at org.apache.spark.sql.execution.datasources.v2.CallExec.run(CallExec.scala:34) ~[iceberg-spark-runtime-3.3_2.12-1.2.0-SNAPSHOT.jar:?]
   	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43) ~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
   	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43) ~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
   	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49) ~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
   ... // omitted ...
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [PR] Spark 3.3: Change the default dest catalog name for TableSnapshot [iceberg]

Posted by "tomtongue (via GitHub)" <gi...@apache.org>.
tomtongue closed pull request #6806: Spark 3.3: Change the default dest catalog name for TableSnapshot
URL: https://github.com/apache/iceberg/pull/6806


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org