You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/08/23 16:53:05 UTC

[GitHub] [iceberg] sririshindra opened a new pull request, #5622: Spark: Remove backup table after a successful migrate action.

sririshindra opened a new pull request, #5622:
URL: https://github.com/apache/iceberg/pull/5622

   Once the MigrateTable action is called, a backup table is created. The backup table is used to restore the original table in case the MigrateTable action fails. This restoration is done by renaming the backup table to the original table. Therefore if the migrate operation fails there is no backup table lying around in the catalog. However, if the migration goes through successfully, the backup table is not cleaned up leaving the user to find a new unknown table in their catalog.
   
   This PR cleans up the backup table after the migrate table action succeeds.
   
   I tested this manually and I also added additional assertions to the existing unit tests to verify that the backup table is cleaned up properly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r995218747


##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Drops the backup of the original table after a successful migration
+   *
+   * @param dropBackup if true, removes the backup table
+   * @return this for method chaining
+   */
+  MigrateTable dropBackup(boolean dropBackup);

Review Comment:
   @danielcweeks @rdblue wanted to check as its api module, we can add a new method if all subclass implement it, right?  No need to add default?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r978900121


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -112,6 +112,8 @@ private MigrateTable.Result doExecute() {
     StagedSparkTable stagedTable = null;
     Table icebergTable;
     boolean threw = true;
+    boolean dropBackup =
+        Boolean.parseBoolean(additionalProperties().getOrDefault("dropBackup", "false"));

Review Comment:
   I added the extra dropBackup method on MigrateTableSparkAction as you suggested in the latest commit. This way we avoid modifying the table properties. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r994957246


##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Sets the dropBackup property to drop backup table after a successful table migration.

Review Comment:
   Done, Fixed in latest commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1227996346

   I did notice it the other day, and also thought it would be a good usability feature (especially as there is snapshot table that keeps the original table).  I wonder if need to have a  drop_backup flag or something, in case someone depends on original behavior somehow?  Let's see if everyone is ok with it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r955612321


##########
spark/v3.1/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -221,4 +223,8 @@ private void restoreSourceTable() {
           e);
     }
   }
+
+  private void removeBackupTable() {
+    destCatalog().dropTable(backupIdent);

Review Comment:
   Should we do try/catch?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] wypoon commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
wypoon commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1228840960

   @RussellSpitzer were you the original author of the migrate action (#1525)? Was the purpose of the backup table simply for restoring it in case the migrate failed? If so, it would make sense to remove the backup if the migrate succeeded, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho merged pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho merged PR #5622:
URL: https://github.com/apache/iceberg/pull/5622


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r966512855


##########
api/src/main/java/org/apache/iceberg/actions/ActionsProvider.java:
##########
@@ -31,6 +31,11 @@ default SnapshotTable snapshotTable(String sourceTableIdent) {
 
   /** Instantiates an action to migrate an existing table to Iceberg. */
   default MigrateTable migrateTable(String tableIdent) {
+    return migrateTable(tableIdent, false);
+  }
+
+  /** Instantiates an action to migrate an existing table to Iceberg. */
+  default MigrateTable migrateTable(String tableIdent, Boolean dropBackup) {

Review Comment:
   We should just keep this the same, and add options to MigrateTable (see DeleteOrphanFiles which has many options, for example)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r977913967


##########
api/src/main/java/org/apache/iceberg/actions/ActionsProvider.java:
##########
@@ -31,6 +31,11 @@ default SnapshotTable snapshotTable(String sourceTableIdent) {
 
   /** Instantiates an action to migrate an existing table to Iceberg. */
   default MigrateTable migrateTable(String tableIdent) {
+    return migrateTable(tableIdent, false);
+  }
+
+  /** Instantiates an action to migrate an existing table to Iceberg. */
+  default MigrateTable migrateTable(String tableIdent, Boolean dropBackup) {

Review Comment:
   Better but not exactly what I meant, I think we still need an extra method on MigrateTableAction object (MigrateTableAction::dropBackup(boolean)).  Otherwise piggybacking with tableProperties there would make actually persisted to tableProperties?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r996182604


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -95,6 +97,12 @@ public MigrateTable tableProperty(String property, String value) {
     return this;
   }
 
+  @Override
+  public MigrateTable dropBackup(boolean dropBackup) {
+    this.removeBackup = dropBackup;

Review Comment:
   Can we rename the variables to be the same (either remove or drop).  It makes the code easier if there's fewer concepts, and we keep the same variable name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r995218747


##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Drops the backup of the original table after a successful migration
+   *
+   * @param dropBackup if true, removes the backup table
+   * @return this for method chaining
+   */
+  MigrateTable dropBackup(boolean dropBackup);

Review Comment:
   @danielcweeks @rdblue wanted to check as its api module, we can add a new method if all subclass implement it, right?  Debating whether or not to make a default.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1282837097

   Merged, thanks @sririshindra for the changes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1227895113

   @aokolnychyi , @rdblue , @danielcweeks 
   Could you please take a look at this PR if you get a chance? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r981912269


##########
.palantir/revapi.yml:
##########
@@ -3,12 +3,19 @@ versionOverrides:
 acceptedBreaks:
   apache-iceberg-0.14.0:
     org.apache.iceberg:iceberg-api:
+    - code: "java.class.defaultSerializationChanged"

Review Comment:
   I don't understand the reason why the partition key is complaining either. 
   When I try to execute ` ./gradlew :iceberg-api:revapi --rerun-tasks`
   
   I get the following error.
   
   ```
   Execution failed for task ':iceberg-api:revapi'.
   > There were Java public API/ABI breaks reported by revapi:
     
     java.class.defaultSerializationChanged: The default serialization ID for the class has changed. This means that the new version of the class is not deserializable from the byte stream of a serialized old class.
     
     old: class org.apache.iceberg.PartitionKey
     new: class org.apache.iceberg.PartitionKey
     
     SOURCE: EQUIVALENT, BINARY: EQUIVALENT, SEMANTIC: BREAKING
     
     From old archive: iceberg-api-0.14.0.jar
     From new archive: iceberg-api-0.15.0-SNAPSHOT.jar
   ```
   Still looking into why it is needed.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r981912269


##########
.palantir/revapi.yml:
##########
@@ -3,12 +3,19 @@ versionOverrides:
 acceptedBreaks:
   apache-iceberg-0.14.0:
     org.apache.iceberg:iceberg-api:
+    - code: "java.class.defaultSerializationChanged"

Review Comment:
   I don't understand the reason why the partition key is complaining either. 
   When I try to execute ` ./gradlew :iceberg-api:revapi --rerun-tasks`
   
   I get the following error.
   
   ```
   Execution failed for task ':iceberg-api:revapi'.
   > There were Java public API/ABI breaks reported by revapi:
     
     java.class.defaultSerializationChanged: The default serialization ID for the class has changed. This means that the new version of the class is not deserializable from the byte stream of a serialized old class.
     
     old: class org.apache.iceberg.PartitionKey
     new: class org.apache.iceberg.PartitionKey
     
     SOURCE: EQUIVALENT, BINARY: EQUIVALENT, SEMANTIC: BREAKING
     
     From old archive: iceberg-api-0.14.0.jar
     From new archive: iceberg-api-0.15.0-SNAPSHOT.jar
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r981913778


##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Sets the dropBackup property to drop backup table after a successful table Migration

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r996231152


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -95,6 +97,12 @@ public MigrateTable tableProperty(String property, String value) {
     return this;
   }
 
+  @Override
+  public MigrateTable dropBackup(boolean dropBackup) {
+    this.removeBackup = dropBackup;

Review Comment:
   OK sorry, i guess it breaks the checkstyle :(.   
   
   After taking a look through other Iceberg interfaces (ie, RewriteDataFiles, OverwriteFiles), I think nobody passes a boolean but just have a no-arg method.  Can we do that instead?  Thanks.
   
   ie. 
   ```
   dropBackup() {
      this.dropBackup = true;
      return this;
   }
   ```
   
   Sorry for back and forth



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1230731228

   > Docs are fine, but to me it would make more sense to add a non-default flag to drop_backup (as mentioned above), if we want to do add a feature here. I dont see letting the user decide name for backup is that helpful?
   
   Sounds good. @szehon-ho I will update the PR to add the drop_table flag and make it optional. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r971478721


##########
api/src/main/java/org/apache/iceberg/actions/ActionsProvider.java:
##########
@@ -31,6 +31,11 @@ default SnapshotTable snapshotTable(String sourceTableIdent) {
 
   /** Instantiates an action to migrate an existing table to Iceberg. */
   default MigrateTable migrateTable(String tableIdent) {
+    return migrateTable(tableIdent, false);
+  }
+
+  /** Instantiates an action to migrate an existing table to Iceberg. */
+  default MigrateTable migrateTable(String tableIdent, Boolean dropBackup) {

Review Comment:
   Done, Fixed in the latest commit.



##########
spark/v3.0/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMigrateTableProcedure.java:
##########
@@ -84,7 +84,7 @@ public void testMigrateWithOptions() throws IOException {
     sql("INSERT INTO TABLE %s VALUES (1, 'a')", tableName);
 
     Object result =
-        scalarSql("CALL %s.system.migrate('%s', map('foo', 'bar'))", catalogName, tableName);
+        scalarSql("CALL %s.system.migrate('%s', false, map('foo', 'bar'))", catalogName, tableName);

Review Comment:
   Fixed in the latest commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r995218747


##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Drops the backup of the original table after a successful migration
+   *
+   * @param dropBackup if true, removes the backup table
+   * @return this for method chaining
+   */
+  MigrateTable dropBackup(boolean dropBackup);

Review Comment:
   @danielcweeks @rdblue wanted to check (as its api module and will be post 1.0), we can add a new method if all subclass implement it, right?  No need to add default?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r996178997


##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Drops the backup of the original table after a successful migration
+   *
+   * @param dropBackup if true, removes the backup table
+   * @return this for method chaining
+   */
+  MigrateTable dropBackup(boolean dropBackup);

Review Comment:
   I checked with @danielcweeks offline, we could put a default that throws UnsupportedOperation or just leave it as is.  Probably the risk is small anyone implements it, so i think we are good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r996222929


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -95,6 +97,12 @@ public MigrateTable tableProperty(String property, String value) {
     return this;
   }
 
+  @Override
+  public MigrateTable dropBackup(boolean dropBackup) {
+    this.removeBackup = dropBackup;

Review Comment:
   Done. Fixed in the latest commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r966512855


##########
api/src/main/java/org/apache/iceberg/actions/ActionsProvider.java:
##########
@@ -31,6 +31,11 @@ default SnapshotTable snapshotTable(String sourceTableIdent) {
 
   /** Instantiates an action to migrate an existing table to Iceberg. */
   default MigrateTable migrateTable(String tableIdent) {
+    return migrateTable(tableIdent, false);
+  }
+
+  /** Instantiates an action to migrate an existing table to Iceberg. */
+  default MigrateTable migrateTable(String tableIdent, Boolean dropBackup) {

Review Comment:
   We should just keep this the same, and use the options already existing on MigrateTables.



##########
spark/v3.0/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMigrateTableProcedure.java:
##########
@@ -84,7 +84,7 @@ public void testMigrateWithOptions() throws IOException {
     sql("INSERT INTO TABLE %s VALUES (1, 'a')", tableName);
 
     Object result =
-        scalarSql("CALL %s.system.migrate('%s', map('foo', 'bar'))", catalogName, tableName);
+        scalarSql("CALL %s.system.migrate('%s', false, map('foo', 'bar'))", catalogName, tableName);

Review Comment:
   I thought it's backward compatible, why do we need to change existing tests?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1281664659

   Thanks for the changes, will merge tomorrow if no further comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r994957699


##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Sets the dropBackup property to drop backup table after a successful table migration.
+   *
+   * @param dropBackup If set to true, removes the backup table

Review Comment:
   Done, Fixed in the latest commit.



##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -58,6 +58,7 @@
   private final StagingTableCatalog destCatalog;
   private final Identifier destTableIdent;
   private final Identifier backupIdent;
+  private boolean removeBackup = false;

Review Comment:
   Done, Fixed in the latest commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r981913671


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -221,4 +230,12 @@ private void restoreSourceTable() {
           e);
     }
   }
+
+  private void dropBackupTable() {
+    try {
+      destCatalog().dropTable(backupIdent);
+    } catch (Exception e) {
+      LOG.error("Cannot drop the backup table {} after migrate is completed.", backupIdent, e);

Review Comment:
   I am not sure I understand the grammar error here, but I think it matches the style of the rest of the error messages in the file. But I changed `after migrate ` to `after migration`.



##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Sets the dropBackup property to drop backup table after a successful table Migration
+   *
+   * @param dropBackup flag to remove backup table

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r996585494


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -95,6 +97,12 @@ public MigrateTable tableProperty(String property, String value) {
     return this;
   }
 
+  @Override
+  public MigrateTable dropBackup(boolean dropBackup) {
+    this.removeBackup = dropBackup;

Review Comment:
   I fixed the checkstyle issue in the latest commit. Thanks for your patience in reviewing this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r977913967


##########
api/src/main/java/org/apache/iceberg/actions/ActionsProvider.java:
##########
@@ -31,6 +31,11 @@ default SnapshotTable snapshotTable(String sourceTableIdent) {
 
   /** Instantiates an action to migrate an existing table to Iceberg. */
   default MigrateTable migrateTable(String tableIdent) {
+    return migrateTable(tableIdent, false);
+  }
+
+  /** Instantiates an action to migrate an existing table to Iceberg. */
+  default MigrateTable migrateTable(String tableIdent, Boolean dropBackup) {

Review Comment:
   Better but not exactly what I meant, I think we still need an extra method on MigrateTableAction object (MigrateTableAction::dropBackup(boolean)).  Otherwise piggybacking with properties there would make it go to tableProperties?



##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -112,6 +112,8 @@ private MigrateTable.Result doExecute() {
     StagedSparkTable stagedTable = null;
     Table icebergTable;
     boolean threw = true;
+    boolean dropBackup =
+        Boolean.parseBoolean(additionalProperties().getOrDefault("dropBackup", "false"));

Review Comment:
   So additionalProperties() looks like properties that will be set on the target table, right?  That's not great because then there will be this extra property 'dropBackup' that's useless for the Iceberg table.
   
   So I think , let's add an extra method on MigrateTableSparkAction.dropBackup(boolean), and propagate it via its own dedicated argument in the procedure?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r994921586


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -58,6 +58,7 @@
   private final StagingTableCatalog destCatalog;
   private final Identifier destTableIdent;
   private final Identifier backupIdent;
+  private boolean removeBackup = false;

Review Comment:
   Nit: we try to put a space between finals and non-finals as well



##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Sets the dropBackup property to drop backup table after a successful table migration.
+   *
+   * @param dropBackup If set to true, removes the backup table

Review Comment:
   "if true, removes the backup table"



##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -221,4 +230,12 @@ private void restoreSourceTable() {
           e);
     }
   }
+
+  private void dropBackupTable() {
+    try {
+      destCatalog().dropTable(backupIdent);
+    } catch (Exception e) {
+      LOG.error("Cannot drop the backup table {} after migrate is completed.", backupIdent, e);

Review Comment:
   Can you change the rest of the Spark files?   "after" should be used before a noun, and "migrate" is a verb.



##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Sets the dropBackup property to drop backup table after a successful table migration.

Review Comment:
   Sentence is a bit redudant, you can see DeleteOrphanFiles or ExpireSnapshots for more direct examples.  
   
   The previous methods says 'sets a table property' because that's what it does, but here there's no point to say it sets a property.  How about:  'Drops the backup of the original table after a successful migration'.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r994960298


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -221,4 +230,12 @@ private void restoreSourceTable() {
           e);
     }
   }
+
+  private void dropBackupTable() {
+    try {
+      destCatalog().dropTable(backupIdent);
+    } catch (Exception e) {
+      LOG.error("Cannot drop the backup table {} after migrate is completed.", backupIdent, e);

Review Comment:
   I think the latest commit addresses the grammar issue correctly.  I used an online grammar checker to make sure the latest version is grammatically correct. Please let me know if a change is needed there. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1230661917

   > 
   
   @rdblue In that case how would you feel about letting the user decide on the name of the backup table instead of the generic backup suffix appended to the table name? I think we should at least document this behavior so that users are not surprised to see a new table in their catalog that they did not create. I can also do a documentation update if that is what is more appropriate. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1229569693

   I'm -1 on this. Not removing the backup table is intentional so that the user could choose which table to move forward with. Our docs included a step to "remove the backup table" once everything was running smoothly.
   
   I'm fine adding this as an option, but I don't think it should be the default.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] wypoon commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
wypoon commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1230906952

   I understand the thinking behind keeping the backup table after a successful migrate action and having the user decide what to do with it. If somehow this isn't already documented, then @sririshindra you could open a doc PR to document it.
   I think giving the user an option to remove the backup table automatically upon a successful migrate action is still useful. We can have a boolean parameter, defaulting to false, for removing the backup table, and pass it in the migrate procedure.
   I'm neutral on having an option for naming the backup table.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1230727511

   Docs are fine, but to me it would make more sense to add a non-default flag to drop_backup (as mentioned above), if we want to do add a feature here.   I dont see letting the user decide name for backup is that helpful?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] wypoon commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
wypoon commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1229718527

   @rdblue can you please point me to the doc with a step to remove the backup table? I do not see it in https://iceberg.apache.org/docs/latest/spark-procedures/#migrate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r966512855


##########
api/src/main/java/org/apache/iceberg/actions/ActionsProvider.java:
##########
@@ -31,6 +31,11 @@ default SnapshotTable snapshotTable(String sourceTableIdent) {
 
   /** Instantiates an action to migrate an existing table to Iceberg. */
   default MigrateTable migrateTable(String tableIdent) {
+    return migrateTable(tableIdent, false);
+  }
+
+  /** Instantiates an action to migrate an existing table to Iceberg. */
+  default MigrateTable migrateTable(String tableIdent, Boolean dropBackup) {

Review Comment:
   We should just keep this the same, and use the options already existing on MigrateTable.



##########
spark/v3.0/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMigrateTableProcedure.java:
##########
@@ -84,7 +84,7 @@ public void testMigrateWithOptions() throws IOException {
     sql("INSERT INTO TABLE %s VALUES (1, 'a')", tableName);
 
     Object result =
-        scalarSql("CALL %s.system.migrate('%s', map('foo', 'bar'))", catalogName, tableName);
+        scalarSql("CALL %s.system.migrate('%s', false, map('foo', 'bar'))", catalogName, tableName);

Review Comment:
   I thought it's backward compatible (new option default to false), why do we need to change existing tests?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r996182604


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -95,6 +97,12 @@ public MigrateTable tableProperty(String property, String value) {
     return this;
   }
 
+  @Override
+  public MigrateTable dropBackup(boolean dropBackup) {
+    this.removeBackup = dropBackup;

Review Comment:
   Can we rename the variables to be the same (either remove or drop).  Probably dropBackup as its smaller change.  It makes the code easier to understand quickly if we keep the same variable name, otherwise it indicates there's a difference.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r996231152


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -95,6 +97,12 @@ public MigrateTable tableProperty(String property, String value) {
     return this;
   }
 
+  @Override
+  public MigrateTable dropBackup(boolean dropBackup) {
+    this.removeBackup = dropBackup;

Review Comment:
   OK sorry, i guess it breaks the checkstyle :(.   
   
   After taking a look through other Iceberg interfaces, I think nobody passes a boolean but just have a no-arg method.  Can we do that instead?  Thanks.
   
   ie. 
   dropBackup() {
      this.dropBackup = true;
      return this;
   }



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] sririshindra commented on pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
sririshindra commented on PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#issuecomment-1276535821

   @szehon-ho Gentle ping to review this PR. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r995220276


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -221,4 +230,12 @@ private void restoreSourceTable() {
           e);
     }
   }
+
+  private void dropBackupTable() {
+    try {
+      destCatalog().dropTable(backupIdent);
+    } catch (Exception e) {
+      LOG.error("Cannot drop the backup table {} after migrate is completed.", backupIdent, e);

Review Comment:
   Sounds good, it's a bit long but its fine.  Ref for the grammar comment (missed putting it earlier):  https://dictionary.cambridge.org/us/grammar/british-grammar/after-afterwards



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5622: Spark: Remove backup table after a successful migrate action.

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5622:
URL: https://github.com/apache/iceberg/pull/5622#discussion_r980598504


##########
spark/v3.0/spark/src/main/java/org/apache/iceberg/spark/actions/BaseMigrateTableSparkAction.java:
##########
@@ -221,4 +230,12 @@ private void restoreSourceTable() {
           e);
     }
   }
+
+  private void dropBackupTable() {
+    try {
+      destCatalog().dropTable(backupIdent);
+    } catch (Exception e) {
+      LOG.error("Cannot drop the backup table {} after migrate is completed.", backupIdent, e);

Review Comment:
   Nit: grammar error here (should be- after + noun).  How about,  'Failed to drop the backup table {} after the migration'



##########
.palantir/revapi.yml:
##########
@@ -3,12 +3,19 @@ versionOverrides:
 acceptedBreaks:
   apache-iceberg-0.14.0:
     org.apache.iceberg:iceberg-api:
+    - code: "java.class.defaultSerializationChanged"

Review Comment:
   Why is partition key complaining here?  Can we not add this one?



##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Sets the dropBackup property to drop backup table after a successful table Migration
+   *
+   * @param dropBackup flag to remove backup table

Review Comment:
   Can be more descriptive, maybe "if true, removes the backup table"



##########
api/src/main/java/org/apache/iceberg/actions/MigrateTable.java:
##########
@@ -41,6 +41,14 @@ public interface MigrateTable extends Action<MigrateTable, MigrateTable.Result>
    */
   MigrateTable tableProperty(String name, String value);
 
+  /**
+   * Sets the dropBackup property to drop backup table after a successful table Migration

Review Comment:
   Capitlization looks a bit weird, let's revert Migration => migration?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org