You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/23 12:37:11 UTC

[GitHub] [iceberg] sshkvar opened a new pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

sshkvar opened a new pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854


   Ability to change purge flag which is used for deleting data and metadata files on table drop action via catalog properties
   
   Added new spark catalog property `purge-data-and-metadata` which is true by default (current behavior)
   if `purge-data-and-metadata` set to `false` data and metadata files will not be deleted 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] kbendick commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

kbendick commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r680596282



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {
+    return new Object[][]{
+            {"testhive", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default"
+                    )},
+            {"testhadoop", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hadoop"
+                    )},
+            {"testhive_purge", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default",
+                            SparkCatalog.PURGE_DATA_AND_METADATA, "false"
+                    )}
+    };
+  }
+
+  @Test
+  public void testCreateTable() throws IOException {
+    Assert.assertFalse("Table should not already exist", validationCatalog.tableExists(tableIdent));
+
+    sql("CREATE TABLE %s USING iceberg as select 1 as value", tableName);
+
+    Table table = validationCatalog.loadTable(tableIdent);

Review comment:
       If we can, I think that would be preferable to some. This `validationCatalog` is somewhat more of a mock than what you'll get from `spark.catalog`.
   
   I will look into adding a method to handle this for users to the base catalog unit test so you can leave this for now and then grab it later. I believe you're right that here, in particular, there's not a large difference and the test is presently correct. =)
   
   But for your reference, the `validationCatalog` is a hadoop catalog. Hence why I think it's better to grab the catalog the same way spark would when calls to `spark.sql` that use that catalog are made =)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r678053438



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {

Review comment:
       now parameters placed above the constructor 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676428162



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {
+    return new Object[][]{
+            {"testhive", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default"
+                    )},
+            {"testhadoop", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hadoop"
+                    )},
+            {"testhive_purge", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default",
+                            SparkCatalog.PURGE_DATA_AND_METADATA, "false"
+                    )}
+    };
+  }

Review comment:
       Thanks, I will update formatting 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#issuecomment-886523709


   > For Spark, my understanding was that we were gong to respect the `PURGE` information coming from the SQL command. Am I fully mistaken or is this for a different issue?
   
   In current implementation we ignoring purge information and always send `true` from [Spark Catalog](https://github.com/apache/iceberg/blob/master/spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java#L237) to [Hive catalog](https://github.com/apache/iceberg/blob/b623b074c6242ac3b48fce8712dedd7f97e6ee5b/api/src/main/java/org/apache/iceberg/catalog/Catalog.java#L284)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676423984



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -81,13 +81,17 @@
 public class SparkCatalog extends BaseCatalog {
   private static final Set<String> DEFAULT_NS_KEYS = ImmutableSet.of(TableCatalog.PROP_OWNER);
 
+  public static final String PURGE_DATA_AND_METADATA = "purge-data-and-metadata";

Review comment:
       Actually we will use value of this property in Hive and Hadoop catalog, please take a look [on this method in Hive catalo, which will be called from spark catalog](https://github.com/apache/iceberg/blob/7779953bd2dad7990725391f9d5d8be077a7048d/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java#L148)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] kbendick commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

kbendick commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676364846



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {
+    return new Object[][]{
+            {"testhive", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default"
+                    )},
+            {"testhadoop", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hadoop"
+                    )},
+            {"testhive_purge", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default",
+                            SparkCatalog.PURGE_DATA_AND_METADATA, "false"
+                    )}
+    };
+  }
+
+  @Test
+  public void testCreateTable() throws IOException {
+    Assert.assertFalse("Table should not already exist", validationCatalog.tableExists(tableIdent));
+
+    sql("CREATE TABLE %s USING iceberg as select 1 as value", tableName);
+
+    Table table = validationCatalog.loadTable(tableIdent);

Review comment:
       Nit: As this is a spark test, can you pull the table off of the `spark.catalog` or from spark's `CatalogManager` in general?
   
   See `TestSparkCatalogHadoopOverrides` for a way to do that if needed :) 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676435030



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {
+    return new Object[][]{
+            {"testhive", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default"
+                    )},
+            {"testhadoop", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hadoop"
+                    )},
+            {"testhive_purge", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default",
+                            SparkCatalog.PURGE_DATA_AND_METADATA, "false"
+                    )}
+    };
+  }
+
+  @Test
+  public void testCreateTable() throws IOException {
+    Assert.assertFalse("Table should not already exist", validationCatalog.tableExists(tableIdent));
+
+    sql("CREATE TABLE %s USING iceberg as select 1 as value", tableName);
+
+    Table table = validationCatalog.loadTable(tableIdent);

Review comment:
       I think that for this case we can pull table information from Hive catalog, because we just need to know table path.
   
   But I can change it and pull table information via spark (like in `TestSparkCatalogHadoopOverrides `)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] edgarRd commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

edgarRd commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r678508567



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -81,13 +81,17 @@
 public class SparkCatalog extends BaseCatalog {
   private static final Set<String> DEFAULT_NS_KEYS = ImmutableSet.of(TableCatalog.PROP_OWNER);
 
+  public static final String PURGE_DATA_AND_METADATA = "purge-data-and-metadata";

Review comment:
       I don't think `SparkCatalog` is the right place to set an Iceberg catalog property - i.e. what about Trino and Hive compute engines? Would they have a different behavior or not have the property at all?
   
   I think if we wanted to add such a property, it should follow the same pattern as other catalog properties specifying it at the catalog level (i.e. a property in `CatalogProperties` and read via the properties passed to the catalog via `initialize` method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#issuecomment-895194533


   @kbendick I have changed test and now use `spark` catalog for getting Table info. 
   Also I rebased this branch to have only 1 commit. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] kbendick commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

kbendick commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676363177



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {

Review comment:
       Nit: Usually I feel like I see parameters going above the constructor, but I'm not sure if there's a hard rule for that or not. Partially bringing it up to hear from others if that's considered a style best practice here or not :) 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676424747



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -400,6 +404,7 @@ public final void initialize(String name, CaseInsensitiveStringMap options) {
             .toArray(new String[0]);
       }
     }
+    this.purgeDataAndMetadata = options.getBoolean(PURGE_DATA_AND_METADATA, PURGE_DATA_AND_METADATA_DEFAULT);

Review comment:
       We can use this property in Spark Catalog, but from my point of view it is better to have additional property. 
   But I can change it 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676423984



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -81,13 +81,17 @@
 public class SparkCatalog extends BaseCatalog {
   private static final Set<String> DEFAULT_NS_KEYS = ImmutableSet.of(TableCatalog.PROP_OWNER);
 
+  public static final String PURGE_DATA_AND_METADATA = "purge-data-and-metadata";

Review comment:
       Actually will use value of this property in Hive and Hadoop catalog, please take a look [on this method in Hive catalo, which will be called from spark catalog](https://github.com/apache/iceberg/blob/7779953bd2dad7990725391f9d5d8be077a7048d/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java#L148)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r680916848



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {
+    return new Object[][]{
+            {"testhive", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default"
+                    )},
+            {"testhadoop", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hadoop"
+                    )},
+            {"testhive_purge", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default",
+                            SparkCatalog.PURGE_DATA_AND_METADATA, "false"
+                    )}
+    };
+  }
+
+  @Test
+  public void testCreateTable() throws IOException {
+    Assert.assertFalse("Table should not already exist", validationCatalog.tableExists(tableIdent));
+
+    sql("CREATE TABLE %s USING iceberg as select 1 as value", tableName);
+
+    Table table = validationCatalog.loadTable(tableIdent);

Review comment:
       I think you a right, so I added the same method `getIcebergTableFromSparkCatalog` as in `TestSparkCatalogHadoopOverrides ` and used it in test 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] kbendick commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

kbendick commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r680596282



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {
+    return new Object[][]{
+            {"testhive", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default"
+                    )},
+            {"testhadoop", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hadoop"
+                    )},
+            {"testhive_purge", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default",
+                            SparkCatalog.PURGE_DATA_AND_METADATA, "false"
+                    )}
+    };
+  }
+
+  @Test
+  public void testCreateTable() throws IOException {
+    Assert.assertFalse("Table should not already exist", validationCatalog.tableExists(tableIdent));
+
+    sql("CREATE TABLE %s USING iceberg as select 1 as value", tableName);
+
+    Table table = validationCatalog.loadTable(tableIdent);

Review comment:
       If you could, I think that would be preferable to some. This `validationCatalog` is somewhat more of a mock than what you'll get from `spark.catalog`.
   
   I will look into adding a method to handle this for users to the base catalog unit test =)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] kbendick commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

kbendick commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676361759



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {
+    return new Object[][]{
+            {"testhive", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default"
+                    )},
+            {"testhadoop", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hadoop"
+                    )},
+            {"testhive_purge", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default",
+                            SparkCatalog.PURGE_DATA_AND_METADATA, "false"
+                    )}
+    };
+  }

Review comment:
       Nit - These appear over indented.
   
   There should be other examples that have the correct formatting. Maybe take a look at `TestSparkCatalogHadoopOverrides`, which has a very similar set of parameters (with proper spacing).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r678052658



##########
File path: spark3/src/test/java/org/apache/iceberg/spark/sql/TestDropTable.java
##########
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.spark.sql;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.Map;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.spark.SparkCatalog;
+import org.apache.iceberg.spark.SparkCatalogTestBase;
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runners.Parameterized;
+
+public class TestDropTable extends SparkCatalogTestBase {
+  private final boolean shouldPurgeDataAndMetadata;
+
+  public TestDropTable(String catalogName, String implementation, Map<String, String> config) {
+    super(catalogName, implementation, config);
+    if (config.containsKey(SparkCatalog.PURGE_DATA_AND_METADATA)) {
+      this.shouldPurgeDataAndMetadata = Boolean.parseBoolean(config.get(SparkCatalog.PURGE_DATA_AND_METADATA));
+    } else {
+      this.shouldPurgeDataAndMetadata = SparkCatalog.PURGE_DATA_AND_METADATA_DEFAULT;
+    }
+  }
+
+  @Parameterized.Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
+  public static Object[][] parameters() {
+    return new Object[][]{
+            {"testhive", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default"
+                    )},
+            {"testhadoop", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hadoop"
+                    )},
+            {"testhive_purge", SparkCatalog.class.getName(),
+                    ImmutableMap.of(
+                            "type", "hive",
+                            "default-namespace", "default",
+                            SparkCatalog.PURGE_DATA_AND_METADATA, "false"
+                    )}
+    };
+  }

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] openinx commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

openinx commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676260116



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -81,13 +81,17 @@
 public class SparkCatalog extends BaseCatalog {
   private static final Set<String> DEFAULT_NS_KEYS = ImmutableSet.of(TableCatalog.PROP_OWNER);
 
+  public static final String PURGE_DATA_AND_METADATA = "purge-data-and-metadata";

Review comment:
       If we plan to introduce such a configure key,  it should be applied to all the catalogs (such as hive,hadoop and other custom catalogs)  , instead of a specific property for spark catalog ? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] openinx commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

openinx commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r676261517



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -400,6 +404,7 @@ public final void initialize(String name, CaseInsensitiveStringMap options) {
             .toArray(new String[0]);
       }
     }
+    this.purgeDataAndMetadata = options.getBoolean(PURGE_DATA_AND_METADATA, PURGE_DATA_AND_METADATA_DEFAULT);

Review comment:
       We now have a `gc.enabled` property for the similar behavior ( https://github.com/apache/iceberg/pull/1796), could we introduce the existing property to this spark catalog ?  Sounds like the `gc.enabled=false` will keep all the data files but still remove the metadata files while what you expected is keep both the data files and metafiles..  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] sshkvar commented on a change in pull request #2854: Spark catalog: fixed hardcoded purge option in drop table action

Posted by GitBox <gi...@apache.org>.

sshkvar commented on a change in pull request #2854:
URL: https://github.com/apache/iceberg/pull/2854#discussion_r678873441



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -81,13 +81,17 @@
 public class SparkCatalog extends BaseCatalog {
   private static final Set<String> DEFAULT_NS_KEYS = ImmutableSet.of(TableCatalog.PROP_OWNER);
 
+  public static final String PURGE_DATA_AND_METADATA = "purge-data-and-metadata";

Review comment:
       > I don't think `SparkCatalog` is the right place to set an Iceberg catalog property - i.e. what about Trino and Hive compute engines? Would they have a different behavior or not have the property at all?
   
   Not sure that I understand why `SparkCatalog` is not a right place, please take a look on current implementation of [dropTable in SparkCatalog](https://github.com/apache/iceberg/blob/master/spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java#L237).  As you can see we are calling [dropTable method from Catalog.java ](https://github.com/apache/iceberg/blob/master/api/src/main/java/org/apache/iceberg/catalog/Catalog.java#L284) where we always pass `true` for purge flag. So I just added ability to make it configurable via property passed to a catalog.
   Currently Trino not dropping any data and metadata, just calls drop table from hive metastore
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org