You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "dpaani (via GitHub)" <gi...@apache.org> on 2023/03/28 15:56:31 UTC

[GitHub] [iceberg] dpaani opened a new pull request, #7228: Fix for Drop the SQL issue when attempting to drop an Iceberg table

dpaani opened a new pull request, #7228:
URL: https://github.com/apache/iceberg/pull/7228

   Fix for Drop the SQL issue when attempting to drop an Iceberg table whose location does not exist - https://github.com/apache/iceberg/issues/7227


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1169526420


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   Okay, I see `DropTableExec` implementation now. I agree about differentiating between table exists vs being able to load table metadata. We can change `tableExists` in Iceberg catalogs to verify if the metastore pointer exists rather that trying to load the table and then overload `tableExists` inside Spark catalogs.
   
   I feel we can still implement what I suggested [above](https://github.com/apache/iceberg/pull/7228/files#r1157829683) for the actual cleanup.
   
   @dpaani, that's unfortunate. One idea to overcome that issue is to throw a new exception type like `CorruptedMetadataException` in our Iceberg catalogs whenever the metadata file is missing. We can then catch this inside our Spark catalogs and return a special Spark table.
   
   Let me think more tomorrow. Any better ideas, @RussellSpitzer?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1157834087


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   I don't think we need any Spark session configs for that. It is up to Iceberg catalogs to verify it is a valid Iceberg table. We can't make the assumption on how the catalog handles Iceberg metadata.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1157756506


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   It feels a bit hacky to me. I wonder whether that unregister table procedure may be a better option.
   Let me think.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1162008555


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   hi @aokolnychyi @RussellSpitzer I tried to move this logic to tableExists method, but it looks like that also wont work.  Spark calls loadTable when it tries to convert Unresolvedrelation to resolved table in 
   
   https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L1151
   
   ```
         case u @ UnresolvedTableOrView(identifier, cmd, allowTempView) =>
           lookupTableOrView(identifier).map {
             case v: ResolvedView if v.isTemp && !allowTempView =>
               throw QueryCompilationErrors.expectTableOrPermanentViewNotTempViewError(
                 identifier.quoted, cmd, u)
             case other => other
           }.getOrElse(u)
   
       private def lookupTableOrView(identifier: Seq[String]): Option[LogicalPlan] = {
         lookupTempView(identifier).map { _ =>
           ResolvedView(identifier.asIdentifier, isTemp = true)
         }.orElse {
           expandIdentifier(identifier) match {
             case CatalogAndIdentifier(catalog, ident) =>
               **CatalogV2Util.loadTable(catalog, ident)**.map {
                 case v1Table: V1Table if CatalogV2Util.isSessionCatalog(catalog) &&
                   v1Table.v1Table.tableType == CatalogTableType.VIEW =>
                   ResolvedView(ident, isTemp = false)
                 case table =>
                   ResolvedTable.create(catalog.asTableCatalog, ident, table)
               }
             case _ => None
           }
         }
       }
   ```
   
   Could you please advise what else I can try here. TIA
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1152399839


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      throw e;
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {

Review Comment:
   I think this goes the the broader point that we don't want the responsibility here on the "loadTableInternal". This should be the responsibility of the delete method correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1152525505


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      throw e;
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {

Review Comment:
   hi @RussellSpitzer drop SQL first calls loadTable method internally . Here is the trace:
   ```
   at org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:208)
   	at org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:185)
   	at org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:180)
   	at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:176)
   	at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:97)
   	at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:80)
   	at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:47)
   	at org.apache.iceberg.spark.SparkCatalog.load(SparkCatalog.java:625)
   	at org.apache.iceberg.spark.SparkCatalog.loadTable(SparkCatalog.java:151)
   	at org.apache.iceberg.spark.SparkSessionCatalog.loadTable(SparkSessionCatalog.java:415)
   	at org.apache.iceberg.spark.SparkSessionCatalog.loadTableInternal(SparkSessionCatalog.java:154)
   	at org.apache.iceberg.spark.SparkSessionCatalog.loadTable(SparkSessionCatalog.java:138)
   	at org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:311)
   	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupTableOrView$2(Analyzer.scala:1134)
   	at scala.Option.orElse(Option.scala:447)
   ```
   
   Ref: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropTableExec.scala
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1157795034


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   We discussed possibly just overriding tableExists so that if we have a "unable to read file exception" from loadTable we can still say the table exists as an Iceberg table. Further catalog attempts to use the table would fail, but delete would succeed. If we want we could also gate that behind session param.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#issuecomment-1487219851

   Hi Team,
     Could you please help to review this PR. TIA
   
   cc : @flyrain @anuragmantri @RussellSpitzer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1157834087


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   I don't think we need any Spark session configs for that. It is up to Iceberg catalogs to verify it is a valid Iceberg table. We can't make assumptions on how the catalog handles Iceberg metadata.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1157795034


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   We discussed possibly just overriding tableExists in BaseCatalog so that if we have a "unable to read file exception" from loadTable we can still say the table exists as an Iceberg table. Further catalog attempts to use the table would fail, but delete would succeed. If we want we could also gate that behind session param.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1152381961


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      throw e;
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {

Review Comment:
   Sure @RussellSpitzer . 
   1. Spark Catalog - Do we need to fix this there? --> Only SparkSessionCatalog has delegate catalog. SparkCatalog only works with iceberg table. In this case, it is not a valid iceberg table since metadata folder does not exist.
   2.  Drop table even if metadata cannot be loaded. Maybe a Spark Conf? --> Having spark config is good idea. By default, we can keep this config as false.
   3. We can maybe do this by checking for MetadataLocation? --> This is also safe thing to do. I thought of checking this only if spark conf is turned on. 
   
   With my proposed change.. it will not break existing functionality and will not be enabled by default. If spark conf set to true, and table contains metadata_location property ( valid iceberg table earlier) and then only load from metastore and delete it. Once metadata does not exist.. this table cant behave as Iceberg table anymore and safer to delete if user requested
   
   ```
     private Table loadTableInternal(Identifier ident, String version, Long timestamp)
         throws NoSuchTableException {
       try {
         return loadTable(icebergCatalog, ident, version, timestamp);
       } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
         throw e;
       } catch (org.apache.iceberg.exceptions.NotFoundException e) {
         **if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {** // check config is set
           Table table = loadTable(getSessionCatalog(), ident, version, timestamp);
           if (table.properties() != null
               && **table.properties().containsKey(TableProperties.METADATA_LOCATION)**) { // check iceberg table
             return table;
           }
         }
         throw e;
       }
     }
   
     private boolean loadFromSessionCatalogOnLocationNotFoundEnabled() {
       return Boolean.parseBoolean(
           SparkSession.active()
               .conf()
               .get(
                   SparkSQLProperties.LOAD_FROM_SESSION_CATALOG_ON_LOCATION_NOT_FOUND_ENABLED,
                   SparkSQLProperties
                       .LOAD_FROM_SESSION_CATALOG_ON_LOCATION_NOT_FOUND_ENABLED_DEFAULT));
     }
   
   
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1152634297


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      throw e;
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {

Review Comment:
   Another option is, overriding DropTableExec Physical plan from IcebergSparkSessionExtensions and handle FileNotFound exception



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1169526420


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   Okay, I see `DropTableExec` implementation now. I agree about differentiating between table exists vs being able to load table metadata. We can change `tableExists` in Iceberg catalogs to verify if the metastore pointer exists rather that trying to load the table and then overload `tableExists` inside Spark catalogs.
   
   I feel we can still implement what I suggested [above](https://github.com/apache/iceberg/pull/7228/files#r1157829683) for the actual cleanup.
   
   @dpaani, that's unfortunate. One idea to overcome that issue is to throw a new exception type like `CorruptedMetadataException` in our Iceberg catalogs whenever the metadata file is missing but the table pointer exists. We can then catch this inside our Spark catalogs and return a special Spark table.
   
   Let me think more tomorrow. Any better ideas, @RussellSpitzer?



##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   Okay, I see `DropTableExec` implementation now. I agree about differentiating between table exists vs being able to load table metadata. We can change `tableExists` in Iceberg catalogs to verify if the metastore pointer exists rather that trying to load the table and then overload `tableExists` inside Spark catalogs.
   
   I feel we can still implement what I suggested [above](https://github.com/apache/iceberg/pull/7228/files#r1157829683) for the actual cleanup.
   
   @dpaani, that's unfortunate. One idea to overcome that issue is to throw a new exception type like `CorruptedMetadataException` in our Iceberg catalogs whenever the metadata file is missing but the table pointer exists. We can then catch this inside our Spark catalogs and return a special dummy Spark table.
   
   Let me think more tomorrow. Any better ideas, @RussellSpitzer?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1151145734


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -136,7 +136,7 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
   public Table loadTable(Identifier ident) throws NoSuchTableException {
     try {
       return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
+    } catch (NoSuchTableException | org.apache.iceberg.exceptions.NotFoundException e) {

Review Comment:
   Why do we only do this in SparkSessionCatalog? Shouldn't the same issue exist with SparkCatalog? I also wonder if we should have a session parameter or something to enable this behavior. Basically in this case we are saying drop table should drop any table that has a catalog entry even if we can't determine it is an Iceberg table.
   
   I think we can probably make this a little safer by checking whether the table has the metadata_location property set in the catalog? I'm open to other suggestions as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1152353848


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      throw e;
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {

Review Comment:
   Hi @RussellSpitzer , could you please confirm this is how you were expecting? If not, please advise.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1152366610


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      throw e;
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {

Review Comment:
   not quite. I had few independent thoughts
   
   1.) Spark Catalog - Do we need to fix this there?
   2.) Should we gate this behind a parameter? IE - Drop table even if metadata cannot be loaded. Maybe a Spark Conf?
   3.) Should we do something to check whether the table we are about to drop is an iceberg table? We can maybe do this by checking for MetadataLocation? This would be separate from 2
   
   I would want to discuss all of these points. Just because a reviewer suggests something doesn't mean we need to necessarily make that change, but we should address them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1152398584


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      throw e;
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {

Review Comment:
   If we have a "notFoundException" doesn't that mean we had to have loaded the MetadataLocation? Doesn't that come from the IO?
   
   The property should probably be named directly to it's purpose. The purpose here is something like "ENABLE_DELETE_WHEN_METADATA_NOT_FOUND"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1157829683


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   I think we are trying to fix this problem at a wrong level. I feel Spark logic should be as follows:
   
   - Load Iceberg table if possible. If not possible, it is OK.
   - Call icebergCatalog.drop(ident, false) to drop the pointer no matter whether the first step was successful.
   - If Iceberg table is not null and metadata file still exists, call the cleanup action.
   
   All we would need to do is to adapt our Iceberg catalog implementations to drop pointers even if the metadata file does not exist. That's where we can check if the catalog entry seems like an Iceberg table. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1160336120


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   Thanks @RussellSpitzer . 
   
   hi @aokolnychyi. I agree with Russell. Can I make the change suggested by Russell? Please advise. Thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1160284760


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   The Spark execution plan requires calling "tableExists" before an invocation of "dropTable" so we can't fix it purely within our implementation unless we also adjust "tableExists" That's why I had suggested fixing "tableExists" so that it doesn't fail just because "loadTable" also failed, but would return true as along as the exception had to do with something Iceberg related like attempting to load metadata.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org