You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/12/08 17:14:20 UTC

[GitHub] [iceberg] RussellSpitzer opened a new pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

RussellSpitzer opened a new pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890


   Previous procedures would take namespace and table seperately. In order to
   facilitate future developments in table identifiers like path based identifiers,
   we have switched to using single string parameters for identifying the table.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538787421



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified

Review comment:
       nit: I think we call it `fallBackCatalog` here and `defaultCatalog` above




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538626562



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);

Review comment:
       I think I missed an update here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#issuecomment-741853376


   I've merged this. Thanks, everyone!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538635991



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                            CatalogPlugin fallBackCatalog) throws ParseException {

Review comment:
       I wish I could teach IDEA this, every time I run reformat it swaps it back




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538789306



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -48,20 +47,18 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     this.tableCatalog = tableCatalog;
   }
 
-  protected <T> T modifyIcebergTable(String namespace, String tableName, Function<org.apache.iceberg.Table, T> func) {
-    return execute(namespace, tableName, true, func);
+  protected <T> T modifyIcebergTable(String identifierAsString, Function<org.apache.iceberg.Table, T> func) {
+    return execute(identifierAsString, true, func);
   }
 
-  protected <T> T withIcebergTable(String namespace, String tableName, Function<org.apache.iceberg.Table, T> func) {
-    return execute(namespace, tableName, false, func);
+  protected <T> T withIcebergTable(String identifierAsString, Function<org.apache.iceberg.Table, T> func) {
+    return execute(identifierAsString, false, func);
   }
 
-  private <T> T execute(String namespace, String tableName, boolean refreshSparkCache,
+  private <T> T execute(String identifierAsString, boolean refreshSparkCache,
                         Function<org.apache.iceberg.Table, T> func) {
-    Preconditions.checkArgument(namespace != null && !namespace.isEmpty(), "Namespace cannot be empty");

Review comment:
       Should we keep one precondition for `identifierAsString`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538767352



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified
    * @return The CatalogPlugin and Identifier for the table
    */
-  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts,
+                                                          CatalogPlugin fallBackCatalog) {
     Preconditions.checkArgument(!nameParts.isEmpty(),
         "Cannot determine catalog and Identifier from empty name parts");
     CatalogManager catalogManager = spark.sessionState().catalogManager();
-    CatalogPlugin currentCatalog = catalogManager.currentCatalog();
     String[] currentNamespace = catalogManager.currentNamespace();

Review comment:
       Yeah I was wondering a bit about this, I think that's a reasonable solve




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538792561



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified
    * @return The CatalogPlugin and Identifier for the table
    */
-  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts,
+                                                          CatalogPlugin fallBackCatalog) {
     Preconditions.checkArgument(!nameParts.isEmpty(),
         "Cannot determine catalog and Identifier from empty name parts");
     CatalogManager catalogManager = spark.sessionState().catalogManager();
-    CatalogPlugin currentCatalog = catalogManager.currentCatalog();
     String[] currentNamespace = catalogManager.currentNamespace();

Review comment:
       The current namespace is whatever you've set with `USE`. The default namespace is the default when you change catalogs without specifying a namespace (`USE CATALOG cat`).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538634891



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;

Review comment:
       Can do




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538769042



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
     }
+
+    return catalogAndIdentifier.identifier();

Review comment:
       Looks like this validation is still here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538622083



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
     }
+
+    return catalogAndIdentifier.identifier();

Review comment:
       We should validate the resolved catalog is `tableCatalog` to make sure we don't modify tables in other catalogs.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539078645



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteManifestsProcedure.java
##########
@@ -162,22 +157,18 @@ public void testInvalidRewriteManifestsCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.rewrite_manifests('n')", catalogName));
+        () -> sql("CALL %s.system.rewrite_manifests()", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls with invalid arg types",
-        RuntimeException.class, "Couldn't parse identifier",
+        AnalysisException.class, "Wrong arg type",
         () -> sql("CALL %s.system.rewrite_manifests('n', 2.2)", catalogName));
 
-    AssertHelpers.assertThrows("Should reject empty namespace",
-        IllegalArgumentException.class, "Namespace cannot be empty",
-        () -> sql("CALL %s.system.rewrite_manifests('', 't')", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty table name",
-        IllegalArgumentException.class, "Table name cannot be empty",
-        () -> sql("CALL %s.system.rewrite_manifests('n', '')", catalogName));
-
     AssertHelpers.assertThrows("Should reject duplicate arg names name",
         AnalysisException.class, "Duplicate procedure argument: table",
         () -> sql("CALL %s.system.rewrite_manifests(table => 't', tAbLe => 't')", catalogName));
+
+    AssertHelpers.assertThrows("Should reject calls without empty table identifier",

Review comment:
       nit: with?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538768244



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -74,26 +71,20 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     return result;
   }
 
-  // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  private Identifier toIdentifier(String identifierAsString) {
+    CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifierAsString, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
+    }
+    if (!catalogAndIdentifier.catalog().equals(tableCatalog)) {
+      throw new IllegalArgumentException(

Review comment:
       Use a precondition?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538790228



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -74,26 +71,20 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     return result;
   }
 
-  // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  private Identifier toIdentifier(String identifierAsString) {
+    CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifierAsString, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);

Review comment:
       This will require a bit more refactoring since our callers are usually via modifyIcebergTable/withIcebergTable
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539157329



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSetCurrentSnapshotProcedure.java
##########
@@ -196,30 +196,26 @@ public void testInvalidRollbackToSnapshotCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.set_current_snapshot('n', 't')", catalogName));
+        () -> sql("CALL %s.system.set_current_snapshot('t')", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.set_current_snapshot('n', 1L)", catalogName));
+        () -> sql("CALL %s.system.set_current_snapshot(1L)", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.set_current_snapshot(namespace => 'n', snapshot_id => 1L)", catalogName));
+        () -> sql("CALL %s.system.set_current_snapshot(snapshot_id => 1L)", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.set_current_snapshot(table => 't', snapshot_id => 1L)", catalogName));
+        () -> sql("CALL %s.system.set_current_snapshot(table => 't')", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls with invalid arg types",
         AnalysisException.class, "Wrong arg type for snapshot_id: cannot cast",
-        () -> sql("CALL %s.system.set_current_snapshot('n', 't', 2.2)", catalogName));
+        () -> sql("CALL %s.system.set_current_snapshot('t', 2.2)", catalogName));
 
-    AssertHelpers.assertThrows("Should reject empty namespace",
-        IllegalArgumentException.class, "Namespace cannot be empty",
-        () -> sql("CALL %s.system.set_current_snapshot('', 't', 1L)", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty table name",
-        IllegalArgumentException.class, "Table name cannot be empty",
-        () -> sql("CALL %s.system.set_current_snapshot('n', '', 1L)", catalogName));
+    AssertHelpers.assertThrows("Should reject calls without empty table identifier",

Review comment:
       nit: with?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#issuecomment-741665068


   Since you guys are asleep right now, I went ahead and fixed the remanning minor points and submitted a [PR](https://github.com/RussellSpitzer/iceberg/pull/1) against Russell's repo. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538629181



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;

Review comment:
       Shall we do an import so that we can refer to `CatalogAndIdentifier` directly?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538624413



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);

Review comment:
       I did this based on some of the discussion me and Ryan were having
   
   ```
   We know we have a catalog if there is a registered catalog with the name
   
   
   
   
   
   11:45
   In Spark, the rules are:
   1. If the identifier is a single part, it is a catalog name. Fill in current catalog and current namespace
   2. If the first part of the identifier is a known catalog, it is a fully qualified name
   3. If the first part is not a known catalog, fill in the current catalog (not namespace because it is already there)
   11:46
   The only difference here would be that we consider the current catalog to be the procedure catalog because this is in the procedure's context
   ```
   
   I thought this was reasonable but we could force the catalog as well




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538816782



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified
    * @return The CatalogPlugin and Identifier for the table
    */
-  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts,
+                                                          CatalogPlugin fallBackCatalog) {
     Preconditions.checkArgument(!nameParts.isEmpty(),
         "Cannot determine catalog and Identifier from empty name parts");
     CatalogManager catalogManager = spark.sessionState().catalogManager();
-    CatalogPlugin currentCatalog = catalogManager.currentCatalog();
     String[] currentNamespace = catalogManager.currentNamespace();

Review comment:
       Ugh :( 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538820376



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCherrypickSnapshotProcedure.java
##########
@@ -171,18 +167,10 @@ public void testInvalidCherrypickSnapshotCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.cherrypick_snapshot('n', 't')", catalogName));
+        () -> sql("CALL %s.system.cherrypick_snapshot('t')", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls with invalid arg types",
         AnalysisException.class, "Wrong arg type for snapshot_id: cannot cast",
-        () -> sql("CALL %s.system.cherrypick_snapshot('n', 't', 2.2)", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty namespace",
-        IllegalArgumentException.class, "Namespace cannot be empty",
-        () -> sql("CALL %s.system.cherrypick_snapshot('', 't', 1L)", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty table name",

Review comment:
       added




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539147452



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified
    * @return The CatalogPlugin and Identifier for the table
    */
-  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts,
+                                                          CatalogPlugin fallBackCatalog) {
     Preconditions.checkArgument(!nameParts.isEmpty(),
         "Cannot determine catalog and Identifier from empty name parts");
     CatalogManager catalogManager = spark.sessionState().catalogManager();
-    CatalogPlugin currentCatalog = catalogManager.currentCatalog();
     String[] currentNamespace = catalogManager.currentNamespace();

Review comment:
       The current logic seems to be as discussed here so I am resolving this thread.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#issuecomment-740995874


   This looks good to me except the remaining open points.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538842536



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
     }
+
+    return catalogAndIdentifier.identifier();

Review comment:
       The check definitely needs to happen somewhere, and the error should be specific: `"Cannot run stored procedure %s.%s: %s is not in catalog %s", catalogName, procName, tableName, catalogName`.
   
   I don't see a way for `loadSparkTable` to catch that the original identifier was actually for a different catalog because the catalog for this identifier is discarded.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539318156



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,19 +73,18 @@ protected Identifier toIdentifier(String identifierAsString, String argName) {
     Preconditions.checkArgument(identifierAsString != null && !identifierAsString.isEmpty(),
         "Cannot handle an empty identifier for argument %s", argName);
 
-    CatalogAndIdentifier catalogAndIdentifier;
-    try {
-      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifierAsString, tableCatalog);
-    } catch (ParseException e) {
-      throw new IllegalArgumentException(String.format("Cannot parse identifier '%s' for argument %s",
-          identifierAsString, argName), e);
-    }
+    CatalogAndIdentifier catalogAndIdentifier = Spark3Util.catalogAndIdentifier(
+        "identifier for arg " + argName, spark, identifierAsString, tableCatalog);
+
+    CatalogPlugin catalog = catalogAndIdentifier.catalog();
+    Identifier identifier = catalogAndIdentifier.identifier();
 
-    Preconditions.checkArgument(catalogAndIdentifier.catalog().equals(tableCatalog), "Cannot run procedure" +
-        " in catalog '%s': Argument %s was set to '%s' which resolves to a table in a different catalog '%s'",
-        tableCatalog.name(), argName, identifierAsString, catalogAndIdentifier.catalog().name());
+    Preconditions.checkArgument(
+        catalog.equals(tableCatalog),
+        "Cannot run procedure in catalog %s: %s is a table in catalog %s",
+        tableCatalog.name(), identifierAsString, catalog);

Review comment:
       I missed to call `catalog.name()` on the third arg. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538782191



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified
    * @return The CatalogPlugin and Identifier for the table
    */
-  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts,
+                                                          CatalogPlugin fallBackCatalog) {
     Preconditions.checkArgument(!nameParts.isEmpty(),
         "Cannot determine catalog and Identifier from empty name parts");
     CatalogManager catalogManager = spark.sessionState().catalogManager();
-    CatalogPlugin currentCatalog = catalogManager.currentCatalog();
     String[] currentNamespace = catalogManager.currentNamespace();

Review comment:
       ```scala
     def currentNamespace: Array[String] = synchronized {
       _currentNamespace.getOrElse {
         if (currentCatalog.name() == SESSION_CATALOG_NAME) {
           Array(v1SessionCatalog.getCurrentDatabase)
         } else {
           currentCatalog.defaultNamespace()
         }
       }
     }
   ```
   V1 always making things more difficult
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538698278



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/ExpireSnapshotsProcedure.java
##########
@@ -77,12 +76,11 @@ public StructType outputType() {
 
   @Override
   public InternalRow[] call(InternalRow args) {
-    String namespace = args.getString(0);
-    String tableName = args.getString(1);
-    Long olderThanMillis = args.isNullAt(2) ? null : DateTimeUtils.toMillis(args.getLong(2));
-    Integer retainLastNum = args.isNullAt(3) ? null : args.getInt(3);
+    String tableName = args.getString(0);

Review comment:
       sounds good, renames all around




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538768132



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -74,26 +71,20 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     return result;
   }
 
-  // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  private Identifier toIdentifier(String identifierAsString) {
+    CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifierAsString, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
+    }
+    if (!catalogAndIdentifier.catalog().equals(tableCatalog)) {

Review comment:
       Nit: missing newline between control flow.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539077511



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestExpireSnapshotsProcedure.java
##########
@@ -163,18 +163,28 @@ public void testInvalidExpireSnapshotsCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.expire_snapshots('n')", catalogName));
+        () -> sql("CALL %s.system.expire_snapshots()", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls with invalid arg types",
-        RuntimeException.class, "Couldn't parse identifier",
+        AnalysisException.class, "Wrong arg type",
         () -> sql("CALL %s.system.expire_snapshots('n', 2.2)", catalogName));
 
-    AssertHelpers.assertThrows("Should reject empty namespace",
-        IllegalArgumentException.class, "Namespace cannot be empty",
-        () -> sql("CALL %s.system.expire_snapshots('', 't')", catalogName));
+    AssertHelpers.assertThrows("Should reject calls without empty table identifier",

Review comment:
       nit: with?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538764059



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRollbackToTimestampProcedure.java
##########
@@ -184,8 +184,8 @@ public void testRollbackToTimestampWithQuotedIdentifiers() {
     String quotedNamespace = quotedNamespaceBuilder.toString();
 
     List<Object[]> output = sql(
-        "CALL %s.system.rollback_to_timestamp('%s', '`%s`', TIMESTAMP '%s')",
-        catalogName, quotedNamespace, tableIdent.name(), firstSnapshotTimestamp);
+        "CALL %s.system.rollback_to_timestamp('%s', TIMESTAMP '%s')",
+            catalogName, quotedNamespace + ".`" + tableIdent.name() + "`", firstSnapshotTimestamp);

Review comment:
       Nit: whitespace change?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538778741



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified
    * @return The CatalogPlugin and Identifier for the table
    */
-  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts,
+                                                          CatalogPlugin fallBackCatalog) {
     Preconditions.checkArgument(!nameParts.isEmpty(),
         "Cannot determine catalog and Identifier from empty name parts");
     CatalogManager catalogManager = spark.sessionState().catalogManager();
-    CatalogPlugin currentCatalog = catalogManager.currentCatalog();
     String[] currentNamespace = catalogManager.currentNamespace();

Review comment:
       Does this mean we can always call "defaultCatalog.defaultNamespace"? What is the difference between catalogManager.currentNamespace and the default catalog's implementation?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538829879



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -48,20 +47,18 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     this.tableCatalog = tableCatalog;
   }
 
-  protected <T> T modifyIcebergTable(String namespace, String tableName, Function<org.apache.iceberg.Table, T> func) {
-    return execute(namespace, tableName, true, func);
+  protected <T> T modifyIcebergTable(String identifierAsString, Function<org.apache.iceberg.Table, T> func) {
+    return execute(identifierAsString, true, func);
   }
 
-  protected <T> T withIcebergTable(String namespace, String tableName, Function<org.apache.iceberg.Table, T> func) {
-    return execute(namespace, tableName, false, func);
+  protected <T> T withIcebergTable(String identifierAsString, Function<org.apache.iceberg.Table, T> func) {
+    return execute(identifierAsString, false, func);
   }
 
-  private <T> T execute(String namespace, String tableName, boolean refreshSparkCache,
+  private <T> T execute(String identifierAsString, boolean refreshSparkCache,
                         Function<org.apache.iceberg.Table, T> func) {
-    Preconditions.checkArgument(namespace != null && !namespace.isEmpty(), "Namespace cannot be empty");

Review comment:
       There is a precondition on the null/empty issue in "toIdentifier"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538762869



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCherrypickSnapshotProcedure.java
##########
@@ -171,18 +167,10 @@ public void testInvalidCherrypickSnapshotCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.cherrypick_snapshot('n', 't')", catalogName));
+        () -> sql("CALL %s.system.cherrypick_snapshot('t')", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls with invalid arg types",
         AnalysisException.class, "Wrong arg type for snapshot_id: cannot cast",
-        () -> sql("CALL %s.system.cherrypick_snapshot('n', 't', 2.2)", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty namespace",
-        IllegalArgumentException.class, "Namespace cannot be empty",
-        () -> sql("CALL %s.system.cherrypick_snapshot('', 't', 1L)", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty table name",

Review comment:
       Should this have a test for empty table name?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539122717



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -48,20 +48,17 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     this.tableCatalog = tableCatalog;
   }
 
-  protected <T> T modifyIcebergTable(String namespace, String tableName, Function<org.apache.iceberg.Table, T> func) {
-    return execute(namespace, tableName, true, func);
+  protected <T> T modifyIcebergTable(Identifier ident, Function<org.apache.iceberg.Table, T> func) {
+    return execute(ident, true, func);
   }
 
-  protected <T> T withIcebergTable(String namespace, String tableName, Function<org.apache.iceberg.Table, T> func) {
-    return execute(namespace, tableName, false, func);
+  protected <T> T withIcebergTable(Identifier ident, Function<org.apache.iceberg.Table, T> func) {
+    return execute(ident, false, func);
   }
 
-  private <T> T execute(String namespace, String tableName, boolean refreshSparkCache,
+  private <T> T execute(Identifier ident, boolean refreshSparkCache,

Review comment:
       nit: can fit on one line now




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi merged pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi merged pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538637233



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
     }
+
+    return catalogAndIdentifier.identifier();

Review comment:
       Well, maybe the validation should not happen here. We could call it `toCatalogAndIdentifier` and validate above.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538782191



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified
    * @return The CatalogPlugin and Identifier for the table
    */
-  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts,
+                                                          CatalogPlugin fallBackCatalog) {
     Preconditions.checkArgument(!nameParts.isEmpty(),
         "Cannot determine catalog and Identifier from empty name parts");
     CatalogManager catalogManager = spark.sessionState().catalogManager();
-    CatalogPlugin currentCatalog = catalogManager.currentCatalog();
     String[] currentNamespace = catalogManager.currentNamespace();

Review comment:
       ```scala
     def currentNamespace: Array[String] = synchronized {
       _currentNamespace.getOrElse {
         if (currentCatalog.name() == SESSION_CATALOG_NAME) {
           Array(v1SessionCatalog.getCurrentDatabase)
         } else {
           currentCatalog.defaultNamespace()
         }
       }
     }
   ```
   V1 always making things more difficult
   ---
   Oh also _currentNamespace. ok np




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538797259



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -74,26 +71,20 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     return result;
   }
 
-  // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  private Identifier toIdentifier(String identifierAsString) {
+    CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifierAsString, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);

Review comment:
       Moved the "toIdentifier" calls into the Procedures and now the modify/with functions take identifiers directly




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#issuecomment-740877489


   @aokolnychyi All done


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538629555



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {

Review comment:
       nit: `identifier` -> `identifierAsString`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538766764



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified
    * @return The CatalogPlugin and Identifier for the table
    */
-  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts,
+                                                          CatalogPlugin fallBackCatalog) {
     Preconditions.checkArgument(!nameParts.isEmpty(),
         "Cannot determine catalog and Identifier from empty name parts");
     CatalogManager catalogManager = spark.sessionState().catalogManager();
-    CatalogPlugin currentCatalog = catalogManager.currentCatalog();
     String[] currentNamespace = catalogManager.currentNamespace();

Review comment:
       This namespace is associated with a catalog. I don't think this should use the catalog passed in with this namespace unless that catalog is the current catalog. I would do the following:
   
   1. If the fallback catalog is the current catalog, use the current namespace
   2. If the fallback catalog is not the current catalog, use its default namespace (`catalog.defaultNamespace()`)
   
   The catalog's default namespace is used when you switch to that catalog. The default namespace becomes the current namespace. Using the default fits with the idea that the fallback catalog is the current catalog for the context of the stored procedure.
   
   It this is difficult to implement, then we can always go back to using the current catalog rather than the procedure catalog.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538621794



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);

Review comment:
       @RussellSpitzer @rdblue, thoughts?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538802579



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
     }
+
+    return catalogAndIdentifier.identifier();

Review comment:
       Ah yes I got very confused. I think we should just let this be for now and remove the check. I think instead "loadSparkTable" can have the check since it assumes that we are loading out of this catalog.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538625798



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/ExpireSnapshotsProcedure.java
##########
@@ -77,12 +76,11 @@ public StructType outputType() {
 
   @Override
   public InternalRow[] call(InternalRow args) {
-    String namespace = args.getString(0);
-    String tableName = args.getString(1);
-    Long olderThanMillis = args.isNullAt(2) ? null : DateTimeUtils.toMillis(args.getLong(2));
-    Integer retainLastNum = args.isNullAt(3) ? null : args.getInt(3);
+    String tableName = args.getString(0);

Review comment:
       I think we better call our variables `tableIdent` rather than `tableName` now. I think we can keep the parameter name as `table`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538621508



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);

Review comment:
       Last time we talked, we thought about defaulting the catalog to the procedure catalog, not the default catalog.
   
   I thought we would simply take the first parsed name part and check if `spark.sql.catalog._name_part_` is set. If yes, it means our identifier contains a catalog and we don't have to prepend the procedure catalog to name parts and should resolve the catalog and validate the resolved catalog matches tableCatalog. If there is no such catalog, we can just call `tableCatalog.load` with the constructed identifier.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539078152



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestExpireSnapshotsProcedure.java
##########
@@ -163,18 +163,28 @@ public void testInvalidExpireSnapshotsCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.expire_snapshots('n')", catalogName));
+        () -> sql("CALL %s.system.expire_snapshots()", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls with invalid arg types",
-        RuntimeException.class, "Couldn't parse identifier",
+        AnalysisException.class, "Wrong arg type",
         () -> sql("CALL %s.system.expire_snapshots('n', 2.2)", catalogName));
 
-    AssertHelpers.assertThrows("Should reject empty namespace",
-        IllegalArgumentException.class, "Namespace cannot be empty",
-        () -> sql("CALL %s.system.expire_snapshots('', 't')", catalogName));
+    AssertHelpers.assertThrows("Should reject calls without empty table identifier",
+        IllegalArgumentException.class, "Cannot handle an empty identifier",
+        () -> sql("CALL %s.system.expire_snapshots('')", catalogName));
+  }
 
-    AssertHelpers.assertThrows("Should reject empty table name",
-        IllegalArgumentException.class, "Table name cannot be empty",
-        () -> sql("CALL %s.system.expire_snapshots('n', '')", catalogName));
+  @Test
+  public void testResolvingTableInAnotherCatalog() throws IOException {
+    String anotherCatalog = "another_" + catalogName;
+    spark.conf().set("spark.sql.catalog." + anotherCatalog, SparkCatalog.class.getName());
+    spark.conf().set("spark.sql.catalog." + anotherCatalog + ".type", "hadoop");
+    spark.conf().set("spark.sql.catalog." + anotherCatalog + ".warehouse", "file:" + temp.newFolder().toString());
+    spark.sql(String.format("CREATE TABLE %s.%s (id bigint NOT NULL, data string) USING iceberg", anotherCatalog,

Review comment:
       nit: we can probably use our `sql()` helper method to get parameterization for free.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538783656



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCherrypickSnapshotProcedure.java
##########
@@ -171,18 +167,10 @@ public void testInvalidCherrypickSnapshotCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.cherrypick_snapshot('n', 't')", catalogName));
+        () -> sql("CALL %s.system.cherrypick_snapshot('t')", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls with invalid arg types",
         AnalysisException.class, "Wrong arg type for snapshot_id: cannot cast",
-        () -> sql("CALL %s.system.cherrypick_snapshot('n', 't', 2.2)", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty namespace",
-        IllegalArgumentException.class, "Namespace cannot be empty",
-        () -> sql("CALL %s.system.cherrypick_snapshot('', 't', 1L)", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty table name",

Review comment:
       +1




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538804513



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
     }
+
+    return catalogAndIdentifier.identifier();

Review comment:
       Sorry I flipped back again, i'm just going to not put the condition here. We'll get an error in "loadSparkTable" which won't have the type argument but should be clear enough?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539078785



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRollbackToSnapshotProcedure.java
##########
@@ -243,30 +239,22 @@ public void testInvalidRollbackToSnapshotCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.rollback_to_snapshot('n', 't')", catalogName));
+        () -> sql("CALL %s.system.rollback_to_snapshot('t')", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.rollback_to_snapshot('n', 1L)", catalogName));
+        () -> sql("CALL %s.system.rollback_to_snapshot(1L)", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.rollback_to_snapshot(namespace => 'n', snapshot_id => 1L)", catalogName));
-
-    AssertHelpers.assertThrows("Should reject calls without all required args",
-        AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.rollback_to_snapshot(table => 't', snapshot_id => 1L)", catalogName));
+        () -> sql("CALL %s.system.rollback_to_snapshot(table => 't')", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls with invalid arg types",
         AnalysisException.class, "Wrong arg type for snapshot_id: cannot cast",
-        () -> sql("CALL %s.system.rollback_to_snapshot('n', 't', 2.2)", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty namespace",
-        IllegalArgumentException.class, "Namespace cannot be empty",
-        () -> sql("CALL %s.system.rollback_to_snapshot('', 't', 1L)", catalogName));
+        () -> sql("CALL %s.system.rollback_to_snapshot('t', 2.2)", catalogName));
 
-    AssertHelpers.assertThrows("Should reject empty table name",
-        IllegalArgumentException.class, "Table name cannot be empty",
-        () -> sql("CALL %s.system.rollback_to_snapshot('n', '', 1L)", catalogName));
+    AssertHelpers.assertThrows("Should reject calls without empty table identifier",

Review comment:
       nit: with?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538626215



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);

Review comment:
       Isn't that what `Spark3Util.catalogAndIdentifier` does? I think the only difference is that it uses Spark's catalogManager instead of simply checking table properties.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539077183



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCherrypickSnapshotProcedure.java
##########
@@ -171,18 +167,14 @@ public void testInvalidCherrypickSnapshotCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.cherrypick_snapshot('n', 't')", catalogName));
+        () -> sql("CALL %s.system.cherrypick_snapshot('t')", catalogName));
+
+    AssertHelpers.assertThrows("Should reject calls without empty table identifier",

Review comment:
       nit: with?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538687988



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
     }
+
+    return catalogAndIdentifier.identifier();

Review comment:
       That's a good idea




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538806709



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRollbackToTimestampProcedure.java
##########
@@ -184,8 +184,8 @@ public void testRollbackToTimestampWithQuotedIdentifiers() {
     String quotedNamespace = quotedNamespaceBuilder.toString();
 
     List<Object[]> output = sql(
-        "CALL %s.system.rollback_to_timestamp('%s', '`%s`', TIMESTAMP '%s')",
-        catalogName, quotedNamespace, tableIdent.name(), firstSnapshotTimestamp);
+        "CALL %s.system.rollback_to_timestamp('%s', TIMESTAMP '%s')",
+            catalogName, quotedNamespace + ".`" + tableIdent.name() + "`", firstSnapshotTimestamp);

Review comment:
       got it, bad copy from the other rollback method




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538891015



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -79,10 +79,14 @@ protected Identifier toIdentifier(String identifierAsString, String argName) {
     try {
       catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifierAsString, tableCatalog);
     } catch (ParseException e) {
-      throw new IllegalArgumentException(String.format("Cannot parse identifier [%s] for argument %s",
+      throw new IllegalArgumentException(String.format("Cannot parse identifier '%s' for argument %s",
           identifierAsString, argName), e);
     }
 
+    Preconditions.checkArgument(catalogAndIdentifier.catalog().equals(tableCatalog), "Cannot run procedure" +
+        " in catalog '%s': Argument %s was set to '%s' which resolves to a table in a different catalog '%s'",
+        tableCatalog.name(), argName, identifierAsString, catalogAndIdentifier.catalog().name());

Review comment:
       This is really wordy. I'd probably simplify it to "Cannot run procedure %s in catalog %s: %s is a table in catalog %s".




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538631234



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                            CatalogPlugin fallBackCatalog) throws ParseException {

Review comment:
       Shall we call it `defaultCatalog`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538635475



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                            CatalogPlugin fallBackCatalog) throws ParseException {

Review comment:
       I think that's a fine name as well




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539156777



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRollbackToTimestampProcedure.java
##########
@@ -184,8 +184,8 @@ public void testRollbackToTimestampWithQuotedIdentifiers() {
     String quotedNamespace = quotedNamespaceBuilder.toString();
 
     List<Object[]> output = sql(
-        "CALL %s.system.rollback_to_timestamp('%s', '`%s`', TIMESTAMP '%s')",
-        catalogName, quotedNamespace, tableIdent.name(), firstSnapshotTimestamp);
+        "CALL %s.system.rollback_to_timestamp('%s', TIMESTAMP '%s')",
+         catalogName, quotedNamespace + ".`" + tableIdent.name() + "`", firstSnapshotTimestamp);

Review comment:
       extra space




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539078529



##########
File path: spark3-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRemoveOrphanFilesProcedure.java
##########
@@ -230,18 +222,14 @@ public void testInvalidRemoveOrphanFilesCases() {
 
     AssertHelpers.assertThrows("Should reject calls without all required args",
         AnalysisException.class, "Missing required parameters",
-        () -> sql("CALL %s.system.remove_orphan_files('n')", catalogName));
+        () -> sql("CALL %s.system.remove_orphan_files()", catalogName));
 
     AssertHelpers.assertThrows("Should reject calls with invalid arg types",
-        RuntimeException.class, "Couldn't parse identifier",
+        AnalysisException.class, "Wrong arg type",
         () -> sql("CALL %s.system.remove_orphan_files('n', 2.2)", catalogName));
 
-    AssertHelpers.assertThrows("Should reject empty namespace",
-        IllegalArgumentException.class, "Namespace cannot be empty",
-        () -> sql("CALL %s.system.remove_orphan_files('', 't')", catalogName));
-
-    AssertHelpers.assertThrows("Should reject empty table name",
-        IllegalArgumentException.class, "Table name cannot be empty",
-        () -> sql("CALL %s.system.remove_orphan_files('n', '')", catalogName));
+    AssertHelpers.assertThrows("Should reject calls without empty table identifier",

Review comment:
       nit: with?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538633189



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);

Review comment:
       I added an argument to catalogAndIdentifier, the last parameter is the "current catalog" that's used




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538630325



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                            CatalogPlugin fallBackCatalog) throws ParseException {

Review comment:
       nit: formatting is off




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538782191



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -605,31 +605,41 @@ private static String sqlString(org.apache.iceberg.expressions.Literal<?> lit) {
   }
 
   public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name) throws ParseException {
+    return catalogAndIdentifier(spark, name, spark.sessionState().catalogManager().currentCatalog());
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, String name,
+                                                          CatalogPlugin defaultCatalog) throws ParseException {
     ParserInterface parser = spark.sessionState().sqlParser();
     Seq<String> multiPartIdentifier = parser.parseMultipartIdentifier(name);
     List<String> javaMultiPartIdentifier = JavaConverters.seqAsJavaList(multiPartIdentifier);
-    return catalogAndIdentifier(spark, javaMultiPartIdentifier);
+    return catalogAndIdentifier(spark, javaMultiPartIdentifier, defaultCatalog);
+  }
+
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+    return catalogAndIdentifier(spark, nameParts, spark.sessionState().catalogManager().currentCatalog());
   }
 
   /**
    * A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply
    * Attempts to find the catalog and identifier a multipart identifier represents
    * @param spark Spark session to use for resolution
    * @param nameParts Multipart identifier representing a table
+   * @param fallBackCatalog Catalog to use if none is specified
    * @return The CatalogPlugin and Identifier for the table
    */
-  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts) {
+  public static CatalogAndIdentifier catalogAndIdentifier(SparkSession spark, List<String> nameParts,
+                                                          CatalogPlugin fallBackCatalog) {
     Preconditions.checkArgument(!nameParts.isEmpty(),
         "Cannot determine catalog and Identifier from empty name parts");
     CatalogManager catalogManager = spark.sessionState().catalogManager();
-    CatalogPlugin currentCatalog = catalogManager.currentCatalog();
     String[] currentNamespace = catalogManager.currentNamespace();

Review comment:
       ```scala
     def currentNamespace: Array[String] = synchronized {
       _currentNamespace.getOrElse {
         if (currentCatalog.name() == SESSION_CATALOG_NAME) {
           Array(v1SessionCatalog.getCurrentDatabase)
         } else {
           currentCatalog.defaultNamespace()
         }
       }
     }
   ```
   V1 always making things more difficult
   
   Oh also _currentNamespace. ok np




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538698464



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -48,20 +46,18 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     this.tableCatalog = tableCatalog;
   }
 
-  protected <T> T modifyIcebergTable(String namespace, String tableName, Function<org.apache.iceberg.Table, T> func) {
-    return execute(namespace, tableName, true, func);
+  protected <T> T modifyIcebergTable(String identifier, Function<org.apache.iceberg.Table, T> func) {

Review comment:
       changed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r539145887



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -74,26 +71,23 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     return result;
   }
 
-  // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
+  protected Identifier toIdentifier(String identifierAsString, String argName) {
+    Preconditions.checkArgument(identifierAsString != null && !identifierAsString.isEmpty(),
+        "Cannot handle an empty identifier for argument %s", argName);
 
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+    CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifierAsString, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException(String.format("Cannot parse identifier '%s' for argument %s",

Review comment:
       I like Ryan's suggestion in another PR to add a variant of `catalogAndIdentifier` that does not throw a checked parse exception. Explicit handling of parse exceptions makes other places more complicated.
   
   If we follow that idea and add this method to `Spark3Util`:
   
   ```
   
     public static CatalogAndIdentifier catalogAndIdentifier(String description, SparkSession spark,
                                                             String name, CatalogPlugin defaultCatalog) {
       try {
         return catalogAndIdentifier(spark, name, defaultCatalog);
       } catch (ParseException e) {
         throw new IllegalArgumentException("Cannot parse " + description + ": " + name, e);
       }
     }
   ```
   
   This place can look like this:
   
   ```
     protected Identifier toIdentifier(String identifierAsString, String argName) {
       Preconditions.checkArgument(identifierAsString != null && !identifierAsString.isEmpty(),
           "Cannot handle an empty identifier for argument %s", argName);
   
       CatalogAndIdentifier catalogAndIdentifier = Spark3Util.catalogAndIdentifier(
           "identifier for arg " + argName, spark, identifierAsString, tableCatalog);
   
       CatalogPlugin catalog = catalogAndIdentifier.catalog();
       Identifier identifier = catalogAndIdentifier.identifier();
   
       Preconditions.checkArgument(
           catalog.equals(tableCatalog),
           "Cannot run procedure in catalog %s: %s is a table in catalog %s",
           tableCatalog.name(), identifierAsString, catalog);
   
       return identifier;
     }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#issuecomment-741244167


   We can always change the message when we get the procedure name code in
   
   On Tue, Dec 8, 2020, 5:40 PM Ryan Blue <no...@github.com> wrote:
   
   > *@rdblue* approved this pull request.
   >
   > I would prefer a more concise error message, but that's not a blocker.
   > I'll merge when tests are passing.
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/iceberg/pull/1890#pullrequestreview-547725186>,
   > or unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AADE2YKVS2TNJ7MQXYF6ZULST22ONANCNFSM4USIPK2Q>
   > .
   >
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538628314



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -48,20 +46,18 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     this.tableCatalog = tableCatalog;
   }
 
-  protected <T> T modifyIcebergTable(String namespace, String tableName, Function<org.apache.iceberg.Table, T> func) {
-    return execute(namespace, tableName, true, func);
+  protected <T> T modifyIcebergTable(String identifier, Function<org.apache.iceberg.Table, T> func) {

Review comment:
       nit: shall we call it `identifierAsString` to be consistent?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538767978



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -74,26 +71,20 @@ protected BaseProcedure(TableCatalog tableCatalog) {
     return result;
   }
 
-  // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  private Identifier toIdentifier(String identifierAsString) {
+    CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifierAsString, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);

Review comment:
       It would be nice to follow this recommendation from the other review: https://github.com/apache/iceberg/pull/1525/files#r538758721
   
   The description could be the argument name.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1890: Change Procedures to use Identifiers instead of Namespace/Table Params

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #1890:
URL: https://github.com/apache/iceberg/pull/1890#discussion_r538625804



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/procedures/BaseProcedure.java
##########
@@ -75,25 +71,15 @@ protected BaseProcedure(TableCatalog tableCatalog) {
   }
 
   // we have to parse both namespace and name as they may be quoted
-  protected Identifier toIdentifier(String namespaceAsString, String name) {
-    String[] namespaceParts = parseMultipartIdentifier(namespaceAsString);
-
-    String[] nameParts = parseMultipartIdentifier(name);
-    Preconditions.checkArgument(nameParts.length == 1, "Name must consist of one part: %s", name);
-
-    return Identifier.of(namespaceParts, nameParts[0]);
-  }
-
-  private String[] parseMultipartIdentifier(String identifierAsString) {
+  protected Identifier toIdentifier(String identifier) {
+    Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
     try {
-      ParserInterface parser = spark.sessionState().sqlParser();
-      Seq<String> namePartsSeq = parser.parseMultipartIdentifier(identifierAsString);
-      String[] nameParts = new String[namePartsSeq.size()];
-      namePartsSeq.copyToArray(nameParts);
-      return nameParts;
+      catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier, tableCatalog);
     } catch (ParseException e) {
-      throw new RuntimeException("Couldn't parse identifier: " + identifierAsString, e);
+      throw new IllegalArgumentException("Cannot parse identifier", e);
     }
+
+    return catalogAndIdentifier.identifier();

Review comment:
       I didn't want that to be a general requirement, But I could add it here. Ie if we want other procedures which operate on tables in other catalogs




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org