You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/12/16 22:53:53 UTC

[GitHub] [iceberg] rdblue opened a new pull request #1948: Add SQL commands evolve partition specs

rdblue opened a new pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948


   This adds two new `ALTER TABLE` commands to the extensions module, `ADD PARTITION FIELD` and `DROP PARTITION FIELD`.
   
   Tests for the new commands are in `TestAlterTablePartitionFields`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r545886729



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
##########
@@ -94,13 +94,20 @@ class IcebergSparkSqlExtensionsParser(delegate: ParserInterface) extends ParserI
    */
   override def parsePlan(sqlText: String): LogicalPlan = {
     val sqlTextAfterSubstitution = substitutor.substitute(sqlText)
-    if (sqlTextAfterSubstitution.toLowerCase(Locale.ROOT).trim().startsWith("call")) {
+    if (isIcebergCommand(sqlTextAfterSubstitution)) {
       parse(sqlTextAfterSubstitution) { parser => astBuilder.visit(parser.singleStatement()) }.asInstanceOf[LogicalPlan]
     } else {
       delegate.parsePlan(sqlText)
     }
   }
 
+  private def isIcebergCommand(sqlText: String): Boolean = {
+    val normalized = sqlText.toLowerCase(Locale.ROOT).trim()
+    normalized.startsWith("call") ||
+        (normalized.startsWith("alter table") && (
+            normalized.contains("add partition field") || normalized.contains("drop partition field")))

Review comment:
       I was thinking about this before and I didn't want it to be that specific. I think maybe a cool solution could be trying both, and checking which parser parsed the further. Each of our analysis exceptions should give a position, so we can just pick whoever parsed the most. How does that sound?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r545991200



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
##########
@@ -94,13 +94,20 @@ class IcebergSparkSqlExtensionsParser(delegate: ParserInterface) extends ParserI
    */
   override def parsePlan(sqlText: String): LogicalPlan = {
     val sqlTextAfterSubstitution = substitutor.substitute(sqlText)
-    if (sqlTextAfterSubstitution.toLowerCase(Locale.ROOT).trim().startsWith("call")) {
+    if (isIcebergCommand(sqlTextAfterSubstitution)) {
       parse(sqlTextAfterSubstitution) { parser => astBuilder.visit(parser.singleStatement()) }.asInstanceOf[LogicalPlan]
     } else {
       delegate.parsePlan(sqlText)
     }
   }
 
+  private def isIcebergCommand(sqlText: String): Boolean = {
+    val normalized = sqlText.toLowerCase(Locale.ROOT).trim()
+    normalized.startsWith("call") ||
+        (normalized.startsWith("alter table") && (
+            normalized.contains("add partition field") || normalized.contains("drop partition field")))

Review comment:
       That sounds reasonable to me, but I think we would want to make sure we have tests for it so we know if the heuristic starts to fail. Also, that should be a follow-up and not in this PR, right?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r545999714



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
##########
@@ -94,13 +94,20 @@ class IcebergSparkSqlExtensionsParser(delegate: ParserInterface) extends ParserI
    */
   override def parsePlan(sqlText: String): LogicalPlan = {
     val sqlTextAfterSubstitution = substitutor.substitute(sqlText)
-    if (sqlTextAfterSubstitution.toLowerCase(Locale.ROOT).trim().startsWith("call")) {
+    if (isIcebergCommand(sqlTextAfterSubstitution)) {
       parse(sqlTextAfterSubstitution) { parser => astBuilder.visit(parser.singleStatement()) }.asInstanceOf[LogicalPlan]
     } else {
       delegate.parsePlan(sqlText)
     }
   }
 
+  private def isIcebergCommand(sqlText: String): Boolean = {
+    val normalized = sqlText.toLowerCase(Locale.ROOT).trim()
+    normalized.startsWith("call") ||
+        (normalized.startsWith("alter table") && (
+            normalized.contains("add partition field") || normalized.contains("drop partition field")))

Review comment:
       Not a deal breaker for me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] holdenk commented on pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

holdenk commented on pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#issuecomment-748269231


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r546016204



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ExtendedDataSourceV2Strategy.scala
##########
@@ -53,4 +76,18 @@ object ExtendedDataSourceV2Strategy extends Strategy {
     }
     new GenericInternalRow(values)
   }
+
+  private object IcebergCatalogAndIdentifier {
+    def unapply(identifier: Seq[String]): Option[(TableCatalog, Identifier)] = {
+      val catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier.asJava)
+      catalogAndIdentifier.catalog match {
+        case icebergCatalog: SparkCatalog =>
+          Some((icebergCatalog, catalogAndIdentifier.identifier))
+        case icebergCatalog: SparkSessionCatalog[_] =>
+          Some((icebergCatalog, catalogAndIdentifier.identifier))

Review comment:
       I don't think you need the inner paren's here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#issuecomment-748263335


   LGTM, nothing to add to existing comments.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r546019435



##########
File path: spark3-extensions/src/main/scala/org/apache/iceberg/spark/extensions/IcebergSparkSessionExtensions.scala
##########
@@ -37,6 +37,6 @@ class IcebergSparkSessionExtensions extends (SparkSessionExtensions => Unit) {
     // TODO: PullupCorrelatedPredicates should handle row-level operations
     extensions.injectOptimizerRule { _ => PullupCorrelatedPredicatesInRowLevelOperations }
     extensions.injectOptimizerRule { _ => RewriteDelete }
-    extensions.injectPlannerStrategy { _ => ExtendedDataSourceV2Strategy }
+    extensions.injectPlannerStrategy { spark => ExtendedDataSourceV2Strategy(spark) }

Review comment:
       Ah I was just following the pattern with injectOptimizer lines directly above




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] holdenk commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

holdenk commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r545474190



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
##########
@@ -94,13 +94,20 @@ class IcebergSparkSqlExtensionsParser(delegate: ParserInterface) extends ParserI
    */
   override def parsePlan(sqlText: String): LogicalPlan = {
     val sqlTextAfterSubstitution = substitutor.substitute(sqlText)
-    if (sqlTextAfterSubstitution.toLowerCase(Locale.ROOT).trim().startsWith("call")) {
+    if (isIcebergCommand(sqlTextAfterSubstitution)) {
       parse(sqlTextAfterSubstitution) { parser => astBuilder.visit(parser.singleStatement()) }.asInstanceOf[LogicalPlan]
     } else {
       delegate.parsePlan(sqlText)
     }
   }
 
+  private def isIcebergCommand(sqlText: String): Boolean = {
+    val normalized = sqlText.toLowerCase(Locale.ROOT).trim()
+    normalized.startsWith("call") ||
+        (normalized.startsWith("alter table") && (
+            normalized.contains("add partition field") || normalized.contains("drop partition field")))

Review comment:
       So, this seems really unlikely and I can't figure out a good way around it, but if someone had a "add partition field" as a column name doing a Spark alter table this would probably give a false positive, but that probably doesn't matter since even then most of the time the iceberg parser is a superset of the delegate parser. Does that sound right or am I off base with the assumption this block?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r545886729



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
##########
@@ -94,13 +94,20 @@ class IcebergSparkSqlExtensionsParser(delegate: ParserInterface) extends ParserI
    */
   override def parsePlan(sqlText: String): LogicalPlan = {
     val sqlTextAfterSubstitution = substitutor.substitute(sqlText)
-    if (sqlTextAfterSubstitution.toLowerCase(Locale.ROOT).trim().startsWith("call")) {
+    if (isIcebergCommand(sqlTextAfterSubstitution)) {
       parse(sqlTextAfterSubstitution) { parser => astBuilder.visit(parser.singleStatement()) }.asInstanceOf[LogicalPlan]
     } else {
       delegate.parsePlan(sqlText)
     }
   }
 
+  private def isIcebergCommand(sqlText: String): Boolean = {
+    val normalized = sqlText.toLowerCase(Locale.ROOT).trim()
+    normalized.startsWith("call") ||
+        (normalized.startsWith("alter table") && (
+            normalized.contains("add partition field") || normalized.contains("drop partition field")))

Review comment:
       I was thinking about this before and I didn't want it to be that specific. I think maybe a cool solution could be trying both, and checking which parser parsed the furthest. Each of our analysis exceptions should give a position, so we can just pick whoever parsed the most. How does that sound?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r546004567



##########
File path: spark3-extensions/src/main/scala/org/apache/iceberg/spark/extensions/IcebergSparkSessionExtensions.scala
##########
@@ -37,6 +37,6 @@ class IcebergSparkSessionExtensions extends (SparkSessionExtensions => Unit) {
     // TODO: PullupCorrelatedPredicates should handle row-level operations
     extensions.injectOptimizerRule { _ => PullupCorrelatedPredicatesInRowLevelOperations }
     extensions.injectOptimizerRule { _ => RewriteDelete }
-    extensions.injectPlannerStrategy { _ => ExtendedDataSourceV2Strategy }
+    extensions.injectPlannerStrategy { spark => ExtendedDataSourceV2Strategy(spark) }

Review comment:
       This follows what we do above, and it seemed reasonably clean to me. I guess we could just pass `ExtendedDataSourceV2Strategy` and have apply automatically called.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r546002712



##########
File path: spark3-extensions/src/main/scala/org/apache/iceberg/spark/extensions/IcebergSparkSessionExtensions.scala
##########
@@ -37,6 +37,6 @@ class IcebergSparkSessionExtensions extends (SparkSessionExtensions => Unit) {
     // TODO: PullupCorrelatedPredicates should handle row-level operations
     extensions.injectOptimizerRule { _ => PullupCorrelatedPredicatesInRowLevelOperations }
     extensions.injectOptimizerRule { _ => RewriteDelete }
-    extensions.injectPlannerStrategy { _ => ExtendedDataSourceV2Strategy }
+    extensions.injectPlannerStrategy { spark => ExtendedDataSourceV2Strategy(spark) }

Review comment:
       Did the apply method not resolve here? Wondering why we needed to modify the anonymous arg




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r546011050



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropPartitionFieldExec.scala
##########
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import org.apache.iceberg.spark.Spark3Util
+import org.apache.iceberg.spark.source.SparkTable
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.Attribute
+import org.apache.spark.sql.connector.catalog.Identifier
+import org.apache.spark.sql.connector.catalog.TableCatalog
+import org.apache.spark.sql.connector.expressions.FieldReference
+import org.apache.spark.sql.connector.expressions.IdentityTransform
+import org.apache.spark.sql.connector.expressions.Transform
+
+case class DropPartitionFieldExec(
+    catalog: TableCatalog,
+    ident: Identifier,
+    transform: Transform) extends V2CommandExec {
+  import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._
+
+  override lazy val output: Seq[Attribute] = Nil
+
+  override protected def run(): Seq[InternalRow] = {
+    catalog.loadTable(ident) match {
+      case iceberg: SparkTable =>
+        val schema = iceberg.table.schema
+        transform match {
+          case IdentityTransform(FieldReference(parts)) if parts.size == 1 && schema.findField(parts.head) == null =>
+            // the name is not present in the Iceberg schema, so it must be a partition field name, not a column name
+            iceberg.table.updateSpec()
+                .removeField(parts.head)
+                .commit()
+
+          case _ =>
+            iceberg.table.updateSpec()
+                .removeField(Spark3Util.toIcebergTerm(transform))
+                .commit()
+        }
+
+

Review comment:
       extra newline?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r546019382



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ExtendedDataSourceV2Strategy.scala
##########
@@ -53,4 +76,18 @@ object ExtendedDataSourceV2Strategy extends Strategy {
     }
     new GenericInternalRow(values)
   }
+
+  private object IcebergCatalogAndIdentifier {
+    def unapply(identifier: Seq[String]): Option[(TableCatalog, Identifier)] = {
+      val catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, identifier.asJava)
+      catalogAndIdentifier.catalog match {
+        case icebergCatalog: SparkCatalog =>
+          Some((icebergCatalog, catalogAndIdentifier.identifier))
+        case icebergCatalog: SparkSessionCatalog[_] =>
+          Some((icebergCatalog, catalogAndIdentifier.identifier))

Review comment:
       I'm pretty sure they are needed. I've hit issues in the past with this, at least in Scala 2.11.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue merged pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

rdblue merged pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

rdblue commented on pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#issuecomment-748366564


   Thanks for reviewing, @holdenk, @RussellSpitzer, and @aokolnychyi! I merged this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] holdenk commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

holdenk commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r546004938



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
##########
@@ -94,13 +94,20 @@ class IcebergSparkSqlExtensionsParser(delegate: ParserInterface) extends ParserI
    */
   override def parsePlan(sqlText: String): LogicalPlan = {
     val sqlTextAfterSubstitution = substitutor.substitute(sqlText)
-    if (sqlTextAfterSubstitution.toLowerCase(Locale.ROOT).trim().startsWith("call")) {
+    if (isIcebergCommand(sqlTextAfterSubstitution)) {
       parse(sqlTextAfterSubstitution) { parser => astBuilder.visit(parser.singleStatement()) }.asInstanceOf[LogicalPlan]
     } else {
       delegate.parsePlan(sqlText)
     }
   }
 
+  private def isIcebergCommand(sqlText: String): Boolean = {
+    val normalized = sqlText.toLowerCase(Locale.ROOT).trim()
+    normalized.startsWith("call") ||
+        (normalized.startsWith("alter table") && (
+            normalized.contains("add partition field") || normalized.contains("drop partition field")))

Review comment:
       Same, I don't think this needs to be addressed here, I was just bringing it up to make sure my understanding was correct.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #1948: Spark: Add SQL commands evolve partition specs

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #1948:
URL: https://github.com/apache/iceberg/pull/1948#discussion_r545514306



##########
File path: spark3-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
##########
@@ -94,13 +94,20 @@ class IcebergSparkSqlExtensionsParser(delegate: ParserInterface) extends ParserI
    */
   override def parsePlan(sqlText: String): LogicalPlan = {
     val sqlTextAfterSubstitution = substitutor.substitute(sqlText)
-    if (sqlTextAfterSubstitution.toLowerCase(Locale.ROOT).trim().startsWith("call")) {
+    if (isIcebergCommand(sqlTextAfterSubstitution)) {
       parse(sqlTextAfterSubstitution) { parser => astBuilder.visit(parser.singleStatement()) }.asInstanceOf[LogicalPlan]
     } else {
       delegate.parsePlan(sqlText)
     }
   }
 
+  private def isIcebergCommand(sqlText: String): Boolean = {
+    val normalized = sqlText.toLowerCase(Locale.ROOT).trim()
+    normalized.startsWith("call") ||
+        (normalized.startsWith("alter table") && (
+            normalized.contains("add partition field") || normalized.contains("drop partition field")))

Review comment:
       I think that the "add partition" syntax requires a partition "spec" that is something like `(a=1, b=2)` so the parentheses should prevent this from catching "add partition" commands.
   
   That said, we used to fall back to the Spark parser whenever something couldn't be parsed by this parser. I'm not sure whether we want to move back to that or do something more complicated. One option is try the Iceberg parser, then the Spark parser, and then check the Spark parser's exception. If it complains about `CALL` or `FIELD` then the exception from the Iceberg parser should be used, otherwise it should re-throw Spark's exception.
   
   @RussellSpitzer, any ideas here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org