You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/10/14 10:08:02 UTC

[GitHub] [spark] ulysses-you opened a new pull request, #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

ulysses-you opened a new pull request, #38257:
URL: https://github.com/apache/spark/pull/38257

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   extract the check insertion field cast methold so that we can do validate patition value at PartitioningUtils.normalizePartitionSpec
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   Insertion follow the behavior of config `spark.sql.storeAssignmentPolicy`, which will fail if the value can not cast to target data type by default. Alter partition should also follow it. For example:
   ```SQL
   CREATE TABLE t (c int) USING PARQUET PARTITIONED BY(p int);
   
   -- This DDL should fail but worked:
   ALTER TABLE t ADD PARTITION(p='aaa'); 
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   yes,  the added partition value will follow the behavior of  `storeAssignmentPolicy`.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   add test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on PR #38257:
URL: https://github.com/apache/spark/pull/38257#issuecomment-1281834266

   @ulysses-you Thanks for the ping. Could you add a legacy flag and migration guide?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r1023453997


##########
docs/sql-migration-guide.md:
##########
@@ -34,6 +34,7 @@ license: |
     - Valid hexadecimal strings should include only allowed symbols (0-9A-Fa-f).
     - Valid values for `fmt` are case-insensitive `hex`, `base64`, `utf-8`, `utf8`.
   - Since Spark 3.4, Spark throws only `PartitionsAlreadyExistException` when it creates partitions but some of them exist already. In Spark 3.3 or earlier, Spark can throw either `PartitionsAlreadyExistException` or `PartitionAlreadyExistsException`.
+  - Since Spark 3.4, Spark will do validation for partition spec in ALTER PARTITION to follow the behavior of `spark.sql.storeAssignmentPolicy` which may cause an exception if type conversion fails, e.g. `ALTER TABLE .. ADD PARTITION(p='a')` if column `p` is int type. To restore the legacy behavior, set `spark.sql.legacy.skipPartitionSpecTypeValidation` to `true`.

Review Comment:
   nice catch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r998785985


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -2987,6 +2987,16 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val SKIP_PARTITION_SPEC_TYPE_VALIDATION =

Review Comment:
   ```suggestion
     val SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION =
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #38257:
URL: https://github.com/apache/spark/pull/38257#issuecomment-1281688522

   also cc @MaxGekk @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r998785579


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -2987,6 +2987,16 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val SKIP_PARTITION_SPEC_TYPE_VALIDATION =
+    buildConf("spark.sql.legacy.skipPartitionSpecTypeValidation")

Review Comment:
   ```suggestion
       buildConf("spark.sql.legacy.skipValidationOnAlterPartitions")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r998786234


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -2987,6 +2987,16 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val SKIP_PARTITION_SPEC_TYPE_VALIDATION =
+    buildConf("spark.sql.legacy.skipPartitionSpecTypeValidation")

Review Comment:
   ```suggestion
       buildConf("spark.sql.legacy.skipTypeValidationOnAlterPartition")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #38257:
URL: https://github.com/apache/spark/pull/38257#issuecomment-1282042809

   @gengliangwang added legacy config and migration guide


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #38257:
URL: https://github.com/apache/spark/pull/38257#issuecomment-1280219476

   cc @gengliangwang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy
URL: https://github.com/apache/spark/pull/38257


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r998325097


##########
docs/sql-migration-guide.md:
##########
@@ -34,6 +34,7 @@ license: |
     - Valid hexadecimal strings should include only allowed symbols (0-9A-Fa-f).
     - Valid values for `fmt` are case-insensitive `hex`, `base64`, `utf-8`, `utf8`.
   - Since Spark 3.4, Spark throws only `PartitionsAlreadyExistException` when it creates partitions but some of them exist already. In Spark 3.3 or earlier, Spark can throw either `PartitionsAlreadyExistException` or `PartitionAlreadyExistsException`.
+  - Since Spark 3.4, Spark will do validation for partition spec follows the behavior of `spark.sql.storeAssignmentPolicy` which may cause exception if type conversion error, e.g. `ALTER TABLE .. ADD PARTITION(p='a')` if column p is int type. To restore the legacy behavior, set `spark.sql.legacy.skipPartitionSpecTypeValidation` to `true`.

Review Comment:
   ```suggestion
     - Since Spark 3.4, Spark will do validation for partition spec in ADD PARTITION to follow the behavior of `spark.sql.storeAssignmentPolicy` which may cause an exception if type conversion fails, e.g. `ALTER TABLE .. ADD PARTITION(p='a')` if column `p` is int type. To restore the legacy behavior, set `spark.sql.legacy.skipPartitionSpecTypeValidation` to `true`.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r998330332


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/util/PartitioningUtils.scala:
##########
@@ -20,14 +20,47 @@ package org.apache.spark.sql.util
 import org.apache.spark.sql.catalyst.analysis.Resolver
 import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec
 import org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils.DEFAULT_PARTITION_NAME
+import org.apache.spark.sql.catalyst.expressions.{Cast, Expression, Literal}
 import org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils
 import org.apache.spark.sql.catalyst.util.CharVarcharUtils
 import org.apache.spark.sql.errors.QueryCompilationErrors
 import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.{CharType, StructType, VarcharType}
+import org.apache.spark.sql.internal.SQLConf.StoreAssignmentPolicy
+import org.apache.spark.sql.types.{CharType, DataType, StringType, StructField, StructType, VarcharType}
 import org.apache.spark.unsafe.types.UTF8String
 
 private[sql] object PartitioningUtils {
+
+  def castPartitionSpec(value: String, dt: DataType, conf: SQLConf): Expression = {
+    conf.storeAssignmentPolicy match {
+      // SPARK-30844: try our best to follow StoreAssignmentPolicy for static partition
+      // values but not completely follow because we can't do static type checking due to
+      // the reason that the parser has erased the type info of static partition values
+      // and converted them to string.
+      case StoreAssignmentPolicy.ANSI | StoreAssignmentPolicy.STRICT =>
+        val cast = Cast(Literal(value), dt, Option(conf.sessionLocalTimeZone),
+          ansiEnabled = true)
+        cast.setTagValue(Cast.BY_TABLE_INSERTION, ())
+        cast
+      case _ =>
+        Cast(Literal(value), dt, Option(conf.sessionLocalTimeZone),
+          ansiEnabled = false)
+    }
+  }
+
+  private def castPartitionSpecToString(value: String, field: StructField): String = {

Review Comment:
   the name is weird as the input is already a `String`. How about `normalizePartitionStringValue`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r998331365


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/AlterTableAddPartitionSuiteBase.scala:
##########
@@ -209,4 +211,37 @@ trait AlterTableAddPartitionSuiteBase extends QueryTest with DDLCommandTestUtils
           Row(Period.ofYears(1), Duration.ofDays(-1), "bbb")))
     }
   }
+
+  test("SPARK-40798: Alter partition should verify partition value") {
+    def shouldThrowException(policy: SQLConf.StoreAssignmentPolicy.Value): Boolean = policy match {
+      case SQLConf.StoreAssignmentPolicy.ANSI | SQLConf.StoreAssignmentPolicy.STRICT =>
+        true
+      case SQLConf.StoreAssignmentPolicy.LEGACY =>
+        false
+    }
+
+    SQLConf.StoreAssignmentPolicy.values.foreach { policy =>
+      withNamespaceAndTable("ns", "tbl") { t =>
+        sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
+
+        withSQLConf(SQLConf.STORE_ASSIGNMENT_POLICY.key -> policy.toString) {
+          if (shouldThrowException(policy)) {
+            val errMsg = intercept[SparkNumberFormatException] {
+              sql(s"ALTER TABLE $t ADD PARTITION (p='aaa')")
+            }.getMessage
+            assert(errMsg.contains(

Review Comment:
   let's use `checkError`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #38257:
URL: https://github.com/apache/spark/pull/38257#issuecomment-1283314858

   @cloud-fan @gengliangwang addressed comments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on PR #38257:
URL: https://github.com/apache/spark/pull/38257#issuecomment-1288532592

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r1023455115


##########
docs/sql-migration-guide.md:
##########
@@ -34,6 +34,7 @@ license: |
     - Valid hexadecimal strings should include only allowed symbols (0-9A-Fa-f).
     - Valid values for `fmt` are case-insensitive `hex`, `base64`, `utf-8`, `utf8`.
   - Since Spark 3.4, Spark throws only `PartitionsAlreadyExistException` when it creates partitions but some of them exist already. In Spark 3.3 or earlier, Spark can throw either `PartitionsAlreadyExistException` or `PartitionAlreadyExistsException`.
+  - Since Spark 3.4, Spark will do validation for partition spec in ALTER PARTITION to follow the behavior of `spark.sql.storeAssignmentPolicy` which may cause an exception if type conversion fails, e.g. `ALTER TABLE .. ADD PARTITION(p='a')` if column `p` is int type. To restore the legacy behavior, set `spark.sql.legacy.skipPartitionSpecTypeValidation` to `true`.

Review Comment:
   just made a PR :-) https://github.com/apache/spark/pull/38667



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r996900974


##########
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala:
##########
@@ -608,7 +608,7 @@ class HiveDDLSuite
   }
 
   test("SPARK-19129: drop partition with a empty string will drop the whole table") {
-    val df = spark.createDataFrame(Seq((0, "a"), (1, "b"))).toDF("partCol1", "name")
+    val df = spark.createDataFrame(Seq(("0", "a"), ("1", "b"))).toDF("partCol1", "name")

Review Comment:
   change partition column from int type to string to make the test work



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r998325549


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -2987,6 +2987,16 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val SKIP_PARTITION_SPEC_TYPE_VALIDATION =
+    buildConf("spark.sql.legacy.skipPartitionSpecTypeValidation")
+      .internal()
+      .doc("When true, skip do validation for partition spec. E.g., " +

Review Comment:
   ```suggestion
         .doc("When true, skip validation for partition spec in ADD PARTITION. E.g., " +
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r998785008


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -2987,6 +2987,16 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val SKIP_PARTITION_SPEC_TYPE_VALIDATION =

Review Comment:
   SKIP_ALTER_PARTITION_SPEC_TYPE_VALIDATION?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] anchovYu commented on a diff in pull request #38257: [SPARK-40798][SQL] Alter partition should verify value follow storeAssignmentPolicy

Posted by GitBox <gi...@apache.org>.
anchovYu commented on code in PR #38257:
URL: https://github.com/apache/spark/pull/38257#discussion_r1023195030


##########
docs/sql-migration-guide.md:
##########
@@ -34,6 +34,7 @@ license: |
     - Valid hexadecimal strings should include only allowed symbols (0-9A-Fa-f).
     - Valid values for `fmt` are case-insensitive `hex`, `base64`, `utf-8`, `utf8`.
   - Since Spark 3.4, Spark throws only `PartitionsAlreadyExistException` when it creates partitions but some of them exist already. In Spark 3.3 or earlier, Spark can throw either `PartitionsAlreadyExistException` or `PartitionAlreadyExistsException`.
+  - Since Spark 3.4, Spark will do validation for partition spec in ALTER PARTITION to follow the behavior of `spark.sql.storeAssignmentPolicy` which may cause an exception if type conversion fails, e.g. `ALTER TABLE .. ADD PARTITION(p='a')` if column `p` is int type. To restore the legacy behavior, set `spark.sql.legacy.skipPartitionSpecTypeValidation` to `true`.

Review Comment:
   The conf should be `spark.sql.legacy.skipTypeValidationOnAlterPartition`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org