You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/08 03:01:16 UTC

[GitHub] [spark] HyukjinKwon opened a new pull request, #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

HyukjinKwon opened a new pull request, #38547:
URL: https://github.com/apache/spark/pull/38547

   ### What changes were proposed in this pull request?
   
   This PR disables ANSI at the test case for DSv2 because such partition casting wasn't supported in the legacy behaviour as well (with ANSI mode on).
   
   ### Why are the changes needed?
   
   To recover the ANSI enabled tests. Currently it fails as below:
   
   https://github.com/apache/spark/actions/runs/3406894388/jobs/5665999638
   
   ```
   - ALTER TABLE .. ADD PARTITION using V2 catalog V2 command: SPARK-40798: Alter partition should verify partition value - legacy *** FAILED *** (18 milliseconds)
     org.apache.spark.SparkNumberFormatException: [CAST_INVALID_INPUT] The value 'aaa' of the type "STRING" cannot be cast to "INT" because it is malformed. Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
   == SQL(line 1, position 1) ==
   ALTER TABLE test_catalog.ns.tbl ADD PARTITION (p='aaa')
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     at org.apache.spark.sql.errors.QueryExecutionErrors$.invalidInputInCastToNumberError(QueryExecutionErrors.scala:161)
     at org.apache.spark.sql.catalyst.util.UTF8StringUtils$.withException(UTF8StringUtils.scala:51)
     at org.apache.spark.sql.catalyst.util.UTF8StringUtils$.toIntExact(UTF8StringUtils.scala:34)
     at org.apache.spark.sql.catalyst.expressions.Cast.$anonfun$castToInt$2(Cast.scala:927)
     at org.apache.spark.sql.catalyst.expressions.Cast.$anonfun$castToInt$2$adapted(Cast.scala:927)
     at org.apache.spark.sql.catalyst.expressions.Cast.buildCast(Cast.scala:588)
     at org.apache.spark.sql.catalyst.expressions.Cast.$anonfun$castToInt$1(Cast.scala:927)
     at org.apache.spark.sql.catalyst.expressions.Cast.nullSafeEval(Cast.scala:1285)
     at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:526)
     at org.apache.spark.sql.catalyst.analysis.ResolvePartitionSpec$.$anonfun$convertToPartIdent$1(ResolvePartitionSpec.scala:83)
     at scala.collection.immutable.List.map(List.scala:293)
     at org.apache.spark.sql.catalyst.analysis.ResolvePartitionSpec$.convertToPartIdent(ResolvePartitionSpec.scala:79)
     at org.apache.spark.sql.catalyst.analysis.ResolvePartitionSpec$.org$apache$spark$sql$catalyst$analysis$ResolvePartitionSpec$$resolvePartitionSpec(ResolvePartitionSpec.scala:72)
     at org.apache.spark.sql.catalyst.analysis.ResolvePartitionSpec$$anonfun$apply$2$$anonfun$applyOrElse$1.applyOrElse(ResolvePartitionSpec.scala:49)
     at org.apache.spark.sql.catalyst.analysis.ResolvePartitionSpec$$anonfun$apply$2$$anonfun$applyOrElse$1.applyOrElse(ResolvePartitionSpec.scala:42)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, test-only.
   
   ### How was this patch tested?
   
   Manually tested locally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #38547:
URL: https://github.com/apache/spark/pull/38547#discussion_r1016106782


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala:
##########
@@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite
     withNamespaceAndTable("ns", "tbl") { t =>
       sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-      withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true") {
+      withSQLConf(
+          SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
+          SQLConf.ANSI_ENABLED.key -> "false") {

Review Comment:
   It passes if the validation is skipped. DSv2 it never worked with/without ANSI (but the exception message was different).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on code in PR #38547:
URL: https://github.com/apache/spark/pull/38547#discussion_r1016108654


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala:
##########
@@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite
     withNamespaceAndTable("ns", "tbl") { t =>
       sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-      withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true") {
+      withSQLConf(
+          SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
+          SQLConf.ANSI_ENABLED.key -> "false") {

Review Comment:
   I confirmed it, previous it would fail and the reason is same with this pr. We need explicitly specify the ansi mode with on/off to test..



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on code in PR #38547:
URL: https://github.com/apache/spark/pull/38547#discussion_r1016114358


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala:
##########
@@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite
     withNamespaceAndTable("ns", "tbl") { t =>
       sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-      withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true") {
+      withSQLConf(
+          SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
+          SQLConf.ANSI_ENABLED.key -> "false") {

Review Comment:
   one nit is, before DSv1 would not do cast, just use original partition spec



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #38547:
URL: https://github.com/apache/spark/pull/38547#discussion_r1016111632


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala:
##########
@@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite
     withNamespaceAndTable("ns", "tbl") { t =>
       sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-      withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true") {
+      withSQLConf(
+          SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
+          SQLConf.ANSI_ENABLED.key -> "false") {

Review Comment:
   **Before:**
   
   
   DSv1
   
   | `spark.sql.ansi.enabled` | |
   | --- | --- |
   | `false` | Casted |
   | `true` | Casted |
   
   DSv2
   
   | `spark.sql.ansi.enabled` | |
   | --- | --- |
   | `false` | Casted |
   | `true` | Error |
   
   **After:**
   
   DSv1
   
   | `spark.sql.ansi.enabled` | `spark.sql.legacy.skipTypeValidationOnAlterPartition` | |
   | --- | --- | --- |
   | `false` | `true` | Casted |
   | `true` | `true` | Casted |
   | `false` | `false` | follows `spark.sql.storeAssignmentPolicy` |
   | `true` | `false` | follows `spark.sql.storeAssignmentPolicy` |
   
   DSv2
   
   | `spark.sql.ansi.enabled` | `spark.sql.legacy.skipTypeValidationOnAlterPartition` | |
   | --- | --- | --- |
   | `false` | `true` | Casted |
   | `true` | `true` | Error |
   | `false` | `false` | follows `spark.sql.storeAssignmentPolicy` |
   | `true` | `false` | follows `spark.sql.storeAssignmentPolicy` |
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on code in PR #38547:
URL: https://github.com/apache/spark/pull/38547#discussion_r1016463908


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala:
##########
@@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite
     withNamespaceAndTable("ns", "tbl") { t =>
       sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-      withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true") {
+      withSQLConf(
+          SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
+          SQLConf.ANSI_ENABLED.key -> "false") {

Review Comment:
   @MaxGekk this pr aims to recover ANSI CI eagerly. I will do a followup to improve the test converage. It's my bad..



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #38547:
URL: https://github.com/apache/spark/pull/38547#discussion_r1016411170


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala:
##########
@@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite
     withNamespaceAndTable("ns", "tbl") { t =>
       sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-      withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true") {
+      withSQLConf(
+          SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
+          SQLConf.ANSI_ENABLED.key -> "false") {

Review Comment:
   I didn't get why we shouldn't test `SQLConf.ANSI_ENABLED.key -> "true"`? Even if some error happens, need to check it using `checkError`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #38547:
URL: https://github.com/apache/spark/pull/38547#issuecomment-1306555836

   cc @cloud-fan @ulysses-you @gengliangwang FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on code in PR #38547:
URL: https://github.com/apache/spark/pull/38547#discussion_r1016113801


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala:
##########
@@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite
     withNamespaceAndTable("ns", "tbl") { t =>
       sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-      withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true") {
+      withSQLConf(
+          SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
+          SQLConf.ANSI_ENABLED.key -> "false") {

Review Comment:
   @HyukjinKwon yeah, this is a clear behavior matrix. thank you ! the behavior is a bit complex though ..



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #38547:
URL: https://github.com/apache/spark/pull/38547#discussion_r1016111812


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala:
##########
@@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite
     withNamespaceAndTable("ns", "tbl") { t =>
       sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-      withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true") {
+      withSQLConf(
+          SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
+          SQLConf.ANSI_ENABLED.key -> "false") {

Review Comment:
   cc @ulysses-you to confirm ^



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2
URL: https://github.com/apache/spark/pull/38547


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #38547:
URL: https://github.com/apache/spark/pull/38547#issuecomment-1307039107

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #38547:
URL: https://github.com/apache/spark/pull/38547#discussion_r1016100975


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala:
##########
@@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite
     withNamespaceAndTable("ns", "tbl") { t =>
       sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-      withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true") {
+      withSQLConf(
+          SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
+          SQLConf.ANSI_ENABLED.key -> "false") {

Review Comment:
   what's the behavior of old Spark versions when ansi mode is on? @ulysses-you 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org