You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/31 17:39:44 UTC

[GitHub] [spark] MaxGekk opened a new pull request, #37744: [WIP][SQL] Migrate onto the `DATATYPE_MISMATCH` error classes

MaxGekk opened a new pull request, #37744:
URL: https://github.com/apache/spark/pull/37744

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes.
   
   ### How was this patch tested?
   By running the affected test suites:
   ```
   $ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
   $ build/sbt "core/testOnly *SparkThrowableSuite"
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r963270773


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."
+        ]
+      },
+      "BINARY_OP_WRONG_TYPE" : {
+        "message" : [
+          "the binary operator requires the input type <inputType>, not <leftDataType>."

Review Comment:
   Actually I wrote this line of code: https://github.com/apache/spark/pull/7420/files#r34704858
   
   My intention was not to ask users to fix the left operand only.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961701777


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."

Review Comment:
   "If the two inputs have different types, the analyzer will find the tightest common type and do the proper type casting."
   We always need to look at errors from a user's perspective...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #37744:
URL: https://github.com/apache/spark/pull/37744#issuecomment-1233890597

   also cc @srielau @anchovYu Could you take a look at the PR which introduces new error classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #37744:
URL: https://github.com/apache/spark/pull/37744#issuecomment-1237855327

   Merging to master. Thank you, @srielau @cloud-fan for review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #37744:
URL: https://github.com/apache/spark/pull/37744#issuecomment-1234048716

   The test failure is not related to this PR, I believe:
   ```
   YarnClusterSuite.run Spark in yarn-client mode with different configurations, ensuring redaction
   ```
   @cloud-fan @HyukjinKwon @gengliangwang Could you review the PR, please. It migrates a bunch of AnalysisException in golden files to error classes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961376892


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/package.scala:
##########
@@ -53,6 +56,14 @@ package object analysis {
         messageParameters = messageParameters,
         origin = t.origin)
     }
+
+    def dataTypeMismatch(expr: Expression, mismatch: DataTypeMismatch): Nothing = {

Review Comment:
   Yes, I have count at least 155 places.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961656263


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."
+        ]
+      },
+      "BINARY_OP_WRONG_TYPE" : {
+        "message" : [
+          "the binary operator requires the input type <inputType>, not <leftDataType>."

Review Comment:
   I think you just introduce unnecessary assumptions between checks, and confuse users. The error message says **precisely** that LEFT is operand has wrong type. Users need to cast it or do something only with the LEFT one. When you says ACTUAL, does the users have to check and/or fix both operands, or left, or right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r960719687


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."
+        ]
+      },
+      "BINARY_OP_WRONG_TYPE" : {
+        "message" : [
+          "the binary operator requires the input type <inputType>, not <leftDataType>."

Review Comment:
   Are you sure this is about the LEFT type?
   Aren't those non-symetrical operators deriving the right type from the left type?
    



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala:
##########
@@ -47,14 +47,23 @@ class ExpressionTypeCheckingSuite extends SparkFunSuite with SQLHelper {
     assert(e.getMessage.contains(errorMessage))
   }
 
+  def assertTypeMismatch(expr: Expression, errorMessage: String): Unit = {
+    val e = intercept[AnalysisException] {
+      assertSuccess(expr)
+    }
+    assert(e.getMessage.contains(
+      s"""Cannot resolve "${expr.sql}" due to data type mismatch:"""))
+    assert(e.getMessage.contains(errorMessage))

Review Comment:
   Use checkError(), don't test the text



##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala:
##########
@@ -2666,10 +2666,7 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
         val m = intercept[AnalysisException] {
           sql("SELECT * FROM t, S WHERE c = C")
         }.message
-        assert(
-          m.contains(
-            "cannot resolve '(spark_catalog.default.t.c = " +
-            "spark_catalog.default.S.C)' due to data type mismatch"))
+        assert(m.contains("""Cannot resolve "(c = C)" due to data type mismatch"""))

Review Comment:
   checkError()



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."

Review Comment:
   ```suggestion
             "the left and right operands of the binary operator have incompatible types (<left> and <right>)."
   ```



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {

Review Comment:
   Isn't. the problem that they are incompatible, rather than different? We will happily divide an integer by a float. We even divide an interval by a an integer....
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r963304845


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."
+        ]
+      },
+      "BINARY_OP_WRONG_TYPE" : {
+        "message" : [
+          "the binary operator requires the input type <inputType>, not <leftDataType>."

Review Comment:
   @cloud-fan Serge's comment has been addressed already.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r963269665


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."
+        ]
+      },
+      "BINARY_OP_WRONG_TYPE" : {
+        "message" : [
+          "the binary operator requires the input type <inputType>, not <leftDataType>."

Review Comment:
   Where do we mention LEFT to the end users? The internal code calls `left.dataType` because the check above guarantees the left and right are the same type.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r960982999


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala:
##########
@@ -47,14 +47,23 @@ class ExpressionTypeCheckingSuite extends SparkFunSuite with SQLHelper {
     assert(e.getMessage.contains(errorMessage))
   }
 
+  def assertTypeMismatch(expr: Expression, errorMessage: String): Unit = {
+    val e = intercept[AnalysisException] {
+      assertSuccess(expr)
+    }
+    assert(e.getMessage.contains(
+      s"""Cannot resolve "${expr.sql}" due to data type mismatch:"""))
+    assert(e.getMessage.contains(errorMessage))

Review Comment:
   done



##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala:
##########
@@ -2666,10 +2666,7 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
         val m = intercept[AnalysisException] {
           sql("SELECT * FROM t, S WHERE c = C")
         }.message
-        assert(
-          m.contains(
-            "cannot resolve '(spark_catalog.default.t.c = " +
-            "spark_catalog.default.S.C)' due to data type mismatch"))
+        assert(m.contains("""Cannot resolve "(c = C)" due to data type mismatch"""))

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r960901497


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."

Review Comment:
   One of properties of any binary operator is to have the same types of its inputs, see
   https://github.com/apache/spark/blob/fab65f4e393359b9523e34457c877083dba0e7e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L738-L742
   
   This is what the error says precisely. I am not sure that we should introduce the term of incompatibility here. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r960893478


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {

Review Comment:
   > the problem that they are incompatible, rather than different?
   
   This PR aims to migrate on error classes but not to change the type checking logic. In fact, we check that types of binary operator are the same, see in the master
   https://github.com/apache/spark/blob/fab65f4e393359b9523e34457c877083dba0e7e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L761-L764



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961658994


##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala:
##########
@@ -2666,10 +2666,7 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
         val m = intercept[AnalysisException] {
           sql("SELECT * FROM t, S WHERE c = C")
         }.message
-        assert(
-          m.contains(
-            "cannot resolve '(spark_catalog.default.t.c = " +
-            "spark_catalog.default.S.C)' due to data type mismatch"))
+        assert(m.contains("""Cannot resolve "(c = C)" due to data type mismatch"""))

Review Comment:
   yep



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r960894747


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."
+        ]
+      },
+      "BINARY_OP_WRONG_TYPE" : {
+        "message" : [
+          "the binary operator requires the input type <inputType>, not <leftDataType>."

Review Comment:
   The binary op checks that operators have the same type, and then check that the left is compatible with expected input type, see
   https://github.com/apache/spark/blob/fab65f4e393359b9523e34457c877083dba0e7e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L765-L767



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961371867


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {

Review Comment:
   This is for errors when the input data types do not meet the requirement of the function/operator



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r960893478


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {

Review Comment:
   > the problem that they are incompatible, rather than different?
   
   This PR aims to migrate on error classes but not to change the type checking logic. In fact, we check that types of binary operands are the same, see in the master
   https://github.com/apache/spark/blob/fab65f4e393359b9523e34457c877083dba0e7e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L761-L764



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961375505


##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala:
##########
@@ -2666,10 +2666,7 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
         val m = intercept[AnalysisException] {
           sql("SELECT * FROM t, S WHERE c = C")
         }.message
-        assert(
-          m.contains(
-            "cannot resolve '(spark_catalog.default.t.c = " +
-            "spark_catalog.default.S.C)' due to data type mismatch"))
+        assert(m.contains("""Cannot resolve "(c = C)" due to data type mismatch"""))

Review Comment:
   is it really done?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961371493


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {

Review Comment:
   ```suggestion
     "INPUT_DATATYPE_MISMATCH" : {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk closed pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk closed pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class
URL: https://github.com/apache/spark/pull/37744


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r960894747


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."
+        ]
+      },
+      "BINARY_OP_WRONG_TYPE" : {
+        "message" : [
+          "the binary operator requires the input type <inputType>, not <leftDataType>."

Review Comment:
   First, the binary op checks that operators have the same type, and then check that the left is compatible with expected input type, see
   https://github.com/apache/spark/blob/fab65f4e393359b9523e34457c877083dba0e7e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L765-L767



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961372897


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {
+    "message" : [
+      "Cannot resolve <sqlExpr> due to data type mismatch:"
+    ],
+    "subClass" : {
+      "BINARY_OP_DIFF_TYPES" : {
+        "message" : [
+          "the left and right arguments of the binary operator have different types (<left> and <right>)."
+        ]
+      },
+      "BINARY_OP_WRONG_TYPE" : {
+        "message" : [
+          "the binary operator requires the input type <inputType>, not <leftDataType>."

Review Comment:
   I think this assumes the left and right are already the same type, maybe `<actualDataType>` is better here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961374501


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/package.scala:
##########
@@ -53,6 +56,14 @@ package object analysis {
         messageParameters = messageParameters,
         origin = t.origin)
     }
+
+    def dataTypeMismatch(expr: Expression, mismatch: DataTypeMismatch): Nothing = {

Review Comment:
   are we going to call it in many places?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37744: [SPARK-40300][SQL] Migrate onto the `DATATYPE_MISMATCH` error class

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37744:
URL: https://github.com/apache/spark/pull/37744#discussion_r961379013


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -75,6 +75,23 @@
       "The value <str> (<fmt>) cannot be converted to <targetType> because it is malformed. Correct the value as per the syntax, or change its format. Use <suggestion> to tolerate malformed input and return NULL instead."
     ]
   },
+  "DATATYPE_MISMATCH" : {

Review Comment:
   The error class + subclass could be long. Since it will be used in docs as a header/title/ref, I don't think that adding the word INPUT can bring any benefits in error class meaning. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org