You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/07 06:28:01 UTC

[GitHub] [spark] panbingkun opened a new pull request, #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

panbingkun opened a new pull request, #38531:
URL: https://github.com/apache/spark/pull/38531

   ### What changes were proposed in this pull request?
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
panbingkun commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1020758537


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala:
##########
@@ -1022,76 +1022,123 @@ class StringExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
   }
 
   test("ToNumber and ToCharacter: negative tests (the format string is invalid)") {
-    val unexpectedCharacter = "the structure of the format string must match: " +
-      "[MI|S] [$] [0|9|G|,]* [.|D] [0|9]* [$] [PR|MI|S]"
-    val thousandsSeparatorDigitsBetween =
-      "Thousands separators (, or G) must have digits in between them"
-    val mustBeAtEnd = "must be at the end of the number format"
-    val atMostOne = "At most one"
     Seq(
       // The format string must not be empty.
-      ("454", "") -> "The format string cannot be empty",
+      ("454", "") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_EMPTY",
+        messageParameters = Map.empty),
       // Make sure the format string does not contain any unrecognized characters.
-      ("454", "999@") -> unexpectedCharacter,
-      ("454", "999M") -> unexpectedCharacter,
-      ("454", "999P") -> unexpectedCharacter,
+      ("454", "999@") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "character '@''", "numberFormat" -> "999@")),
+      ("454", "999M") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "character 'M''", "numberFormat" -> "999M")),
+      ("454", "999P") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "character 'P''", "numberFormat" -> "999P")),
       // Make sure the format string contains at least one digit.
-      ("454", "$") -> "The format string requires at least one number digit",
+      ("454", "$") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_DIGIT",
+        messageParameters = Map.empty),
       // Make sure the format string contains at most one decimal point.
-      ("454", "99.99.99") -> atMostOne,
+      ("454", "99.99.99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> ". or D", "numberFormat" -> "99.99.99")),
       // Make sure the format string contains at most one dollar sign.
-      ("454", "$$99") -> atMostOne,
+      ("454", "$$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "$", "numberFormat" -> "$$99")),
       // Make sure the format string contains at most one minus sign at the beginning or end.
-      ("$4-4", "$9MI9") -> unexpectedCharacter,
-      ("--4", "SMI9") -> unexpectedCharacter,
-      ("--$54", "SS$99") -> atMostOne,
-      ("-$54", "MI$99MI") -> atMostOne,
-      ("$4-4", "$9MI9MI") -> atMostOne,
+      ("$4-4", "$9MI9") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "digit sequence", "numberFormat" -> "$9MI9")),
+      ("--4", "SMI9") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "digit sequence", "numberFormat" -> "SMI9")),
+      ("--$54", "SS$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "S", "numberFormat" -> "SS$99")),
+      ("-$54", "MI$99MI") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "MI", "numberFormat" -> "MI$99MI")),
+      ("$4-4", "$9MI9MI") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "MI", "numberFormat" -> "$9MI9MI")),
       // Make sure the format string contains at most one closing angle bracket at the end.
-      ("<$45>", "PR$99") -> unexpectedCharacter,
-      ("$4<4>", "$9PR9") -> unexpectedCharacter,
-      ("<<454>>", "999PRPR") -> atMostOne,
+      ("<$45>", "PR$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "$", "numberFormat" -> "PR$99")),
+      ("$4<4>", "$9PR9") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "digit sequence", "numberFormat" -> "$9PR9")),
+      ("<<454>>", "999PRPR") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "PR", "numberFormat" -> "999PRPR")),
       // Make sure that any dollar sign in the format string occurs before any digits.
-      ("4$54", "9$99") -> "Currency characters must appear before digits",
+      ("4$54", "9$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT",
+        messageParameters = Map("numberFormat" -> "9$99")),
       // Make sure that any dollar sign in the format string occurs before any decimal point.
-      (".$99", ".$99") -> "Currency characters must appear before any decimal point",
+      (".$99", ".$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL",
+        messageParameters = Map("numberFormat" -> ".$99")),
       // Thousands separators must have digits in between them.
-      (",123", ",099") -> thousandsSeparatorDigitsBetween,
-      (",123,456", ",999,099") -> thousandsSeparatorDigitsBetween,
-      (",,345", "9,,09.99") -> thousandsSeparatorDigitsBetween,
-      (",,345", "9,99,.99") -> thousandsSeparatorDigitsBetween,
-      (",,345", "9,99,") -> thousandsSeparatorDigitsBetween,
-      (",,345", ",,999,099.99") -> thousandsSeparatorDigitsBetween,
+      (",123", ",099") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> ",099")),
+      (",123,456", ",999,099") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> ",999,099")),
+      (",,345", "9,,09.99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> "9,,09.99")),
+      (",,345", "9,99,.99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> "9,99,.99")),
+      (",,345", "9,99,") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> "9,99,")),
+      (",,345", ",,999,099.99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> ",,999,099.99")),
       // Thousands separators must not appear after the decimal point.
-      ("123.45,6", "099.99,9") ->
-        "Thousands separators (, or G) may not appear after the decimal point"
-    ).foreach { case ((str: String, format: String), expectedErrMsg: String) =>
+      ("123.45,6", "099.99,9") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_BEFORE_DECIMAL",
+        messageParameters = Map("numberFormat" -> "099.99,9"))
+    ).foreach { case ((str: String, format: String), dataTypeMismatch: DataTypeMismatch) =>
       val toNumberResult = ToNumber(Literal(str), Literal(format)).checkInputDataTypes()
       assert(toNumberResult != TypeCheckResult.TypeCheckSuccess,
         s"The format string should have been invalid: $format")
-      toNumberResult match {
-        case TypeCheckResult.TypeCheckFailure(message) =>
-          assert(message.contains(expectedErrMsg))
-      }
+      assert(toNumberResult == dataTypeMismatch)
 
       val tryToNumberResult = TryToNumber(Literal(str), Literal(format)).checkInputDataTypes()
       assert(tryToNumberResult != TypeCheckResult.TypeCheckSuccess,
         s"The format string should have been invalid: $format")
-      tryToNumberResult match {
-        case TypeCheckResult.TypeCheckFailure(message) =>
-          assert(message.contains(expectedErrMsg))
-      }
+      assert(tryToNumberResult == dataTypeMismatch)
 
       val toCharResult = ToCharacter(Decimal(456), Literal(format)).checkInputDataTypes()
       assert(toCharResult != TypeCheckResult.TypeCheckSuccess,
         s"The format string should have been invalid: $format")
-      toCharResult match {
-        case TypeCheckResult.TypeCheckFailure(message) =>
-          assert(message.contains(expectedErrMsg))
-      }
+      assert(toCharResult == dataTypeMismatch)
     }
   }
 
+  test("ToCharacter: fails analysis if numberFormat is not a constant") {

Review Comment:
   Done



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala:
##########
@@ -1134,6 +1181,21 @@ class StringExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
     }
   }
 
+  test("ToNumber: fails analysis if numberFormat is not a constant") {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #38531:
URL: https://github.com/apache/spark/pull/38531#issuecomment-1311696603

   also cc @dtenedor @cloud-fan @srielau @itholic 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1020213977


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"

Review Comment:
   ditto: quoting of `numberFormat`



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_EMPTY" : {
+        "message" : [
+          "The format string cannot be empty"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Thousands separators (, or G) may not appear after the decimal point in the number format: '<numberFormat>'"

Review Comment:
   ditto: quoting + dot at the end.



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_EMPTY" : {
+        "message" : [
+          "The format string cannot be empty"

Review Comment:
   nit: but for consistency w/ other error messages. Could you add a dot at the end.
   ```suggestion
             "The format string cannot be empty."
   ```



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"

Review Comment:
   Let's quote the format as we do in other places:
   ```
   toSQLValue(pattern, StringType)
   ```
   see, for instance
   https://github.com/apache/spark/blob/0205478b9d35d62450fd7c9ade520087fd2979a7/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala#L1381



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_EMPTY" : {
+        "message" : [
+          "The format string cannot be empty"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Thousands separators (, or G) may not appear after the decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN" : {

Review Comment:
   The name is too long. Is it possible to make it slightly shorter?



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala:
##########
@@ -1134,6 +1181,21 @@ class StringExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
     }
   }
 
+  test("ToNumber: fails analysis if numberFormat is not a constant") {

Review Comment:
   nit: is not foldable



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala:
##########
@@ -1022,76 +1022,123 @@ class StringExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
   }
 
   test("ToNumber and ToCharacter: negative tests (the format string is invalid)") {
-    val unexpectedCharacter = "the structure of the format string must match: " +
-      "[MI|S] [$] [0|9|G|,]* [.|D] [0|9]* [$] [PR|MI|S]"
-    val thousandsSeparatorDigitsBetween =
-      "Thousands separators (, or G) must have digits in between them"
-    val mustBeAtEnd = "must be at the end of the number format"
-    val atMostOne = "At most one"
     Seq(
       // The format string must not be empty.
-      ("454", "") -> "The format string cannot be empty",
+      ("454", "") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_EMPTY",
+        messageParameters = Map.empty),
       // Make sure the format string does not contain any unrecognized characters.
-      ("454", "999@") -> unexpectedCharacter,
-      ("454", "999M") -> unexpectedCharacter,
-      ("454", "999P") -> unexpectedCharacter,
+      ("454", "999@") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "character '@''", "numberFormat" -> "999@")),
+      ("454", "999M") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "character 'M''", "numberFormat" -> "999M")),
+      ("454", "999P") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "character 'P''", "numberFormat" -> "999P")),
       // Make sure the format string contains at least one digit.
-      ("454", "$") -> "The format string requires at least one number digit",
+      ("454", "$") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_DIGIT",
+        messageParameters = Map.empty),
       // Make sure the format string contains at most one decimal point.
-      ("454", "99.99.99") -> atMostOne,
+      ("454", "99.99.99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> ". or D", "numberFormat" -> "99.99.99")),
       // Make sure the format string contains at most one dollar sign.
-      ("454", "$$99") -> atMostOne,
+      ("454", "$$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "$", "numberFormat" -> "$$99")),
       // Make sure the format string contains at most one minus sign at the beginning or end.
-      ("$4-4", "$9MI9") -> unexpectedCharacter,
-      ("--4", "SMI9") -> unexpectedCharacter,
-      ("--$54", "SS$99") -> atMostOne,
-      ("-$54", "MI$99MI") -> atMostOne,
-      ("$4-4", "$9MI9MI") -> atMostOne,
+      ("$4-4", "$9MI9") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "digit sequence", "numberFormat" -> "$9MI9")),
+      ("--4", "SMI9") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "digit sequence", "numberFormat" -> "SMI9")),
+      ("--$54", "SS$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "S", "numberFormat" -> "SS$99")),
+      ("-$54", "MI$99MI") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "MI", "numberFormat" -> "MI$99MI")),
+      ("$4-4", "$9MI9MI") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "MI", "numberFormat" -> "$9MI9MI")),
       // Make sure the format string contains at most one closing angle bracket at the end.
-      ("<$45>", "PR$99") -> unexpectedCharacter,
-      ("$4<4>", "$9PR9") -> unexpectedCharacter,
-      ("<<454>>", "999PRPR") -> atMostOne,
+      ("<$45>", "PR$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "$", "numberFormat" -> "PR$99")),
+      ("$4<4>", "$9PR9") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_UNEXPECTED_FORMAT_TOKEN",
+        messageParameters = Map("formatToken" -> "digit sequence", "numberFormat" -> "$9PR9")),
+      ("<<454>>", "999PRPR") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_WRONG_NUM_TOKEN",
+        messageParameters = Map("token" -> "PR", "numberFormat" -> "999PRPR")),
       // Make sure that any dollar sign in the format string occurs before any digits.
-      ("4$54", "9$99") -> "Currency characters must appear before digits",
+      ("4$54", "9$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT",
+        messageParameters = Map("numberFormat" -> "9$99")),
       // Make sure that any dollar sign in the format string occurs before any decimal point.
-      (".$99", ".$99") -> "Currency characters must appear before any decimal point",
+      (".$99", ".$99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL",
+        messageParameters = Map("numberFormat" -> ".$99")),
       // Thousands separators must have digits in between them.
-      (",123", ",099") -> thousandsSeparatorDigitsBetween,
-      (",123,456", ",999,099") -> thousandsSeparatorDigitsBetween,
-      (",,345", "9,,09.99") -> thousandsSeparatorDigitsBetween,
-      (",,345", "9,99,.99") -> thousandsSeparatorDigitsBetween,
-      (",,345", "9,99,") -> thousandsSeparatorDigitsBetween,
-      (",,345", ",,999,099.99") -> thousandsSeparatorDigitsBetween,
+      (",123", ",099") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> ",099")),
+      (",123,456", ",999,099") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> ",999,099")),
+      (",,345", "9,,09.99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> "9,,09.99")),
+      (",,345", "9,99,.99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> "9,99,.99")),
+      (",,345", "9,99,") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> "9,99,")),
+      (",,345", ",,999,099.99") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN",
+        messageParameters = Map("numberFormat" -> ",,999,099.99")),
       // Thousands separators must not appear after the decimal point.
-      ("123.45,6", "099.99,9") ->
-        "Thousands separators (, or G) may not appear after the decimal point"
-    ).foreach { case ((str: String, format: String), expectedErrMsg: String) =>
+      ("123.45,6", "099.99,9") -> DataTypeMismatch(
+        errorSubClass = "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_BEFORE_DECIMAL",
+        messageParameters = Map("numberFormat" -> "099.99,9"))
+    ).foreach { case ((str: String, format: String), dataTypeMismatch: DataTypeMismatch) =>
       val toNumberResult = ToNumber(Literal(str), Literal(format)).checkInputDataTypes()
       assert(toNumberResult != TypeCheckResult.TypeCheckSuccess,
         s"The format string should have been invalid: $format")
-      toNumberResult match {
-        case TypeCheckResult.TypeCheckFailure(message) =>
-          assert(message.contains(expectedErrMsg))
-      }
+      assert(toNumberResult == dataTypeMismatch)
 
       val tryToNumberResult = TryToNumber(Literal(str), Literal(format)).checkInputDataTypes()
       assert(tryToNumberResult != TypeCheckResult.TypeCheckSuccess,
         s"The format string should have been invalid: $format")
-      tryToNumberResult match {
-        case TypeCheckResult.TypeCheckFailure(message) =>
-          assert(message.contains(expectedErrMsg))
-      }
+      assert(tryToNumberResult == dataTypeMismatch)
 
       val toCharResult = ToCharacter(Decimal(456), Literal(format)).checkInputDataTypes()
       assert(toCharResult != TypeCheckResult.TypeCheckSuccess,
         s"The format string should have been invalid: $format")
-      toCharResult match {
-        case TypeCheckResult.TypeCheckFailure(message) =>
-          assert(message.contains(expectedErrMsg))
-      }
+      assert(toCharResult == dataTypeMismatch)
     }
   }
 
+  test("ToCharacter: fails analysis if numberFormat is not a constant") {

Review Comment:
   correct the title, it is not foldable



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
panbingkun commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1020759087


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_EMPTY" : {
+        "message" : [
+          "The format string cannot be empty"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Thousands separators (, or G) may not appear after the decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN" : {

Review Comment:
   Change it to 'NUMBER_FORMAT_SEPARATOR_MUST_HAVE_DIGIT_BETWEEN'?
   It seems that the meaning of this error cannot be well displayed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
panbingkun commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1022273230


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {
+        "message" : [
+          "Thousands separators (, or G) must have digits in between them in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_CUR_MUST_BEFORE_DEC" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_CUR_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_EMPTY" : {
+        "message" : [
+          "The number format string cannot be empty."
+        ]
+      },
+      "NUM_FORMAT_THOUSANDS_SEPS_MUST_BEFORE_DEC" : {
+        "message" : [
+          "Thousands separators (, or G) may not appear after the decimal point in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_UNEXPECTED_TOKEN" : {
+        "message" : [
+          "Unexpected <token> found in the format string <format>; the structure of the format string must match: [MI|S] [$] [0|9|G|,]* [.|D] [0|9]* [$] [PR|MI|S]."
+        ]
+      },
+      "NUM_FORMAT_WRONG_NUM_DIGIT" : {

Review Comment:
   Ok, let me do it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #38531:
URL: https://github.com/apache/spark/pull/38531#issuecomment-1315834870

   +1, LGTM. Merging to master.
   Thank you, @panbingkun and @cloud-fan @srielau for review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
panbingkun commented on PR #38531:
URL: https://github.com/apache/spark/pull/38531#issuecomment-1306478131

   cc @MaxGekk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
panbingkun commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1020752639


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"

Review Comment:
   Here are all subclasses of DATATYPE_MISMATCH.
   Do we really need Level 3 error classes? @MaxGekk 
   
   If we keep Level 2 error classes, INVALID_NUMBER_FORMAT....
   The naming rule is inconsistent with other subclasses, eg:
   RANGE_FRAME_INVALID_TYPE, RANGE_FRAME_MULTI_ORDER, RANGE_FRAME_WITHOUT_ORDER...
   
   If we merge two flavors for SEPARATOR and CURRENCY, We have to put these characters into the message parameters, which seems inconvenient for subsequent internationalization.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1020324789


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"

Review Comment:
   Seems there is 5 (or 3) of these, how about using suberror classes here:
   INVALID_NUMBER_FORMAT.CURRENCY/EMPTY/SEPARATOR
   
   Or:
   INVALID_NUMBER_FORMAT_CURRENT
   ...
   
   Also I think the two flavors for SEPARATOR and CURRENCY each can be merged.
   
   Currency characters must appear before the decimal point and any digits in the number format
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1022513821


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {

Review Comment:
   I think we abuse type check in a few expressions... Some are query compile time static checks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1022941891


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {

Review Comment:
   This must be fixed before we cut 12.0.
   Also there is no joy in having an error class with dozens of subclasses because it makes for a deeper doc tree. Because I need to now group sublasses to keep the doc pages in check...
   
   By this logic a CONSTRAINT violation is a data type mismatch and so is a NOT NULL violation....
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1020757754


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"

Review Comment:
   > Do we really need Level 3 error classes? @MaxGekk
   
   I don't think it makes sense.
   
   > If we keep Level 2 error classes, INVALID_NUMBER_FORMAT....
   
   Let's make the prefix shorter, for instance `NUM_FORMAT` like
   `DATATYPE_MISMATCH.NUM_FORMAT_CONT_THOUSANDS_SEPS`
    instead of
   `DATATYPE_MISMATCH.NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
panbingkun commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1020759087


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_EMPTY" : {
+        "message" : [
+          "The format string cannot be empty"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Thousands separators (, or G) may not appear after the decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN" : {

Review Comment:
   Change it to 'NUMBER_FORMAT_SEPARATOR_MUST_HAVE_DIGIT_BETWEEN'?
   It seems that the meaning of this error cannot be well displayed.



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_EMPTY" : {
+        "message" : [
+          "The format string cannot be empty"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Thousands separators (, or G) may not appear after the decimal point in the number format: '<numberFormat>'"

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1023233663


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {

Review Comment:
   > This must be fixed before we cut 12.0.
   
   Definitely, it will be fixed before Spark 12. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1022343441


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {

Review Comment:
   How is this related to DATA_TYPE_MISMATCH?
   Shouldn't this be a separate error_class:
   INVALID_NUMBER_FORMAT.*



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1022941891


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {

Review Comment:
   This must be fixed before we cut 12.0.
   Also there is no joy in having an error class with dozens of subclasses because it makes for a deeper doc tree.
   
   By this logic a CONSTRAINT violation is a data type mismatch and so is a NOT NULL violation....
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
panbingkun commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1020763977


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_EMPTY" : {
+        "message" : [
+          "The format string cannot be empty"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Thousands separators (, or G) may not appear after the decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_THOUSANDS_SEPARATOR_MUST_HAVE_DIGIT_IN_BETWEEN" : {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1022706881


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {

Review Comment:
   @cloud-fan Are you ok with such approach?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1021226420


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {
+        "message" : [
+          "Thousands separators (, or G) must have digits in between them in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_CUR_MUST_BEFORE_DEC" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_CUR_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_EMPTY" : {
+        "message" : [
+          "The number format string cannot be empty."
+        ]
+      },
+      "NUM_FORMAT_THOUSANDS_SEPS_MUST_BEFORE_DEC" : {
+        "message" : [
+          "Thousands separators (, or G) may not appear after the decimal point in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_UNEXPECTED_TOKEN" : {
+        "message" : [
+          "Unexpected <token> found in the format string <format>; the structure of the format string must match: [MI|S] [$] [0|9|G|,]* [.|D] [0|9]* [$] [PR|MI|S]."
+        ]
+      },
+      "NUM_FORMAT_WRONG_NUM_DIGIT" : {

Review Comment:
   I wonder is the prefix `NUM_` useful. Maybe just `FORMAT`? @cloud-fan @srielau WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
panbingkun commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1020759115


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_EMPTY" : {
+        "message" : [
+          "The format string cannot be empty"

Review Comment:
   Done



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DECIMAL" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: '<numberFormat>'"
+        ]
+      },
+      "NUMBER_FORMAT_CURRENCY_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: '<numberFormat>'"

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #38531:
URL: https://github.com/apache/spark/pull/38531#issuecomment-1305494803

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1022349552


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {

Review Comment:
   Theoretically, a data type is a set of possible values and a set of allowed operations on it. The number format is not an arbitrary string, it must be in some particular format. So, the error indicates violation of datatype range of values.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
panbingkun commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1022446300


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {
+        "message" : [
+          "Thousands separators (, or G) must have digits in between them in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_CUR_MUST_BEFORE_DEC" : {
+        "message" : [
+          "Currency characters must appear before any decimal point in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_CUR_MUST_BEFORE_DIGIT" : {
+        "message" : [
+          "Currency characters must appear before digits in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_EMPTY" : {
+        "message" : [
+          "The number format string cannot be empty."
+        ]
+      },
+      "NUM_FORMAT_THOUSANDS_SEPS_MUST_BEFORE_DEC" : {
+        "message" : [
+          "Thousands separators (, or G) may not appear after the decimal point in the number format: <format>."
+        ]
+      },
+      "NUM_FORMAT_UNEXPECTED_TOKEN" : {
+        "message" : [
+          "Unexpected <token> found in the format string <format>; the structure of the format string must match: [MI|S] [$] [0|9|G|,]* [.|D] [0|9]* [$] [PR|MI|S]."
+        ]
+      },
+      "NUM_FORMAT_WRONG_NUM_DIGIT" : {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk closed pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk closed pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes
URL: https://github.com/apache/spark/pull/38531


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #38531:
URL: https://github.com/apache/spark/pull/38531#discussion_r1022613029


##########
core/src/main/resources/error/error-classes.json:
##########
@@ -290,6 +290,46 @@
           "Null typed values cannot be used as arguments of <functionName>."
         ]
       },
+      "NUM_FORMAT_CONT_THOUSANDS_SEPS" : {

Review Comment:
   > Some are query compile time static checks
   
   I would like to propose to do refactoring later. Let's assign unique error classes to the errors that come from `checkInputDataTypes()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org