You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "panbingkun (via GitHub)" <gi...@apache.org> on 2023/05/25 12:14:27 UTC

[GitHub] [spark] panbingkun opened a new pull request, #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335

panbingkun opened a new pull request, #41314:
URL: https://github.com/apache/spark/pull/41314

   ### What changes were proposed in this pull request?
   The pr aims to assign a name to the error class _LEGACY_ERROR_TEMP_1335.
   
   ### Why are the changes needed?
   The changes improve the error framework.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Update existed UT.
   Pass GA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #41314:
URL: https://github.com/apache/spark/pull/41314#discussion_r1205618825


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala:
##########
@@ -38,21 +38,25 @@ object TimeTravelSpec {
       val ts = timestamp.get
       assert(ts.resolved && ts.references.isEmpty && !SubqueryExpression.hasSubquery(ts))
       if (!Cast.canAnsiCast(ts.dataType, TimestampType)) {
-        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+          "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.DATA_TYPE", ts, ts.dataType)
       }
       val tsToEval = ts.transform {
         case r: RuntimeReplaceable => r.replacement
         case _: Unevaluable =>
-          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+            "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.UNEVALUABLE", ts)
         case e if !e.deterministic =>
-          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+            "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.NON_DETERMINISTIC", ts)
       }
       val tz = Some(conf.sessionLocalTimeZone)
       // Set `ansiEnabled` to false, so that it can return null for invalid input and we can provide
       // better error message.
       val value = Cast(tsToEval, TimestampType, tz, ansiEnabled = false).eval()
       if (value == null) {
-        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(

Review Comment:
   Although logically speaking, one is `can't ansi cast` and the other is `can't cast`, they should belong to a kind of errors. 
   Perhaps one is called `INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.ANSI_DATA_TYPE`, and the other is called `INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.DATA_TYPE`? 
   Will it cause confusion for users?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on PR #41314:
URL: https://github.com/apache/spark/pull/41314#issuecomment-1564396160

   +1, LGTM. Merging to master.
   Thank you, @panbingkun.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #41314:
URL: https://github.com/apache/spark/pull/41314#discussion_r1205644934


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala:
##########
@@ -38,21 +38,25 @@ object TimeTravelSpec {
       val ts = timestamp.get
       assert(ts.resolved && ts.references.isEmpty && !SubqueryExpression.hasSubquery(ts))
       if (!Cast.canAnsiCast(ts.dataType, TimestampType)) {
-        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+          "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.DATA_TYPE", ts, ts.dataType)
       }
       val tsToEval = ts.transform {
         case r: RuntimeReplaceable => r.replacement
         case _: Unevaluable =>
-          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+            "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.UNEVALUABLE", ts)
         case e if !e.deterministic =>
-          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+            "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.NON_DETERMINISTIC", ts)
       }
       val tz = Some(conf.sessionLocalTimeZone)
       // Set `ansiEnabled` to false, so that it can return null for invalid input and we can provide
       // better error message.
       val value = Cast(tsToEval, TimestampType, tz, ansiEnabled = false).eval()
       if (value == null) {
-        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(

Review Comment:
   I think so, one is about incorrect input type, but another one is about incorrect values. Could you name the sub-class as `INPUT`, and say in its message something like `The input value <inputVal> cannot be casted to the "TIMESTAMP" type.`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #41314:
URL: https://github.com/apache/spark/pull/41314#discussion_r1206277747


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala:
##########
@@ -38,21 +38,25 @@ object TimeTravelSpec {
       val ts = timestamp.get
       assert(ts.resolved && ts.references.isEmpty && !SubqueryExpression.hasSubquery(ts))
       if (!Cast.canAnsiCast(ts.dataType, TimestampType)) {
-        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+          "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.DATA_TYPE", ts, ts.dataType)
       }
       val tsToEval = ts.transform {
         case r: RuntimeReplaceable => r.replacement
         case _: Unevaluable =>
-          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+            "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.UNEVALUABLE", ts)
         case e if !e.deterministic =>
-          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+            "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.NON_DETERMINISTIC", ts)
       }
       val tz = Some(conf.sessionLocalTimeZone)
       // Set `ansiEnabled` to false, so that it can return null for invalid input and we can provide
       // better error message.
       val value = Cast(tsToEval, TimestampType, tz, ansiEnabled = false).eval()
       if (value == null) {
-        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(

Review Comment:
   This is done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #41314:
URL: https://github.com/apache/spark/pull/41314#discussion_r1205522488


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala:
##########
@@ -3130,10 +3130,18 @@ private[sql] object QueryCompilationErrors extends QueryErrorsBase {
       messageParameters = Map.empty)
   }
 
-  def invalidTimestampExprForTimeTravel(expr: Expression): Throwable = {
-    new AnalysisException(
-      errorClass = "_LEGACY_ERROR_TEMP_1335",
-      messageParameters = Map("expr" -> expr.sql))
+  def invalidTimestampExprForTimeTravel(
+      errorClass: String,
+      expr: Expression,
+      dataType: Option[DataType] = None): Throwable = {
+    dataType match {
+      case Some(v) =>
+        new AnalysisException(errorClass = errorClass,
+          messageParameters = Map("expr" -> toSQLExpr(expr), "dataType" -> toSQLType(v)))
+      case _ =>
+        new AnalysisException(errorClass = errorClass,
+          messageParameters = Map("expr" -> toSQLExpr(expr)))
+    }
   }

Review Comment:
   Could you make more readable and just overload the function:
   ```suggestion
     def invalidTimestampExprForTimeTravel(errorClass: String, expr: Expression): Throwable = {
         new AnalysisException(
           errorClass = errorClass,
           messageParameters = Map("expr" -> toSQLExpr(expr)))
     }
   
     def invalidTimestampExprForTimeTravel(
         errorClass: String,
         expr: Expression,
         dataType: DataType): Throwable = {
       new AnalysisException(
         errorClass = errorClass,
         messageParameters = Map(
           "expr" -> toSQLExpr(expr),
           "dataType" -> toSQLType(dataType)))
     }
   ```



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -1062,6 +1062,33 @@
     ],
     "sqlState" : "42613"
   },
+  "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR" : {
+    "message" : [
+      "The time travel timestamp expression <expr> is invalid."
+    ],
+    "subClass" : {
+      "DATA_TYPE" : {
+        "message" : [
+          "Must be timestamp type, but got <dataType>."
+        ]
+      },
+      "IS_NULL" : {
+        "message" : [
+          "Must not be null."

Review Comment:
   nit:
   ```suggestion
             "Must not be NULL."
   ```



##########
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala:
##########
@@ -2931,29 +2931,31 @@ class DataSourceV2SQLSuiteV1Filter
         exception = intercept[AnalysisException] {
           sql("SELECT * FROM t TIMESTAMP AS OF INTERVAL 1 DAY").collect()
         },
-        errorClass = "_LEGACY_ERROR_TEMP_1335",
-        parameters = Map("expr" -> "INTERVAL '1' DAY"))
+        errorClass = "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.DATA_TYPE",
+        parameters = Map(
+          "expr" -> "\"INTERVAL '1' DAY\"",
+          "dataType" -> "\"INTERVAL DAY\""))
 
       checkError(
         exception = intercept[AnalysisException] {
           sql("SELECT * FROM t TIMESTAMP AS OF 'abc'").collect()
         },
-        errorClass = "_LEGACY_ERROR_TEMP_1335",
-        parameters = Map("expr" -> "'abc'"))
+        errorClass = "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.IS_NULL",
+        parameters = Map("expr" -> "\"abc\""))

Review Comment:
   The error class and its message confuses slightly. `'abc'` is not NULL, right. NULL appears as a result of casting to timestamp. Could you improve error message, please.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #41314:
URL: https://github.com/apache/spark/pull/41314#discussion_r1205621697


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala:
##########
@@ -38,21 +38,25 @@ object TimeTravelSpec {
       val ts = timestamp.get
       assert(ts.resolved && ts.references.isEmpty && !SubqueryExpression.hasSubquery(ts))
       if (!Cast.canAnsiCast(ts.dataType, TimestampType)) {
-        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(

Review Comment:
   Merge with the bottom logic to form a `INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.DATA_TYPE` error message.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #41314:
URL: https://github.com/apache/spark/pull/41314#discussion_r1205644934


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala:
##########
@@ -38,21 +38,25 @@ object TimeTravelSpec {
       val ts = timestamp.get
       assert(ts.resolved && ts.references.isEmpty && !SubqueryExpression.hasSubquery(ts))
       if (!Cast.canAnsiCast(ts.dataType, TimestampType)) {
-        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+          "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.DATA_TYPE", ts, ts.dataType)
       }
       val tsToEval = ts.transform {
         case r: RuntimeReplaceable => r.replacement
         case _: Unevaluable =>
-          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+            "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.UNEVALUABLE", ts)
         case e if !e.deterministic =>
-          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+          throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
+            "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.NON_DETERMINISTIC", ts)
       }
       val tz = Some(conf.sessionLocalTimeZone)
       // Set `ansiEnabled` to false, so that it can return null for invalid input and we can provide
       // better error message.
       val value = Cast(tsToEval, TimestampType, tz, ansiEnabled = false).eval()
       if (value == null) {
-        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(ts)
+        throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(

Review Comment:
   I think so, one is about incorrect input type, but another one is about incorrect values. Could you name the sub-class as `INPUT`, and say in its message something like `The input value <inputVal> cannot be cast to the "TIMESTAMP" type.`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk closed pull request #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk closed pull request #41314: [SPARK-43794][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1335
URL: https://github.com/apache/spark/pull/41314


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org