You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/15 08:20:58 UTC

[GitHub] [spark] MaxGekk opened a new pull request, #37520: [WIP][SQL] Format error messages in the Thrift Server

MaxGekk opened a new pull request, #37520:
URL: https://github.com/apache/spark/pull/37520

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r949811915


##########
core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala:
##########
@@ -222,4 +222,70 @@ class SparkThrowableSuite extends SparkFunSuite {
         assert(false)
     }
   }
+
+  test("get message in the specified format") {
+    import ErrorMessageFormat._
+    class TestQueryContext extends QueryContext {
+      override val objectName = "v1"
+      override val objectType = "VIEW"
+      override val startIndex = 2
+      override val stopIndex = -1
+      override val fragment = "1 / 0"
+    }
+    val e = new SparkArithmeticException(

Review Comment:
   I will add a test w/ a sub-class but w/o query context. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
srielau commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1216838503

   I want to point out that I'm not married to the labels MINIMAL, STANDARD, and PRETTY.... If someone has more appropriate names ..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r954829692


##########
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServerErrors.scala:
##########
@@ -36,11 +36,10 @@ object HiveThriftServerErrors {
       " new task for execution, please retry the operation", rejected)
   }
 
-  def runningQueryError(e: Throwable): Throwable = e match {
-    case st: SparkThrowable =>
-      val errorClassPrefix = Option(st.getErrorClass).map(e => s"[$e] ").getOrElse("")

Review Comment:
   The PRETTY message includes the error class, see the test
   https://github.com/apache/spark/pull/37520/files#diff-44a4c34050c5ff3cc3488e7e800ad41d791936c97c2f9535371931e43868741bR165



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
srielau commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1216841711

   "startIndex - the integer represents the starting index in the query text which throws the exception. The index starts from 0, but it can be -1 if the index is unknown.
   stopIndex - the integer represents the stopping index in the query which throws the exception. The index starts from 0, but it can be -1 if the index is unknown."
   Didn't we fix this recently to be 1/1 based? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1217517685

   After offline discussion with @srielau @cloud-fan , I am going to change `startIndex` and `stopIndex` to 1-based. And If the index is unknown, just don't output it as we do in the textual form.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1217004037

   > Didn't we fix this recently to be 1/1 based?
   
   @srielau If you mean https://github.com/apache/spark/pull/36651 but it increased the position in textual form of query context not in `QueryContext` itself. @gengliangwang Am I right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
srielau commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1216843434

   Also this is a map:
   messageParameters - an array of message parameters as a strings.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948640883


##########
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerWithSparkContextSuite.scala:
##########
@@ -149,6 +149,64 @@ trait ThriftServerWithSparkContextSuite extends SharedThriftServer {
       }
     }
   }
+
+  test("formats of error messages") {
+    val sql = "select 1 / 0"
+    withCLIServiceClient() { client =>
+      val sessionHandle = client.openSession(user, "")
+      val confOverlay = new java.util.HashMap[java.lang.String, java.lang.String]
+      val exec: String => OperationHandle = client.executeStatement(sessionHandle, _, confOverlay)
+
+      exec(s"set ${SQLConf.ANSI_ENABLED.key}=true")
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.PRETTY}")
+      val e1 = intercept[HiveSQLException](exec(sql))
+      // scalastyle:off line.size.limit
+      assert(e1.getMessage ===
+        """Error running query: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
+          |== SQL(line 1, position 8) ==
+          |select 1 / 0
+          |       ^^^^^
+          |""".stripMargin)
+
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.MINIMAL}")
+      val e2 = intercept[HiveSQLException](exec(sql))
+      assert(e2.getMessage ===
+        """Error running query: {
+          |  "errorClass" : "DIVIDE_BY_ZERO",
+          |  "sqlState" : "22012",
+          |  "messageParameters" : {
+          |    "config" : "\"spark.sql.ansi.enabled\""
+          |  },
+          |  "queryContext" : [ {
+          |    "objectType" : "",
+          |    "objectName" : "",
+          |    "startIndex" : 8,
+          |    "stopIndex" : 12,
+          |    "fragment" : "1 / "

Review Comment:
   is it a bug? it should be `1 / 0`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947171224


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +143,41 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        val jValue = ("errorClass" -> "legacy") ~
+          ("messageParameters" -> JObject(List("message" -> JString(e.getMessage)))) ~
+          ("queryContext" -> JArray(List.empty))
+        compact(render(jValue))
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        val message = if (format == STANDARD) {
+          val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+            throw new IllegalArgumentException(s"Cannot find error class '$errorClass'"))
+          Some(errorInfo.messageFormat)
+        } else None
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        val jValue = ("errorClass" -> errorClass) ~
+          ("errorSubClass" -> Option(e.getErrorSubClass)) ~
+          ("message" -> message) ~
+          ("sqlState" -> Option(e.getSqlState)) ~
+          ("messageParameters" ->
+            JObject((e.getParameterNames zip e.getMessageParameters.map(JString)).toList)) ~
+          ("queryContext" -> JArray(
+            e.getQueryContext.map(c => JObject(
+              "objectType" -> JString(c.objectType()),
+              "objectName" -> JString(c.objectName()),
+              "startIndex" -> JInt(c.startIndex()),
+              "stopIndex" -> JInt(c.stopIndex()),
+              "fragment" -> JString(c.fragment()))).toList)
+          )
+        val rendered = render(jValue)
+        if (format == MINIMAL) compact(rendered) else pretty(rendered)

Review Comment:
   MINIMAL only has payload. It does not contain the message text at all.
   So it's needed for the Golden Files (recall that teh original idea was to do machine readable before golden files),
   It may be advised for logs - so we do keep stability.
   and it may be useful for clients that want to get their messages from another (future) API. For example if they have a localized set of messages.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r949721704


##########
core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala:
##########
@@ -222,4 +222,70 @@ class SparkThrowableSuite extends SparkFunSuite {
         assert(false)
     }
   }
+
+  test("get message in the specified format") {
+    import ErrorMessageFormat._
+    class TestQueryContext extends QueryContext {
+      override val objectName = "v1"
+      override val objectType = "VIEW"
+      override val startIndex = 2
+      override val stopIndex = -1
+      override val fragment = "1 / 0"
+    }
+    val e = new SparkArithmeticException(

Review Comment:
   let's also test an exception with error class but without query context.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
srielau commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1219047663

   What is the reason for the limit? Will it break notebook and query editor?
   I think the point is that we want tools to start using it.
   
   Sent from my iPhone
   
   On Aug 17, 2022, at 9:34 PM, Wenchen Fan ***@***.***> wrote:
   
   
   
   @cloud-fan commented on this pull request.
   
   ________________________________
   
   In sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala<https://github.com/apache/spark/pull/37520#discussion_r948640467>:
   
   > @@ -3876,6 +3876,15 @@ object SQLConf {
          .booleanConf
          .createWithDefault(false)
   
   +  val ERROR_MESSAGE_FORMAT = buildConf("spark.sql.error.messageFormat")
   +    .doc("When PRETTY, the error message consists of textual representation of error class, " +
   +      "message and query context. The MINIMAL and STANDARD formats are pretty JSON formats where " +
   +      "STANDARD includes an additional JSON field `message`.")
   
   
   let's also mention where this config applies: thriftserver and sql shell
   
   —
   Reply to this email directly, view it on GitHub<https://github.com/apache/spark/pull/37520#pullrequestreview-1076685617>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AA22CFGHCBQXN6WA4WU5XEDVZW4OLANCNFSM56RPFPYA>.
   You are receiving this because you were mentioned.Message ID: ***@***.***>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947694236


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +140,59 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        toJsonString { g =>
+          g.writeStartObject()
+          g.writeStringField("errorClass", "legacy")
+          g.writeObjectFieldStart("messageParameters")
+          g.writeStringField("message", e.getMessage)
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          g.writeEndArray()
+          g.writeEndObject()
+        }
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        toJsonString { generator =>
+          val g = if (format == STANDARD) generator.useDefaultPrettyPrinter() else generator
+          g.writeStartObject()
+          g.writeStringField("errorClass", errorClass)
+          val errorSubClass = e.getErrorSubClass
+          if (errorSubClass != null) g.writeStringField("errorSubClass", errorSubClass)
+          if (format == STANDARD) {
+            val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+              throw SparkException.internalError(s"Cannot find the error class '$errorClass'"))
+            g.writeStringField("message", errorInfo.messageFormat)
+          }
+          val sqlState = e.getSqlState
+          if (sqlState != null) g.writeStringField("sqlState", sqlState)
+          g.writeObjectFieldStart("messageParameters")
+          (e.getParameterNames zip e.getMessageParameters).foreach { case (name, value) =>
+            g.writeStringField(name, value)
+          }
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          e.getQueryContext.map { c =>
+            g.writeStartObject()
+            g.writeStringField("objectType", c.objectType())
+            g.writeStringField("objectName", c.objectName())
+            val startIndex = c.startIndex() + 1

Review Comment:
   @cloud-fan Could you elaborate it a little bit more, please. What's the original error message? After https://github.com/apache/spark/pull/36651, the position is 1-based too. And after my last commit, the indexes are 1-based. What do you mean by inconsistent?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948171033


##########
core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala:
##########
@@ -222,4 +222,66 @@ class SparkThrowableSuite extends SparkFunSuite {
         assert(false)
     }
   }
+
+  test("get message in the specified format") {
+    import ErrorMessageFormat._
+    class TestQueryContext extends QueryContext {
+      override val objectName = "v1"
+      override val objectType = "VIEW"
+      override val startIndex = 2
+      override val stopIndex = -1
+      override val fragment = "1 / 0"
+    }
+    val e = new SparkArithmeticException(
+      errorClass = "DIVIDE_BY_ZERO",
+      errorSubClass = None,
+      messageParameters = Array("CONFIG"),
+      context = Array(new TestQueryContext),
+      summary = "Query summary")
+
+    assert(SparkThrowableHelper.getMessage(e, PRETTY) ===
+      "[DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 " +
+      "and return NULL instead. If necessary set CONFIG to \"false\" to bypass this error." +
+      "\nQuery summary")
+    // scalastyle:off line.size.limit
+    assert(SparkThrowableHelper.getMessage(e, MINIMAL) ===
+      """{
+        |  "errorClass" : "DIVIDE_BY_ZERO",
+        |  "sqlState" : "22012",
+        |  "messageParameters" : {
+        |    "config" : "CONFIG"
+        |  },
+        |  "queryContext" : [ {
+        |    "objectType" : "VIEW",
+        |    "objectName" : "v1",
+        |    "startIndex" : 3,
+        |    "fragment" : "1 / 0"
+        |  } ]
+        |}""".stripMargin)
+    assert(SparkThrowableHelper.getMessage(e, STANDARD) ===
+      """{
+        |  "errorClass" : "DIVIDE_BY_ZERO",
+        |  "message" : "Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set <config> to \"false\" to bypass this error.",
+        |  "sqlState" : "22012",
+        |  "messageParameters" : {
+        |    "config" : "CONFIG"
+        |  },
+        |  "queryContext" : [ {
+        |    "objectType" : "VIEW",
+        |    "objectName" : "v1",
+        |    "startIndex" : 3,
+        |    "fragment" : "1 / 0"
+        |  } ]
+        |}""".stripMargin)
+      // scalastyle:on line.size.limit
+    // Legacy mode when an exception does not have any error class
+    class LegacyException extends Throwable with SparkThrowable {
+      override def getErrorClass: String = null
+      override def getMessage: String = "Test message"
+    }
+    val e2 = new LegacyException
+    assert(SparkThrowableHelper.getMessage(e2, MINIMAL) ===
+      """{"errorClass":"legacy","messageParameters":{"message":"Test message"},""" +

Review Comment:
   Nit: Should we capitalize LEGACY? 



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3876,6 +3876,15 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val ERROR_MESSAGE_FORMAT = buildConf("spark.sql.error.messageFormat")
+    .doc("When PRETTY, the error message consists of textual representation of error class, " +
+      " message and query context. The MINIMAL and STANDARD formats are JSON formats where " +
+      "STANDARD is pretty JSON with additional JSON field `message`.")

Review Comment:
   ```suggestion
       .doc("When PRETTY, the error message consists of textual representation of error class, " +
         " message and query context. The MINIMAL and STANDARD formats are pretty JSON formats where " +
         "STANDARD includes an additional JSON field `message`.")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948755303


##########
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerWithSparkContextSuite.scala:
##########
@@ -149,6 +149,64 @@ trait ThriftServerWithSparkContextSuite extends SharedThriftServer {
       }
     }
   }
+
+  test("formats of error messages") {
+    val sql = "select 1 / 0"
+    withCLIServiceClient() { client =>
+      val sessionHandle = client.openSession(user, "")
+      val confOverlay = new java.util.HashMap[java.lang.String, java.lang.String]
+      val exec: String => OperationHandle = client.executeStatement(sessionHandle, _, confOverlay)
+
+      exec(s"set ${SQLConf.ANSI_ENABLED.key}=true")
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.PRETTY}")
+      val e1 = intercept[HiveSQLException](exec(sql))
+      // scalastyle:off line.size.limit
+      assert(e1.getMessage ===
+        """Error running query: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
+          |== SQL(line 1, position 8) ==
+          |select 1 / 0
+          |       ^^^^^
+          |""".stripMargin)
+
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.MINIMAL}")
+      val e2 = intercept[HiveSQLException](exec(sql))
+      assert(e2.getMessage ===
+        """Error running query: {
+          |  "errorClass" : "DIVIDE_BY_ZERO",
+          |  "sqlState" : "22012",
+          |  "messageParameters" : {
+          |    "config" : "\"spark.sql.ansi.enabled\""
+          |  },
+          |  "queryContext" : [ {
+          |    "objectType" : "",
+          |    "objectName" : "",
+          |    "startIndex" : 8,
+          |    "stopIndex" : 12,
+          |    "fragment" : "1 / "

Review Comment:
   Here is the JIRA for the issue: https://issues.apache.org/jira/browse/SPARK-40136



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1220433022

   ImageFileFormatSuite failed again. I re-ran it locally. Seems like it is just a flaky test. Merging to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948640297


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +140,59 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        toJsonString { g =>
+          g.writeStartObject()
+          g.writeStringField("errorClass", "LEGACY")
+          g.writeObjectFieldStart("messageParameters")
+          g.writeStringField("message", e.getMessage)
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          g.writeEndArray()
+          g.writeEndObject()
+        }
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        toJsonString { generator =>
+          val g = generator.useDefaultPrettyPrinter()
+          g.writeStartObject()
+          g.writeStringField("errorClass", errorClass)
+          val errorSubClass = e.getErrorSubClass
+          if (errorSubClass != null) g.writeStringField("errorSubClass", errorSubClass)
+          if (format == STANDARD) {
+            val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+              throw SparkException.internalError(s"Cannot find the error class '$errorClass'"))
+            g.writeStringField("message", errorInfo.messageFormat)
+          }
+          val sqlState = e.getSqlState
+          if (sqlState != null) g.writeStringField("sqlState", sqlState)
+          g.writeObjectFieldStart("messageParameters")
+          (e.getParameterNames zip e.getMessageParameters).foreach { case (name, value) =>
+            g.writeStringField(name, value)
+          }
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")

Review Comment:
   do we still need the `queryContext` field if there is no query context in the error?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
srielau commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1274348629

   Please stick with 1. we should not change values based on formatting of the error.
   
   Sent from my iPhone
   
   On Aug 16, 2022, at 1:38 PM, Maxim Gekk ***@***.***> wrote:
   
   
   
   For Json format output, I am ok with 0-based or 1-based. Either way makes sense...
   
   If we assume that the messages in the JSON format will be parsed machinery, it would be more natural to begin the indexes from 0, IMHO.
   
   —
   Reply to this email directly, view it on GitHub<https://github.com/apache/spark/pull/37520#issuecomment-1217135789>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AA22CFGO7PHMRABGOZVH6M3VZP34PANCNFSM56RPFPYA>.
   You are receiving this because you were mentioned.Message ID: ***@***.***>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r954907474


##########
core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala:
##########
@@ -222,4 +222,70 @@ class SparkThrowableSuite extends SparkFunSuite {
         assert(false)
     }
   }
+
+  test("get message in the specified format") {
+    import ErrorMessageFormat._
+    class TestQueryContext extends QueryContext {
+      override val objectName = "v1"
+      override val objectType = "VIEW"
+      override val startIndex = 2
+      override val stopIndex = -1
+      override val fragment = "1 / 0"
+    }
+    val e = new SparkArithmeticException(

Review Comment:
   ah, do you mean before this PR thriftserver print error class twice?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1216662529

   @cloud-fan @gengliangwang @HyukjinKwon Could you review this PR, please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r960169126


##########
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServerErrors.scala:
##########
@@ -36,11 +36,10 @@ object HiveThriftServerErrors {
       " new task for execution, please retry the operation", rejected)
   }
 
-  def runningQueryError(e: Throwable): Throwable = e match {
-    case st: SparkThrowable =>
-      val errorClassPrefix = Option(st.getErrorClass).map(e => s"[$e] ").getOrElse("")
-      new HiveSQLException(
-        s"Error running query: ${errorClassPrefix}${st.toString}", st.getSqlState, st)

Review Comment:
   found a behavior change here: previously we call `st.toString`, but now `SparkThrowableHelper.getMessage` will call `st.getMessage` at the end.
   
   Can we restore the previous behavior?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r946937105


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -25,6 +25,10 @@ import com.fasterxml.jackson.annotation.JsonIgnore
 import com.fasterxml.jackson.core.`type`.TypeReference
 import com.fasterxml.jackson.databind.json.JsonMapper
 import com.fasterxml.jackson.module.scala.DefaultScalaModule
+import org.json4s.{JInt, JString}
+import org.json4s.JsonAST.{JArray, JObject}
+import org.json4s.JsonDSL._
+import org.json4s.jackson.JsonMethods.{compact, pretty, render}

Review Comment:
   according to https://github.com/apache/spark/pull/36885 , we want to remove json4s eventually. Shall we use jackson here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947120002


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -25,6 +25,10 @@ import com.fasterxml.jackson.annotation.JsonIgnore
 import com.fasterxml.jackson.core.`type`.TypeReference
 import com.fasterxml.jackson.databind.json.JsonMapper
 import com.fasterxml.jackson.module.scala.DefaultScalaModule
+import org.json4s.{JInt, JString}
+import org.json4s.JsonAST.{JArray, JObject}
+import org.json4s.JsonDSL._
+import org.json4s.jackson.JsonMethods.{compact, pretty, render}

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947195431


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +143,41 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        val jValue = ("errorClass" -> "legacy") ~
+          ("messageParameters" -> JObject(List("message" -> JString(e.getMessage)))) ~
+          ("queryContext" -> JArray(List.empty))
+        compact(render(jValue))
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        val message = if (format == STANDARD) {
+          val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+            throw new IllegalArgumentException(s"Cannot find error class '$errorClass'"))
+          Some(errorInfo.messageFormat)
+        } else None
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        val jValue = ("errorClass" -> errorClass) ~
+          ("errorSubClass" -> Option(e.getErrorSubClass)) ~
+          ("message" -> message) ~
+          ("sqlState" -> Option(e.getSqlState)) ~
+          ("messageParameters" ->
+            JObject((e.getParameterNames zip e.getMessageParameters.map(JString)).toList)) ~
+          ("queryContext" -> JArray(
+            e.getQueryContext.map(c => JObject(
+              "objectType" -> JString(c.objectType()),
+              "objectName" -> JString(c.objectName()),
+              "startIndex" -> JInt(c.startIndex()),
+              "stopIndex" -> JInt(c.stopIndex()),
+              "fragment" -> JString(c.fragment()))).toList)
+          )
+        val rendered = render(jValue)
+        if (format == MINIMAL) compact(rendered) else pretty(rendered)

Review Comment:
   @srielau Should we output MINIMAL in the compact or pretty form?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948165410


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +140,59 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        toJsonString { g =>
+          g.writeStartObject()
+          g.writeStringField("errorClass", "legacy")
+          g.writeObjectFieldStart("messageParameters")
+          g.writeStringField("message", e.getMessage)
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          g.writeEndArray()
+          g.writeEndObject()
+        }
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        toJsonString { generator =>
+          val g = if (format == STANDARD) generator.useDefaultPrettyPrinter() else generator
+          g.writeStartObject()
+          g.writeStringField("errorClass", errorClass)
+          val errorSubClass = e.getErrorSubClass
+          if (errorSubClass != null) g.writeStringField("errorSubClass", errorSubClass)
+          if (format == STANDARD) {
+            val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+              throw SparkException.internalError(s"Cannot find the error class '$errorClass'"))
+            g.writeStringField("message", errorInfo.messageFormat)
+          }
+          val sqlState = e.getSqlState
+          if (sqlState != null) g.writeStringField("sqlState", sqlState)
+          g.writeObjectFieldStart("messageParameters")
+          (e.getParameterNames zip e.getMessageParameters).foreach { case (name, value) =>
+            g.writeStringField(name, value)
+          }
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          e.getQueryContext.map { c =>
+            g.writeStartObject()
+            g.writeStringField("objectType", c.objectType())
+            g.writeStringField("objectName", c.objectName())
+            val startIndex = c.startIndex() + 1

Review Comment:
   1 baed makes it consistent.



##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +140,59 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        toJsonString { g =>
+          g.writeStartObject()
+          g.writeStringField("errorClass", "legacy")
+          g.writeObjectFieldStart("messageParameters")
+          g.writeStringField("message", e.getMessage)
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          g.writeEndArray()
+          g.writeEndObject()
+        }
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        toJsonString { generator =>
+          val g = if (format == STANDARD) generator.useDefaultPrettyPrinter() else generator
+          g.writeStartObject()
+          g.writeStringField("errorClass", errorClass)
+          val errorSubClass = e.getErrorSubClass
+          if (errorSubClass != null) g.writeStringField("errorSubClass", errorSubClass)
+          if (format == STANDARD) {
+            val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+              throw SparkException.internalError(s"Cannot find the error class '$errorClass'"))
+            g.writeStringField("message", errorInfo.messageFormat)
+          }
+          val sqlState = e.getSqlState
+          if (sqlState != null) g.writeStringField("sqlState", sqlState)
+          g.writeObjectFieldStart("messageParameters")
+          (e.getParameterNames zip e.getMessageParameters).foreach { case (name, value) =>
+            g.writeStringField(name, value)
+          }
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          e.getQueryContext.map { c =>
+            g.writeStartObject()
+            g.writeStringField("objectType", c.objectType())
+            g.writeStringField("objectName", c.objectName())
+            val startIndex = c.startIndex() + 1

Review Comment:
   1 based makes it consistent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948680591


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3876,6 +3876,15 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val ERROR_MESSAGE_FORMAT = buildConf("spark.sql.error.messageFormat")
+    .doc("When PRETTY, the error message consists of textual representation of error class, " +
+      "message and query context. The MINIMAL and STANDARD formats are pretty JSON formats where " +
+      "STANDARD includes an additional JSON field `message`.")

Review Comment:
   In this logic, we should mention where every SQL config is applied for. I could mention the Thrift server as one of applications and of the config, only as an example.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947195431


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +143,41 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        val jValue = ("errorClass" -> "legacy") ~
+          ("messageParameters" -> JObject(List("message" -> JString(e.getMessage)))) ~
+          ("queryContext" -> JArray(List.empty))
+        compact(render(jValue))
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        val message = if (format == STANDARD) {
+          val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+            throw new IllegalArgumentException(s"Cannot find error class '$errorClass'"))
+          Some(errorInfo.messageFormat)
+        } else None
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        val jValue = ("errorClass" -> errorClass) ~
+          ("errorSubClass" -> Option(e.getErrorSubClass)) ~
+          ("message" -> message) ~
+          ("sqlState" -> Option(e.getSqlState)) ~
+          ("messageParameters" ->
+            JObject((e.getParameterNames zip e.getMessageParameters.map(JString)).toList)) ~
+          ("queryContext" -> JArray(
+            e.getQueryContext.map(c => JObject(
+              "objectType" -> JString(c.objectType()),
+              "objectName" -> JString(c.objectName()),
+              "startIndex" -> JInt(c.startIndex()),
+              "stopIndex" -> JInt(c.stopIndex()),
+              "fragment" -> JString(c.fragment()))).toList)
+          )
+        val rendered = render(jValue)
+        if (format == MINIMAL) compact(rendered) else pretty(rendered)

Review Comment:
   @srielau The question is how we should output error message in the MINIMAL format either in the compact or pretty JSON form?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r946933016


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +143,41 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        val jValue = ("errorClass" -> "legacy") ~
+          ("messageParameters" -> JObject(List("message" -> JString(e.getMessage)))) ~
+          ("queryContext" -> JArray(List.empty))
+        compact(render(jValue))
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        val message = if (format == STANDARD) {
+          val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+            throw new IllegalArgumentException(s"Cannot find error class '$errorClass'"))

Review Comment:
   let's use the new `SparkException.internalError`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948920596


##########
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerWithSparkContextSuite.scala:
##########
@@ -149,6 +149,64 @@ trait ThriftServerWithSparkContextSuite extends SharedThriftServer {
       }
     }
   }
+
+  test("formats of error messages") {
+    val sql = "select 1 / 0"
+    withCLIServiceClient() { client =>
+      val sessionHandle = client.openSession(user, "")
+      val confOverlay = new java.util.HashMap[java.lang.String, java.lang.String]
+      val exec: String => OperationHandle = client.executeStatement(sessionHandle, _, confOverlay)
+
+      exec(s"set ${SQLConf.ANSI_ENABLED.key}=true")
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.PRETTY}")
+      val e1 = intercept[HiveSQLException](exec(sql))
+      // scalastyle:off line.size.limit
+      assert(e1.getMessage ===
+        """Error running query: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
+          |== SQL(line 1, position 8) ==
+          |select 1 / 0
+          |       ^^^^^
+          |""".stripMargin)
+
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.MINIMAL}")
+      val e2 = intercept[HiveSQLException](exec(sql))
+      assert(e2.getMessage ===
+        """Error running query: {
+          |  "errorClass" : "DIVIDE_BY_ZERO",
+          |  "sqlState" : "22012",
+          |  "messageParameters" : {
+          |    "config" : "\"spark.sql.ansi.enabled\""
+          |  },
+          |  "queryContext" : [ {
+          |    "objectType" : "",
+          |    "objectName" : "",
+          |    "startIndex" : 8,
+          |    "stopIndex" : 12,
+          |    "fragment" : "1 / "

Review Comment:
   I have opened the PR w/ a bug fix: https://github.com/apache/spark/pull/37566



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk closed pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk closed pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server
URL: https://github.com/apache/spark/pull/37520


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948640467


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3876,6 +3876,15 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val ERROR_MESSAGE_FORMAT = buildConf("spark.sql.error.messageFormat")
+    .doc("When PRETTY, the error message consists of textual representation of error class, " +
+      "message and query context. The MINIMAL and STANDARD formats are pretty JSON formats where " +
+      "STANDARD includes an additional JSON field `message`.")

Review Comment:
   let's also mention where this config applies: thriftserver and sql shell



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948732140


##########
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerWithSparkContextSuite.scala:
##########
@@ -149,6 +149,64 @@ trait ThriftServerWithSparkContextSuite extends SharedThriftServer {
       }
     }
   }
+
+  test("formats of error messages") {
+    val sql = "select 1 / 0"
+    withCLIServiceClient() { client =>
+      val sessionHandle = client.openSession(user, "")
+      val confOverlay = new java.util.HashMap[java.lang.String, java.lang.String]
+      val exec: String => OperationHandle = client.executeStatement(sessionHandle, _, confOverlay)
+
+      exec(s"set ${SQLConf.ANSI_ENABLED.key}=true")
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.PRETTY}")
+      val e1 = intercept[HiveSQLException](exec(sql))
+      // scalastyle:off line.size.limit
+      assert(e1.getMessage ===
+        """Error running query: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
+          |== SQL(line 1, position 8) ==
+          |select 1 / 0
+          |       ^^^^^
+          |""".stripMargin)
+
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.MINIMAL}")
+      val e2 = intercept[HiveSQLException](exec(sql))
+      assert(e2.getMessage ===
+        """Error running query: {
+          |  "errorClass" : "DIVIDE_BY_ZERO",
+          |  "sqlState" : "22012",
+          |  "messageParameters" : {
+          |    "config" : "\"spark.sql.ansi.enabled\""
+          |  },
+          |  "queryContext" : [ {
+          |    "objectType" : "",
+          |    "objectName" : "",
+          |    "startIndex" : 8,
+          |    "stopIndex" : 12,
+          |    "fragment" : "1 / "

Review Comment:
   Seems like the issue in:
   ```scala
     override lazy val fragment: String = {
       if (!isValid) {
         ""
       } else {
         sqlText.get.substring(originStartIndex.get, originStopIndex.get)
       }
     }
   ```
   `originStopIndex.get` points to the last char of the fragment but `substring()` takes a sub-string up to `endIndex - 1`.
   
   Let me open a PR and fix this separately. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947703504


##########
core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala:
##########
@@ -222,4 +222,56 @@ class SparkThrowableSuite extends SparkFunSuite {
         assert(false)
     }
   }
+
+  test("get message in the specified format") {
+    import ErrorMessageFormat._
+    class TestQueryContext extends QueryContext {
+      override val objectName = "v1"
+      override val objectType = "VIEW"
+      override val startIndex = 2
+      override val stopIndex = -1
+      override val fragment = "1 / 0"
+    }
+    val e = new SparkArithmeticException(
+      errorClass = "DIVIDE_BY_ZERO",
+      errorSubClass = None,
+      messageParameters = Array("CONFIG"),
+      context = Array(new TestQueryContext),
+      summary = "Query summary")
+
+    assert(SparkThrowableHelper.getMessage(e, PRETTY) ===
+      "[DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 " +
+      "and return NULL instead. If necessary set CONFIG to \"false\" to bypass this error." +
+      "\nQuery summary")
+    assert(SparkThrowableHelper.getMessage(e, MINIMAL) ===
+      """{"errorClass":"DIVIDE_BY_ZERO","sqlState":"22012",""" +
+      """"messageParameters":{"config":"CONFIG"},"queryContext":[{"objectType":"VIEW",""" +
+      """"objectName":"v1","startIndex":3,"fragment":"1 / 0"}]}""")

Review Comment:
   ok. I will print MINIMAL in the pretty form.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r954599653


##########
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServerErrors.scala:
##########
@@ -36,11 +36,10 @@ object HiveThriftServerErrors {
       " new task for execution, please retry the operation", rejected)
   }
 
-  def runningQueryError(e: Throwable): Throwable = e match {
-    case st: SparkThrowable =>
-      val errorClassPrefix = Option(st.getErrorClass).map(e => s"[$e] ").getOrElse("")

Review Comment:
   I noticed that thriftserver already pretty the error class. Can we keep it when the format is PRETTY?



##########
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServerErrors.scala:
##########
@@ -36,11 +36,10 @@ object HiveThriftServerErrors {
       " new task for execution, please retry the operation", rejected)
   }
 
-  def runningQueryError(e: Throwable): Throwable = e match {
-    case st: SparkThrowable =>
-      val errorClassPrefix = Option(st.getErrorClass).map(e => s"[$e] ").getOrElse("")

Review Comment:
   I noticed that thriftserver already print the error class. Can we keep it when the format is PRETTY?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r954973980


##########
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServerErrors.scala:
##########
@@ -36,11 +36,10 @@ object HiveThriftServerErrors {
       " new task for execution, please retry the operation", rejected)
   }
 
-  def runningQueryError(e: Throwable): Throwable = e match {
-    case st: SparkThrowable =>
-      val errorClassPrefix = Option(st.getErrorClass).map(e => s"[$e] ").getOrElse("")

Review Comment:
   > ah, do you mean before this PR thriftserver print error class twice?
   
   Yep, it did.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947694236


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +140,59 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        toJsonString { g =>
+          g.writeStartObject()
+          g.writeStringField("errorClass", "legacy")
+          g.writeObjectFieldStart("messageParameters")
+          g.writeStringField("message", e.getMessage)
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          g.writeEndArray()
+          g.writeEndObject()
+        }
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        toJsonString { generator =>
+          val g = if (format == STANDARD) generator.useDefaultPrettyPrinter() else generator
+          g.writeStartObject()
+          g.writeStringField("errorClass", errorClass)
+          val errorSubClass = e.getErrorSubClass
+          if (errorSubClass != null) g.writeStringField("errorSubClass", errorSubClass)
+          if (format == STANDARD) {
+            val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+              throw SparkException.internalError(s"Cannot find the error class '$errorClass'"))
+            g.writeStringField("message", errorInfo.messageFormat)
+          }
+          val sqlState = e.getSqlState
+          if (sqlState != null) g.writeStringField("sqlState", sqlState)
+          g.writeObjectFieldStart("messageParameters")
+          (e.getParameterNames zip e.getMessageParameters).foreach { case (name, value) =>
+            g.writeStringField(name, value)
+          }
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          e.getQueryContext.map { c =>
+            g.writeStartObject()
+            g.writeStringField("objectType", c.objectType())
+            g.writeStringField("objectName", c.objectName())
+            val startIndex = c.startIndex() + 1

Review Comment:
   @cloud-fan Could you elaborate it a little bit more, please. What's the original error message? After https://github.com/apache/spark/pull/36651, the position is 1-based. And after my last commit, the indexes are 1-based. What do you mean by inconsistent?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947681963


##########
core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala:
##########
@@ -222,4 +222,56 @@ class SparkThrowableSuite extends SparkFunSuite {
         assert(false)
     }
   }
+
+  test("get message in the specified format") {
+    import ErrorMessageFormat._
+    class TestQueryContext extends QueryContext {
+      override val objectName = "v1"
+      override val objectType = "VIEW"
+      override val startIndex = 2
+      override val stopIndex = -1
+      override val fragment = "1 / 0"
+    }
+    val e = new SparkArithmeticException(
+      errorClass = "DIVIDE_BY_ZERO",
+      errorSubClass = None,
+      messageParameters = Array("CONFIG"),
+      context = Array(new TestQueryContext),
+      summary = "Query summary")
+
+    assert(SparkThrowableHelper.getMessage(e, PRETTY) ===
+      "[DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 " +
+      "and return NULL instead. If necessary set CONFIG to \"false\" to bypass this error." +
+      "\nQuery summary")
+    assert(SparkThrowableHelper.getMessage(e, MINIMAL) ===
+      """{"errorClass":"DIVIDE_BY_ZERO","sqlState":"22012",""" +
+      """"messageParameters":{"config":"CONFIG"},"queryContext":[{"objectType":"VIEW",""" +
+      """"objectName":"v1","startIndex":3,"fragment":"1 / 0"}]}""")

Review Comment:
   The only benefit of using compact format is to save space, while I think this doesn't matter for error messages. Shall we always make it human readable?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947677976


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +140,59 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        toJsonString { g =>
+          g.writeStartObject()
+          g.writeStringField("errorClass", "legacy")
+          g.writeObjectFieldStart("messageParameters")
+          g.writeStringField("message", e.getMessage)
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          g.writeEndArray()
+          g.writeEndObject()
+        }
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        toJsonString { generator =>
+          val g = if (format == STANDARD) generator.useDefaultPrettyPrinter() else generator
+          g.writeStartObject()
+          g.writeStringField("errorClass", errorClass)
+          val errorSubClass = e.getErrorSubClass
+          if (errorSubClass != null) g.writeStringField("errorSubClass", errorSubClass)
+          if (format == STANDARD) {
+            val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+              throw SparkException.internalError(s"Cannot find the error class '$errorClass'"))
+            g.writeStringField("message", errorInfo.messageFormat)
+          }
+          val sqlState = e.getSqlState
+          if (sqlState != null) g.writeStringField("sqlState", sqlState)
+          g.writeObjectFieldStart("messageParameters")
+          (e.getParameterNames zip e.getMessageParameters).foreach { case (name, value) =>
+            g.writeStringField(name, value)
+          }
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          e.getQueryContext.map { c =>
+            g.writeStartObject()
+            g.writeStringField("objectType", c.objectType())
+            g.writeStringField("objectName", c.objectName())
+            val startIndex = c.startIndex() + 1

Review Comment:
   does this mean it's inconsistent with the original text error message? cc @gengliangwang 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srielau commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
srielau commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947171653


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +143,41 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        val jValue = ("errorClass" -> "legacy") ~
+          ("messageParameters" -> JObject(List("message" -> JString(e.getMessage)))) ~
+          ("queryContext" -> JArray(List.empty))
+        compact(render(jValue))
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        val message = if (format == STANDARD) {
+          val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+            throw new IllegalArgumentException(s"Cannot find error class '$errorClass'"))
+          Some(errorInfo.messageFormat)
+        } else None
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        val jValue = ("errorClass" -> errorClass) ~
+          ("errorSubClass" -> Option(e.getErrorSubClass)) ~
+          ("message" -> message) ~
+          ("sqlState" -> Option(e.getSqlState)) ~
+          ("messageParameters" ->
+            JObject((e.getParameterNames zip e.getMessageParameters.map(JString)).toList)) ~
+          ("queryContext" -> JArray(
+            e.getQueryContext.map(c => JObject(
+              "objectType" -> JString(c.objectType()),
+              "objectName" -> JString(c.objectName()),
+              "startIndex" -> JInt(c.startIndex()),
+              "stopIndex" -> JInt(c.stopIndex()),
+              "fragment" -> JString(c.fragment()))).toList)
+          )
+        val rendered = render(jValue)
+        if (format == MINIMAL) compact(rendered) else pretty(rendered)

Review Comment:
   PS: We need an API (table function?) to return error messages.....



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1217106692

   > @srielau If you mean https://github.com/apache/spark/pull/36651 but it increased the position in textual form of query context not in QueryContext itself. @gengliangwang Am I right?
   
   Yes I changed the position in the text summary.
   For Json format output, I am ok with 0-based or 1-based. Either way makes sense...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948683436


##########
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerWithSparkContextSuite.scala:
##########
@@ -149,6 +149,64 @@ trait ThriftServerWithSparkContextSuite extends SharedThriftServer {
       }
     }
   }
+
+  test("formats of error messages") {
+    val sql = "select 1 / 0"
+    withCLIServiceClient() { client =>
+      val sessionHandle = client.openSession(user, "")
+      val confOverlay = new java.util.HashMap[java.lang.String, java.lang.String]
+      val exec: String => OperationHandle = client.executeStatement(sessionHandle, _, confOverlay)
+
+      exec(s"set ${SQLConf.ANSI_ENABLED.key}=true")
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.PRETTY}")
+      val e1 = intercept[HiveSQLException](exec(sql))
+      // scalastyle:off line.size.limit
+      assert(e1.getMessage ===
+        """Error running query: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
+          |== SQL(line 1, position 8) ==
+          |select 1 / 0
+          |       ^^^^^
+          |""".stripMargin)
+
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.MINIMAL}")
+      val e2 = intercept[HiveSQLException](exec(sql))
+      assert(e2.getMessage ===
+        """Error running query: {
+          |  "errorClass" : "DIVIDE_BY_ZERO",
+          |  "sqlState" : "22012",
+          |  "messageParameters" : {
+          |    "config" : "\"spark.sql.ansi.enabled\""
+          |  },
+          |  "queryContext" : [ {
+          |    "objectType" : "",
+          |    "objectName" : "",
+          |    "startIndex" : 8,
+          |    "stopIndex" : 12,
+          |    "fragment" : "1 / "

Review Comment:
   Probably



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948693663


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +140,59 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        toJsonString { g =>
+          g.writeStartObject()
+          g.writeStringField("errorClass", "LEGACY")
+          g.writeObjectFieldStart("messageParameters")
+          g.writeStringField("message", e.getMessage)
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")
+          g.writeEndArray()
+          g.writeEndObject()
+        }
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        toJsonString { generator =>
+          val g = generator.useDefaultPrettyPrinter()
+          g.writeStartObject()
+          g.writeStringField("errorClass", errorClass)
+          val errorSubClass = e.getErrorSubClass
+          if (errorSubClass != null) g.writeStringField("errorSubClass", errorSubClass)
+          if (format == STANDARD) {
+            val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+              throw SparkException.internalError(s"Cannot find the error class '$errorClass'"))
+            g.writeStringField("message", errorInfo.messageFormat)
+          }
+          val sqlState = e.getSqlState
+          if (sqlState != null) g.writeStringField("sqlState", sqlState)
+          g.writeObjectFieldStart("messageParameters")
+          (e.getParameterNames zip e.getMessageParameters).foreach { case (name, value) =>
+            g.writeStringField(name, value)
+          }
+          g.writeEndObject()
+          g.writeArrayFieldStart("queryContext")

Review Comment:
   I will remove it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948706245


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3876,6 +3876,15 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val ERROR_MESSAGE_FORMAT = buildConf("spark.sql.error.messageFormat")
+    .doc("When PRETTY, the error message consists of textual representation of error class, " +
+      "message and query context. The MINIMAL and STANDARD formats are pretty JSON formats where " +
+      "STANDARD includes an additional JSON field `message`.")

Review Comment:
   SGTM



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948707966


##########
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerWithSparkContextSuite.scala:
##########
@@ -149,6 +149,64 @@ trait ThriftServerWithSparkContextSuite extends SharedThriftServer {
       }
     }
   }
+
+  test("formats of error messages") {
+    val sql = "select 1 / 0"
+    withCLIServiceClient() { client =>
+      val sessionHandle = client.openSession(user, "")
+      val confOverlay = new java.util.HashMap[java.lang.String, java.lang.String]
+      val exec: String => OperationHandle = client.executeStatement(sessionHandle, _, confOverlay)
+
+      exec(s"set ${SQLConf.ANSI_ENABLED.key}=true")
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.PRETTY}")
+      val e1 = intercept[HiveSQLException](exec(sql))
+      // scalastyle:off line.size.limit
+      assert(e1.getMessage ===
+        """Error running query: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
+          |== SQL(line 1, position 8) ==
+          |select 1 / 0
+          |       ^^^^^
+          |""".stripMargin)
+
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.MINIMAL}")
+      val e2 = intercept[HiveSQLException](exec(sql))
+      assert(e2.getMessage ===
+        """Error running query: {
+          |  "errorClass" : "DIVIDE_BY_ZERO",
+          |  "sqlState" : "22012",
+          |  "messageParameters" : {
+          |    "config" : "\"spark.sql.ansi.enabled\""
+          |  },
+          |  "queryContext" : [ {
+          |    "objectType" : "",
+          |    "objectName" : "",
+          |    "startIndex" : 8,
+          |    "stopIndex" : 12,
+          |    "fragment" : "1 / "

Review Comment:
   is this bug thrift-server only? if yes we can probably create a ticket and ask someone who knows thriftserver better to fix it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r948725197


##########
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerWithSparkContextSuite.scala:
##########
@@ -149,6 +149,64 @@ trait ThriftServerWithSparkContextSuite extends SharedThriftServer {
       }
     }
   }
+
+  test("formats of error messages") {
+    val sql = "select 1 / 0"
+    withCLIServiceClient() { client =>
+      val sessionHandle = client.openSession(user, "")
+      val confOverlay = new java.util.HashMap[java.lang.String, java.lang.String]
+      val exec: String => OperationHandle = client.executeStatement(sessionHandle, _, confOverlay)
+
+      exec(s"set ${SQLConf.ANSI_ENABLED.key}=true")
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.PRETTY}")
+      val e1 = intercept[HiveSQLException](exec(sql))
+      // scalastyle:off line.size.limit
+      assert(e1.getMessage ===
+        """Error running query: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
+          |== SQL(line 1, position 8) ==
+          |select 1 / 0
+          |       ^^^^^
+          |""".stripMargin)
+
+      exec(s"set ${SQLConf.ERROR_MESSAGE_FORMAT.key}=${ErrorMessageFormat.MINIMAL}")
+      val e2 = intercept[HiveSQLException](exec(sql))
+      assert(e2.getMessage ===
+        """Error running query: {
+          |  "errorClass" : "DIVIDE_BY_ZERO",
+          |  "sqlState" : "22012",
+          |  "messageParameters" : {
+          |    "config" : "\"spark.sql.ansi.enabled\""
+          |  },
+          |  "queryContext" : [ {
+          |    "objectType" : "",
+          |    "objectName" : "",
+          |    "startIndex" : 8,
+          |    "stopIndex" : 12,
+          |    "fragment" : "1 / "

Review Comment:
   It is not thrift-server specific:
   ```scala
       withSQLConf(SQLConf.ANSI_ENABLED.key -> "true") {
         val e = intercept[SparkArithmeticException] {
           sql("select 1 / 0").collect()
         }
         println("'" + e.getQueryContext()(0).fragment() + "'")
       }
   
   '1 / '
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1216216727

   @srielau @anchovYu Please, take a look at the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r946938322


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +143,41 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        val jValue = ("errorClass" -> "legacy") ~
+          ("messageParameters" -> JObject(List("message" -> JString(e.getMessage)))) ~
+          ("queryContext" -> JArray(List.empty))
+        compact(render(jValue))
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        val message = if (format == STANDARD) {
+          val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+            throw new IllegalArgumentException(s"Cannot find error class '$errorClass'"))
+          Some(errorInfo.messageFormat)
+        } else None
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        val jValue = ("errorClass" -> errorClass) ~
+          ("errorSubClass" -> Option(e.getErrorSubClass)) ~
+          ("message" -> message) ~
+          ("sqlState" -> Option(e.getSqlState)) ~
+          ("messageParameters" ->
+            JObject((e.getParameterNames zip e.getMessageParameters.map(JString)).toList)) ~
+          ("queryContext" -> JArray(
+            e.getQueryContext.map(c => JObject(
+              "objectType" -> JString(c.objectType()),
+              "objectName" -> JString(c.objectName()),
+              "startIndex" -> JInt(c.startIndex()),
+              "stopIndex" -> JInt(c.stopIndex()),
+              "fragment" -> JString(c.fragment()))).toList)
+          )
+        val rendered = render(jValue)
+        if (format == MINIMAL) compact(rendered) else pretty(rendered)

Review Comment:
   shall we always use the pretty JSON output instead od compacted?



##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +143,41 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        val jValue = ("errorClass" -> "legacy") ~
+          ("messageParameters" -> JObject(List("message" -> JString(e.getMessage)))) ~
+          ("queryContext" -> JArray(List.empty))
+        compact(render(jValue))
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        val message = if (format == STANDARD) {
+          val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+            throw new IllegalArgumentException(s"Cannot find error class '$errorClass'"))
+          Some(errorInfo.messageFormat)
+        } else None
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        val jValue = ("errorClass" -> errorClass) ~
+          ("errorSubClass" -> Option(e.getErrorSubClass)) ~
+          ("message" -> message) ~
+          ("sqlState" -> Option(e.getSqlState)) ~
+          ("messageParameters" ->
+            JObject((e.getParameterNames zip e.getMessageParameters.map(JString)).toList)) ~
+          ("queryContext" -> JArray(
+            e.getQueryContext.map(c => JObject(
+              "objectType" -> JString(c.objectType()),
+              "objectName" -> JString(c.objectName()),
+              "startIndex" -> JInt(c.startIndex()),
+              "stopIndex" -> JInt(c.stopIndex()),
+              "fragment" -> JString(c.fragment()))).toList)
+          )
+        val rendered = render(jValue)
+        if (format == MINIMAL) compact(rendered) else pretty(rendered)

Review Comment:
   shall we always use the pretty JSON output instead of compacted?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on code in PR #37520:
URL: https://github.com/apache/spark/pull/37520#discussion_r947039507


##########
core/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##########
@@ -135,4 +143,41 @@ private[spark] object SparkThrowableHelper {
   def isInternalError(errorClass: String): Boolean = {
     errorClass == "INTERNAL_ERROR"
   }
+
+  def getMessage(e: SparkThrowable with Throwable, format: ErrorMessageFormat.Value): String = {
+    import ErrorMessageFormat._
+    format match {
+      case PRETTY => e.getMessage
+      case MINIMAL | STANDARD if e.getErrorClass == null =>
+        val jValue = ("errorClass" -> "legacy") ~
+          ("messageParameters" -> JObject(List("message" -> JString(e.getMessage)))) ~
+          ("queryContext" -> JArray(List.empty))
+        compact(render(jValue))
+      case MINIMAL | STANDARD =>
+        val errorClass = e.getErrorClass
+        val message = if (format == STANDARD) {
+          val errorInfo = errorClassToInfoMap.getOrElse(errorClass,
+            throw new IllegalArgumentException(s"Cannot find error class '$errorClass'"))
+          Some(errorInfo.messageFormat)
+        } else None
+        assert(e.getParameterNames.size == e.getMessageParameters.size,
+          "Number of message parameter names and values must be the same")
+        val jValue = ("errorClass" -> errorClass) ~
+          ("errorSubClass" -> Option(e.getErrorSubClass)) ~
+          ("message" -> message) ~
+          ("sqlState" -> Option(e.getSqlState)) ~
+          ("messageParameters" ->
+            JObject((e.getParameterNames zip e.getMessageParameters.map(JString)).toList)) ~
+          ("queryContext" -> JArray(
+            e.getQueryContext.map(c => JObject(
+              "objectType" -> JString(c.objectType()),
+              "objectName" -> JString(c.objectName()),
+              "startIndex" -> JInt(c.startIndex()),
+              "stopIndex" -> JInt(c.stopIndex()),
+              "fragment" -> JString(c.fragment()))).toList)
+          )
+        val rendered = render(jValue)
+        if (format == MINIMAL) compact(rendered) else pretty(rendered)

Review Comment:
   I supposed that minimal means minimal size of result JSON that will be parsed machinery. What's the use case for the MINIMAL form from your point of view. Do you mean that it could improve readability of golden files?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #37520: [SPARK-40098][SQL] Format error messages in the Thrift Server

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #37520:
URL: https://github.com/apache/spark/pull/37520#issuecomment-1217135789

   > For Json format output, I am ok with 0-based or 1-based. Either way makes sense...
   
   If we assume that the messages in the JSON format will be parsed machinery, it would be more natural to begin the indexes from 0, IMHO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org