You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/05/12 01:54:10 UTC

[GitHub] [spark] HeartSaVioR opened a new pull request, #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

HeartSaVioR opened a new pull request, #41150:
URL: https://github.com/apache/spark/pull/41150

   ### What changes were proposed in this pull request?
   
   This adds a new field in QueryTerminatedEvent about "error class" if the information exists in the exception. Note that it doesn't retrieve the information from StreamingQueryException, because it is marked as it's own error class and we want to know the causing exception.
   
   ### Why are the changes needed?
   
   Custom streaming query listeners can get the "categorized" error as cause of the query termination, which can be used to determine trends of the errors, aggregation, etc.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, users having customer streaming query listener will have additional information in termination event.
   
   ### How was this patch tested?
   
   Modified UTs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a diff in pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on code in PR #41150:
URL: https://github.com/apache/spark/pull/41150#discussion_r1193464953


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala:
##########
@@ -154,11 +154,22 @@ object StreamingQueryListener {
    * @param runId A query id that is unique for every start/restart. See `StreamingQuery.runId()`.
    * @param exception The exception message of the query if the query was terminated
    *                  with an exception. Otherwise, it will be `None`.
+   * @param errorClassOnException The error class from the exception if the query was terminated
+   *                              with an exception which is a part of error class framework.

Review Comment:
   Do we think it'd be better to not mention about error class framework and say "this gives an error classification if the exception is available and classified"? If then I can avoid mentioning error class framework.
   
   @HyukjinKwon WDYT? cc. @MaxGekk 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on PR #41150:
URL: https://github.com/apache/spark/pull/41150#issuecomment-1544992096

   cc. @zsxwing @viirya @HyukjinKwon Please take a look. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a diff in pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on code in PR #41150:
URL: https://github.com/apache/spark/pull/41150#discussion_r1193464953


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala:
##########
@@ -154,11 +154,22 @@ object StreamingQueryListener {
    * @param runId A query id that is unique for every start/restart. See `StreamingQuery.runId()`.
    * @param exception The exception message of the query if the query was terminated
    *                  with an exception. Otherwise, it will be `None`.
+   * @param errorClassOnException The error class from the exception if the query was terminated
+   *                              with an exception which is a part of error class framework.

Review Comment:
   Do we think it'd be better to say this gives an error classification if the exception is available and classified, without mentioning error class framework? If then I can avoid mentioning error class framework.
   
   @HyukjinKwon WDYT? cc. @MaxGekk 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a diff in pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on code in PR #41150:
URL: https://github.com/apache/spark/pull/41150#discussion_r1193464953


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala:
##########
@@ -154,11 +154,22 @@ object StreamingQueryListener {
    * @param runId A query id that is unique for every start/restart. See `StreamingQuery.runId()`.
    * @param exception The exception message of the query if the query was terminated
    *                  with an exception. Otherwise, it will be `None`.
+   * @param errorClassOnException The error class from the exception if the query was terminated
+   *                              with an exception which is a part of error class framework.

Review Comment:
   Do we think it'd be better to say "this gives an error classification if the exception is available and classified", which means, avoiding to mention about error class framework? If then I can avoid mentioning error class framework.
   
   @HyukjinKwon WDYT? cc. @MaxGekk 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #41150:
URL: https://github.com/apache/spark/pull/41150#discussion_r1194519170


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala:
##########
@@ -154,11 +154,22 @@ object StreamingQueryListener {
    * @param runId A query id that is unique for every start/restart. See `StreamingQuery.runId()`.
    * @param exception The exception message of the query if the query was terminated
    *                  with an exception. Otherwise, it will be `None`.
+   * @param errorClassOnException The error class from the exception if the query was terminated
+   *                              with an exception which is a part of error class framework.

Review Comment:
   I think it's fine .. it's sort of documented ..e.g., https://github.com/apache/spark/blob/master/docs/sql-error-conditions-connect-error-class.md



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a diff in pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on code in PR #41150:
URL: https://github.com/apache/spark/pull/41150#discussion_r1193454180


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala:
##########
@@ -154,11 +154,22 @@ object StreamingQueryListener {
    * @param runId A query id that is unique for every start/restart. See `StreamingQuery.runId()`.
    * @param exception The exception message of the query if the query was terminated
    *                  with an exception. Otherwise, it will be `None`.
+   * @param errorClassOnException The error class from the exception if the query was terminated
+   *                              with an exception which is a part of error class framework.

Review Comment:
   SparkException and SparkThrowable are all public API where we expose to API doc. It's just that we lack about documentation for these classes. That's something we have to fix; we could probably refer to SparkThrowable.getErrorClass here, but it's not documented so wouldn't help unfortunately.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception
URL: https://github.com/apache/spark/pull/41150


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #41150:
URL: https://github.com/apache/spark/pull/41150#discussion_r1193292405


##########
python/pyspark/sql/streaming/listener.py:
##########
@@ -295,6 +299,17 @@ def exception(self) -> Optional[str]:
         """
         return self._exception
 
+    @property
+    def errorClassOnException(self) -> Optional[str]:
+        """
+        The error class from the exception if the query was terminated
+        with an exception which is a part of error class framework.
+        If the query was terminated without an exception, or the
+        exception is not a part of error class framework, it will be
+        `None`.

Review Comment:
   ```suggestion
           `None`.
   
           .. versionadded:: 3.5.0
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #41150:
URL: https://github.com/apache/spark/pull/41150#issuecomment-1548872475

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a diff in pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on code in PR #41150:
URL: https://github.com/apache/spark/pull/41150#discussion_r1194535759


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala:
##########
@@ -154,11 +154,22 @@ object StreamingQueryListener {
    * @param runId A query id that is unique for every start/restart. See `StreamingQuery.runId()`.
    * @param exception The exception message of the query if the query was terminated
    *                  with an exception. Otherwise, it will be `None`.
+   * @param errorClassOnException The error class from the exception if the query was terminated
+   *                              with an exception which is a part of error class framework.

Review Comment:
   https://spark.apache.org/docs/latest/sql-error-conditions.html
   
   Above is the landing page on website. It's in SQL guide doc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a diff in pull request #41150: [SPARK-43482][SS] Expand QueryTerminatedEvent to contain error class if it exists in exception

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #41150:
URL: https://github.com/apache/spark/pull/41150#discussion_r1193388602


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala:
##########
@@ -154,11 +154,22 @@ object StreamingQueryListener {
    * @param runId A query id that is unique for every start/restart. See `StreamingQuery.runId()`.
    * @param exception The exception message of the query if the query was terminated
    *                  with an exception. Otherwise, it will be `None`.
+   * @param errorClassOnException The error class from the exception if the query was terminated
+   *                              with an exception which is a part of error class framework.

Review Comment:
   Is error class framework a public API that end-users understand or can find document?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org