You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "amahussein (via GitHub)" <gi...@apache.org> on 2023/05/04 19:03:04 UTC

[GitHub] [spark] amahussein opened a new pull request, #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

amahussein opened a new pull request, #41050:
URL: https://github.com/apache/spark/pull/41050

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   
   This PR fixes a regression introduced by #36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE.  
   As a result, SHS misinterprets  failed-jobs/SQLs as `Active/Incomplete` 
   
   This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements`
   
   
   ### Why are the changes needed?
   
   Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set.
   
   Example eventlog
   
   ```
   {
      "Event":"SparkListenerJobEnd",
      "Job ID":31,
      "Completion Time":1616171909785,
      "Job Result":{
         "Result":"JobFailed",
         "Exception":{
            "Message":"Job aborted"
         }
      }
   } 
   ```
   
   **Original behavior:**
   
   The job is marked as `incomplete`
   
   Error from the SHS logs:
   
   ```
   23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI...
   23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test
   java.lang.NullPointerException
       at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589)
       at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558)
       at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569)
       at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423)
       at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967)
       at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878)
       at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865)
   ....
   23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line #24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception":
   {"Message":"Job aborted"}
   }}
   ```
   
   **After the fix:**
   
   Job 31 is marked as `failedJob`
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Added new unit test in JsonProtocolSuite.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amahussein commented on pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "amahussein (via GitHub)" <gi...@apache.org>.
amahussein commented on PR #41050:
URL: https://github.com/apache/spark/pull/41050#issuecomment-1536393665

   > Thank u for pinging me. Will take a look tonight.
   
   Thanks @dongjoon-hyun 
   I am not sure I can find the reason behind the build errors though
   ```
   #35 ERROR: failed to push ghcr.io/amahussein/apache-spark-ci-image:master-4886449337: unexpected status: 403 Forbidden
   
   #36 [auth] amahussein/apache-spark-ci-image:pull,push apache/spark/apache-spark-github-action-image-cache:pull token for ghcr.io
   #36 DONE 0.0s
   ------
    > exporting to image:
   ------
   ERROR: failed to solve: failed to push ghcr.io/amahussein/apache-spark-ci-image:master-4886449337: unexpected status: 403 Forbidden
   Error: buildx failed with: ERROR: failed to solve: failed to push ghcr.io/amahussein/apache-spark-ci-image:master-4886449337: unexpected status: 403 Forbidden
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amahussein commented on pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "amahussein (via GitHub)" <gi...@apache.org>.
amahussein commented on PR #41050:
URL: https://github.com/apache/spark/pull/41050#issuecomment-1535282609

   @JoshRosen and @dongjoon-hyun Can you please check the changes?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41050:
URL: https://github.com/apache/spark/pull/41050#discussion_r1186440414


##########
core/src/main/scala/org/apache/spark/util/JsonProtocol.scala:
##########
@@ -1555,13 +1555,18 @@ private[spark] object JsonProtocol {
   }
 
   def stackTraceFromJson(json: JsonNode): Array[StackTraceElement] = {
-    json.extractElements.map { line =>
-      val declaringClass = line.get("Declaring Class").extractString
-      val methodName = line.get("Method Name").extractString
-      val fileName = jsonOption(line.get("File Name")).map(_.extractString).orNull
-      val lineNumber = line.get("Line Number").extractInt
-      new StackTraceElement(declaringClass, methodName, fileName, lineNumber)
-    }.toArray
+    jsonOption(json)
+      .map(
+        _.extractElements
+          .map { line =>
+            val declaringClass = line.get("Declaring Class").extractString
+            val methodName = line.get("Method Name").extractString
+            val fileName = jsonOption(line.get("File Name")).map(_.extractString).orNull
+            val lineNumber = line.get("Line Number").extractInt
+            new StackTraceElement(declaringClass, methodName, fileName, lineNumber)
+          }
+          .toArray)
+      .getOrElse(Array[StackTraceElement]())

Review Comment:
   ditto.
   ```scala
   -   }
   -   .toArray)
   - .getOrElse(Array[StackTraceElement]())
   + }..toArray).getOrElse(Array[StackTraceElement]())
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41050:
URL: https://github.com/apache/spark/pull/41050#issuecomment-1536881780

   Welcome to the Apache Spark community, @amahussein .
   I added you to the Apache Spark contributor group and assigned SPARK-43340 to you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41050:
URL: https://github.com/apache/spark/pull/41050#discussion_r1186439516


##########
core/src/main/scala/org/apache/spark/util/JsonProtocol.scala:
##########
@@ -1555,13 +1555,18 @@ private[spark] object JsonProtocol {
   }
 
   def stackTraceFromJson(json: JsonNode): Array[StackTraceElement] = {
-    json.extractElements.map { line =>
-      val declaringClass = line.get("Declaring Class").extractString
-      val methodName = line.get("Method Name").extractString
-      val fileName = jsonOption(line.get("File Name")).map(_.extractString).orNull
-      val lineNumber = line.get("Line Number").extractInt
-      new StackTraceElement(declaringClass, methodName, fileName, lineNumber)
-    }.toArray
+    jsonOption(json)
+      .map(
+        _.extractElements
+          .map { line =>

Review Comment:
   shall we keep as a single line?
   ```scala
   -    jsonOption(json)
   -      .map(
   -        _.extractElements
   -          .map { line =>
   +    jsonOption(json).map(_.extractElements.map { line =>
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41050:
URL: https://github.com/apache/spark/pull/41050#issuecomment-1536757870

   To @tgravescs , I also hit that issue before when I re-forked my Spark repository. Like 403 error means, it's permission issue in GitHub setting. But, I forgot the details. At that time, @Yikun helped me because he setup the pipeline in the community~
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] JoshRosen commented on pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "JoshRosen (via GitHub)" <gi...@apache.org>.
JoshRosen commented on PR #41050:
URL: https://github.com/apache/spark/pull/41050#issuecomment-1536958447

   LGTM as well. Thanks for fixing this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun closed pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs
URL: https://github.com/apache/spark/pull/41050


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41050:
URL: https://github.com/apache/spark/pull/41050#issuecomment-1535324629

   Thank u for pinging me. Will take a look tonight.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "tgravescs (via GitHub)" <gi...@apache.org>.
tgravescs commented on PR #41050:
URL: https://github.com/apache/spark/pull/41050#issuecomment-1536741822

   @dongjoon-hyun  do you know about what might be missing for permissions for his build error here? 
   
    (ERROR: failed to solve: failed to push ghcr.io/amahussein/apache-spark-ci-image:master-4897157804: unexpected status: 403 Forbidden)
   
   I haven't seen that before.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #41050: [SPARK-43340][CORE] Handle missing stack-trace field in eventlogs

Posted by "Yikun (via GitHub)" <gi...@apache.org>.
Yikun commented on PR #41050:
URL: https://github.com/apache/spark/pull/41050#issuecomment-1536985087

   > ERROR: failed to solve: failed to push ghcr.io/amahussein/apache-spark-ci-image:master-4897157804: unexpected status: 403 Forbidden
   
   There are two possible issue we met before:
   1. Github Action flaky issue: https://github.com/orgs/community/discussions/32184
   2. Permisssion seeting issue. By default, write permission already included to your GITHUB_TOKEN, but if you set it manually (see also [1], [2]), it will failed, current CI will first build infra and push to your ghcr so write permission is required.
   [1] https://github.blog/changelog/2021-04-20-github-actions-control-permissions-for-github_token/
   [2] https://github.com/users/amahussein/packages/container/apache-spark-ci-image/settings
   
   >  Like 403 error means, it was a permission issue in GitHub setting. But, I forgot the details. At that time, @Yikun helped me because he did setup the pipeline in the community~
   
   Yes, we do some configure on https://github.com/apache/spark/pull/37745#issuecomment-1234063012 , but after that I found it was a flaky issue of github aciton.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org