You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mridulm <gi...@git.apache.org> on 2018/08/24 22:53:25 UTC
[GitHub] spark pull request #22209: [SPARK-24415][Core] Fixed the aggregated stage me...
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r212772176
--- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,16 @@ private[spark] class AppStatusListener(
val e = it.next()
if (job.stageIds.contains(e.getKey()._1)) {
val stage = e.getValue()
- stage.status = v1.StageStatus.SKIPPED
- job.skippedStages += stage.info.stageId
- job.skippedTasks += stage.info.numTasks
- it.remove()
- update(stage, now)
+ // Only update the stage if it has not finished already
+ if (v1.StageStatus.ACTIVE.equals(stage.status) ||
+ v1.StageStatus.PENDING.equals(stage.status)) {
+ stage.status = v1.StageStatus.SKIPPED
+ job.skippedStages += stage.info.stageId
+ job.skippedTasks += stage.info.numTasks
+ job.activeStages -= 1
+ it.remove()
--- End diff --
To clarify, I was referring to 'this' being job end event received before stage end (for a stage which is part of a job).
I was not referring to task end event's (those can come in after stage or job end's).
Thanks for clarifying @vanzin ... given the snippet is not trying to recover from events drop, wondering why "non"-skipped stages would even be in the list : I would expect all of them to be skipped ?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org