You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Baoxu Shi (JIRA)" <ji...@apache.org> on 2014/06/22 00:43:24 UTC

[jira] [Updated] (SPARK-2228) onStageSubmitted does not properly called so NoSuchElement will be thrown in onStageCompleted

     [ https://issues.apache.org/jira/browse/SPARK-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Baoxu Shi updated SPARK-2228:
-----------------------------

    Summary: onStageSubmitted does not properly called so NoSuchElement will be thrown in onStageCompleted  (was: onStageSubmitted does not properly called so NoSuchElement will throw in onStageCompleted)

> onStageSubmitted does not properly called so NoSuchElement will be thrown in onStageCompleted
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-2228
>                 URL: https://issues.apache.org/jira/browse/SPARK-2228
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.0, 1.1.0
>            Reporter: Baoxu Shi
>
> We are using `SaveAsObjectFile` and `objectFile` to cut off lineage during iterative computing, but after several hundreds of iterations, there will be `NoSuchElementsError`. We check the code and locate the problem at `org.apache.spark.ui.jobs.JobProgressListener`. When `onStageCompleted` is called, such `stageId` can not be found in `stageIdToPool`, but it does exist in other HashMaps. So we think `onStageSubmitted` is not properly called. `Spark` did add a stage but failed to send the message to listeners. When sending `finish` message to listeners, the error occurs. 
> This problem will cause a huge number of `active stages` showing in `SparkUI`, which is really annoying. But it may not affect the final result, according to the result of my testing code.
> I'm willing to help solve this problem, any idea about which part should I change? I assume `org.apache.spark.scheduler.SparkListenerBus` have something to do with it but it looks fine to me.
> FYI, here is the test code that could reproduce the problem. I do not know who to put code here with highlight, so I put the code on gist to make the issue looks clean.
> https://gist.github.com/bxshi/b5c0fe0ae089c75a39bd



--
This message was sent by Atlassian JIRA
(v6.2#6252)