You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by pw...@apache.org on 2013/12/12 08:11:51 UTC

[20/50] git commit: Merge pull request #190 from markhamstra/Stages4Jobs

Merge pull request #190 from markhamstra/Stages4Jobs

stageId <--> jobId mapping in DAGScheduler

Okay, I think this one is ready to go -- or at least it's ready for review and discussion.  It's a carry-over of https://github.com/mesos/spark/pull/842 with updates for the newer job cancellation functionality.  The prior discussion still applies.  I've actually changed the job cancellation flow a bit: Instead of ``cancelTasks`` going to the TaskScheduler and then ``taskSetFailed`` coming back to the DAGScheduler (resulting in ``abortStage`` there), the DAGScheduler now takes care of figuring out which stages should be cancelled, tells the TaskScheduler to cancel tasks for those stages, then does the cleanup within the DAGScheduler directly without the need for any further prompting by the TaskScheduler.

I know of three outstanding issues, each of which can and should, I believe, be handled in follow-up pull requests:

1) https://spark-project.atlassian.net/browse/SPARK-960
2) JobLogger should be re-factored to eliminate duplication
3) Related to 2), the WebUI should also become a consumer of the DAGScheduler's new understanding of the relationship between jobs and stages so that it can display progress indication and the like grouped by job.  Right now, some of this information is just being sent out as part of ``SparkListenerJobStart`` messages, but more or different job <--> stage information may need to be exported from the DAGScheduler to meet listeners needs.

Except for the eventQueue -> Actor commit, the rest can be cherry-picked almost cleanly into branch-0.8.  A little merging is needed in MapOutputTracker and the DAGScheduler.  Merged versions of those files are in https://github.com/markhamstra/incubator-spark/tree/aba2b40ce04ee9b7b9ea260abb6f09e050142d43

Note that between the recent Actor change in the DAGScheduler and the cleaning up of DAGScheduler data structures on job completion in this PR, some races have been introduced into the DAGSchedulerSuite.  Those tests usually pass, and I don't think that better-behaved code that doesn't directly inspect DAGScheduler data structures should be seeing any problems, but I'll work on fixing DAGSchedulerSuite as either an addition to this PR or as a separate request.

UPDATE: Fixed the race that I introduced.  Created a JIRA issue (SPARK-965) for the one that was introduced with the switch to eventProcessorActor in the DAGScheduler.


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/e0392343
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/e0392343
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/e0392343

Branch: refs/heads/scala-2.10
Commit: e0392343a026d632bac0df0ad4f399fce742c151
Parents: bfa6860 403234d
Author: Matei Zaharia <ma...@eecs.berkeley.edu>
Authored: Fri Dec 6 11:49:59 2013 -0800
Committer: Matei Zaharia <ma...@eecs.berkeley.edu>
Committed: Fri Dec 6 11:49:59 2013 -0800

----------------------------------------------------------------------
 .../org/apache/spark/MapOutputTracker.scala     |   8 +-
 .../apache/spark/scheduler/DAGScheduler.scala   | 278 +++++++++++++++----
 .../spark/scheduler/DAGSchedulerEvent.scala     |   3 +-
 .../apache/spark/scheduler/SparkListener.scala  |   2 +-
 .../scheduler/cluster/ClusterScheduler.scala    |   4 +-
 .../cluster/ClusterTaskSetManager.scala         |   2 +-
 .../spark/scheduler/local/LocalScheduler.scala  |  27 +-
 .../org/apache/spark/JobCancellationSuite.scala |   4 +-
 .../spark/scheduler/DAGSchedulerSuite.scala     |  43 ++-
 9 files changed, 280 insertions(+), 91 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/e0392343/core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterScheduler.scala
----------------------------------------------------------------------