You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 05:37:40 UTC

[jira] [Resolved] (SPARK-2581) complete or withdraw visitedStages optimization in DAGScheduler’s stageDependsOn

     [ https://issues.apache.org/jira/browse/SPARK-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-2581.
---------------------------------
    Resolution: Incomplete

> complete or withdraw visitedStages optimization in DAGScheduler’s stageDependsOn
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-2581
>                 URL: https://issues.apache.org/jira/browse/SPARK-2581
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler
>            Reporter: Aaron Staple
>            Priority: Minor
>              Labels: bulk-closed
>
> Right now the visitedStages HashSet is populated with stages, but never queried to limit examination of previously visited stages.  It may make sense to check whether a mapStage has been visited previously before visiting it again, as in the nearby visitedRdds check.  Or it may be that the existing visitedRdds check sufficiently optimizes this function, and visitedStages can simply be removed.
> See discussion here: https://github.com/apache/spark/pull/1362#discussion-diff-15018046L1107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org