You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/07/30 19:03:05 UTC

[jira] [Resolved] (SPARK-2089) With YARN, preferredNodeLocalityData isn't honored

     [ https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-2089.
------------------------------
    Resolution: Won't Fix

> With YARN, preferredNodeLocalityData isn't honored 
> ---------------------------------------------------
>
>                 Key: SPARK-2089
>                 URL: https://issues.apache.org/jira/browse/SPARK-2089
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.0.0
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>            Priority: Critical
>
> When running in YARN cluster mode, apps can pass preferred locality data when constructing a Spark context that will dictate where to request executor containers.
> This is currently broken because of a race condition.  The Spark-YARN code runs the user class and waits for it to start up a SparkContext.  During its initialization, the SparkContext will create a YarnClusterScheduler, which notifies a monitor in the Spark-YARN code that .  The Spark-Yarn code then immediately fetches the preferredNodeLocationData from the SparkContext and uses it to start requesting containers.
> But in the SparkContext constructor that takes the preferredNodeLocationData, setting preferredNodeLocationData comes after the rest of the initialization, so, if the Spark-YARN code comes around quickly enough after being notified, the data that's fetched is the empty unset version.  The occurred during all of my runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org