You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sandy Ryza (JIRA)" <ji...@apache.org> on 2014/10/07 10:23:34 UTC

[jira] [Commented] (SPARK-3797) Run the shuffle service inside the YARN NodeManager as an AuxiliaryService

    [ https://issues.apache.org/jira/browse/SPARK-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161636#comment-14161636 ] 

Sandy Ryza commented on SPARK-3797:
-----------------------------------

Not necessarily opposed to this, but wanted to bring up some of the drawbacks of running a Spark shuffle service inside YARN NodeManagers.  And the alternative.

* *Dependencies* .  We will need to avoid dependency conflicts between Spark's shuffle service and the rest of the NodeManager.  It's worth keeping in mind that the NodeManager may include Hadoop server-side dependencies we haven't dealt with in the past through depending on hadoop-client, and that its dependencies will also need to jive with other auxiliary services like MR's and Tez's. Unlike in Spark, where we place Spark jars in front of Hadoop jars and thus allow Spark versions to take precedence, NodeManagers presumably run with Hadoop jars in front.
* *Resource management* .  YARN will soon have support for some disk I/O isolation and scheduling (YARN-2139).  Running inside the NodeManager means that we won't be able to account for serving shuffle data inside of this. 
* *Deployment* . Where currently "installing" Spark on YARN at most means placing a Spark assembly jar on HDFS, this would require deploying Spark bits on every node in the cluster.
* *Rolling Upgrades* . Some proposed YARN work will allow containers to continue running while NodeManagers restart.  With Spark depending on the NodeManager to serve data, these upgrades would interfere with running Spark applications in situations where they otherwise might not.

The other option worth considering is to run the shuffle service in containers that sit beside the executor(s) on each node. This avoids all the problems above, but brings a couple of its own:
* Under many cluster configurations, YARN expects each container to take up at least a minimum amount of memory and CPU.  The shuffle service, which would use little of either of these, would sit on these resources unnecessarily.
* Scheduling becomes more difficult.  Spark would require two different containers to be scheduled on any node it wants to be functional on.  Once YARN has container resizing (YARN-1197), this could be mitigated by running two processes inside a single container.  If Spark wanted to kill an executor, it could order the executor process to kill itself and then shrink the container to the size of the shuffle service.

> Run the shuffle service inside the YARN NodeManager as an AuxiliaryService
> --------------------------------------------------------------------------
>
>                 Key: SPARK-3797
>                 URL: https://issues.apache.org/jira/browse/SPARK-3797
>             Project: Spark
>          Issue Type: Sub-task
>          Components: YARN
>            Reporter: Patrick Wendell
>            Assignee: Andrew Or
>
> It's also worth considering running the shuffle service in a YARN container beside the executor(s) on each node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org