You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/03/22 23:06:00 UTC

[jira] [Commented] (SPARK-34828) YARN Shuffle Service: Support configurability of aux service name and service-specific config overrides

    [ https://issues.apache.org/jira/browse/SPARK-34828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306638#comment-17306638 ] 

Apache Spark commented on SPARK-34828:
--------------------------------------

User 'xkrogen' has created a pull request for this issue:
https://github.com/apache/spark/pull/31936

> YARN Shuffle Service: Support configurability of aux service name and service-specific config overrides
> -------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-34828
>                 URL: https://issues.apache.org/jira/browse/SPARK-34828
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle, YARN
>    Affects Versions: 3.1.1
>            Reporter: Erik Krogen
>            Priority: Major
>
> In some cases it may be desirable to run multiple instances of the Spark Shuffle Service which are using different versions of Spark. This can be helpful, for example, when running a YARN cluster with a mixed workload of applications running multiple Spark versions, since a given version of the shuffle service is not always compatible with other versions of Spark. (See SPARK-27780 for more detail on this)
> YARN versions since 2.9.0 support the ability to run shuffle services within an isolated classloader (see YARN-4577), meaning multiple Spark versions can coexist within a single NodeManager.
> To support this from the Spark side, we need to make two enhancements:
> * Make the name of the shuffle service configurable. Currently it is hard-coded to be {{spark_shuffle}} on both the client and server side. The server-side name is not actually used anywhere, as it is the value within the {{yarn.nodemanager.aux-services}} which is considered by the NodeManager to be definitive name. However, if you change this in the configs, the hard-coded name within the client will no longer match. So, this needs to be configurable.
> * Add a way to separately configure the two shuffle service instances. Since the configurations such as the port number are taken from the NodeManager config, they will both try to use the same port, which obviously won't work. So, we need to provide a way to selectively configure the two shuffle service instances. I will go into details on my proposal for how to achieve this within the PR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org