You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Min Shen (Jira)" <ji...@apache.org> on 2020/09/17 17:37:00 UTC

[jira] [Created] (SPARK-32919) Add support in Spark driver to coordinate the shuffle map stage in push-based shuffle by selecting external shuffle services for merging shuffle partitions

Min Shen created SPARK-32919:
--------------------------------

             Summary: Add support in Spark driver to coordinate the shuffle map stage in push-based shuffle by selecting external shuffle services for merging shuffle partitions
                 Key: SPARK-32919
                 URL: https://issues.apache.org/jira/browse/SPARK-32919
             Project: Spark
          Issue Type: Sub-task
          Components: Shuffle, Spark Core
    Affects Versions: 3.1.0
            Reporter: Min Shen


In the beginning of a shuffle map stage, driver needs to select external shuffle services as the mergers of the shuffle partitions for the corresponding shuffle.

We currently leverage the immediate available information about current and past executor location information for this selection purpose. Ideally, this would be behind a pluggable interface so that we can potentially leverage information tracked outside of a Spark application for better load balancing or for a disaggregate deployment environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org