You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "L. C. Hsieh (Jira)" <ji...@apache.org> on 2021/04/11 06:39:00 UTC

[jira] [Assigned] (SPARK-35022) Task Scheduling Plugin in Spark

     [ https://issues.apache.org/jira/browse/SPARK-35022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

L. C. Hsieh reassigned SPARK-35022:
-----------------------------------

    Assignee: L. C. Hsieh

> Task Scheduling Plugin in Spark
> -------------------------------
>
>                 Key: SPARK-35022
>                 URL: https://issues.apache.org/jira/browse/SPARK-35022
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 3.2.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>
> Spark scheduler schedules tasks to executors in an indeterminate way. Although there is locality configuration, the configuration is used for data locality purposes. Generally we cannot suggest the scheduler where a task should be scheduled to. Normally it is not a problem because the general task is executor-agnostic. But for special tasks, for example stateful tasks in Structured Streaming, state store is maintained at the executor side. Changing task location means reloading checkpoint data from the last batch. It has disadvantages from the performance perspective and also casts some limitations when we want to implement advanced features in Structured Streaming.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org