You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2020/08/18 06:52:00 UTC

[jira] [Updated] (SPARK-32518) CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

     [ https://issues.apache.org/jira/browse/SPARK-32518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan updated SPARK-32518:
--------------------------------
    Fix Version/s: 3.0.1

> CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-32518
>                 URL: https://issues.apache.org/jira/browse/SPARK-32518
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: wuyi
>            Assignee: wuyi
>            Priority: Major
>             Fix For: 3.0.1, 3.1.0
>
>
> Currently, CoarseGrainedSchedulerBackend.maxNumConcurrentTasks only considers the CPU for the max concurrent tasks. This can cause the application to hang when a barrier stage requires extra custom resources but the cluster doesn't have enough corresponding resources. Because, without the checking for other custom resources in maxNumConcurrentTasks, the barrier stage can be submitted to the TaskSchedulerImpl. But the TaskSchedulerImpl can not launch tasks for the barrier stage due to the insufficient task slots calculated by calculateAvailableSlots(which does check all kinds of resources). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org