You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Weiwei Yang (Jira)" <ji...@apache.org> on 2022/01/28 01:08:00 UTC

[jira] [Commented] (SPARK-36060) Support backing off dynamic allocation increases if resources are "stuck"

    [ https://issues.apache.org/jira/browse/SPARK-36060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483504#comment-17483504 ] 

Weiwei Yang commented on SPARK-36060:
-------------------------------------

hi [~holden] 

We have seen the same issue. The flag {{spark.kubernetes.allocation.maxPendingPods}} introduced by SPARK-36052 helps to mitigate the issue, do you think there is anything else we can do except this?

> Support backing off dynamic allocation increases if resources are "stuck"
> -------------------------------------------------------------------------
>
>                 Key: SPARK-36060
>                 URL: https://issues.apache.org/jira/browse/SPARK-36060
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Kubernetes
>    Affects Versions: 3.2.0
>            Reporter: Holden Karau
>            Priority: Major
>
> In a over-subscribed environment we may enter a situation where our requests for more pods are not going to be fulfilled. Adding more requests for more pods is not going to help and may slow down the scheduler. We should detect this situation and hold off on increasing pod requests until the scheduler allocates more pods to us. We have a limited version of this in the Kube scheduler it's self but it would be better to plumb this all the way through to the DA logic.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org