You are viewing a plain text version of this content. The canonical link for it is here.

Posted to yarn-issues@hadoop.apache.org by "Michael Zeoli (Jira)" <ji...@apache.org> on 2021/03/19 17:58:00 UTC

[jira] [Commented] (YARN-6538) Inter Queue preemption is not happening when DRF is configured

    [ https://issues.apache.org/jira/browse/YARN-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305084#comment-17305084 ] 

Michael Zeoli commented on YARN-6538:
-------------------------------------

As we transition from Fair Scheduler to Capacity Scheduler, we're running into what we believe is this same issue.  We typically assign 1 core to our executors, as our work is typically memory bound and multiple cores per container offer no performance increase.  Under Fair Scheduler, preemption worked well for us.  Under Capacity, we see situations where jobs are starved for AM's and/or executors when they should otherwise receive their minimum guaranteed capacity via preempted resources from jobs in other queues.

While our configuration may be uncommon, it's certainly a valid use case in the grand scheme of YARN and Spark, and this bug seems to create significant issues where they did not exist before (in Fair).

 

 

> Inter Queue preemption is not happening when DRF is configured
> --------------------------------------------------------------
>
>                 Key: YARN-6538
>                 URL: https://issues.apache.org/jira/browse/YARN-6538
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler, scheduler preemption
>    Affects Versions: 2.8.0
>            Reporter: Sunil G
>            Assignee: Sunil G
>            Priority: Major
>
> Cluster capacity of <memory:3TB, vCores:168>. Here memory is more and vcores are less. If applications have more demand, vcores might be exhausted. 
> Inter queue preemption ideally has to be kicked in once vcores is over utilized. However preemption is not happening.
> Analysis:
> In {{AbstractPreemptableResourceCalculator.computeFixpointAllocation}}, 
> {code}
>     // assign all cluster resources until no more demand, or no resources are
>     // left
>     while (!orderedByNeed.isEmpty() && Resources.greaterThan(rc, totGuarant,
>         unassigned, Resources.none())) {
> {code}
>  will loop even when vcores are 0 (because memory is still +ve). Hence we are having more vcores in idealAssigned which cause no-preemption cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org