You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Tao Yang (JIRA)" <ji...@apache.org> on 2018/11/21 09:58:00 UTC

[jira] [Created] (YARN-9043) Inter-queue preemption sometimes starves an underserved queue when using DominantResourceCalculator

Tao Yang created YARN-9043:
------------------------------

             Summary: Inter-queue preemption sometimes starves an underserved queue when using DominantResourceCalculator
                 Key: YARN-9043
                 URL: https://issues.apache.org/jira/browse/YARN-9043
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler
    Affects Versions: 3.3.0
            Reporter: Tao Yang
            Assignee: Tao Yang


To reproduce this problem in UT, we can setup a cluster with resource <40,18> and create 3 queues and apps:
 * queue a: guaranteed=<10,10>, used=<6,10> by app1
 * queue b: guaranteed=<20,6>, used=<20,8> by app2
 * queue c: guaranteed=<10,2>, used=<0,0>, pending=<1,1>

Queue c is an underserved queue, queue b overuses 2 cpu resource, we expect app2 in queue b can be preempted but nothing happens.

This problem is related to Resources#greaterThan/lessThan, comparation between two resources is based on the resource/cluster-resource ratio inside DominantResourceCalculator#compare, in this way, the low weight resource may be ignored, for the scenario in UT, take comparation between ideal assgined resource and used resource:
 * cluster resource is <40,18>
 * ideal assigned resource of queue b is <20,6>, ideal-assigned-resource / cluster-resource = <20, 6> / <40, 18> = max(20/40, 6/18) = 0.5
 * used resource of queue b is <20, 8>, used-resource / cluster-resource = <20, 8> / <40, 18> = max(20/40, 8/18) = 0.5

The results of {{Resources.greaterThan(rc, clusterResource, used, idealAssigned)}} will be false instead of true, and there are some other similar places have the same problem, so that preemption can't happen in current logic.

To solve this problem, I propose to add ResourceCalculator#isAnyMajorResourceGreaterThan method, inside DominantResourceCalculator implements, it will compare every resource type between two resources and return true if any major resource types of left resource is greater than that of right resource, then replace Resources#greaterThan with it in some places of inter-queue preemption with this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org