You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Dmitry (Jira)" <ji...@apache.org> on 2022/06/23 01:19:00 UTC
[jira] [Created] (YARN-11194) FairShare preemption doesn't enforce fairness between sibling in some cases
Dmitry created YARN-11194:
-----------------------------
Summary: FairShare preemption doesn't enforce fairness between sibling in some cases
Key: YARN-11194
URL: https://issues.apache.org/jira/browse/YARN-11194
Project: Hadoop YARN
Issue Type: Bug
Components: fairscheduler, scheduler preemption
Affects Versions: 3.2.1
Environment: hadoop yarn 3.2.1
Reporter: Dmitry
Queues hierarchy:
root (cluster: 30GB, 30 vcores)
* q1 (maxResources: 10GB, 10 vcores)
** q1.1 (weight: 1)
** q1.2 (weight: 9)
* q2
* q3
Steps:
# app1 with a demand 100GB/100 vcores is added to q1.1 and gets 10GB/10 vcores
## q1 reaches it's max
# app2 with a demand 1000GB/1000 vcores is added to q2, it gets 20GB/20 vcores
## cluster runs at 100% capacity now
# app3 with demand 100GB/100 vcores is added to q1.2
{*}Expected{*}: fair share preemption preempts container so app3 (q1.2) gets 9GB/9 vcores. It needs to preempt from app1 (q1.1) so q1 doesn't exceed max resources.
{*}Observed{*}: app3 is starving
Some observations:
# We see some preemption happening from app2 (q2) that matches app3 starvation (9GB/9 vcores in this case). It may suggest app2 preempts from app3 but can't use preempted containers due to this [check|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L1098]
# Eliminating max on q1 helps to resolve the issue
Notes:
# this is oversimplified version of our production set up. I can provide more details if needed.
# I have a heap dump of the issue that I can't share due because of our policy, but I can look up some state if needed.
Thanks!
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org