You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jonathan Hung (Jira)" <ji...@apache.org> on 2019/08/29 18:36:00 UTC
[jira] [Comment Edited] (YARN-9770) Create a queue ordering policy which picks child queues with equal probability

    [ https://issues.apache.org/jira/browse/YARN-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918853#comment-16918853 ] 

Jonathan Hung edited comment on YARN-9770 at 8/29/19 6:35 PM:
--------------------------------------------------------------

Hi [~eepayne], at a high level what we observed is, an app with > 10k container requests gets submitted to an underutilized queue A. Queue A takes up allocations for 5-10 seconds. When A's utilization reaches utilization of other queues (e.g. queue B), queue B starts getting allocations too - queue B will allocate to apps in fifo order, and if the apps at the head of the fifo queue in B are at least medium-sized, these apps will consume all of the allocations in queue B.

While underutilized queues are receiving allocations, highly utilized queues are not, but are still receiving app submissions, increasing activeUsers in these highly utilized queues.

Another thing we observed is that if underutilized queues have high container churn, its utilization will remain low, and continue to consume a majority of scheduler's overall container allocations which exacerbates the starvation problem.

Attached a screenshot (activeUsers_overlay) which shows activeUsers for an impacted queue (blue is post-YARN-9770, red is pre-YARN-9770)


was (Author: jhung):
Hi [~eepayne], at a high level what we observed is, an app with > 10k container requests gets submitted to an underutilized queue A. Queue A takes up allocations for 5-10 seconds. When A's utilization reaches utilization of other queues (e.g. queue B), queue B starts getting allocations too - queue B will allocate to apps in fifo order, and if the apps at the head of the fifo queue in B are at least medium-sized, these apps will consume all of the allocations in queue B.

While underutilized queues are receiving allocations, highly utilized queues are not, but are still receiving app submissions, increasing activeUsers in these highly utilized queues.

Another thing we observed is that if underutilized queues have high container churn, its utilization will remain low, and continue to consume a majority of scheduler's overall container allocations.

Attached a screenshot (activeUsers_overlay) which shows activeUsers for an impacted queue (blue is post-YARN-9770, red is pre-YARN-9770)

> Create a queue ordering policy which picks child queues with equal probability
> ------------------------------------------------------------------------------
>
>                 Key: YARN-9770
>                 URL: https://issues.apache.org/jira/browse/YARN-9770
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Jonathan Hung
>            Assignee: Jonathan Hung
>            Priority: Major
>              Labels: release-blocker
>         Attachments: YARN-9770.001.patch, YARN-9770.002.patch, YARN-9770.003.patch, activeUsers_overlay.png
>
>
> Ran some simulations with the default queue_utilization_ordering_policy:
> An underutilized queue which receives an application with many (thousands) resource requests will hog scheduler allocations for a long time (on the order of a minute). In the meantime apps are getting submitted to all other queues, which increases activeUsers in these queues, which drops user limit in these queues to small values if minimum-user-limit-percent is configured to small values (e.g. 10%).
> To avoid this issue, we assign to queues with equal probability, to avoid scenarios where queues don't get allocations for a long time.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org