You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Wangda Tan (JIRA)" <ji...@apache.org> on 2016/05/01 20:13:13 UTC

[jira] [Commented] (YARN-4280) CapacityScheduler reservations may not prevent indefinite postponement on a busy cluster

    [ https://issues.apache.org/jira/browse/YARN-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265860#comment-15265860 ] 

Wangda Tan commented on YARN-4280:
----------------------------------

[~jlowe],

Agree, and actually even after enable preemption, sometimes reservation cannot happen because of this issue:

For example.
- At time T1, if scheduler selects two containers (c1, c2) from Q1 to be preempted, which will be preempted at T1 + 10secs (assume kill-wait is 10 sec).
- At time T1 + 5 sec, AM releases C2 itself, and scheduler allocate the resource to Q1.
- At the same time (T1 + 5 sec), preemption policy selects new set of containers to be preempt, assume they’re c1 and c3, C1 will be preempted at T1 + 10 sec (not changed), and C3 will be preempt at T1 + 15 sec.
- At time T1 + 10 sec, scheduler kill C1, and it allocates the resource back to Q1.
- At the same time (T1 + 10 sec), preemption policy selects new set of containers to be preempt, assume they’re c3 and c4, c3 will be preempted at T1 + 15 sec, and c4 will be preempted at T1 + 20 sec.
- Scheduler could repeat preempting containers, but all of preempted resources come back to Q1 because of YARN-4280.

I have considered solving this issue for a while. The problem of allowing one container reserved exceed queue's max capacity is: in a small cluster, one single large container could mean a large proportion of the cluster. And queue's maximum capacity will be exceeded a lot.

I like the idea you mentioned at comment: https://issues.apache.org/jira/browse/YARN-4280?focusedCommentId=14971154&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14971154
If we can introduce a "pre-reservation" state, which is waiting for queue has enough headroom respect maximum-capacity. And during the wait, some queues in the hierarchy will be locked. However it seems implementation could be tricky, I don't have good idea to implement it cleanly.

> CapacityScheduler reservations may not prevent indefinite postponement on a busy cluster
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-4280
>                 URL: https://issues.apache.org/jira/browse/YARN-4280
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 2.6.1, 2.8.0, 2.7.1
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>
> Consider the following scenario:
> There are 2 queues A(25% of the total capacity) and B(75%), both can run at total cluster capacity. There are 2 applications, appX that runs on Queue A, always asking for 1G containers(non-AM) and appY runs on Queue B asking for 2 GB containers.
> The user limit is high enough for the application to reach 100% of the cluster resource. 
> appX is running at total cluster capacity, full with 1G containers releasing only one container at a time. appY comes in with a request of 2GB container but only 1 GB is free. Ideally, since appY is in the underserved queue, it has higher priority and should reserve for its 2 GB request. Since this request puts the alloc+reserve above total capacity of the cluster, reservation is not made. appX comes in with a 1GB request and since 1GB is still available, the request is allocated. 
> This can continue indefinitely causing priority inversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org