You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2016/01/19 22:59:39 UTC

[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

    [ https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107542#comment-15107542 ] 

Jason Lowe commented on YARN-4610:
----------------------------------

I believe the issue is in LeafQueue#assignToUser.  That method will modify the amount needed to unreserve for a particular user when they hit the resource limit.  However the amount needed to unreserve never gets reset to zero for the next iteration of the loop, so subsequent apps for different users can end up not receiving containers because it accidentally thinks it needs to unreserve based on that stale value.

> Reservations continue looking for one app causes other apps to starve
> ---------------------------------------------------------------------
>
>                 Key: YARN-4610
>                 URL: https://issues.apache.org/jira/browse/YARN-4610
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.7.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Blocker
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that allows an application to unreserve elsewhere to fulfil a container request on a node that has available space.  However in 2.7 that logic seems to break allocations for subsequent apps in the queue.  Once a user hits its user limit, subsequent apps in the queue for other users receive containers at a significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)