You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Tao Jie (JIRA)" <ji...@apache.org> on 2016/04/15 04:16:25 UTC

[jira] [Commented] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit

    [ https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242298#comment-15242298 ] 

Tao Jie commented on YARN-3126:
-------------------------------

I think this issue is quite common, and we have met the same problem.
The root cause is that when we should make the max-limitation check in assignment, we should compare *current usage* + *resource to assign* with *max resource limit*. However when have resource to assign to a queue, we know only *current resource usage* and *max resource limit*, we don't know *resource to assign* until we assign resource to an appAttempt.
This patch seems add a additional check(checkQueueResourceLimit) on *leaf queue* then assign to AppAttempt, but *parent queue* resource usage may still over max resource limit.
Also we already have *FSQueue.assignContainerPreCheck* for max resource limit. If we add a new check, the former one seems to be unnecessary here.
[~kasha], would like to hear your thoughts.

> FairScheduler: queue's usedResource is always more than the maxResource limit
> -----------------------------------------------------------------------------
>
>                 Key: YARN-3126
>                 URL: https://issues.apache.org/jira/browse/YARN-3126
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.3.0
>         Environment: hadoop2.3.0. fair scheduler. spark 1.1.0. 
>            Reporter: Xia Hu
>              Labels: BB2015-05-TBR, assignContainer, fairscheduler, resources
>             Fix For: trunk-win
>
>         Attachments: resourcelimit-02.patch, resourcelimit-test.patch, resourcelimit.patch
>
>
> When submitting spark application(both spark-on-yarn-cluster and spark-on-yarn-cleint model), the queue's usedResources assigned by fairscheduler always can be more than the queue's maxResources limit.
> And by reading codes of fairscheduler, I suppose this issue happened because of ignore to check the request resources when assign Container.
> Here is the detail:
> 1. choose a queue. In this process, it will check if queue's usedResource is bigger than its max, with assignContainerPreCheck. 
> 2. then choose a app in the certain queue. 
> 3. then choose a container. And here is the question, there is no check whether this container would make the queue sources over its max limit. If a queue's usedResource is 13G, the maxResource limit is 16G, then a container which asking for 4G resources may be assigned successful. 
> This problem will always happen in spark application, cause we can ask for different container resources in different applications. 
> By the way, I have already use the patch from YARN-2083. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)