You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Naganarasimha G R (JIRA)" <ji...@apache.org> on 2016/01/11 02:05:39 UTC

[jira] [Updated] (YARN-3945) maxApplicationsPerUser is wrongly calculated

     [ https://issues.apache.org/jira/browse/YARN-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Naganarasimha G R updated YARN-3945:
------------------------------------
    Attachment: YARN-3945.V1.003.patch

Hi [~wangda], 
I had few queries in your earlier example :
bq. A queue has 100 guaranteed resource (capacity=max-capacity=100). And minimum-user-limit=25.
There're 4 users in the queue, they're using u1=40, u2=30, u3=20, u4=10 resources. After a while, u3 finished its application, so there're 20 available resources. Only u2 and u1 are asking resources. So the user-limit = max(1/#active-user, 25/100) = 50. So it is possible u2 get all available resource, and usage becomes u1=40, u2=50, u3=0, u4=10. This is very unfair to me.And I think currently we cannot relief this issue via tuning minimum-user-limit.
bq. If thinking more fair, when there's any available resource, we should give them to users have requirement and also respecting their usage (e.g. we should give 20 available resource to u4 to make usage to be u1=40, u2=30, u3=0, u4=30).
Here IMO, as per what you mentioned  *u2 and u1 are asking resources* only these 2 users are active so according to the existing rule we are trying to cap their resources using the minimum user limit and hence u4 doesnt get but if suppose u4 also has resources then it would have become 1/3 = 33% and hence u2/u4 would have got.
Yes of course it doesnt bring the fine grained fairness with the new approach you propose (??fair share : same as how fair scheduler computes fair share, for users within a queue, it will be a new option like *enable-user-fair-share*.??), but neverthless to an extent gives some fairness.

But anyway this is new approach and would take sufficient amount of time to come into picture but how about fixing the current issue and handling new feature in a separate jira ?

Coming to existing issue i have given patch considering the calculation similar to am resource calculation in {{getUserAMResourceLimitPerPartition}}, but i am skeptical about this too as here we are doing 
{{queueCapacities.getMaxAMResourcePercentage(nodePartition) * effectiveUserLimit * userLimitFactor}}
but as per the formula you shared earlier 
{{user-limit = {{min(queue-capacity * user-limit-factor,current-capacity * max(user-limit / 100, 1 / #active-user)}}

> maxApplicationsPerUser is wrongly calculated
> --------------------------------------------
>
>                 Key: YARN-3945
>                 URL: https://issues.apache.org/jira/browse/YARN-3945
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.7.1
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>         Attachments: YARN-3945.20150728-1.patch, YARN-3945.20150729-1.patch, YARN-3945.V1.003.patch
>
>
> maxApplicationsPerUser is currently calculated based on the formula
> {{maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * userLimitFactor)}} but description of userlimit is 
> {quote}
> Each queue enforces a limit on the percentage of resources allocated to a user at any given time, if there is demand for resources. The user limit can vary between a minimum and maximum value.{color:red} The the former (the minimum value) is set to this property value {color} and the latter (the maximum value) depends on the number of users who have submitted applications. For e.g., suppose the value of this property is 25. If two users have submitted applications to a queue, no single user can use more than 50% of the queue resources. If a third user submits an application, no single user can use more than 33% of the queue resources. With 4 or more users, no user can use more than 25% of the queues resources. A value of 100 implies no user limits are imposed. The default is 100. Value is specified as a integer.
> {quote}
> configuration related to minimum limit should not be made used in a formula to calculate max applications for a user



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)