You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jonathan Hung (JIRA)" <ji...@apache.org> on 2017/07/14 00:58:00 UTC
[jira] [Updated] (YARN-6818) User limit per partition is not
honored in branch-2.7 >=
[ https://issues.apache.org/jira/browse/YARN-6818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Hung updated YARN-6818:
--------------------------------
Attachment: YARN-6818-branch-2.7.001.patch
> User limit per partition is not honored in branch-2.7 >=
> --------------------------------------------------------
>
> Key: YARN-6818
> URL: https://issues.apache.org/jira/browse/YARN-6818
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Jonathan Hung
> Assignee: Jonathan Hung
> Attachments: YARN-6818-branch-2.7.001.patch
>
>
> We are seeing an issue where user limit factor does not cap the amount of resources a user can consume in a queue in a partition. Suppose you have a queue with access to partition X, used resources in default partition is 0, and used resources in partition X is at the partition's user limit. This is the problematic code as far as I can tell: (in LeafQueue.java){noformat} if (Resources
> .greaterThan(resourceCalculator, clusterResource,
> user.getUsed(label),
> limit)) {
> // if enabled, check to see if could we potentially use this node instead
> // of a reserved node if the application has reserved containers
> if (this.reservationsContinueLooking) {
> if (Resources.lessThanOrEqual(
> resourceCalculator,
> clusterResource,
> Resources.subtract(user.getUsed(), application.getCurrentReservation()),
> limit)) {
> if (LOG.isDebugEnabled()) {
> LOG.debug("User " + userName + " in queue " + getQueueName()
> + " will exceed limit based on reservations - " + " consumed: "
> + user.getUsed() + " reserved: "
> + application.getCurrentReservation() + " limit: " + limit);
> }
> Resource amountNeededToUnreserve = Resources.subtract(user.getUsed(label), limit);
> // we can only acquire a new container if we unreserve first since we ignored the
> // user limit. Choose the max of user limit or what was previously set by max
> // capacity.
> currentResoureLimits.setAmountNeededUnreserve(Resources.max(resourceCalculator,
> clusterResource, currentResoureLimits.getAmountNeededUnreserve(),
> amountNeededToUnreserve));
> return true;
> }
> }
> if (LOG.isDebugEnabled()) {
> LOG.debug("User " + userName + " in queue " + getQueueName()
> + " will exceed limit - " + " consumed: "
> + user.getUsed() + " limit: " + limit);
> }
> return false;
> }
> {noformat}
> First it sees the used resources in partition X is greater than partition's user limit. Then the reservation check also succeeds because it is checking {{user.getUsed() - application.getCurrentReservation() <= limit}} and returns true.
> One fix is to just set {{Resources.subtract(user.getUsed(), application.getCurrentReservation())}} to {{Resources.subtract(user.getUsed(label), application.getCurrentReservation())}}.
> This doesn't seem to be a problem in branch-2.8 and higher since YARN-3356 introduces this check: {noformat} if (this.reservationsContinueLooking && checkReservations
> && label.equals(CommonNodeLabelsManager.NO_LABEL)) {{noformat}
> so in this case getting the used resources in default partition seems to be correct.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org