You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Ryan Williams (JIRA)" <ji...@apache.org> on 2016/03/23 00:58:25 UTC
[jira] [Commented] (YARN-4477) FairScheduler: Handle condition
which can result in an infinite loop in attemptScheduling.
[ https://issues.apache.org/jira/browse/YARN-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207567#comment-15207567 ]
Ryan Williams commented on YARN-4477:
-------------------------------------
{quote} if multiple assign is enabled and maxAssign is unlimited, this while loop would never break. {quote}
I am seeing this with multiple assign disabled; is that known to be possible? Running 2.6.0-cdh5.5.1.
> FairScheduler: Handle condition which can result in an infinite loop in attemptScheduling.
> ------------------------------------------------------------------------------------------
>
> Key: YARN-4477
> URL: https://issues.apache.org/jira/browse/YARN-4477
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Reporter: Tao Jie
> Assignee: Tao Jie
> Fix For: 2.8.0
>
> Attachments: YARN-4477.001.patch, YARN-4477.002.patch, YARN-4477.003.patch, YARN-4477.004.patch
>
>
> This problem is introduced by YARN-4270 which add limitation on reservation.
> In FSAppAttempt.reserve():
> {code}
> if (!reservationExceedsThreshold(node, type)) {
> LOG.info("Making reservation: node=" + node.getNodeName() +
> " app_id=" + getApplicationId());
> if (!alreadyReserved) {
> getMetrics().reserveResource(getUser(), container.getResource());
> RMContainer rmContainer =
> super.reserve(node, priority, null, container);
> node.reserveResource(this, priority, rmContainer);
> setReservation(node);
> } else {
> RMContainer rmContainer = node.getReservedContainer();
> super.reserve(node, priority, rmContainer, container);
> node.reserveResource(this, priority, rmContainer);
> setReservation(node);
> }
> }
> {code}
> If reservation over threshod, current node will not set reservation.
> But in attemptScheduling in FairSheduler:
> {code}
> while (node.getReservedContainer() == null) {
> boolean assignedContainer = false;
> if (!queueMgr.getRootQueue().assignContainer(node).equals(
> Resources.none())) {
> assignedContainers++;
> assignedContainer = true;
>
> }
>
> if (!assignedContainer) { break; }
> if (!assignMultiple) { break; }
> if ((assignedContainers >= maxAssign) && (maxAssign > 0)) { break; }
> }
> {code}
> assignContainer(node) still return FairScheduler.CONTAINER_RESERVED, which not
> equals to Resources.none().
> As a result, if multiple assign is enabled and maxAssign is unlimited, this while loop would never break.
> I suppose that assignContainer(node) should return Resource.none rather than CONTAINER_RESERVED when the attempt doesn't take the reservation because of the limitation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)