You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Hadoop QA (Jira)" <ji...@apache.org> on 2022/02/24 08:49:00 UTC
[jira] [Commented] (YARN-11082) Use node label reosurce as denominator to decide which resource is dominated
[ https://issues.apache.org/jira/browse/YARN-11082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497262#comment-17497262 ]
Hadoop QA commented on YARN-11082:
----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red}{color} | {color:red} YARN-11082 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-11082 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13040419/YARN-11082.patch |
| Console output | https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1273/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
This message was automatically generated.
> Use node label reosurce as denominator to decide which resource is dominated
> -----------------------------------------------------------------------------
>
> Key: YARN-11082
> URL: https://issues.apache.org/jira/browse/YARN-11082
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Affects Versions: 3.1.1
> Reporter: Bo Li
> Priority: Major
> Fix For: 3.1.1
>
> Attachments: YARN-11082.patch
>
>
> We ued cluster resource as denominator to decide which resoure is dominated in AbstrctQueue#canAssignToThisQueue. Howere nodes in our cluster are configed differently.
> {quote}2021-12-09 10:24:37,069 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator: assignedContainer application attempt=appattempt_1637412555366_1588993_000001 container=null queue=root.a.a1.a2 clusterResource=<memory:175117312, vCores:40222> type=RACK_LOCAL requestedPartition=xx
> 2021-12-09 10:24:37,069 DEBUG org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue: Used resource=<memory:3381248, vCores:687> exceeded maxResourceLimit of the queue =<memory:3420315, vCores:687>
> 2021-12-09 10:24:37,069 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Failed to accept allocation proposal
> {quote}
> We can find out that even thouth root.a.a1.a2 used 687/687 vcores, but the following code in AbstrctQueue#canAssignToThisQueue still return false
> {quote}
> Resources.greaterThanOrEqual(resourceCalculator, clusterResource,
> usedExceptKillable, currentLimitResource)
> {quote}
> clusterResource = <memory:175117312, vCores:40222>
> usedExceptKillable = <memory:3381248, vCores:687>
> currentLimitResource = <memory:3420315, vCores:687>
> currentLimitResource:
> memory : 3381248/175117312 = 0.01930847362
> vCores : 687/40222 = 0.01708020486
> usedExceptKillable:
> memory : 3384320/175117312 = 0.01932601615
> vCores : 688/40222 = 0.01710506687
> DRF will think memory is dominated resource and return false in this scenario
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org