You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2017/03/02 01:29:45 UTC
[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or
nodes
[ https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891420#comment-15891420 ]
Robert Kanter commented on YARN-6050:
-------------------------------------
[~leftnoteasy], I've been working on fixing this, and here's the code I currently have:
{code:java}
public static int getApplicableNodeCountForAM(RMContext rmContext,
Configuration conf, List<ResourceRequest> amReqs) {
Set<NodeId> nodesForReqs = new HashSet<>();
for (ResourceRequest amReq : amReqs) {
if (amReq.getRelaxLocality() &&
!amReq.getResourceName().equals(ResourceRequest.ANY)) {
nodesForReqs.addAll(
rmContext.getScheduler().getClusterNodeIdsByResourceName(
amReq.getResourceName()));
}
}
if (YarnConfiguration.areNodeLabelsEnabled(conf)) {
RMNodeLabelsManager labelManager = rmContext.getNodeLabelManager();
String amNodeLabelExpression = amReqs.get(0).getNodeLabelExpression();
amNodeLabelExpression = (amNodeLabelExpression == null
|| amNodeLabelExpression.trim().isEmpty())
? RMNodeLabelsManager.NO_LABEL : amNodeLabelExpression;
Map<String, Set<NodeId>> labelsToNodes =
labelManager.getLabelsToNodes(
Collections.singleton(amNodeLabelExpression));
if (labelsToNodes.containsKey(amNodeLabelExpression)) {
Set<NodeId> nodesForLabels = labelsToNodes.get(amNodeLabelExpression);
if (nodesForReqs.isEmpty()) {
return nodesForLabels.size();
}
return Sets.intersection(nodesForLabels, nodesForReqs).size();
}
}
if (nodesForReqs.isEmpty()) {
return rmContext.getScheduler().getNumClusterNodes();
}
return nodesForReqs.size();
}
}
{code}
Basically, I'm getting {{NodeId}}'s for each of the resource requests and for the node labels, and then finding the ones that satisfy both. The problem is that {{getLabelsToNodes}} returns _all_ {{NodeId}}'s, instead of just the active ones. For example, if {{nodeA:1234}} has label "label1", then {{getLabelsToNodes("label1")}} returns {{nodeA:1234}} and {{nodeA:0}}. In this case, we don't want {{nodeA:0}}, and in fact, {{getActiveNMCountPerLabel("label1")}} only returns {{1}}, not {{2}}. Looking at {{RMNodeLabel}}, it only keeps track of the count for the active nodes for the label, and not the {{NodeId}}'s themselves. Do you know if there was any reason for that? If not, I could refactor {{RMNodeLabel}} to keep track of the active {{NodeId}} and that would solve my problem.
> AMs can't be scheduled on racks or nodes
> ----------------------------------------
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.9.0, 3.0.0-alpha2
> Reporter: Robert Kanter
> Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch, YARN-6050.003.patch, YARN-6050.004.patch, YARN-6050.005.patch, YARN-6050.006.patch, YARN-6050.007.patch, YARN-6050.008.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than one {{ResourceRequest}}. For example, if you want to schedule an AM only on "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}. The current behavior is to either build one from {{getResource}} or directly from {{getAMContainerResourceRequest}}, depending on if {{getAMContainerResourceRequest}} is null or not. We'll need to add a third method, say {{getAMContainerResourceRequests}}, which takes a list of {{ResourceRequest}} so that clients can specify the multiple resource requests.
> # There are some places where things are hardcoded to overwrite what the client specifies. These are pretty straightforward to fix.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org