You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2017/03/02 01:29:45 UTC
[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

    [ https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891420#comment-15891420 ] 

Robert Kanter commented on YARN-6050:
-------------------------------------

[~leftnoteasy], I've been working on fixing this, and here's the code I currently have:
{code:java}
  public static int getApplicableNodeCountForAM(RMContext rmContext,
     Configuration conf, List<ResourceRequest> amReqs) {
    Set<NodeId> nodesForReqs = new HashSet<>();
    for (ResourceRequest amReq : amReqs) {
      if (amReq.getRelaxLocality() &&
          !amReq.getResourceName().equals(ResourceRequest.ANY)) {
        nodesForReqs.addAll(
            rmContext.getScheduler().getClusterNodeIdsByResourceName(
                amReq.getResourceName()));
      }
    }

    if (YarnConfiguration.areNodeLabelsEnabled(conf)) {
      RMNodeLabelsManager labelManager = rmContext.getNodeLabelManager();
      String amNodeLabelExpression = amReqs.get(0).getNodeLabelExpression();
      amNodeLabelExpression = (amNodeLabelExpression == null
          || amNodeLabelExpression.trim().isEmpty())
              ? RMNodeLabelsManager.NO_LABEL : amNodeLabelExpression;
      Map<String, Set<NodeId>> labelsToNodes =
          labelManager.getLabelsToNodes(
              Collections.singleton(amNodeLabelExpression));
      if (labelsToNodes.containsKey(amNodeLabelExpression)) {
        Set<NodeId> nodesForLabels = labelsToNodes.get(amNodeLabelExpression);
        if (nodesForReqs.isEmpty()) {
          return nodesForLabels.size();
        }
        return Sets.intersection(nodesForLabels, nodesForReqs).size();
      }
    }

    if (nodesForReqs.isEmpty()) {
      return rmContext.getScheduler().getNumClusterNodes();
    }
    return nodesForReqs.size();
  }
}
{code}
Basically, I'm getting {{NodeId}}'s for each of the resource requests and for the node labels, and then finding the ones that satisfy both.  The problem is that {{getLabelsToNodes}} returns _all_ {{NodeId}}'s, instead of just the active ones.  For example, if {{nodeA:1234}} has label "label1", then {{getLabelsToNodes("label1")}} returns {{nodeA:1234}} and {{nodeA:0}}.  In this case, we don't want {{nodeA:0}}, and in fact, {{getActiveNMCountPerLabel("label1")}} only returns {{1}}, not {{2}}.  Looking at {{RMNodeLabel}}, it only keeps track of the count for the active nodes for the label, and not the {{NodeId}}'s themselves.  Do you know if there was any reason for that?  If not, I could refactor {{RMNodeLabel}} to keep track of the active {{NodeId}} and that would solve my problem.

> AMs can't be scheduled on racks or nodes
> ----------------------------------------
>
>                 Key: YARN-6050
>                 URL: https://issues.apache.org/jira/browse/YARN-6050
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.9.0, 3.0.0-alpha2
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-6050.001.patch, YARN-6050.002.patch, YARN-6050.003.patch, YARN-6050.004.patch, YARN-6050.005.patch, YARN-6050.006.patch, YARN-6050.007.patch, YARN-6050.008.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than one {{ResourceRequest}}.  For example, if you want to schedule an AM only on "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The current behavior is to either build one from {{getResource}} or directly from {{getAMContainerResourceRequest}}, depending on if {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third method, say {{getAMContainerResourceRequests}}, which takes a list of {{ResourceRequest}} so that clients can specify the multiple resource requests.
> # There are some places where things are hardcoded to overwrite what the client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org