You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "BELUGA BEHR (JIRA)" <ji...@apache.org> on 2018/04/17 16:07:00 UTC

[jira] [Created] (YARN-8170) Caching Node Rack Location

BELUGA BEHR created YARN-8170:
---------------------------------

             Summary: Caching Node Rack Location
                 Key: YARN-8170
                 URL: https://issues.apache.org/jira/browse/YARN-8170
             Project: Hadoop YARN
          Issue Type: New Feature
          Components: applications, nodemanager
    Affects Versions: 3.0.1
            Reporter: BELUGA BEHR


When the MapReduce ApplicationMaster is trying to assign Mappers to Nodes, it loops all of the queued Mappers and looks up the ideal rack location of each Mapper.

Under the covers, the rack awareness script is being called, once per Mapper. The results do get cached, but for only as long as the ApplicationMaster exists. That means that the script gets called N times each time a new ApplicationMaster is launched. If the rack awareness script is complex or requires an external lookup, this can be a slow process and can even DDOS the external lookup source.

There are at least a couple of ways to tackle this...
 # Add a DNSToSwitchMapping implementation that caches in an external cache (i.e., memcached) instead of memory so that all ApplicationMasters can share the same cache and would rarely call the rack awareness script.
 # Like the shuffle service, add a new NodeManager auxiliary which exposes a rack lookup API so that the NodeManagers are responsible for caching the rack locations. This would also require a DNSToSwitchMapping implementation that interacts with this new service.
 # Other?

{code:java}
          String host = allocated.getNodeId().getHost();
          String rack = RackResolver.resolve(host).getNetworkLocation();
{code}
[https://github.com/apache/hadoop/blob/453d48bdfbb67ed3e66c33c4aef239c3d7bdd3bc/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java#L1435-L1464]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org