You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Hong Tang (JIRA)" <ji...@apache.org> on 2009/11/06 10:51:32 UTC
[jira] Commented: (MAPREDUCE-1191) JobInProgress.createCache()
should not add unknown hosts to the host-to-rack location mapping.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774256#action_12774256 ]
Hong Tang commented on MAPREDUCE-1191:
--------------------------------------
Related code - JobInProgress.createCache:
{code}
Map<Node, List<TaskInProgress>> createCache(
Job.RawSplit[] splits, int maxLevel) {
Map<Node, List<TaskInProgress>> cache =
new IdentityHashMap<Node, List<TaskInProgress>>(maxLevel);
for (int i = 0; i < splits.length; i++) {
String[] splitLocations = splits[i].getLocations();
if (splitLocations.length == 0) {
nonLocalMaps.add(maps[i]);
continue;
}
for(String host: splitLocations) {
Node node = jobtracker.resolveAndAddToTopology(host); //< HERE host will always be added to internal hostnamesToNodeMap
LOG.info("tip:" + maps[i].getTIPId() + " has split on node:" + node);
for (int j = 0; j < maxLevel; j++) {
List<TaskInProgress> hostMaps = cache.get(node);
if (hostMaps == null) {
hostMaps = new ArrayList<TaskInProgress>();
cache.put(node, hostMaps);
hostMaps.add(maps[i]);
}
//check whether the hostMaps already contains an entry for a TIP
//This will be true for nodes that are racks and multiple nodes in
//the rack contain the input for a tip. Note that if it already
//exists in the hostMaps, it must be the last element there since
//we process one TIP at a time sequentially in the split-size order
if (hostMaps.get(hostMaps.size() - 1) != maps[i]) {
hostMaps.add(maps[i]);
}
node = node.getParent();
}
}
}
return cache;
}
{code}
> JobInProgress.createCache() should not add unknown hosts to the host-to-rack location mapping.
> ----------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1191
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1191
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Hong Tang
>
> JobInProgress.createCache() currently would add host names specified in rawsplits to rack "/default-rack" if it does not already know the mapping. This seems to be a bad idea in the sense that a malicious client can submit jobs with many maps whose locations are non-existent hosts and thus consume up JobTracker's memory.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.