You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Hong Tang (JIRA)" <ji...@apache.org> on 2009/11/06 10:51:32 UTC

[jira] Commented: (MAPREDUCE-1191) JobInProgress.createCache() should not add unknown hosts to the host-to-rack location mapping.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774256#action_12774256 ] 

Hong Tang commented on MAPREDUCE-1191:
--------------------------------------

Related code - JobInProgress.createCache:
{code}
  Map<Node, List<TaskInProgress>> createCache(
                         Job.RawSplit[] splits, int maxLevel) {
    Map<Node, List<TaskInProgress>> cache =
      new IdentityHashMap<Node, List<TaskInProgress>>(maxLevel);
    
    for (int i = 0; i < splits.length; i++) {
      String[] splitLocations = splits[i].getLocations();
      if (splitLocations.length == 0) {
        nonLocalMaps.add(maps[i]);
        continue;
      }

      for(String host: splitLocations) {
        Node node = jobtracker.resolveAndAddToTopology(host); //< HERE host will always be added to internal hostnamesToNodeMap
        LOG.info("tip:" + maps[i].getTIPId() + " has split on node:" + node);
        for (int j = 0; j < maxLevel; j++) {
          List<TaskInProgress> hostMaps = cache.get(node);
          if (hostMaps == null) {
            hostMaps = new ArrayList<TaskInProgress>();
            cache.put(node, hostMaps);
            hostMaps.add(maps[i]);
          }
          //check whether the hostMaps already contains an entry for a TIP
          //This will be true for nodes that are racks and multiple nodes in
          //the rack contain the input for a tip. Note that if it already
          //exists in the hostMaps, it must be the last element there since
          //we process one TIP at a time sequentially in the split-size order
          if (hostMaps.get(hostMaps.size() - 1) != maps[i]) {
            hostMaps.add(maps[i]);
          }
          node = node.getParent();
        }
      }
    }
    return cache;
  }
{code}

> JobInProgress.createCache() should not add unknown hosts to the host-to-rack location mapping.
> ----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1191
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1191
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Hong Tang
>
> JobInProgress.createCache() currently would add host names specified in rawsplits to rack "/default-rack" if it does not already know the mapping. This seems to be a bad idea in the sense that a malicious client can submit jobs with many maps whose locations are non-existent hosts and thus consume up JobTracker's memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.