You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Devaraj Das (JIRA)" <ji...@apache.org> on 2007/10/30 11:43:50 UTC
[jira] Issue Comment Edited: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538717 ] 

devaraj edited comment on HADOOP-1985 at 10/30/07 3:43 AM:
---------------------------------------------------------------

Some thoughts - 
1) Make the DNS->Switch mapping an interface class. 
    1.1) interface DNStoSwitchMap {
               public String resolve(String dnsname);
            }
    1.2) The switch string format is the same as it exists today (documented in https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf). That will make things work in the non-typical setup with 3+ levels of nodes.
    1.3) The default implementation of the interface, packaged with hadoop, could simply look up a table of dns->switch mapping created statically. 

2) The DataNode, today, takes the location as an argument. This is not needed anymore, and hence the associated code would go away.
3) The DataNode sends the location information as part of the registration. The NetworkTopology is derived at the NameNode. Using the interface mentioned in (1), the NameNode can create the topology all by itself.

4) The JobTracker creates the NetworkTopology for the TaskTrackers exactly how the NameNode does it.
5) The JobTracker assigns tasks first on node-locality basis, then on rack-locality basis.

In our environment,  task placement based on "distance" (o.a.h.n.NetworkTopology.getDistance), isn't that much relevant since we only have flat racks of machines. But we might make the framework ready for it as well (assuming it is not too much work). 

Does the above make sense?

      was (Author: devaraj):
    Some thoughts - 
1) Make the DNS->Switch mapping an interface class. 
    1.1) interface DNStoSwitchMap {
               public String resolve(String dnsname);
            }
    1.2) The switch string format is the same as it exists today (documented in https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf). That will make things work in the non-typical setup with 3+ levels of nodes.
    1.3) The default implementation of the interface, packaged with hadoop, could simply look up a table of dns->switch mapping created statically. 

2) The DataNode, today, takes the location as an argument. This is not needed anymore, and hence the associated code would go away.
3) The DataNode sends the location information as part of the registration. The NetworkTopology is derived at the NameNode. Using the interface mentioned in (1), the NameNode can create the topology all by itself.

4) The JobTracker creates the NetworkTopology for the TaskTrackers exactly how the NameNode does it.
5) The JobTracker assigns tasks first on node-locality basis, then on rack-locality basis.

In our environment,  "distance-basis" (o.a.h.n.NetworkTopology.getDistance), isn't that much relevant. But we might make the framework ready for it as well (assuming it is not too much work). 

Does the above make sense?
  
> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.