You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "eric baldeschwieler (JIRA)" <ji...@apache.org> on 2007/10/02 20:39:50 UTC

[jira] Created: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Abstract node to switch mapping into a topology service class used by namenode and jobtracker
---------------------------------------------------------------------------------------------

                 Key: HADOOP-1985
                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
             Project: Hadoop
          Issue Type: New Feature
            Reporter: eric baldeschwieler


In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.

I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.

Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Open  (was: Patch Available)

The patch is out-of-sync with the trunk.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539020 ] 

eric baldeschwieler commented on HADOOP-1985:
---------------------------------------------

I agree that an exec as a simple to config option should be required.


> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Patch Available  (was: Open)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.v1.patch

Attached is a tested patch. The patch has better documentation too. One of the important changes in the patch to do with testcases is the way it handles multiple datanodes in the same machine. Since the namenode should be able to distinguish between them in terms of the dnsToRackId mapping, I added a configuration option called "slave.host.name" that would take effect only when the framework is run under junit. Ditto applies to the jobtracker/tasktrackers. Also all communications to these dummy hostnames are redirected to "localhost" (indirectly via NetUtils.createSocketAddress). 

Patch is up for review.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>         Attachments: 1985.new.patch, 1985.v1.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558834#action_12558834 ] 

owen.omalley edited comment on HADOOP-1985 at 1/14/08 3:09 PM:
----------------------------------------------------------------

I'm worried about the time and memory performance of this. Have you run a sort with dfs cluster == map/reduce cluster and compared running times and job tracker memory size? We've already seen cases where the current pollForNewTask causes performance problems...

It bothers me that the max levels is hard coded rather than configurable.

>From a style point of view, I probably would have defined a new class rather than use nested java.utils containers like List<Map<Node, List<TaskInProgress>>>. That way if we change the representation later it won't be scattered through the code.  In particular, I can imagine wanting to have the data structure be something like:
Map<String (rack name), RackInfo> and RackInfo has a Map<String (hostname), List<TaskInProgress> >. Or even more tree-like...

Did you need to change the definition of findNewTask? I don't see it in the patch.

This needs user documentation in forrest.

The java doc on DNSToSwitchMapping.resolve should probably mention that they must cache if their operation is expensive. Although there isn't a way to clear or update that cache, which might be a problem at some point...

You don't really need the Scan example, you could use the GenericMRLoadGenerator with a -keepmap of 0.

In the longer term I think a configured mapping class would be useful. A class named
org.apache.hadoop.net.ConfiguredNodeMapping that let you set the mapping in your config. Something like:

{code}
<property>
   <name>hadoop.configured.node.mapping</name>
   <value>host1=/rack1,host2=/rack1,host3=/rack4</value>
</property>
{code}




      was (Author: owen.omalley):
    I'm worried about the time and memory performance of this. Have you run a sort with dfs cluster == map/reduce cluster and compared running times and job tracker memory size? We've already seen cases where the current pollForNewTask causes performance problems...

It bothers me that the max levels is hard coded rather than configurable.

>From a style point of view, I probably would have defined a new class rather than use nested java.utils containers like List<Map<Node, List<TaskInProgress>>>. That way if we change the representation later it won't be scattered through the code.  In particular, I can imagine wanting to have the data structure be something like:
Map<String (rack name), RackInfo> and RackInfo has a Map<String (hostname), List<TaskInProgress> >. Or even more tree-like...

Did you need to change the definition of findNewTask? I don't see it in the patch.

This needs user documentation in forrest.

The java doc on DNSToSwitchMapping.resolve should probably mention that they must cache if their operation is expensive. Although there isn't a way to clear or update that cache, which might be a problem at some point...

You don't really need the Scan example, you could use the GenericMRLoadGenerator with a -keepmap of 0.

In the longer term I think a configured mapping class would be useful. A class named
org.apache.hadoop.net.ConfiguredNodeMapping that let you set the mapping in your config. Something like:

{code}
<property>
   <name>hadoop.configured.node.mapping</name>
   <value>host1=/rack1,host2=/rack1,host3=/rack4</value>
</property>



  
> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563045#action_12563045 ] 

Nigel Daley commented on HADOOP-1985:
-------------------------------------

I just added our the release audit to the patch process.  It looks for an increase in the number of files that don't have property license headers.  This patch is missing one for src/java/org/apache/hadoop/net/ScriptBasedMapping.java which is why it got a -1.  Don't worry about fixing this for now.  I'll be fixing a number of these before we release 0.16.



> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v11.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556684#action_12556684 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12372631/1985.v5.patch
against trunk revision .

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1503/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1503/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1503/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1503/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.v11.patch

One more of those occasions when the patch went out-of-sync.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v11.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561794#action_12561794 ] 

Devaraj Das commented on HADOOP-1985:
-------------------------------------

I think the core tests failed due to HADOOP-2680 ("all datanodes are bad" ..). They pass on my machine.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533350 ] 

Runping Qi commented on HADOOP-1985:
------------------------------------


yes, DNS name (hostname) to switch id mapping should be managed just like the hostname to IP mapping. 
The info should be available to the DFS  namenode, datanodes and applications in the same way. 
Job tracker uses this info for task assignment. In general, DFS client should also use this info to decide where to fetch a needed block.


> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Patch Available  (was: Open)

Rerunning through hudson.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558626#action_12558626 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12373076/1985.v6.patch
against trunk revision r611760.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1581/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1581/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1581/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1581/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Assignee: Doug Cutting  (was: Devaraj Das)
      Status: Patch Available  (was: Open)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Doug Cutting
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.v10.patch

Thanks Nigel for pointing out that doc build failure might be incorrectly reported as a core-tests failure (we should address this issue). There was a problem in the forrest doc in the patch. This patch fixes that.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561506#action_12561506 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12373749/1985.v10.patch
against trunk revision r614301.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1677/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1677/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1677/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1677/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555578#action_12555578 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12372415/1985.v4.patch
against trunk revision .

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1454/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1454/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1454/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1454/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: jobinprogress.patch

Here is a patch with some changes in the task cache datastructure. Also, there are changes that tries to ensure that rack  (and higher level) locality is preserved for failed/speculative tasks also... It doesn't delete TIPs from a node cache until the TIP is complete or the node happens to be the host itself. The logic is that we should not delete TIPs from the rack level cache to avoid having the speculative/failed TIPs pay a peformance penalty if some other tasktracker from the same rack gets to run that failed/speculative task. We delete TIPs from the host cache since we don't execute the same tip on a host that failed to execute it earlier...
Would highly appreciate a quick review on this one.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Open  (was: Patch Available)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560675#action_12560675 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12373500/1985.v9.patch
against trunk revision r613359.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1648/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1648/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1648/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1648/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Open  (was: Patch Available)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v11.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12562648#action_12562648 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12374045/1985.v11.patch
against trunk revision 614721.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1679/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1679/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1679/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1679/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v11.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Patch Available  (was: Open)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Open  (was: Patch Available)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Patch Available  (was: Open)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538717 ] 

devaraj edited comment on HADOOP-1985 at 10/30/07 3:43 AM:
---------------------------------------------------------------

Some thoughts - 
1) Make the DNS->Switch mapping an interface class. 
    1.1) interface DNStoSwitchMap {
               public String resolve(String dnsname);
            }
    1.2) The switch string format is the same as it exists today (documented in https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf). That will make things work in the non-typical setup with 3+ levels of nodes.
    1.3) The default implementation of the interface, packaged with hadoop, could simply look up a table of dns->switch mapping created statically. 

2) The DataNode, today, takes the location as an argument. This is not needed anymore, and hence the associated code would go away.
3) The DataNode sends the location information as part of the registration. The NetworkTopology is derived at the NameNode. Using the interface mentioned in (1), the NameNode can create the topology all by itself.

4) The JobTracker creates the NetworkTopology for the TaskTrackers exactly how the NameNode does it.
5) The JobTracker assigns tasks first on node-locality basis, then on rack-locality basis.

In our environment,  task placement based on "distance" (o.a.h.n.NetworkTopology.getDistance), isn't that much relevant since we only have flat racks of machines. But we might make the framework ready for it as well (assuming it is not too much work). 

Does the above make sense?

      was (Author: devaraj):
    Some thoughts - 
1) Make the DNS->Switch mapping an interface class. 
    1.1) interface DNStoSwitchMap {
               public String resolve(String dnsname);
            }
    1.2) The switch string format is the same as it exists today (documented in https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf). That will make things work in the non-typical setup with 3+ levels of nodes.
    1.3) The default implementation of the interface, packaged with hadoop, could simply look up a table of dns->switch mapping created statically. 

2) The DataNode, today, takes the location as an argument. This is not needed anymore, and hence the associated code would go away.
3) The DataNode sends the location information as part of the registration. The NetworkTopology is derived at the NameNode. Using the interface mentioned in (1), the NameNode can create the topology all by itself.

4) The JobTracker creates the NetworkTopology for the TaskTrackers exactly how the NameNode does it.
5) The JobTracker assigns tasks first on node-locality basis, then on rack-locality basis.

In our environment,  "distance-basis" (o.a.h.n.NetworkTopology.getDistance), isn't that much relevant. But we might make the framework ready for it as well (assuming it is not too much work). 

Does the above make sense?
  
> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.v4.patch

HADOOP-2344 made this patch go out-of-sync. This is the updated one.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563041#action_12563041 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12374045/1985.v11.patch
against trunk revision 615686.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit -1.  The applied patch generated 289 release audit warnings (more than the trunk's current 288 warnings).

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1690/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1690/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1690/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1690/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v11.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Patch Available  (was: Open)

Passing it through hudson.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540233 ] 

Hairong Kuang commented on HADOOP-1985:
---------------------------------------

Currently in dfs a datanode can get its network location either from the command line or by running a pluggable script at the startup time. The property is defined in the default configuration file as below.

<property>
  <name>dfs.network.script</name>
  <value></value>
  <description>
        Specifies a script name that print the network location path
        of the current machine.
  </description>
</property>


> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552127 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12371728/1985.v2.patch
against trunk revision r604451.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1356/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1356/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1356/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1356/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.v2.patch

Fixed the findbugs issues.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551804 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12371666/1985.v1.patch
against trunk revision r604058.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs -1.  The patch appears to introduce 2 new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1344/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1344/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1344/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1344/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das reassigned HADOOP-1985:
-----------------------------------

    Assignee: Devaraj Das  (was: Doug Cutting)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.v9.patch

Patch attached with review comments incorporated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558834#action_12558834 ] 

Owen O'Malley commented on HADOOP-1985:
---------------------------------------

I'm worried about the time and memory performance of this. Have you run a sort with dfs cluster == map/reduce cluster and compared running times and job tracker memory size? We've already seen cases where the current pollForNewTask causes performance problems...

It bothers me that the max levels is hard coded rather than configurable.

>From a style point of view, I probably would have defined a new class rather than use nested java.utils containers like List<Map<Node, List<TaskInProgress>>>. That way if we change the representation later it won't be scattered through the code.  In particular, I can imagine wanting to have the data structure be something like:
Map<String (rack name), RackInfo> and RackInfo has a Map<String (hostname), List<TaskInProgress> >. Or even more tree-like...

Did you need to change the definition of findNewTask? I don't see it in the patch.

This needs user documentation in forrest.

The java doc on DNSToSwitchMapping.resolve should probably mention that they must cache if their operation is expensive. Although there isn't a way to clear or update that cache, which might be a problem at some point...

You don't really need the Scan example, you could use the GenericMRLoadGenerator with a -keepmap of 0.

In the longer term I think a configured mapping class would be useful. A class named
org.apache.hadoop.net.ConfiguredNodeMapping that let you set the mapping in your config. Something like:

{code}
<property>
   <name>hadoop.configured.node.mapping</name>
   <value>host1=/rack1,host2=/rack1,host3=/rack4</value>
</property>




> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538717 ] 

Devaraj Das commented on HADOOP-1985:
-------------------------------------

Some thoughts - 
1) Make the DNS->Switch mapping an interface class. 
    1.1) interface DNStoSwitchMap {
               public String resolve(String dnsname);
            }
    1.2) The switch string format is the same as it exists today (documented in https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf). That will make things work in the non-typical setup with 3+ levels of nodes.
    1.3) The default implementation of the interface, packaged with hadoop, could simply look up a table of dns->switch mapping created statically. 

2) The DataNode, today, takes the location as an argument. This is not needed anymore, and hence the associated code would go away.
3) The DataNode sends the location information as part of the registration. The NetworkTopology is derived at the NameNode. Using the interface mentioned in (1), the NameNode can create the topology all by itself.

4) The JobTracker creates the NetworkTopology for the TaskTrackers exactly how the NameNode does it.
5) The JobTracker assigns tasks first on node-locality basis, then on rack-locality basis.

In our environment,  "distance-basis" (o.a.h.n.NetworkTopology.getDistance), isn't that much relevant. But we might make the framework ready for it as well (assuming it is not too much work). 

Does the above make sense?

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.v5.patch

Fixed a problem with NetworkTopology's getNode method.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558940#action_12558940 ] 

devaraj edited comment on HADOOP-1985 at 1/15/08 1:56 AM:
--------------------------------------------------------------

bq. I'm worried about the time and memory performance of this. Have you run a sort with dfs cluster == map/reduce cluster and compared running times and job tracker memory size? We've already seen cases where the current pollForNewTask causes performance problems... 

I assume you meant findNewTask giving performance problems. To clarify (for the benefit of others), the JobTracker would consume more memory due to two reasons:
1) The NetworkTopology is created here. This cannot be avoided, right?
2) Multiple cache levels are maintained. Currently (existing codebase), we maintain only one cache level (host to maps). This patch adds a cache at each level. Level is currently set to two (host, rack) and compile time config, to do efficient look ups. But the caches are just mappings from Node to references to objects in the NetworkTopology. Are you referring to these additional caches when you say memory performance may be a problem?

The running time performance is helped by the caches. If the TIP is present in some cache the complexity of is O(num-level), since it takes O(1) at each level to find a TIP for a TaskTracker, no? The linear search for TIP (if it was a cache miss), is there even currently(existing codebase). The only additional thing here are the lookups when the level is more than 1.

I did run the sort with dfs-cluster == map/reduce-cluster and the numbers were very comparable. Nothing concerning there..

bq. It bothers me that the max levels is hard coded rather than configurable.

I was thinking that the most typical cases would require just two levels - host, rack, and that's why i made this a compile time constant. But if it makes sense to make that runtime configurable, I can enable that behavior..

bq. From a style point of view, I probably would have defined a new class rather than use nested java.utils containers like List<Map<Node, List<TaskInProgress>>>. That way if we change the representation later it won't be scattered through the code. In particular, I can imagine wanting to have the data structure be something like: Map<String (rack name), RackInfo> and RackInfo has a Map<String (hostname), List<TaskInProgress> >. Or even more tree-like...

How about providing get/set APIs to the existing datastructure. The datastructure works for all cases with arbitrary number of levels (host, rack, switch, datacenter,..) (since it is a list of mappings from Node to TIPs). I didn't want to introduce Strings in the mapping since the NetworkTopology provides a  _Node_ abstraction for everything. If we went to Strings then we have an additional step of getting the Node from the String name (and vice versa), parsing strings to get to the Node, etc., which can be easily avoided by having the mappings based on Node.

bq. Did you need to change the definition of findNewTask? I don't see it in the patch.

Yes, I changed the definition of findNewTask. In the patch look for _Find a new task to run._ The diff doesn't have the line _findNewTask_. It just has the comment above it.

bq. This needs user documentation in forrest.

I have that in the 1985.v6.patch. Look for cluster_setup.xml and hdfs_design.xml, where I talk about how rack config can be setup. Did you mean something else?

bq. The java doc on DNSToSwitchMapping.resolve should probably mention that they must cache if their operation is expensive. Although there isn't a way to clear or update that cache, which might be a problem at some point...

Agreed regarding the documentation on the cache part. The update of the cache could be handled by the implementation of DNSToSwitchMapping, no? I can imagine a case, where the implementation starts a thread that periodically contacts some service and updates its cache. This is transparent to clients calling DNSToSwitchMapping.resolve.

bq. You don't really need the Scan example, you could use the GenericMRLoadGenerator with a -keepmap of 0.

Okay.

bq. In the longer term I think a configured mapping class would be useful. A class named org.apache.hadoop.net.ConfiguredNodeMapping that let you set the mapping in your config.

In the patch this is handled by a specific implementation of the DNSToSwitchMapping called StaticMapping, and that provides an API to set up the mapping from host to rackid (used in testcases). But I think I should be able to set things in the configuration and StaticMapping could initialize itself with the mapping provided there. I'll look at that.

      was (Author: devaraj):
    bq. I'm worried about the time and memory performance of this. Have you run a sort with dfs cluster == map/reduce cluster and compared running times and job tracker memory size? We've already seen cases where the current pollForNewTask causes performance problems... 

I assume you meant findNewTask giving performance problems. To clarify (for the benefit of others), the JobTracker would consume more memory due to two reasons:
1) The NetworkTopology is created here. This cannot be avoided, right?
2) Multiple cache levels are maintained. Currently, we maintain only one cache level (host to maps). This patch adds a cache at level, currently set to two (host, rack) and compile time config, to do efficient look ups. But the caches are just mappings from Node to references to objects in the NetworkTopology. Are you referring to these additional caches when you say memory performance may be a problem?

The running time performance is helped by the caches. It takes O(1) at level to find a TIP for a TaskTracker, no? The linear search for TIP (if it was a cache miss), is there even currently. The only additional thing here is the lookup when the level is more than 1.

I did run the sort with dfs-cluster == map/reduce-cluster and the numbers were very comparable. Nothing concerning there..

bq. It bothers me that the max levels is hard coded rather than configurable.

I was thinking that the most typical cases would require just two levels - host, rack, and that's why i made this a compile time constant. But if it makes sense to make that runtime configurable, I can enable that behavior..

bq. From a style point of view, I probably would have defined a new class rather than use nested java.utils containers like List<Map<Node, List<TaskInProgress>>>. That way if we change the representation later it won't be scattered through the code. In particular, I can imagine wanting to have the data structure be something like: Map<String (rack name), RackInfo> and RackInfo has a Map<String (hostname), List<TaskInProgress> >. Or even more tree-like...

How about providing get/set APIs to the existing datastructure. The datastructure works for all cases with arbitrary number of levels (host, rack, switch, datacenter,..) (since it is a list of mappings from Node to TIPs). I didn't want to introduce Strings in the mapping since the NetworkTopology provides a  _Node_ abstraction for everything. If we went to Strings then we have an additional step of getting the Node from the String name (and vice versa), parsing strings to get to the Node, etc., which can be easily avoided by having the mappings based on Node.

bq. Did you need to change the definition of findNewTask? I don't see it in the patch.

Yes, I changed the definition of findNewTask. In the patch look for _Find a new task to run._ The diff doesn't have the line _findNewTask_. It just has the comment above it.

bq. This needs user documentation in forrest.

I have that in the 1985.v6.patch. Look for cluster_setup.xml and hdfs_design.xml, where I talk about how rack config can be setup. Did you mean something else?

bq. The java doc on DNSToSwitchMapping.resolve should probably mention that they must cache if their operation is expensive. Although there isn't a way to clear or update that cache, which might be a problem at some point...

Agreed regarding the documentation on the cache part. The update of the cache could be handled by the implementation of DNSToSwitchMapping, no? I can imagine a case, where the implementation starts a thread that periodically contacts some service and updates its cache. This is transparent to clients calling DNSToSwitchMapping.resolve.

bq. You don't really need the Scan example, you could use the GenericMRLoadGenerator with a -keepmap of 0.

Okay.

bq. In the longer term I think a configured mapping class would be useful. A class named org.apache.hadoop.net.ConfiguredNodeMapping that let you set the mapping in your config.

In the patch this is handled by a specific implementation of the DNSToSwitchMapping called StaticMapping, and that provides an API to set up the mapping from host to rackid (used in testcases). But I think I should be able to set things in the configuration and StaticMapping could initialize itself with the mapping provided there. I'll look at that.
  
> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538872 ] 

Allen Wittenauer commented on HADOOP-1985:
------------------------------------------

Just a few notes.

>From an ops perspective, it is important that this mapping be highly pluggable in an easy way.  The ability to have hadoop call some sort of executable (not necessarily a script) means we can do fancy things with /etc/netmasks or LDAP lookups or ... .  Ideally, every sort of mapping would have a callout rather than having one big one. KISS is important here.  [Remember, most admins--myself included--are not hardcore Java people. ]

FWIW, most implementations of autofs include similar functionality called executable maps where the key is passed to an exec and the exec returns the location of the mount.  So the practice has at least a little bit of traction.  [In fact, auto.net aka /net on Linux uses this method.]

Additionally,I think moving this functionality to be done on the namenode makes this significantly easier to manage as a grid scales up.  There is also the issue of should the namenode 'trust' the datanode to report the proper location.  I understand that the datanode and namenode have to trust each other at some point during node bringup, but I think it makes a lot of sense to let the namenode be in charge of data locality.

Hopefuly this was helpful.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Fix Version/s: 0.16.0

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Open  (was: Patch Available)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554686 ] 

Devaraj Das commented on HADOOP-1985:
-------------------------------------

I ran the Scan benchmark attached in the patch (the benchmark just scans inputs; no sort/shuffle/reduce). 

The input data was generated on a cluster of  ~300 machines. Randomwriter with the following config was run - 40 maps per host, each map configured to generate 1G, dfs blk size 256 MB. The input data set was thus around 11.6 TB.

Another cluster of ~900 nodes, with its dfs pointing to the earlier 300 node cluster, was used to run the Scan benchmark. The number of maps was equal to the number of dfs blocks in the input.

The two clusters had common racks but no common nodes. With the rack aware patch, the scan program took 25 minutes (with 90% rack-local tasks), and without the patch, the scan took around 35 minutes, ~30% improvement.





> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.new.patch

Attached is one version of the patch. Hasn't been tested on large clusters yet. Here are the main changes:
1) DFS part updated to use the newly defined DNSToSwitchMapping interface.
  1.1) The datanode doesn't send the switch info as part of registration, rather, the namenode gets that info through the 
  DNSToSwitchMapping.resolve
2) The default implementation of the DNSToSwitchMapping assumes a script based resolution (ScriptBasedMapping). If the script is defined, then DEFAULT_RACK is assumed.
3) The JobTracker maintains the network topology and updates it (if required) whenever a tasktracker sends a heartbeat.
4) The JobInProgress maintains a compile-time-configurable number of task caches. For e.g., the first level cache is the map of leaf level Nodes to TIPs, the second level is the map of the rack level nodes to TIPs in that rack, the third level is for the level above rack and so on.. The default number of caches here is hardcoded to 2.
5) At runtime, the findNewTask would use these caches to assign a task to a checked-in tasktracker.

Patch up for review.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>         Attachments: 1985.new.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Open  (was: Patch Available)

Cancelling patch to get the latest patch through hudson

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Patch Available  (was: Open)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v11.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560889#action_12560889 ] 

Hadoop QA commented on HADOOP-1985:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12373500/1985.v9.patch
against trunk revision r613499.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1665/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1665/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1665/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1665/console

This message is automatically generated.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Patch Available  (was: Open)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558940#action_12558940 ] 

devaraj edited comment on HADOOP-1985 at 1/14/08 10:59 PM:
---------------------------------------------------------------

bq. I'm worried about the time and memory performance of this. Have you run a sort with dfs cluster == map/reduce cluster and compared running times and job tracker memory size? We've already seen cases where the current pollForNewTask causes performance problems... 

I assume you meant findNewTask giving performance problems. To clarify (for the benefit of others), the JobTracker would consume more memory due to two reasons:
1) The NetworkTopology is created here. This cannot be avoided, right?
2) Multiple cache levels are maintained. Currently, we maintain only one cache level (host to maps). This patch adds a cache at level, currently set to two (host, rack) and compile time config, to do efficient look ups. But the caches are just mappings from Node to references to objects in the NetworkTopology. Are you referring to these additional caches when you say memory performance may be a problem?

The running time performance is helped by the caches. It takes O(1) at level to find a TIP for a TaskTracker, no? The linear search for TIP (if it was a cache miss), is there even currently. The only additional thing here is the lookup when the level is more than 1.

I did run the sort with dfs-cluster == map/reduce-cluster and the numbers were very comparable. Nothing concerning there..

bq. It bothers me that the max levels is hard coded rather than configurable.

I was thinking that the most typical cases would require just two levels - host, rack, and that's why i made this a compile time constant. But if it makes sense to make that runtime configurable, I can enable that behavior..

bq. From a style point of view, I probably would have defined a new class rather than use nested java.utils containers like List<Map<Node, List<TaskInProgress>>>. That way if we change the representation later it won't be scattered through the code. In particular, I can imagine wanting to have the data structure be something like: Map<String (rack name), RackInfo> and RackInfo has a Map<String (hostname), List<TaskInProgress> >. Or even more tree-like...

How about providing get/set APIs to the existing datastructure. The datastructure works for all cases with arbitrary number of levels (host, rack, switch, datacenter,..) (since it is a list of mappings from Node to TIPs). I didn't want to introduce Strings in the mapping since the NetworkTopology provides a  _Node_ abstraction for everything. If we went to Strings then we have an additional step of getting the Node from the String name (and vice versa), parsing strings to get to the Node, etc., which can be easily avoided by having the mappings based on Node.

bq. Did you need to change the definition of findNewTask? I don't see it in the patch.

Yes, I changed the definition of findNewTask. In the patch look for _Find a new task to run._ The diff doesn't have the line _findNewTask_. It just has the comment above it.

bq. This needs user documentation in forrest.

I have that in the 1985.v6.patch. Look for cluster_setup.xml and hdfs_design.xml, where I talk about how rack config can be setup. Did you mean something else?

bq. The java doc on DNSToSwitchMapping.resolve should probably mention that they must cache if their operation is expensive. Although there isn't a way to clear or update that cache, which might be a problem at some point...

Agreed regarding the documentation on the cache part. The update of the cache could be handled by the implementation of DNSToSwitchMapping, no? I can imagine a case, where the implementation starts a thread that periodically contacts some service and updates its cache. This is transparent to clients calling DNSToSwitchMapping.resolve.

bq. You don't really need the Scan example, you could use the GenericMRLoadGenerator with a -keepmap of 0.

Okay.

bq. In the longer term I think a configured mapping class would be useful. A class named org.apache.hadoop.net.ConfiguredNodeMapping that let you set the mapping in your config.

In the patch this is handled by a specific implementation of the DNSToSwitchMapping called StaticMapping, and that provides an API to set up the mapping from host to rackid (used in testcases). But I think I should be able to set things in the configuration and StaticMapping could initialize itself with the mapping provided there. I'll look at that.

      was (Author: devaraj):
    bq. I'm worried about the time and memory performance of this. Have you run a sort with dfs cluster == map/reduce cluster and compared running times and job tracker memory size? We've already seen cases where the current pollForNewTask causes performance problems... 

I assume you meant findNewTask giving performance problems. To clarify (for the benefit of others), the JobTracker would consume more memory due to two reasons:
1) The NetworkTopology is created here. This cannot be avoided, right?
2) Multiple cache levels are maintained. Currently, we maintain only one cache level (host to maps). This patch adds a cache at level, currently set to two (host, rack) and compile time config, to do efficient look ups. But the caches are just mappings from Node to references to objects in the NetworkTopology. Are you referring to these additional caches when you say memory performance may be a problem?

The running time performance is helped by the caches. It takes O(1) at level to find a TIP for a TaskTracker, no? The linear search for TIP (if it was a cache miss), is there even currently. The only additional thing here is the lookup when the level is more than 1.

I did run the sort with dfs-cluster == map/reduce-cluster and the numbers were very comparable. Nothing concerning there..

bq. It bothers me that the max levels is hard coded rather than configurable.

I was thinking that the most typical cases would require just two levels - host, rack, and that's why i made this a compile time constant. But if it makes sense to make that runtime configurable, I can enable that behavior..

bq. From a style point of view, I probably would have defined a new class rather than use nested java.utils containers like List<Map<Node, List<TaskInProgress>>>. That way if we change the representation later it won't be scattered through the code. In particular, I can imagine wanting to have the data structure be something like:
Map<String (rack name), RackInfo> and RackInfo has a Map<String (hostname), List<TaskInProgress> >. Or even more tree-like...

How about providing get/set APIs to the existing datastructure. The datastructure works for all cases with arbitrary number of levels (host, rack, switch, datacenter,..) (since it is a list of mappings from Node to TIPs). I didn't want to introduce Strings in the mapping since the NetworkTopology provides a  _Node_ abstraction for everything. If we went to Strings then we have an additional step of getting the Node from the String name (and vice versa), parsing strings to get to the Node, etc., which can be easily avoided by having the mappings based on Node.

bq. Did you need to change the definition of findNewTask? I don't see it in the patch.

Yes, I changed the definition of findNewTask. In the patch look for _Find a new task to run._ The diff doesn't have the line _findNewTask_. It just has the comment above it.

bq. This needs user documentation in forrest.

I have that in the 1985.v6.patch. Look for cluster_setup.xml and hdfs_design.xml, where I talk about how rack config can be setup. Did you mean something else?

bq. The java doc on DNSToSwitchMapping.resolve should probably mention that they must cache if their operation is expensive. Although there isn't a way to clear or update that cache, which might be a problem at some point...

Agreed regarding the documentation on the cache part. The update of the cache could be handled by the implementation of DNSToSwitchMapping, no? I can imagine a case, where the implementation starts a thread that periodically contacts some service and updates its cache. This is transparent to clients calling DNSToSwitchMapping.resolve.

bq. You don't really need the Scan example, you could use the GenericMRLoadGenerator with a -keepmap of 0.

Okay.

bq. In the longer term I think a configured mapping class would be useful. A class named
org.apache.hadoop.net.ConfiguredNodeMapping that let you set the mapping in your config.

In the patch this is handled by a specific implementation of the DNSToSwitchMapping called StaticMapping, and that provides an API to set up the mapping from host to rackid (used in testcases). But I think I should be able to set things in the configuration and StaticMapping could initialize itself with the mapping provided there. I'll look at that.
  
> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.v6.patch

Updated patch.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558940#action_12558940 ] 

Devaraj Das commented on HADOOP-1985:
-------------------------------------

bq. I'm worried about the time and memory performance of this. Have you run a sort with dfs cluster == map/reduce cluster and compared running times and job tracker memory size? We've already seen cases where the current pollForNewTask causes performance problems... 

I assume you meant findNewTask giving performance problems. To clarify (for the benefit of others), the JobTracker would consume more memory due to two reasons:
1) The NetworkTopology is created here. This cannot be avoided, right?
2) Multiple cache levels are maintained. Currently, we maintain only one cache level (host to maps). This patch adds a cache at level, currently set to two (host, rack) and compile time config, to do efficient look ups. But the caches are just mappings from Node to references to objects in the NetworkTopology. Are you referring to these additional caches when you say memory performance may be a problem?

The running time performance is helped by the caches. It takes O(1) at level to find a TIP for a TaskTracker, no? The linear search for TIP (if it was a cache miss), is there even currently. The only additional thing here is the lookup when the level is more than 1.

I did run the sort with dfs-cluster == map/reduce-cluster and the numbers were very comparable. Nothing concerning there..

bq. It bothers me that the max levels is hard coded rather than configurable.

I was thinking that the most typical cases would require just two levels - host, rack, and that's why i made this a compile time constant. But if it makes sense to make that runtime configurable, I can enable that behavior..

bq. From a style point of view, I probably would have defined a new class rather than use nested java.utils containers like List<Map<Node, List<TaskInProgress>>>. That way if we change the representation later it won't be scattered through the code. In particular, I can imagine wanting to have the data structure be something like:
Map<String (rack name), RackInfo> and RackInfo has a Map<String (hostname), List<TaskInProgress> >. Or even more tree-like...

How about providing get/set APIs to the existing datastructure. The datastructure works for all cases with arbitrary number of levels (host, rack, switch, datacenter,..) (since it is a list of mappings from Node to TIPs). I didn't want to introduce Strings in the mapping since the NetworkTopology provides a  _Node_ abstraction for everything. If we went to Strings then we have an additional step of getting the Node from the String name (and vice versa), parsing strings to get to the Node, etc., which can be easily avoided by having the mappings based on Node.

bq. Did you need to change the definition of findNewTask? I don't see it in the patch.

Yes, I changed the definition of findNewTask. In the patch look for _Find a new task to run._ The diff doesn't have the line _findNewTask_. It just has the comment above it.

bq. This needs user documentation in forrest.

I have that in the 1985.v6.patch. Look for cluster_setup.xml and hdfs_design.xml, where I talk about how rack config can be setup. Did you mean something else?

bq. The java doc on DNSToSwitchMapping.resolve should probably mention that they must cache if their operation is expensive. Although there isn't a way to clear or update that cache, which might be a problem at some point...

Agreed regarding the documentation on the cache part. The update of the cache could be handled by the implementation of DNSToSwitchMapping, no? I can imagine a case, where the implementation starts a thread that periodically contacts some service and updates its cache. This is transparent to clients calling DNSToSwitchMapping.resolve.

bq. You don't really need the Scan example, you could use the GenericMRLoadGenerator with a -keepmap of 0.

Okay.

bq. In the longer term I think a configured mapping class would be useful. A class named
org.apache.hadoop.net.ConfiguredNodeMapping that let you set the mapping in your config.

In the patch this is handled by a specific implementation of the DNSToSwitchMapping called StaticMapping, and that provides an API to set up the mapping from host to rackid (used in testcases). But I think I should be able to set things in the configuration and StaticMapping could initialize itself with the mapping provided there. I'll look at that.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das reassigned HADOOP-1985:
-----------------------------------

    Assignee: Devaraj Das

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538857 ] 

eric baldeschwieler commented on HADOOP-1985:
---------------------------------------------

works for me

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Open  (was: Patch Available)

core-tests seems to have failed. But I am not able to get to what failed using the link http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1648/testReport/ . Also, tests passed on my machine. Cancelling patch to get it through hudson again..

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563047#action_12563047 ] 

Nigel Daley commented on HADOOP-1985:
-------------------------------------

I just added our the release audit to the patch process.  It looks  
for an increase in the number of files that don't have property  
license headers.  This patch is missing one for src/java/org/apache/ 
hadoop/net/ScriptBasedMapping.java which is why it got a -1.  Don't  
worry about fixing this for now.  I'll be fixing a number of these  
before we release 0.16.






> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v11.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Patch Available  (was: Open)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Component/s: mapred
                 dfs

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Status: Open  (was: Patch Available)

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-1985:
--------------------------------

    Attachment: 1985.v3.patch

Fixed an issue in the Scan benchmark.

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, 1985.v3.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in both the namenode and job tracker.  Currently the namenode asks the data nodes for this info and they run a local script to answer this question.  In our environment and others that I know of there is no reason to push this to each node.  It is easier to maintain a centralized script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.  We can then add this to the namenode to support the current block to switch mapping needs and simplify the data nodes.  We can also add this same callout to the job tracker and then implement rack locality logic there without needing to chane the filesystem API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also future compatible to future infrastructures that may derive topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.