You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Christian Kunz (JIRA)" <ji...@apache.org> on 2009/11/29 19:38:20 UTC

[jira] Created: (HADOOP-6397) topology script called with host names instead of IP addresses

topology script called with host names instead of IP addresses
--------------------------------------------------------------

                 Key: HADOOP-6397
                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
             Project: Hadoop Common
          Issue Type: Bug
          Components: util
    Affects Versions: 0.20.1
            Reporter: Christian Kunz


According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.

This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:

It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6397) topology script called with host names instead of IP addresses

Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806204#action_12806204 ] 

Scott Chen commented on HADOOP-6397:
------------------------------------

Add the unit test and use the IP pattern to check if the name is already an IP.

@Christian: I am also changing our internal trunk for this part. So I have modified the patch. I hope you don't mind.

> topology script called with host names instead of IP addresses
> --------------------------------------------------------------
>
>                 Key: HADOOP-6397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.20.1
>            Reporter: Christian Kunz
>         Attachments: HADOOP-6397-v2.patch, HADOOP-6397.patch
>
>
> According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.
> This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:
> It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
> There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6397) topology script called with host names instead of IP addresses

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806214#action_12806214 ] 

Hadoop QA commented on HADOOP-6397:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431738/HADOOP-6397-v2.patch
  against trunk revision 904339.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/298/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/298/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/298/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/298/console

This message is automatically generated.

> topology script called with host names instead of IP addresses
> --------------------------------------------------------------
>
>                 Key: HADOOP-6397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.22.0
>            Reporter: Christian Kunz
>         Attachments: HADOOP-6397-v2.patch, HADOOP-6397.patch
>
>
> According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.
> This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:
> It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
> There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6397) topology script called with host names instead of IP addresses

Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806187#action_12806187 ] 

Scott Chen commented on HADOOP-6397:
------------------------------------

The patch looks good to me.
A simple unit test would be using "localhost" and "127.0.0.1" to provide the basic checking.

In FsNamesystems.java resolveNetworkLocation():
    if (dnsToSwitchMapping instanceof CachedDNSToSwitchMapping) {
      // get the node's IP address
      names.add(node.getHost());
    } else {
This kind of logics should be hidden in the interface. The interface should be getting the same input (hostnames).
The above code pass IP to CachedDNSToSwitchMapping but pass hostname to other instances.
It would be good if this part can also be fixed in this patch.

> topology script called with host names instead of IP addresses
> --------------------------------------------------------------
>
>                 Key: HADOOP-6397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.20.1
>            Reporter: Christian Kunz
>         Attachments: HADOOP-6397.patch
>
>
> According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.
> This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:
> It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
> There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6397) topology script called with host names instead of IP addresses

Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Scott Chen updated HADOOP-6397:
-------------------------------

    Affects Version/s:     (was: 0.20.1)
                       0.22.0
               Status: Patch Available  (was: Open)

> topology script called with host names instead of IP addresses
> --------------------------------------------------------------
>
>                 Key: HADOOP-6397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.22.0
>            Reporter: Christian Kunz
>         Attachments: HADOOP-6397-v2.patch, HADOOP-6397.patch
>
>
> According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.
> This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:
> It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
> There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6397) topology script called with host names instead of IP addresses

Posted by "Jakob Homan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jakob Homan updated HADOOP-6397:
--------------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Duplicate

Resolving as duplicate as this was independently discovered and fixed in HADOOP-6682 (although without the unit test, which would have been nice).  Further improvement to this function was delegated to HADOOP-6824.

> topology script called with host names instead of IP addresses
> --------------------------------------------------------------
>
>                 Key: HADOOP-6397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.22.0
>            Reporter: Christian Kunz
>         Attachments: HADOOP-6397-v2.patch, HADOOP-6397.patch
>
>
> According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.
> This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:
> It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
> There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6397) topology script called with host names instead of IP addresses

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783599#action_12783599 ] 

Steve Loughran commented on HADOOP-6397:
----------------------------------------

Well spotted. Assuming that any hostname beginning with a-f is hex is not that sensible.

I thought initially that this would be trivial to write tests for, but of course, it depends on DNS working and the hostname passed in resolving. 

> topology script called with host names instead of IP addresses
> --------------------------------------------------------------
>
>                 Key: HADOOP-6397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.20.1
>            Reporter: Christian Kunz
>         Attachments: HADOOP-6397.patch
>
>
> According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.
> This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:
> It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
> There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6397) topology script called with host names instead of IP addresses

Posted by "Christian Kunz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Kunz updated HADOOP-6397:
-----------------------------------

    Attachment: HADOOP-6397.patch

Patch for trunk. Works for hadoop-0.20.1 when in the file names java is replaced by core.

> topology script called with host names instead of IP addresses
> --------------------------------------------------------------
>
>                 Key: HADOOP-6397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.20.1
>            Reporter: Christian Kunz
>         Attachments: HADOOP-6397.patch
>
>
> According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.
> This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:
> It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
> There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6397) topology script called with host names instead of IP addresses

Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Scott Chen updated HADOOP-6397:
-------------------------------

    Attachment: HADOOP-6397-v2.patch

> topology script called with host names instead of IP addresses
> --------------------------------------------------------------
>
>                 Key: HADOOP-6397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.20.1
>            Reporter: Christian Kunz
>         Attachments: HADOOP-6397-v2.patch, HADOOP-6397.patch
>
>
> According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.
> This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:
> It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
> There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6397) topology script called with host names instead of IP addresses

Posted by "Scott Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806192#action_12806192 ] 

Scott Chen commented on HADOOP-6397:
------------------------------------

Sorry, I forgot that FsNamesystem.java is actually in HDFS. We should file another JIRA for that after this one is fixed.

> topology script called with host names instead of IP addresses
> --------------------------------------------------------------
>
>                 Key: HADOOP-6397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6397
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.20.1
>            Reporter: Christian Kunz
>         Attachments: HADOOP-6397.patch
>
>
> According to the 'Hadoop Rack Awareness' section on http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html the script/program topology.script.file.name can assume that names are passed in as IP addresses.
> This seems to work for hdfs, but the rack-awareness of the JobTracker does not always work, depending on the hostnames of the cluster, because of a bug in the method normalizeHostName of org.apache.hadoop.net.NetUtils.java:
> It returns the host name as-is when it decides that the string passed-in is already an IP but actually it checks whether the string starts with a valid hex character!
> There is no need to check at all, because InetAddress.getByName is smart enough to not do a dns lookup when the name is an IP address.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.