You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2010/04/25 01:41:50 UTC

[jira] Created: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

NetUtils.connect should check that it hasn't connected a socket to itself
-------------------------------------------------------------------------

                 Key: HADOOP-6722
                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
             Project: Hadoop Common
          Issue Type: Bug
          Components: util
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon


I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.

This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-6722:
------------------------------

           Status: Resolved  (was: Patch Available)
     Hadoop Flags: [Reviewed]
    Fix Version/s: 0.22.0
       Resolution: Fixed

I've just committed this. Thanks Todd!

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.22.0
>
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860743#action_12860743 ] 

Todd Lipcon commented on HADOOP-6722:
-------------------------------------

btw, Allen: I was able to reproduce on OSX. It just takes a while sometimes.

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860742#action_12860742 ] 

Todd Lipcon commented on HADOOP-6722:
-------------------------------------

Hudson bot isn't commenting automatically anymore, but results here:
hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/478/

   [exec] +1 overall.  Here are the results of testing the latest attachment 
     [exec]   http://issues.apache.org/jira/secure/attachment/12442761/hadoop-6722.txt
     [exec]   against trunk revision 937577.
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 2 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec] 
     [exec]     +1 core tests.  The patch passed core unit tests.
     [exec] 
     [exec]     +1 contrib tests.  The patch passed contrib unit tests.

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860632#action_12860632 ] 

Hadoop QA commented on HADOOP-6722:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12442761/hadoop-6722.txt
  against trunk revision 937577.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 2 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/478/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/478/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/478/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/478/console

This message is automatically generated.

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860631#action_12860631 ] 

Allen Wittenauer commented on HADOOP-6722:
------------------------------------------

FWIW, I can't duplicate this on Solaris or Mac OS X, both of which are based upon BSD sockets.  So I'm guessing this is a bug in the Linux TCP/IP stack.

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6722:
--------------------------------

    Status: Patch Available  (was: Open)

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861236#action_12861236 ] 

Tom White commented on HADOOP-6722:
-----------------------------------

+1

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860633#action_12860633 ] 

Hong Tang commented on HADOOP-6722:
-----------------------------------

http://www.rampa.sk/static/tcpLoopConnect.html

This is not a bug, but the usefulness of this feature is certainly questionable.

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860634#action_12860634 ] 

Hong Tang commented on HADOOP-6722:
-----------------------------------

Also, it seems somebody experienced this problem on freebsd too. http://osdir.com/ml/freebsd.devel.hackers/2002-05/msg00209.html

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-6722:
--------------------------------

    Attachment: hadoop-6722.txt

Here's a patch which detects this situation and throws a ConnectException.

The test case manufactures the problem by binding to an ephemeral port and then trying to connect to itself.

(fwiw, I actually did run into this issue while testing hbase under failure injection scenarios - a client tried to open RPC to the hbase server, but got itself instead, and was very unhappy)

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6722) NetUtils.connect should check that it hasn't connected a socket to itself

Posted by "Hong Tang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860624#action_12860624 ] 

Hong Tang commented on HADOOP-6722:
-----------------------------------

I was bit by this "feature" of TCP in my past life too. :)

> NetUtils.connect should check that it hasn't connected a socket to itself
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-6722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6722
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6722.txt
>
>
> I had no idea this was possible, but it turns out that a TCP connection will be established in the rare case that the local side of the socket binds to the ephemeral port that you later try to connect to. This can present itself in very very rare occasion when an RPC client is trying to connect to a daemon running on the same node, but that daemon is down. To see what I'm talking about, run "while true ; do telnet localhost 60020 ; done" on a multicore box and wait several minutes.
> This can be easily detected in NetUtils.connect by making sure the local address/port is not equal to the remote address/port.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.