You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Aroop Maliakkal (JIRA)" <ji...@apache.org> on 2008/03/03 12:24:51 UTC

[jira] Created: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

HOD is trying to bring up task tracker on  port which is already in close_wait state
------------------------------------------------------------------------------------

                 Key: HADOOP-2924
                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
             Project: Hadoop Core
          Issue Type: Improvement
          Components: contrib/hod
            Reporter: Aroop Maliakkal


While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-2924:
-------------------------------------

        Fix Version/s: 0.17.0
             Assignee: Vinod Kumar Vavilapalli
             Priority: Critical  (was: Major)
           Issue Type: Bug  (was: Improvement)
    Affects Version/s: 0.16.0

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>             Fix For: 0.17.0
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579414#action_12579414 ] 

Hudson commented on HADOOP-2924:
--------------------------------

Integrated in Hadoop-trunk #431 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/431/])

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-2924, HADOOP-2924.1
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576901#action_12576901 ] 

Hemanth Yamijala commented on HADOOP-2924:
------------------------------------------

Couple of comments:

- We should print the error message from socket.error. This will help us debug problems better, given the very core nature of the change we are making to the port detection method.
- We should also decrement the value of retries in case we get socket.error. This is not being done even currently, but, I think, is clearly intended.
- Recommend removing the unused methods. I've checked they are indeed unused.

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-2924
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578920#action_12578920 ] 

Hadoop QA commented on HADOOP-2924:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12377678/HADOOP-2924.1
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included -1.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1962/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1962/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1962/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1962/console

This message is automatically generated.

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-2924, HADOOP-2924.1
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kumar Vavilapalli updated HADOOP-2924:
--------------------------------------------

    Attachment: HADOOP-2924.1

Included the above changes. Also, increased the default retries to 900(which takes around 6.2 secs in the worst case of no single available port out of 900 ports). The default retries used to be 30.

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-2924, HADOOP-2924.1
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575574#action_12575574 ] 

Hemanth Yamijala commented on HADOOP-2924:
------------------------------------------

This could also happen if the host is using a port as the source port for a connection.

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>             Fix For: 0.17.0
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-2924:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Vinod!

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-2924, HADOOP-2924.1
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-2924:
-------------------------------------

    Status: Patch Available  (was: Open)

Running through Hudson.

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-2924, HADOOP-2924.1
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kumar Vavilapalli updated HADOOP-2924:
--------------------------------------------

    Attachment: HADOOP-2924

This is happening because of the way HOD currently looks for a free port - it tries to connect and if it gets an exception, it takes it as a free port. And this fails while trying with source ports and/or ports in CLOSE_WAIT state as said above.

This patch changes the implementation, by testing for a free port by 'binding' to the port and ensuring that we don't get a bind exception. Tested this behaviour. Also carried tests to make sure that ports bound like this and found to be free are still free and usable by the time the hadoop daemons actually bind to them.

Marked some unused methods, as UNUSED.

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-2924
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2924) HOD is trying to bring up task tracker on port which is already in close_wait state

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578374#action_12578374 ] 

Hemanth Yamijala commented on HADOOP-2924:
------------------------------------------

+1 for the changes.

Going to do patch available to run through Hudson.

> HOD is trying to bring up task tracker on  port which is already in close_wait state
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2924
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-2924, HADOOP-2924.1
>
>
> While bringing up task tracker using random ports, HOD is not checking whether the port is in CLOSE_WAIT state. So when it starts task tracker, we will be getting an address bind error on that port. We can avoid this error if we check for CLOSE_WAIT state on that port before starting the tasktracker.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.