You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2013/08/27 22:24:53 UTC

[jira] [Updated] (HADOOP-9898) Set SO_KEEPALIVE on all our sockets

     [ https://issues.apache.org/jira/browse/HADOOP-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-9898:
--------------------------------

    Attachment: hadoop-9898.txt
    
> Set SO_KEEPALIVE on all our sockets
> -----------------------------------
>
>                 Key: HADOOP-9898
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9898
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc, net
>    Affects Versions: 3.0.0
>            Reporter: Todd Lipcon
>            Priority: Minor
>         Attachments: hadoop-9898.txt
>
>
> We recently saw an issue where network issues between slaves and the NN caused ESTABLISHED TCP connections to pile up and leak on the NN side. It looks like the RST packets were getting dropped, which meant that the client thought the connections were closed, while they hung open forever on the server.
> Setting the SO_KEEPALIVE option on our sockets would prevent this kind of leak from going unchecked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira