You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2013/09/12 19:46:55 UTC

[jira] [Updated] (HADOOP-9955) RPC idle connection closing is extremely inefficient

     [ https://issues.apache.org/jira/browse/HADOOP-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daryn Sharp updated HADOOP-9955:
--------------------------------

    Attachment: HADOOP-9955.patch

Here's a proposed patch with no additional tests.  Existing tests verify that idle connections are closed but I can add more if the approach is acceptable and the reviewer feels more tests are necessary.

I'm using a background TimerTask to sweep a thread-safe hash set.  The hash set makes connection removal cheap and provides a thread-safe iterator for the background timer.

I question if we should limit the max connections to nuke anymore?  Presumably it's only there to limit the time spent locking the linked list?
                
> RPC idle connection closing is extremely inefficient
> ----------------------------------------------------
>
>                 Key: HADOOP-9955
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9955
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ipc
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HADOOP-9955.patch
>
>
> The RPC server listener loops accepting connections, distributing the new connections to socket readers, and then conditionally & periodically performs a scan for idle connections.  The idle scan choses a _random index range_ to scan in a _synchronized linked list_.
> With 20k+ connections, walking the range of indices in the linked list is extremely expensive.  During the sweep, other threads (socket responder and readers) that want to close connections are blocked, and no new connections are being accepted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira