You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Kihwal Lee (JIRA)" <ji...@apache.org> on 2016/03/01 20:47:18 UTC

[jira] [Commented] (HADOOP-12861) RPC client fails too quickly when server connection limit is reached

    [ https://issues.apache.org/jira/browse/HADOOP-12861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15174305#comment-15174305 ] 

Kihwal Lee commented on HADOOP-12861:
-------------------------------------

Due to the way the IPC Client does SASL negotiation, once a connection is established, an immediately following connection reset is handled by {{handleSaslConnectionFailure()}}.  This will throw an {{IOException}}, which {{FailoverOnNetworkExceptionRetry.shouldRetry}} prescribes {{RetryAction.FAILOVER_AND_RETRY}} for, without any delay. The client can burn though retries rather quickly and have a permanent failure.

> RPC client fails too quickly when server connection limit is reached
> --------------------------------------------------------------------
>
>                 Key: HADOOP-12861
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12861
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 2.7.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>
> The NN's rpc server immediately closes new client connections when a connection limit is reached. The client rapidly retries a small number of times with no delay which causes clients to fail quickly. If the connection is refused or timedout, the connection retry policy tries with backoff. Clients should treat a reset connection as a connection failure so the connection retry policy is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)