You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Chris Li (JIRA)" <ji...@apache.org> on 2014/10/13 20:21:35 UTC

[jira] [Commented] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load

    [ https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169689#comment-14169689 ] 

Chris Li commented on HADOOP-10597:
-----------------------------------

Hi [~mingma], thanks for adding some numbers. If I understand correctly from the graph, the latency spike is a result of maxing out the call queue's capacity, which FairCallQueue will not solve since FCQ has no choice but to enqueue a call somewhere. Just to double check, were all these calls made under the same user? I'd guess that RPC client backoff would work just as well when FairCallQueue is disabled too, since it solves the different problem of alleviating a full queue. I do agree with Steve that we'll want some fuzz on the retry method, since linear could cause load to be periodic over time


> Evaluate if we can have RPC client back off when server is under heavy load
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10597
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HADOOP-10597-2.patch, HADOOP-10597.patch, MoreRPCClientBackoffEvaluation.pdf, RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking state, assuming OS connection doesn't run out. Alternatively RPC or NN can throw some well defined exception back to the client based on certain policies when it is under heavy load; client will understand such exception and do exponential back off, as another implementation of RetryInvocationHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)