You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Chris Li (JIRA)" <ji...@apache.org> on 2014/11/06 22:50:37 UTC

[jira] [Commented] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load

    [ https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201000#comment-14201000 ] 

Chris Li commented on HADOOP-10597:
-----------------------------------

Cool. I think this is a good feature to have.

One small question about the code:
+        LOG.warn("Element " + e + " was queued properly." +
+                "But client is asked to retry.");

>From my brief study of the code, it seems like isCallQueued is passed pretty deep in order to maintain some sort of reference count on how many pending requests each handler has waiting client-side to retry. Does this count always balance to zero? What if a client makes a request, is denied, and then terminates before it can make a request that successfully queues?

Also, what conditions will the element be queued correctly but the client gets a retry?

Also kind of a small thing but instead of recentBackOffCount.set(oldValue) it would be more clear to create a new variable newValue and recentBackOffCount.set(newValue) instead of mutating oldValue, or perhaps just rename the oldValue variable to something which doesn't imply immutability

> Evaluate if we can have RPC client back off when server is under heavy load
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10597
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HADOOP-10597-2.patch, HADOOP-10597.patch, MoreRPCClientBackoffEvaluation.pdf, RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking state, assuming OS connection doesn't run out. Alternatively RPC or NN can throw some well defined exception back to the client based on certain policies when it is under heavy load; client will understand such exception and do exponential back off, as another implementation of RetryInvocationHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)