You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Yu Li (JIRA)" <ji...@apache.org> on 2018/03/28 18:46:00 UTC

[jira] [Commented] (HBASE-20303) RS RPC server should not allow the response queue size to be too large

    [ https://issues.apache.org/jira/browse/HBASE-20303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417910#comment-16417910 ] 

Yu Li commented on HBASE-20303:
-------------------------------

Supplements on the background: this is something Chance found and resolved in product environment for last year's (2017) Singles' Day. Applications with async client may fail to deal with the results quickly enough thus the response may accumulate in server's response queue which caused full GC.

> RS RPC server should not allow the response queue size to be too large
> ----------------------------------------------------------------------
>
>                 Key: HBASE-20303
>                 URL: https://issues.apache.org/jira/browse/HBASE-20303
>             Project: HBase
>          Issue Type: Improvement
>          Components: rpc
>    Affects Versions: 3.0.0
>         Environment: 2000+ Region Servers
>            Reporter: Chance Li
>            Assignee: Chance Li
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: HBASE-20303.master.v1.patch
>
>
> With async clients, in some scenarios RS RPC server will occur Full GC because of the large reponse queue. 
>  Netty provides a WriteBufferHighWaterMark on channel, But this doesn't meet RS's needs(Consider 5k ~ 10k sockets in one RS server, it will need 50G ~ 100G heap for 10M per channel).
> RS rpc server will add a gloabl response buffer water mark(2G by default). when reaching the throttle, RS will not serve any request.
>  And RS rpc server will add a water mark for channel (100M by default), because it's mostly possible that this client is abnomal.
> We created a unit test to simulate abnormal case: a client that has only 1 socket can lead RS server to occupy heap up to 100M.
>  
>  Notes:
>  1. For client compability, we still use existed exception(CALL_QUEUE_TOO_BIG_EXCEPTION) but error message is different.
>  2. Not for SimpleRpcServer, because It's rarely used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)