You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Tianying Chang (JIRA)" <ji...@apache.org> on 2016/02/03 18:59:39 UTC

[jira] [Commented] (HBASE-15155) Show All RPC handler tasks stops working after cluster is under heavy load for a while

    [ https://issues.apache.org/jira/browse/HBASE-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15130776#comment-15130776 ] 

Tianying Chang commented on HBASE-15155:
----------------------------------------

We deployed at our production cluster running 94.26, it is working fine with heavy traffic for over a week now. 

> Show All RPC handler tasks stops working after cluster is under heavy load for a while
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-15155
>                 URL: https://issues.apache.org/jira/browse/HBASE-15155
>             Project: HBase
>          Issue Type: Bug
>          Components: monitoring
>    Affects Versions: 0.98.0, 1.0.0, 0.94.19
>            Reporter: Tianying Chang
>            Assignee: Tianying Chang
>         Attachments: hbase-15155.patch
>
>
> After we upgrade from 94.7 to 94.26 and 1.0, we found that "Show All RPC handler status" link on RS webUI stops working after running in production cluster with relatively high load for several days.  
> Turn out to be it is a bug introduced by https://issues.apache.org/jira/browse/HBASE-10312 The BoundedFIFOBuffer cause RPCHandler Status overriden/removed permanently when there is a spike of non-RPC tasks status that is over the MAX_SIZE (1000).  So as long as the RS experienced "high" load once, the RPC status monitoring is gone forever, until RS is restarted. 
>  We added a unit test that can repro this. And the fix can pass the test.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)