You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2011/03/22 01:01:05 UTC

[jira] [Assigned] (HBASE-3680) Publish more metrics about mslab

     [ https://issues.apache.org/jira/browse/HBASE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans reassigned HBASE-3680:
-----------------------------------------

    Assignee: Todd Lipcon

Hoping Todd can take a quick look.

> Publish more metrics about mslab
> --------------------------------
>
>                 Key: HBASE-3680
>                 URL: https://issues.apache.org/jira/browse/HBASE-3680
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Todd Lipcon
>             Fix For: 0.92.0
>
>
> We have been using mslab on all our clusters for a while now and it seems it tends to OOME or send us into GC loops of death a lot more than it used to. For example, one RS with mslab enabled and 7GB of heap died out of OOME this afternoon; it had .55GB in the block cache and 2.03GB in the memstores which doesn't account for much... but it could be that because of mslab a lot of space was lost in those incomplete 2MB blocks and without metrics we can't really tell. Compactions were running at the time of the OOME and I see block cache activity. The average load on that cluster is 531.
> We should at least publish the total size of all those blocks and maybe even take actions based on that (like force flushing).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira