You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Varun Sharma (JIRA)" <ji...@apache.org> on 2013/06/26 09:20:20 UTC

[jira] [Commented] (HBASE-8370) Report data block cache hit rates apart from aggregate cache hit rates

    [ https://issues.apache.org/jira/browse/HBASE-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13693798#comment-13693798 ] 

Varun Sharma commented on HBASE-8370:
-------------------------------------

Here are some stats for this JIRA - I am arguing that the BlockCacheHit ratio number reported on a region server does not mean much.

"tbl.feeds.cf.home.bt.Index.fsBlockReadCnt" : 46864,
"tbl.feeds.cf.home.bt.Index.fsBlockReadCacheHitCnt" : 46864

Index Block cache hit ratio = 100 %

"tbl.feeds.cf.home.bt.Data.fsBlockReadCacheHitCnt" : 202
"tbl.feeds.cf.home.bt.Data.fsBlockReadCnt" : 247

Data Block cache hit ratio = 82 %

Overall Cache hit ration = (46864 + 202) / (46864 + 247) = 99 %

Since Indexes are hit often, cache hits are 100 % and also # of hits is high. The real number that we are concerned about, is 82 % which is hit rate on the data block. However, we continue to show the # 99 % on the region server console instead. I think we need to fix that number. Please let me know if folks object to this ?
                
> Report data block cache hit rates apart from aggregate cache hit rates
> ----------------------------------------------------------------------
>
>                 Key: HBASE-8370
>                 URL: https://issues.apache.org/jira/browse/HBASE-8370
>             Project: HBase
>          Issue Type: Improvement
>          Components: metrics
>            Reporter: Varun Sharma
>            Assignee: Varun Sharma
>            Priority: Minor
>
> Attaching from mail to dev@hbase.apache.org
> I am wondering whether the HBase cachingHitRatio metrics that the region server UI shows, can get me a break down by data blocks. I always see this number to be very high and that could be exagerated by the fact that each lookup hits the index blocks and bloom filter blocks in the block cache before retrieving the data block. This could be artificially bloating up the cache hit ratio.
> Assuming the above is correct, do we already have a cache hit ratio for data blocks alone which is more obscure ? If not, my sense is that it would be pretty valuable to add one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira