You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Jason Rutherglen (JIRA)" <ji...@apache.org> on 2011/06/29 00:31:29 UTC

[jira] [Commented] (HBASE-4038) Hot Region : Write Diagnosis

    [ https://issues.apache.org/jira/browse/HBASE-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056825#comment-13056825 ] 

Jason Rutherglen commented on HBASE-4038:
-----------------------------------------

@Nicolas Hot row handling would benefit greatly from a row level LRU cache, as described in the BigTable paper.  With a row cache, the 'cost' of the hotness (seeking into the block) will be minimized to a hash lookup.  Though agreed that general diagnosis will/could be required to turn on row caching.

> Hot Region : Write Diagnosis
> ----------------------------
>
>                 Key: HBASE-4038
>                 URL: https://issues.apache.org/jira/browse/HBASE-4038
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Riley Patterson
>            Priority: Minor
>
> We should provide a basic way for end users to operationally diagnose hot row problems.  Thinking about a 2-phase approach:
> 1. Diagnose hot regions
> 2. Inspect those regions/servers to find the hot rows.
> To diagnose hot regions, we could query the master or regionservers for these regions + sort.  To inspect the regions for hot rows, we could write another script to analyze the HLogs on a server and basically do: sort log|uniq -n|sort -n|top

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira