You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2011/06/28 00:47:29 UTC

[jira] [Created] (HBASE-4038) Hot Region Diagnosis

Hot Region Diagnosis
--------------------

                 Key: HBASE-4038
                 URL: https://issues.apache.org/jira/browse/HBASE-4038
             Project: HBase
          Issue Type: Improvement
          Components: client, regionserver
    Affects Versions: 0.92.0
            Reporter: Nicolas Spiegelberg
            Assignee: Nicolas Spiegelberg
            Priority: Minor


We should provide a basic way for end users to operationally diagnose hot row problems.  Thinking about a 2-phase approach:

1. Diagnose hot regions
2. Inspect those regions/servers to find the hot rows.

To diagnose hot regions, we could query the master or regionservers for these regions + sort.  To inspect the regions for hot rows, we could write another script to analyze the HLogs on a server and basically do: sort log|uniq -n|sort -n|top

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4038) Hot Region Diagnosis

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055984#comment-13055984 ] 

Jason Rutherglen commented on HBASE-4038:
-----------------------------------------

Couldn't this be done by keeping track of 'hot' blocks?  Statistics on the usage of blocks via the LRU block cache?

> Hot Region Diagnosis
> --------------------
>
>                 Key: HBASE-4038
>                 URL: https://issues.apache.org/jira/browse/HBASE-4038
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>
> We should provide a basic way for end users to operationally diagnose hot row problems.  Thinking about a 2-phase approach:
> 1. Diagnose hot regions
> 2. Inspect those regions/servers to find the hot rows.
> To diagnose hot regions, we could query the master or regionservers for these regions + sort.  To inspect the regions for hot rows, we could write another script to analyze the HLogs on a server and basically do: sort log|uniq -n|sort -n|top

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4038) Hot Region : Write Diagnosis

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056825#comment-13056825 ] 

Jason Rutherglen commented on HBASE-4038:
-----------------------------------------

@Nicolas Hot row handling would benefit greatly from a row level LRU cache, as described in the BigTable paper.  With a row cache, the 'cost' of the hotness (seeking into the block) will be minimized to a hash lookup.  Though agreed that general diagnosis will/could be required to turn on row caching.

> Hot Region : Write Diagnosis
> ----------------------------
>
>                 Key: HBASE-4038
>                 URL: https://issues.apache.org/jira/browse/HBASE-4038
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Riley Patterson
>            Priority: Minor
>
> We should provide a basic way for end users to operationally diagnose hot row problems.  Thinking about a 2-phase approach:
> 1. Diagnose hot regions
> 2. Inspect those regions/servers to find the hot rows.
> To diagnose hot regions, we could query the master or regionservers for these regions + sort.  To inspect the regions for hot rows, we could write another script to analyze the HLogs on a server and basically do: sort log|uniq -n|sort -n|top

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4038) Hot Region : Write Diagnosis

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-4038:
---------------------------------------

    Summary: Hot Region : Write Diagnosis  (was: Hot Region Diagnosis)

> Hot Region : Write Diagnosis
> ----------------------------
>
>                 Key: HBASE-4038
>                 URL: https://issues.apache.org/jira/browse/HBASE-4038
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Riley Patterson
>            Priority: Minor
>
> We should provide a basic way for end users to operationally diagnose hot row problems.  Thinking about a 2-phase approach:
> 1. Diagnose hot regions
> 2. Inspect those regions/servers to find the hot rows.
> To diagnose hot regions, we could query the master or regionservers for these regions + sort.  To inspect the regions for hot rows, we could write another script to analyze the HLogs on a server and basically do: sort log|uniq -n|sort -n|top

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4038) Hot Region Diagnosis

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056662#comment-13056662 ] 

Todd Lipcon commented on HBASE-4038:
------------------------------------

Analyzing the HLogs only gets you hot write, not hot read, right?

An RPC sampling approach would be nice. For example, a boolean which can be flipped at runtime to enable reservoir sampling of RPCs, and a servlet which can dump a representative set from the last few minutes.

> Hot Region Diagnosis
> --------------------
>
>                 Key: HBASE-4038
>                 URL: https://issues.apache.org/jira/browse/HBASE-4038
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>
> We should provide a basic way for end users to operationally diagnose hot row problems.  Thinking about a 2-phase approach:
> 1. Diagnose hot regions
> 2. Inspect those regions/servers to find the hot rows.
> To diagnose hot regions, we could query the master or regionservers for these regions + sort.  To inspect the regions for hot rows, we could write another script to analyze the HLogs on a server and basically do: sort log|uniq -n|sort -n|top

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4038) Hot Region Diagnosis

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056701#comment-13056701 ] 

Nicolas Spiegelberg commented on HBASE-4038:
--------------------------------------------

@Todd:  Correct.  We also have a separate internal task to look at a runtime-enabled sampling approach for hot read diagnosis.  However, right now, our main applications with non-uniform distribution are heavily write-dominant so write analysis is more important for us.

@Jason: Tracking Block Cache usage would give us hot read analysis.  You would have the same problem where there is not a 1:1 Block:Row mapping, so you would need further investigation either way.  Really, you want general read/write request stats so you know which servers to drill down into.  Note that the metrics necessary for this approach could also be used by the load balancer.

> Hot Region Diagnosis
> --------------------
>
>                 Key: HBASE-4038
>                 URL: https://issues.apache.org/jira/browse/HBASE-4038
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>
> We should provide a basic way for end users to operationally diagnose hot row problems.  Thinking about a 2-phase approach:
> 1. Diagnose hot regions
> 2. Inspect those regions/servers to find the hot rows.
> To diagnose hot regions, we could query the master or regionservers for these regions + sort.  To inspect the regions for hot rows, we could write another script to analyze the HLogs on a server and basically do: sort log|uniq -n|sort -n|top

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-4038) Hot Region Diagnosis

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg reassigned HBASE-4038:
------------------------------------------

    Assignee: Riley Patterson  (was: Nicolas Spiegelberg)

> Hot Region Diagnosis
> --------------------
>
>                 Key: HBASE-4038
>                 URL: https://issues.apache.org/jira/browse/HBASE-4038
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Riley Patterson
>            Priority: Minor
>
> We should provide a basic way for end users to operationally diagnose hot row problems.  Thinking about a 2-phase approach:
> 1. Diagnose hot regions
> 2. Inspect those regions/servers to find the hot rows.
> To diagnose hot regions, we could query the master or regionservers for these regions + sort.  To inspect the regions for hot rows, we could write another script to analyze the HLogs on a server and basically do: sort log|uniq -n|sort -n|top

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira