You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Enis Soztutar (Commented) (JIRA)" <ji...@apache.org> on 2012/04/19 01:32:40 UTC
[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload

    [ https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257085#comment-13257085 ] 

Enis Soztutar commented on HBASE-5349:
--------------------------------------

I have been thinking about this, and I think we can have a shot at a simple implementation. Let me summarize what I have in mind before starting the implementation: 
Goals: 
 - Provide min - max heap percentages for block cache (memstore kind of has it). I think we should keep max-min sanity bounds, and if they are equal, disable auto-tuning. 
 - enable optimizing the available memory for adaptive workloads (mostly writes during the day, a lot of reads once MR job starts, etc). For example, when a large write job is started after ~10 minutes, region servers should tune for write workload. 
Non-goals: 
 - find the optimum mem-utilization algorithm
 - introduce a bunch of other parameters, to get rid of the current ones
 - make it very experimental so that nobody enables it in production. 

Ideally, to optimize the usage of the available memory, we should predict the future workload (possibly from past workload), and devise a model capturing all the costs associated with block cache hits / misses, flushes, compactions, etc. But this model will be very complex to do it properly.

I have checked Hypertable's implementation, and it seems that they check whether the load is read/write heavy by some hard coded values for the counters, and increment/decrement the mem limits, much like what Zhihong proposes above. I also want to start with something similar. 

Implementation layer: 
 - Currently global memstore limit is a soft limit, we may have to make it a hard limit (blocking writes)
 - we should enable incrementing / decrementing and setting global memstore and block cache maximum limits. We do not have live configuration changes, but regardless of auto-tuning, we should be able to manually set those online. 
 - Periodically we should check past workload (like past 10 min), and depending on whether it is write heavy or read heavy (from metrics), adjust the mem limits in small intervals. 

What do you guys think? Still worth pursuing?
                
> Automagically tweak global memstore and block cache sizes based on workload
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-5349
>                 URL: https://issues.apache.org/jira/browse/HBASE-5349
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.96.0
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache (our MemStores) and Block Cache based on the workload. If you need an image, scroll down at the bottom of this link: http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira