You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Wellington Chevreuil (Jira)" <ji...@apache.org> on 2019/12/09 15:49:00 UTC

[jira] [Resolved] (HBASE-23073) Add an optional costFunction to balance regions according to a capacity rule

     [ https://issues.apache.org/jira/browse/HBASE-23073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wellington Chevreuil resolved HBASE-23073.
------------------------------------------
    Resolution: Fixed

Merged into branch-1. Thanks for the contribution, [~PierreZ]!

> Add an optional costFunction to balance regions according to a capacity rule
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-23073
>                 URL: https://issues.apache.org/jira/browse/HBASE-23073
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>    Affects Versions: 3.0.0
>            Reporter: Pierre Zemb
>            Assignee: Pierre Zemb
>            Priority: Minor
>             Fix For: 3.0.0, 2.3.0, 1.6.0
>
>         Attachments: HBASE-23073.branch-1.0002.patch, HBASE-23073.branch-1.001.patch
>
>
> Based on the work in [HBASE-22618|https://issues.apache.org/jira/browse/HBASE-22618], users can now load custom costFunctions inside the main balancer used by HBase. As an example, we like like to add upstream an optional cost function called HeterogeneousRegionCountCostFunction that will deal with our issue: how to balance regions according to the capacity of a RS instead of using the RegionCountSkewCostFunction that is trying to avoid skew.
> A rule file is loaded from HDFS before balancing. It contains lines of rules. A rule is composed of a regexp for hostname, and a limit. For example, we could have:
> * rs[0-9] 200
> * rs1[0-9] 50 
> RegionServers with hostname matching the first rules will have a limit of 200, and the others 50. If there's no match, a default is set.
> Thanks to the rule, we have two informations: the max number of regions for this cluster, and the rules for each servers. HeterogeneousBalancer will try to balance regions according to their capacity.
> Let's take an example. Let's say that we have 20 RS:
>     10 RS, named through rs0 to rs9 loaded with 60 regions each, and each can handle 200 regions.
>     10 RS, named through rs10 to rs19 loaded with 60 regions each, and each can support 50 regions.
> Based on the following rules: 
>     rs[0-9] 200
>     rs1[0-9] 50
> The second group is overloaded, whereas the first group has plenty of space. Moving a region from the first group to the second should provide a lower cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)