You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Kyle Purtell (Jira)" <ji...@apache.org> on 2022/06/16 18:11:00 UTC

[jira] [Resolved] (HBASE-9601) Use a faster balancer when the imbalance is in the hundreds of regions

     [ https://issues.apache.org/jira/browse/HBASE-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Kyle Purtell resolved HBASE-9601.
----------------------------------------
    Resolution: Later

> Use a faster balancer when the imbalance is in the hundreds of regions
> ----------------------------------------------------------------------
>
>                 Key: HBASE-9601
>                 URL: https://issues.apache.org/jira/browse/HBASE-9601
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer
>    Affects Versions: 0.96.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Major
>
> Something I'm noticing is that the new balancer is good at optimizing the balance when it needs to move a handful of regions, but once the imbalance is in the hundreds then it might be better if we used something speedier.
> For example, I have a small 5 nodes cluster that I use to test with a lot of regions, 10k to be precise. The average load should be 2000, but killing one RS will make the average go to 2500, and when the RS comes back there's 2000 regions to move. When I call the balancer it spends 30 seconds to be able to move 150-300 regions, so in order to go back to a good balancer cluster-wide it takes me a few runs, but if the balancer was doing it by itself (with the 5 minutes wait), it could take hours. Maybe not a bad thing in prod although getting your locality back might be a good idea as well as offloading the machines.
> So it seems that we need to be able to detect this situation and only balance based on load average, and maybe locality (so that the original regions would move back, hopefully).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)