You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Clara Xiong (Jira)" <ji...@apache.org> on 2021/10/01 20:15:00 UTC
[jira] [Comment Edited] (HBASE-25625) StochasticBalancer CostFunctions needs a better way to evaluate resource distribution

    [ https://issues.apache.org/jira/browse/HBASE-25625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17423413#comment-17423413 ] 

Clara Xiong edited comment on HBASE-25625 at 10/1/21, 8:14 PM:
---------------------------------------------------------------

The problem surfaced again on a few different clusters where balancer keeps getting triggered by cohosted replicas. https://issues.apache.org/jira/browse/HBASE-26309 

When balancer has to satisfy other constraints, even region count distribution just cannot be guaranteed, as in existing test case TestStochasticLoadBalancerRegionReplicaWithRacks. Because replica distribution has much higher weight than region count skew, the rack with fewer servers tend to get more regions than those with more servers.

In this test case, server 0 and 1 are on the same rack while 2 and 3 are on each's rack because servers cannot be place completely evenly. The resulted region count distribution can be [2,2, 4, 4] or be[1, 3, 4, 4]so that we have no replicas of the same region on the first rack. So we have to have fewer regions per server on the first two servers. With the current algorithm, the costs of two plan are the same for region count skew because only linear deviation to ideal average is considered. It can get much more extreme when we have 8 servers for this test case: [1,3,3,3,5]or [2,2,3,3,5] depending on the random walk. But since the algorithm says they are the same expense for region count skew, balancer can be stuck at the former. The more servers we have, as long as the RS counts are not completely even, which happens all the time, the more variation of results we will see depending the random walk. But once we reach the extreme case, balancer is stuck because the cost function says moving doesn't gain.

I am proposing using the sum of square of deviation for load functions, inline with replica cost functions. we don't need standard deviation so we can keep it simple and fast.


was (Author: claraxiong):
The problem surfaced again on a few different clusters where balancer keeps getting triggered by cohosted replicas. https://issues.apache.org/jira/browse/HBASE-26309 

When balancer has to satisfy other constraints, even region count distribution just cannot be guaranteed, as in existing test case TestStochasticLoadBalancerRegionReplicaWithRacks. Because replica distribution has much higher weight than region count skew, the rack with fewer servers tend to get more regions than those with more servers.

In this test case, server 0 and 1 are on the same rack while 2 and 3 are on each's rack because servers cannot be place completely evenly. The resulted region count distribution can be [2,2, 4, 4] or be[1, 3, 4, 4]so that we have no replicas of the same region on the first rack. So we have to have fewer regions per server on the first two servers. With the current algorithm, the costs of two plan are the same for region count skew because only linear deviation to ideal average is considered. It can get much more extreme when we have 8 servers for this test case: [1,3,3,3,5]or [2,2,3,3,5] depending on the random walk. But since the algorithm says they are the same expense for region count skew, balancer can be stuck at the former. The more servers we have, the more variation of results we will see depending the random walk. But once we reach the extreme case, balancer is stuck because the cost function says moving doesn't gain.

I am proposing using the sum of square of deviation for load functions, inline with replica cost functions. we don't need standard deviation so we can keep it simple and fast.

> StochasticBalancer CostFunctions needs a better way to evaluate resource distribution
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-25625
>                 URL: https://issues.apache.org/jira/browse/HBASE-25625
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer, master
>            Reporter: Clara Xiong
>            Assignee: Clara Xiong
>            Priority: Major
>
> Currently CostFunctions including RegionCountSkewCostFunctions, PrimaryRegionCountSkewCostFunctions and all load cost functions calculate the unevenness of the distribution by getting the sum of deviation per region server. This simple implementation works when the cluster is small. But when the cluster get larger with more region servers and regions, it doesn't work well with hot spots or a small number of unbalanced servers. The proposal is to use the standard deviation of the count per region server to capture the existence of a small portion of region servers with overwhelming load/allocation.
> TableSkewCostFunction uses the sum of the max deviation region per server for all tables as the measure of unevenness. It doesn't work in a very common scenario in operations. Say we have 100 regions on 50 nodes, two on each. We add 50 new nodes and they have 0 each. The max deviation from the mean is 1, compared to 99 in the worst case scenario of 100 regions on a single server. The normalized cost is 1/99 = 0.011 < default threshold of 0.05. Balancer wouldn't move.  The proposal is to use the standard deviation of the count per region server to detect this scenario, generating a cost of 3.1/31 = 0.1 in this case.
> Patch is in test and will follow shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)