You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Xiaolin Ha (Jira)" <ji...@apache.org> on 2021/05/17 09:38:00 UTC

[jira] [Comment Edited] (HBASE-25739) TableSkewCostFunction need to use aggregated deviation

    [ https://issues.apache.org/jira/browse/HBASE-25739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346017#comment-17346017 ] 

Xiaolin Ha edited comment on HBASE-25739 at 5/17/21, 9:37 AM:
--------------------------------------------------------------

Hi, [~claraxiong], as we have discussed in [https://github.com/apache/hbase/pull/3260] , you have mentioned that you used the same aggregation as RegionCountSkewCostFunction for each table? If we set "hbase.master.loadbalance.bytable" be true, then I think the cost of RegionCountSkewCostFunction is for each table, do you think so?  For our production clusters, we set "hbase.master.loadbalance.bytable" be true, and from the balanced results until now, each table can be balanced well. So I'm confused of the question you have mentioned that byTable option doesn't work on your clusters, could you explain a bit? 


was (Author: xiaolin ha):
Hi, [~claraxiong], as we have discussed in [https://github.com/apache/hbase/pull/3260,] you have mentioned that you used the same aggregation as RegionCountSkewCostFunction for each table? If we set "hbase.master.loadbalance.bytable" be true, then I think the cost of RegionCountSkewCostFunction is for each table, do you think so?  For our production clusters, we set "hbase.master.loadbalance.bytable" be true, and from the balanced results until now, each table can be balanced well. So I'm confused of the question you have mentioned that byTable option doesn't work on your clusters, could you explain a bit? 

> TableSkewCostFunction need to use aggregated deviation
> ------------------------------------------------------
>
>                 Key: HBASE-25739
>                 URL: https://issues.apache.org/jira/browse/HBASE-25739
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Balancer, master
>            Reporter: Clara Xiong
>            Assignee: Clara Xiong
>            Priority: Major
>         Attachments: TEST-org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancerBalanceCluster.xml, org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancerBalanceCluster.txt
>
>
> TableSkewCostFunction uses the sum of the max deviation region per server for all tables as the measure of unevenness. It doesn't work in a very common scenario in operations. Say we have 100 regions on 50 nodes, two on each. We add 50 new nodes and they have 0 each. The max deviation from the mean is 1, compared to 99 in the worst case scenario of 100 regions on a single server. The normalized cost is 1/99 = 0.011 < default threshold of 0.05. Balancer wouldn't move.  The proposal is to use aggregated deviation of the count per region server to detect this scenario, generating a cost of 100/198 = 0.5 in this case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)