You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2018/03/13 15:43:00 UTC

[jira] [Commented] (HBASE-20186) Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup

    [ https://issues.apache.org/jira/browse/HBASE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397127#comment-16397127 ] 

Ted Yu commented on HBASE-20186:
--------------------------------

bq. could be reduced to O (m * logn)

Seems to be meaningful speedup.

Looking forward to the patch.

> Improve RSGroupBasedLoadBalancer#balanceCluster() to be more efficient when calculating cluster state for each rsgroup
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-20186
>                 URL: https://issues.apache.org/jira/browse/HBASE-20186
>             Project: HBase
>          Issue Type: Improvement
>          Components: rsgroup
>         Environment: In RSGroupBasedLoadBalancer
> {code}
> public List<RegionPlan> balanceCluster(Map<ServerName, List<RegionInfo>> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions which have been already placed according to the rsgroup assignment, and it is calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite efficient, as it does not make good use of the fact that servers in RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
>     if(curr.getAddress().equals(sName)) {
>       groupClusterState.put(curr, correctedState.get(curr));
>     }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each rsgroup in average, the code above has time complexity as O(m * n), while using TreeSet's contains(), the time complexity could be reduced to O (m * logn).
>            Reporter: Xiang Li
>            Assignee: Xiang Li
>            Priority: Minor
>
> In RSGroupBasedLoadBalancer
> {code}
> public List<RegionPlan> balanceCluster(Map<ServerName, List<RegionInfo>> clusterState)
> {code}
> The second half of the function is to calculate region move plan for regions which have been already placed according to the rsgroup assignment, and it is calculated one rsgroup after another.
> The following logic to check if a server belongs to the rsgroup is not quite efficient, as it does not make good use of the fact that servers in RSGroupInfo is a TreeSet.
> {code}
> for (Address sName : info.getServers()) {
>   for(ServerName curr: clusterState.keySet()) {
>     if(curr.getAddress().equals(sName)) {
>       groupClusterState.put(curr, correctedState.get(curr));
>     }
>   }
> }
> {code}
> Given there are m region servers in the cluster and n region servers for each rsgroup in average, the code above has time complexity as O(m * n), while using TreeSet's contains(), the time complexity could be reduced to O (m * logn).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)