You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Wang Qiang (JIRA)" <ji...@apache.org> on 2013/07/15 07:44:49 UTC
[jira] [Commented] (HBASE-8432) a table with unbalanced regions will balance indefinitely with the 'org.apache.hadoop.hbase.master.DefaultLoadBalancer'

    [ https://issues.apache.org/jira/browse/HBASE-8432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13708251#comment-13708251 ] 

Wang Qiang commented on HBASE-8432:
-----------------------------------

    // If we still have regions to dish out, assign underloaded to max
    if (0 < regionsToMove.size()) {
      for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server :
        serversByLoad.entrySet()) {
        int regionCount = server.getKey().getLoad();
        if(regionCount >= max) {
          break;
        }
        addRegionPlan(regionsToMove, fetchFromTail,
          server.getKey().getServerName(), regionsToReturn);
        if (emptyRegionServerPresent) {
          fetchFromTail = !fetchFromTail;
        }
        if (regionsToMove.isEmpty()) {
          break;
        }
      }
    }
the code above should change to the below, it hasn't check the balance info

    // If we still have regions to dish out, assign underloaded to max
    if (0 < regionsToMove.size()) {
      for (Map.Entry<ServerAndLoad, List<HRegionInfo>> server :
        serversByLoad.entrySet()) {
        int regionCount = server.getKey().getLoad();
        if(regionCount >= max) {
          break;
        }
        BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey().getServerName());
        if(balanceInfo != null) {
          regionCount += balanceInfo.getNumRegionsAdded();
        }
        if(regionCount >= max) {
        	continue;
        }
        addRegionPlan(regionsToMove, fetchFromTail,
          server.getKey().getServerName(), regionsToReturn);
        if (emptyRegionServerPresent) {
          fetchFromTail = !fetchFromTail;
        }
        if (regionsToMove.isEmpty()) {
          break;
        }
      }
    }
                
> a table with unbalanced regions will balance indefinitely with the 'org.apache.hadoop.hbase.master.DefaultLoadBalancer'
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8432
>                 URL: https://issues.apache.org/jira/browse/HBASE-8432
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>    Affects Versions: 0.94.5
>         Environment: Linux 2.6.32-el5.x86_64
>            Reporter: Wang Qiang
>            Priority: Critical
>         Attachments: patch_20130425_01.txt
>
>
> it happened that a table with unbalanced regions, as follows in my cluster(the cluster has 20 regionservers, the table has 12 regions):
> http://hadoopdev19.cm6:60030/	1
> http://hadoopdev8.cm6:60030/	2
> http://hadoopdev17.cm6:60030/	1
> http://hadoopdev12.cm6:60030/	1
> http://hadoopdev5.cm6:60030/	1
> http://hadoopdev9.cm6:60030/	1
> http://hadoopdev22.cm6:60030/	1
> http://hadoopdev11.cm6:60030/	1
> http://hadoopdev21.cm6:60030/	1
> http://hadoopdev16.cm6:60030/	1
> http://hadoopdev10.cm6:60030/	1
> with the 'org.apache.hadoop.hbase.master.DefaultLoadBalancer', after 5 times load-balances, the table are still unbalanced:
> http://hadoopdev3.cm6:60030/	1
> http://hadoopdev20.cm6:60030/	1
> http://hadoopdev4.cm6:60030/	2
> http://hadoopdev18.cm6:60030/	1
> http://hadoopdev12.cm6:60030/	1
> http://hadoopdev14.cm6:60030/	1
> http://hadoopdev15.cm6:60030/	1
> http://hadoopdev6.cm6:60030/	1
> http://hadoopdev13.cm6:60030/	1
> http://hadoopdev11.cm6:60030/	1
> http://hadoopdev10.cm6:60030/	1
> http://hadoopdev19.cm6:60030/	1
> http://hadoopdev17.cm6:60030/	1
> http://hadoopdev8.cm6:60030/	1
> http://hadoopdev5.cm6:60030/	1
> http://hadoopdev12.cm6:60030/	1
> http://hadoopdev22.cm6:60030/	1
> http://hadoopdev11.cm6:60030/	1
> http://hadoopdev21.cm6:60030/	1
> http://hadoopdev7.cm6:60030/	2
> http://hadoopdev10.cm6:60030/	1
> http://hadoopdev16.cm6:60030/	1
> http://hadoopdev3.cm6:60030/	1
> http://hadoopdev20.cm6:60030/	1
> http://hadoopdev4.cm6:60030/	1
> http://hadoopdev18.cm6:60030/	2
> http://hadoopdev12.cm6:60030/	1
> http://hadoopdev14.cm6:60030/	1
> http://hadoopdev15.cm6:60030/	1
> http://hadoopdev6.cm6:60030/	1
> http://hadoopdev13.cm6:60030/	1
> http://hadoopdev11.cm6:60030/	1
> http://hadoopdev10.cm6:60030/	1
> http://hadoopdev19.cm6:60030/	1
> http://hadoopdev8.cm6:60030/	1
> http://hadoopdev17.cm6:60030/	1
> http://hadoopdev12.cm6:60030/	1
> http://hadoopdev5.cm6:60030/	1
> http://hadoopdev22.cm6:60030/	1
> http://hadoopdev11.cm6:60030/	1
> http://hadoopdev7.cm6:60030/	1
> http://hadoopdev21.cm6:60030/	2
> http://hadoopdev16.cm6:60030/	1
> http://hadoopdev10.cm6:60030/	1
> http://hadoopdev3.cm6:60030/	1
> http://hadoopdev20.cm6:60030/	1
> http://hadoopdev18.cm6:60030/	1
> http://hadoopdev4.cm6:60030/	1
> http://hadoopdev12.cm6:60030/	1
> http://hadoopdev15.cm6:60030/	1
> http://hadoopdev14.cm6:60030/	2
> http://hadoopdev6.cm6:60030/	1
> http://hadoopdev13.cm6:60030/	1
> http://hadoopdev11.cm6:60030/	1
> http://hadoopdev10.cm6:60030/	1
> from the above logs, we can also find that some regions needn't move, but they moved. follow into 'org.apache.hadoop.hbase.master.DefaultLoadBalancer.balanceCluster()', I found that 'maxToTake' is error calculated. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira