You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Dave Revell (JIRA)" <ji...@apache.org> on 2012/06/30 20:32:42 UTC

[jira] [Created] (HBASE-6298) Region balancer not balancing

Dave Revell created HBASE-6298:
----------------------------------

             Summary: Region balancer not balancing
                 Key: HBASE-6298
                 URL: https://issues.apache.org/jira/browse/HBASE-6298
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 0.94.0
            Reporter: Dave Revell
         Attachments: master_startup.log

Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.

The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6298) Region balancer not balancing

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404637#comment-13404637 ] 

stack commented on HBASE-6298:
------------------------------

Its disturbing the master is so bad at math.  Do you see where it went awry earlier in the logs?  Where master had regions as > 1k and then of a sudden starting thinking one region out on the cluster only?  Good on you D.
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6298) Region balancer not balancing

Posted by "Dave Revell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404634#comment-13404634 ] 

Dave Revell commented on HBASE-6298:
------------------------------------

@stack: yes, the problem was occurring before the restart. That's why I restarted the master, to try to get the master to start balancing again.
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6298) Region balancer not balancing

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405143#comment-13405143 ] 

Zhihong Ted Yu commented on HBASE-6298:
---------------------------------------

This issue should be asked on dev mailing list first.
There is a sloppy factor used by load balancer (defaulting to 20%):
{code}
  public void setConf(Configuration conf) {
    this.slop = conf.getFloat("hbase.regions.slop", (float) 0.2);
{code}
The average number of regions is 37 for Dave's cluster.
(44-37)/37=19%

@Dave:
I suggest you tighten "hbase.regions.slop"
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6298) Region balancer not balancing

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404939#comment-13404939 ] 

stack commented on HBASE-6298:
------------------------------

@Dave Want to try what the lads suggest?  Change config and restart master.  See if that fixes it?  If so, there is a problem w/ the table balancer it seems.
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (HBASE-6298) Region balancer not balancing

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405143#comment-13405143 ] 

Zhihong Ted Yu edited comment on HBASE-6298 at 7/2/12 5:25 PM:
---------------------------------------------------------------

This issue should be asked on dev mailing list first.
There is a sloppy factor used by load balancer (defaulting to 20%):
{code}
  public void setConf(Configuration conf) {
    this.slop = conf.getFloat("hbase.regions.slop", (float) 0.2);
{code}
The log message cited above came from this check:
{code}
    int floor = (int) Math.floor(average * (1 - slop));
    int ceiling = (int) Math.ceil(average * (1 + slop));
    if (serversByLoad.lastKey().getLoad() <= ceiling &&
       serversByLoad.firstKey().getLoad() >= floor) {
{code}

The average number of regions is 37 for Dave's cluster.
(44-37)/37=19%

@Dave:
I suggest you tighten "hbase.regions.slop"
                
      was (Author: zhihyu@ebaysf.com):
    This issue should be asked on dev mailing list first.
There is a sloppy factor used by load balancer (defaulting to 20%):
{code}
  public void setConf(Configuration conf) {
    this.slop = conf.getFloat("hbase.regions.slop", (float) 0.2);
{code}
The average number of regions is 37 for Dave's cluster.
(44-37)/37=19%

@Dave:
I suggest you tighten "hbase.regions.slop"
                  
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-6298) Region balancer not balancing

Posted by "Dave Revell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dave Revell resolved HBASE-6298.
--------------------------------

    Resolution: Invalid

Closing as invalid because there was never really a bug.
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6298) Region balancer not balancing

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404630#comment-13404630 ] 

stack commented on HBASE-6298:
------------------------------

Was the problem evident before the master restart Dave?  Its plain that the new master joining cluser is getting wonky picture of cluster state:

{code}
2012-06-30 14:11:46,326 - INFO  [IPC Server handler 4 on 7050:DefaultLoadBalancer@248] - Skipping load balancing because balanced cluster; servers=29 regions=1 average=0.03448276 mostloaded=1 leastloaded=0
{code}
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6298) Region balancer not balancing

Posted by "rajeshbabu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404717#comment-13404717 ] 

rajeshbabu commented on HBASE-6298:
-----------------------------------

Skipping load balancing because  "hbase.master.loadbalance.bytable" property set to true (true by default) when we set it to true, balancer calls balance table wise(only table regions and corresponding servers considered for balance). Some times even we start new region server also no use in these scenarios. if you configure "hbase.master.loadbalance.bytable" to false load will be balanced on all servers. 
Hope I am clear.
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6298) Region balancer not balancing

Posted by "Dave Revell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405399#comment-13405399 ] 

Dave Revell commented on HBASE-6298:
------------------------------------

I changed hbase.regions.slop to 0.1 and the imbalance lessened. So I see no reason to suspect a bug here. I was just unduly scared by the log messages and default slop setting.

Thanks @Zhihong and everyone for your help.
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6298) Region balancer not balancing

Posted by "Dave Revell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dave Revell updated HBASE-6298:
-------------------------------

    Attachment: master_startup.log

Attached log file
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6298) Region balancer not balancing

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404890#comment-13404890 ] 

ramkrishna.s.vasudevan commented on HBASE-6298:
-----------------------------------------------

@Dave/Stack
As Rajesh suggested pls see the hbase.master.loadbalance.bytable property. I think you 31 tables? few with 86 regions, few with 178 regions and few with 1 region.
                
> Region balancer not balancing
> -----------------------------
>
>                 Key: HBASE-6298
>                 URL: https://issues.apache.org/jira/browse/HBASE-6298
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: Dave Revell
>         Attachments: master_startup.log
>
>
> Despite regions being unbalanced, the load balancer takes no action. On my cluster the least-loaded regionserver has 33 regions and the most-loaded regionserver has 44 regions. My cluster has 1084 regions and 29 servers. It might be relevant that a 30th server used to belong to the cluster but was removed.
> The master log has some strange entries when the balancer runs. The attached log file was generated by restarting the master, then running "balancer" in the shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira