You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Bryan Beaudreault (JIRA)" <ji...@apache.org> on 2015/12/22 00:45:46 UTC

[jira] [Commented] (HBASE-9310) Remove slop for Stochastic load balancer

    [ https://issues.apache.org/jira/browse/HBASE-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067272#comment-15067272 ] 

Bryan Beaudreault commented on HBASE-9310:
------------------------------------------

Are we sure this is actually fixed?

All this patch seems to have done is ensure that the StochasticLoadBalancer defaults to 0.001F. Looking at the code, this is not enough to guarantee that slop is disabled.

Here is a log from my cluster, where I have tried a value of 0.001 and smaller, but it still does not run:

{quote}
2015-12-21 23:38:01,161 TRACE org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer: Skipping load balancing because balanced cluster; servers=14 regions=4501 average=321.5 mostloaded=322 leastloaded=321
{quote}

At this point I looked at the code and see:

{code}
float average = cs.getLoadAverage(); // for logging
    int floor = (int) Math.floor(average * (1 - slop));
    int ceiling = (int) Math.ceil(average * (1 + slop));
    if (!(cs.getMaxLoad() > ceiling || cs.getMinLoad() < floor)) {
      NavigableMap<ServerAndLoad, List<HRegionInfo>> serversByLoad = cs.getServersByLoad();
      if (LOG.isTraceEnabled()) {
        // If nothing to balance, then don't say anything unless trace-level logging.
        LOG.trace("Skipping load balancing because balanced cluster; " +
          "servers=" + cs.getNumServers() +
          " regions=" + cs.getNumRegions() + " average=" + average +
          " mostloaded=" + serversByLoad.lastKey().getLoad() +
          " leastloaded=" + serversByLoad.firstKey().getLoad());
      }
      return false;
    }
    return true;
{code}

In my logline case above, ceiling=Math.ceil(321.8215)=322; floor=Math.floor(321.1785)=321. My maxloaded is 322 and least loaded is 321, so the balancer exits.

Considering there are other factors at play such as request load, locality, etc based on cost functions, we should be able to disable the slop check altogether.

Should I create a new JIRA or re-open this one?

> Remove slop for Stochastic load balancer
> ----------------------------------------
>
>                 Key: HBASE-9310
>                 URL: https://issues.apache.org/jira/browse/HBASE-9310
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>    Affects Versions: 0.98.0, 0.95.2
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>             Fix For: 0.98.0, 0.96.0
>
>         Attachments: HBASE-9310-0.patch
>
>
> The new load balancer already has the idea of some slop built in.  We shouldn't have two layers of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)