You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2014/12/30 09:04:13 UTC

[jira] [Reopened] (HBASE-12762) Region with no hfiles will have the highest locality cost in LocalityCostFunction

     [ https://issues.apache.org/jira/browse/HBASE-12762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell reopened HBASE-12762:
------------------------------------

No fix version for 0.98 was set on this issue but it was committed there. 

TestShell is failing on 0.98 since this change, please see https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/728 and https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/729 

> Region with no hfiles will have the highest locality cost in LocalityCostFunction
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-12762
>                 URL: https://issues.apache.org/jira/browse/HBASE-12762
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer
>    Affects Versions: 0.99.2
>            Reporter: cuijianwei
>            Assignee: cuijianwei
>            Priority: Minor
>             Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
>         Attachments: HBASE-12762-trunk.patch
>
>
> The locality cost of region will be computed in LocalityCostFunction.cost as:
> {code}
> double cost() {
>         ...
>         int index = -1;
>         for (int j = 0; j < regionLocations.length; j++) {
>           if (regionLocations[j] >= 0 && regionLocations[j] == serverIndex) {
>             index = j;
>             break;
>           }
>         }
>         if (index < 0) {
>           cost += 1;  // ==> region with no hfiles will have the highest cost
>         } else {
>           cost += (double) index / (double) regionLocations.length;
>         }
>         ...
>     }
> {code}
> The region with no hfiles(such as empty region) will have the highest cost which represents the worst case that region located in the server with no locality for hfiles. However, this might be the best case because there are no hlogs for the region. Although the absolute cost value won't affect the balance process, will it be more reasonable to have zero cost for such regions? such as:
> {code}
>    ...
>         if (index < 0) {
>           if (regionLocation.length > 0) { //  ==> only consider regions with hfiles
>               cost += 1;
>           }
>         } else {
>           cost += (double) index / (double) regionLocations.length;
>         }
>    ...
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)