You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2011/03/01 20:11:37 UTC

[jira] Created: (HBASE-3586) Randomize the selection of regions to balance

Randomize the selection of regions to balance
---------------------------------------------

                 Key: HBASE-3586
                 URL: https://issues.apache.org/jira/browse/HBASE-3586
             Project: HBase
          Issue Type: Improvement
    Affects Versions: 0.90.1
            Reporter: Jean-Daniel Cryans
            Priority: Critical
             Fix For: 0.90.2


Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.

We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).

The part of the code that should be modified is:
{code}
for (HRegionInfo hri: regions) {
  // Don't rebalance meta regions.
  if (hri.isMetaRegion()) continue; 
  regionsToMove.add(new RegionPlan(hri, serverInfo, null));
  numTaken++;
  if (numTaken >= numToOffload) break;
}
{code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003823#comment-13003823 ] 

stack commented on HBASE-3586:
------------------------------

Thanks for review Ted.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003110#comment-13003110 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

Although random selection statistically achieves balance, I still prefer a deterministic approach.
Consider the following variant for my patches:
Instead of using the following loop to fill out regionsToMove:
{code}
      for (int i = sz-1; i >= 0; i--) {
    	  HRegionInfo hri = regions.get(i);
{code}
We alternate between the head and tail of regions (picking both young and old ones).

We still keep the following sort():
{code}
    Collections.sort(regionsToMove, rpComparator);
{code}
We then iterate through underloadedServers repeatedly, doing the following action alternately:
picking one region from head of regionsToMove in passes 1, 3, 5, etc
picking one region from tail of regionsToMove in passes 2, 4, 6, etc.

E.g. suppose RS1 has regions with region Ids of 42, 54, 105 and 201
RS2 has regions with region Ids of 34, 104, 110 and 154
Suppose we need to offload some regions to RS3 and RS4 which didn't carry regions.

regionsToMove would contain regions (201, 42, 154, 34) before sorting and regions (201, 154, 42, 34) after sorting.
Then we assign
region 201 to RS3, region 154 to RS4
region 34 to RS3, region 42 to RS4

Please comment.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003093#comment-13003093 ] 

stack commented on HBASE-3586:
------------------------------

Yeah, agree w/ Ryan that we should try random as baseline.  Lets do other policies in a different issue (Thanks for testing your patch Ted and finding that it may be suboptimal).



> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3586:
--------------------------

    Attachment: hbase-3586-with-sort.txt

Added import statement.

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HBASE-3586) Improve the selection of regions to balance

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-3586:
----------------------------------

       Tags:   (was: noob)
    Summary: Improve the selection of regions to balance  (was: Randomize the selection of regions to balance)

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003612#comment-13003612 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

Random selection approach would make debugging harder.

A patch using random selection is welcome.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-3586:
----------------------------------

    Attachment:     (was: HBASE-3586-by-region-age.patch)

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-3586:
----------------------------------

    Attachment: HBASE-3586-by-region-age.patch

Something like so?

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Reopened: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu reopened HBASE-3586:
---------------------------


The patch actually aggravates unbalanced load in a situation where one or more region servers weren't assigned (any) region and a table with multiple regions is created.
All the new regions from this table would be assigned to the region servers which didn't carry any region.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003074#comment-13003074 ] 

ryan rawson commented on HBASE-3586:
------------------------------------

Can you try a random approach? Often random can be more predictable and not
have weak edge cases that different use patterns can tickle.


> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006283#comment-13006283 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

I ran the random region selector on staging cluster (15 region servers, 4600 regions).
In terms of region count, load is balanced.
I observed that a few region servers received around 0 requests even though they carried some regions of the table which was actively written to.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001047#comment-13001047 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

See my comment at 29/Jan/11 00:53 in HBASE-3373

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003686#comment-13003686 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

I feel sweat on my back :-)

I am still a beginner. Let's make HBase better together.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002354#comment-13002354 ] 

Andrew Purtell commented on HBASE-3586:
---------------------------------------

@Ted: So maybe you should post a patch here? :-)

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005589#comment-13005589 ] 

Hudson commented on HBASE-3586:
-------------------------------

Integrated in HBase-TRUNK #1781 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1781/])
    

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3586) Improve the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3586:
-------------------------

    Attachment: 3586-randomize.txt

How is this.  Adds a test too.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003092#comment-13003092 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

One of the goals of load balancer is to minimize the number of region movements so that scanner timeout is minimized.
The current approach to balancing young regions aligns better with this goal.
For random approach, we may need to constantly move some regions to achieve balanced load.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003600#comment-13003600 ] 

stack commented on HBASE-3586:
------------------------------

I'd suggest we checkin random for now then experiment thereafter?

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3586:
--------------------------

    Attachment:     (was: hbase-3586-with-sort.txt)

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001093#comment-13001093 ] 

Jean-Daniel Cryans commented on HBASE-3586:
-------------------------------------------

Ah ok that wasn't clear, well let's try it out!

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3586:
--------------------------

    Attachment: hbase-3586-table-creation.txt

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-3586:
----------------------------------

    Attachment: HBASE-3586-by-region-age.patch

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006493#comment-13006493 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

Right.
I am continuing my effort toward better load balancing through HBASE-3507, etc.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3586:
--------------------------

    Attachment: hbase-3586-with-sort.txt

Per Andrew's request, here is my patch.
Pardon me for possible white space issue.

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003075#comment-13003075 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

@Ryan: I think you're referring to my first patch. I partially agree. We may provide (at least two) policies - one favoring moving young regions and the other doing random region selection.

I think my second patch establishes condition for the first to function as expected.
I would like to hear about other use patterns which are not covered by both patches.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001087#comment-13001087 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

That's what I meant. LoadBalancer should select/move young regions.

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001075#comment-13001075 ] 

Jean-Daniel Cryans commented on HBASE-3586:
-------------------------------------------

You're talking of sorting regionsToMove by creation time, in this jira I'm talking about filling this structure by choosing regions in a slightly different way. Unless that's what you meant too, that we should choose regions from those lists according to their creation time?

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003099#comment-13003099 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

I assume the above situation happened without my first patch :-)
Can you describe how that server had lower than average number of regions ?

There're two scenarios which would make my first patch suboptimal:
1. After cluster starts up, some region servers aren't immediately detected by master as being online. Therefore they carry no regions.
2. When region server crashes and is brought back up back monitoring script.
If one or more tables get created immediately after either of the above two scenarios, balancer would assign (young) regions from the new table to the few region servers which didn't carry regions.


> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002332#comment-13002332 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

Looking at how HRegionInfo gets added into AssignmentManager.servers:
{code} 
  private void addToServers(final HServerInfo hsi, final HRegionInfo hri) {
    List<HRegionInfo> hris = servers.get(hsi);
    if (hris == null) {
      hris = new ArrayList<HRegionInfo>();
      servers.put(hsi, hris);
    }
    hris.add(hri);
  }
{code} 
I think we can traverse List<HRegionInfo> in reverse order because young regions are added to the tail of the List.
I have this:
{code}
      int sz = regions.size();
      for (int i = sz-1; i >= 0; i--) {
    	  HRegionInfo hri = regions.get(i);
{code}

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002360#comment-13002360 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

I can try.
But I need to trim down my changes for HBASE-3373 :-)
I don't want to repeat the experience of my previous submission. So I would wait for other dev's comment first.

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HBASE-3586) Improve the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3586:
-------------------------

    Attachment: 3586-randomize-v2.txt

Here is version that does not include pollution patching FSUtils.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002465#comment-13002465 ] 

stack commented on HBASE-3586:
------------------------------

+1 on Ted's patch.  LGTM.

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-3586:
----------------------------------

    Attachment: HBASE-3586-by-region-age.patch

Sure that's lighter weight. Your idea as patch.

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003065#comment-13003065 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

Although balanceCluster() can be made more complex by considering both old and new regions, the new patch achieves the same effect.
When creating a table with multiple regions, I check whether there're online region servers which don't carry region (this can be relaxed by introducing a threshold which separates overloaded and underloaded servers). If there're such servers, balance() is called to balance the (relatively old) regions.
Since assignmentManager.assignUserRegions() uses round-robin assignment, cluster would still be balanced when createTable() returns.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003703#comment-13003703 ] 

stack commented on HBASE-3586:
------------------------------

I opened new issue for continuing the above balancer explorations: HBASE-3609

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003681#comment-13003681 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

Overall +1

For TestLoadBalancer.java, I would use simpler form because LoadBalancer.randomize() is only called once for each server in balanceCluster():
{code}
        List<HRegionInfo> copy = new ArrayList<HRegionInfo>(original);
        List<HRegionInfo> randomized = LoadBalancer.randomize(copy);
        if (original.equals(randomized)) {
        	assertFalse(e.getKey().toString() + " has identical region list", true);
        }
{code}
I removed some logging which should have been done using LOG.info().
I also ran TestAdmin which passed.

This approach is smart and should avoid common pitfalls statistically.

Let's observe its efficacy in production. 


> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006471#comment-13006471 ] 

stack commented on HBASE-3586:
------------------------------

@Ted So you tested the patch that was submitted here?  The random assignment?  The balancer only does count of regions, not load on the regions, so yes, I'd imagine that its possible some regions would be taking no load.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003683#comment-13003683 ] 

Andrew Purtell commented on HBASE-3586:
---------------------------------------

@Ted: Thank you for your contributions.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003094#comment-13003094 ] 

stack commented on HBASE-3586:
------------------------------

Oh, just to say that at our house, we had a pathological condition where balancer ended up assigning all regions for a table to one server, then as it split, the balancer kept landing the new splits back to the same origin server.  Chatting, random would have done for our case.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003665#comment-13003665 ] 

ryan rawson commented on HBASE-3586:
------------------------------------

+1 lgtm

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Resolved: (HBASE-3586) Improve the selection of regions to balance

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-3586.
--------------------------

    Resolution: Fixed

Committed patch to branch and trunk.  Thanks for review Ryan.  Lets open new issue Ted to do other balancing algorithms.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Resolved: (HBASE-3586) Improve the selection of regions to balance

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell resolved HBASE-3586.
-----------------------------------

      Resolution: Fixed
        Assignee: Ted Yu
    Hadoop Flags: [Reviewed]

Committed to trunk and 0.90 branch.

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002370#comment-13002370 ] 

Andrew Purtell commented on HBASE-3586:
---------------------------------------

I wouldn't wait for comment, you have someone giving you a +1 on approach. Next we need a patch to test.

> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003611#comment-13003611 ] 

Hudson commented on HBASE-3586:
-------------------------------

Integrated in HBase-TRUNK #1771 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1771/])
    

> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3586) Randomize the selection of regions to balance

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002345#comment-13002345 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

I also have the following:
{code}
    // put young regions at the beginning of regionsToMove
    Collections.sort(regionsToMove, rpComparator);

    // Walk down least loaded, filling each to the min
{code}
where RegionPlanComparator is defined as:
{code}
  static class RegionPlanComparator implements Comparator<RegionPlan> {
      @Override
      public int compare(RegionPlan l, RegionPlan r) {
    	  long diff = r.getRegionInfo().getRegionId() - l.getRegionInfo().getRegionId();
    	  if (diff < 0) return -1;
    	  if (diff > 0) return 1;
    	  return 0;
      }	  
  }
  static RegionPlanComparator rpComparator = new RegionPlanComparator();
{code}


> Randomize the selection of regions to balance
> ---------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table starting with letter "p" which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter "a" (and the regions that were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira