You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/03/08 00:40:59 UTC

[jira] Created: (HBASE-3609) Improve the selection of regions to balance; part 2

Improve the selection of regions to balance; part 2
---------------------------------------------------

                 Key: HBASE-3609
                 URL: https://issues.apache.org/jira/browse/HBASE-3609
             Project: HBase
          Issue Type: Improvement
            Reporter: stack


See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018729#comment-13018729 ] 

stack commented on HBASE-3609:
------------------------------

Looking at patch, I see removal of testRandomizer.  Would be nice if a new test of new functionality took its place.

I like how you use regionid figuring region age.  Would suggest you add a little documentation that you depend on regionid being a timestamp (would also clarify what you are up to here).

Do you have to do this:

{code}+      boolean emptyRegionServerPresent) {code}

So this algorithm balances exclusively by age?   No randomness?

Would be nice if you described the algo in a comment included in the patch.

How we know this algo better than what we have?

Patch looks good otherwise Ted.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: hbase-3609-by-region-age.txt

Sort regions before fetching any region

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019204#comment-13019204 ] 

stack commented on HBASE-3609:
------------------------------

bq. Empty server can be detected within balanceCluster(). But this detection has been performed by Master, hence the flag.

Thats fair.  Its nice having the balance invocation method simple as possible though.

bq. The static regionId helps make each region Id unique. I actually utilized this fact to debug my code.

I missed that it was being used.

bq. Preliminary response from Stan Barton showed improvement over random selector.
I am waiting for further feedback from gaojinchao@huawei.com and Stan.

I think that if you get good feedback from others, that'll help getting this patch committed.

Good stuff Ted.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004254#comment-13004254 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

This is close to what I described in HBASE-3586.
I used MinMaxPriorityQueue to alternately choose regions from head and tail of regionsToMove.

Both TestLoadBalancer and TestAdmin pass.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-alternate.txt

Improved region selection using Double-Ended Priority Queues

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020075#comment-13020075 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

A brief summary can be found here: http://zhihongyu.blogspot.com/2011/04/load-balancer-in-hbase-090.html

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004140#comment-13004140 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

If someone can point me to an implementation of Double-Ended Priority Queue in Java, I can use it in the three loops involving regionsToMove.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: hbase-3609.txt

Initial attempt.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003842#comment-13003842 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

hbase-3609.txt implemented first part of the idea expressed in HBASE-3586 @ 06/Mar/11 07:10
The alternation between head and tail of regions list happens after each addition to regionsToMove. So this should provide good distribution of young and old regions.

The implementation for second part needs more refinement because regionsToMove is manipulated in three loops in balanceCluster().

TestLoadBalancer passes.

Please review.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004627#comment-13004627 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

I am thinking about adding this enum:
{code}
  public static enum BalancerPolicy {
	  BALANCE_REGION_COUNT,
	  BALANCE_REQUESTS_PER_MINUTE,
  }
{code}
Shall we introduce hbase.balancer.policy in hbase-default.xml which reflects the above enum ?


> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019193#comment-13019193 ] 

stack commented on HBASE-3609:
------------------------------

Ted:

Why do we pass the emptyRS flag?  Didn't we just insert an HServerInfo for the server with not regions into the assignments Map?  Isn't an HSI that has an empty array of Regions enough of a flag such that you don't need this extra boolean?

Is this used?

{code}
+  static int regionId = 0;
{code}

The new javadoc helps.

Otherwise patch looks good Ted.  It seems to be an ornamentation on what we had previous; rather than taking regionservers at random, it alternatively takes the newest and then the oldest off the regionserver -- is that right?  (Is that explained in the patch?  I don't think I can see it).  You've also added this enhancement: "Basically I find the new regions and put them on different underloaded servers. Previously one underloaded server would be filled up before the next underloaded server is considered."  Any chance of your proving the last enhancement with a unit test?

All the other load balancer tests pass?

Thanks.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020184#comment-13020184 ] 

stack commented on HBASE-3609:
------------------------------

@Ted I'm game for committing this to branch if you add javadoc explaining your new balance algorithm; the rules you've implemented.  Boil down your blog and use that.  I'm uneasy about the lack of unit test but if its too hard to do, I think its OK getting the balancer changes in now into TRUNK before we start in trying to harden 0.92.

Good work.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004082#comment-13004082 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

hbase-3609-by-region-age.txt is the refined version.
This accounts for the out-of-band regions which were assigned to the server after some other region server crashed.
In that scenario, random selection may not work very well because the regions on individual region server are not sorted.
Both TestLoadBalancer and TestAdmin pass.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004141#comment-13004141 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

I will try http://download.oracle.com/javase/6/docs/api/java/util/Deque.html

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment:     (was: 3609-alternate.txt)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019560#comment-13019560 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

>From Stan Barton who helps me experiment with my changes:

There is no easy
way to check how many regions are assigned to particular RS, so will
probably need to write some small parser to prove that.

I think we should backport HBASE-3704 (at least Regions by Region Server) to 0.90.3 so that people can easily tell how (un)even the load is distributed.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-double-alternation.txt

Removed the changes in Master.
The boolean flag can be computed in the first loop of balanceCluster()

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Work started: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HBASE-3609 started by Ted Yu.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-3609:
----------------------------

    Assignee: stack  (was: Ted Yu)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: stack
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment:     (was: 3609-double-alternation.txt)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004591#comment-13004591 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

I was looking for a class which can store moving average of requests/second.
Shall we add that policy in another JIRA ?

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004911#comment-13004911 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

Depending on the chosen policy, I plan to introduce interface, RegionPlanHolder, which abstracts the concrete data structures holding RegionPlans:
{code}
  interface RegionPlanHolder {
	  RegionPlanHolder create();
	  int size();
	  void add(RegionPlan rp);
	  // removes the head element
	  RegionPlan remove();
	  // removes the tail element
	  RegionPlan removeLast();
	  // retrieves element at index 
	  RegionPlan get(int index);
  }
{code}
Then regionsToMove would be declared as RegionPlanHolder so that implementation detail such as the following would be hidden in balanceCluster().
{code}
    MinMaxPriorityQueue<RegionPlan> regionsToMove = MinMaxPriorityQueue.orderedBy(rpComparator).create();
{code}


> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004966#comment-13004966 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

Since request count (HBASE-3507) is contained in HRegion, not HRegionInfo, there is more work to be done so that decaying moving average of request rate (in Ryan's term) is available to LoadBalancer.
This can be done in other JIRAs.

I suggest 3609-alternate.txt be checked in first.


> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-double-alternation.txt

Made changes to utilize the randomizer which shuffles the list of underloaded region servers.
This way we can avoid distributing offloaded regions to few region servers. 

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment:     (was: 3609-double-alternation.txt)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018438#comment-13018438 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

>From Stan Barton:
Apr 8th:
I can see that the regionserver that gets all the inserts makes pauses
time to time - I checked the log and it is because (I assume) the
regions of other tables from this RS are re-assigned elsewhere. When I
started inserting, the table was empty, now it has ~20 regions, all
assigned to one RS, that kicks the scale out property out of the
system. The insertion process is still running, if you are interested
in some other info.

Apr 11th:
I have downloaded the 0.90.2 candidate but I regret to inform you,
that it did not help to solve the issue. I am re-doing the tests and
again, all the newly created regions are being assigned to a single
RS. It is a real performance killer.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3609:
-------------------------

    Fix Version/s: 0.92.0
         Assignee: Ted Yu  (was: stack)
     Release Note: Balancer improvements over the previous purely random assignment balancer; distributes evenly from oldest and newest regions and does round robin loading a newly added server.
     Hadoop Flags: [Reviewed]
           Status: Patch Available  (was: In Progress)

Marking patch available 

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>             Fix For: 0.92.0
>
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-double-alternation.txt

This patch combines 3609-empty-RS.txt with my earlier enhancement.
Basically I find the new regions and put them on different underloaded servers. Previously one underloaded server would be filled up before the next underloaded server is considered.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment:     (was: 3609-double-alternation.txt)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004393#comment-13004393 ] 

stack commented on HBASE-3609:
------------------------------

Patch looks good Ted.  How you propose we add it in?  It'd be put in place instead of the randomizing balancing?  Or should it be a configuration where we load different balancing algoritms.  For example, I can imagine that some would like to balance based off requests/second.... rather than region count. 

Oh, you think a test would make sense for your stuff?

Good stuff.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-empty-RS.txt

Add javadoc for emptyRegionServerPresent.
Also explained how the new mechanism works.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment:     (was: 3609-double-alternation.txt)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018725#comment-13018725 ] 

stack commented on HBASE-3609:
------------------------------

I wonder why all regions of a table are assigned to one regionserver when we added randomization to the balancer in 0.90.2?  Didn't we? (i.e. HBASE-3586)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018735#comment-13018735 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

For unit test, the region Ids created by mockClusterServers() have the same value - because all the regions are created at almost the same time.
It will take a little more time to devise appropriate test case generation and validate the balancing moves.

For the moment, I still base my improvement on the existing framework where region count plays dominant factor.

One of the goals of this JIRA is to remove randomness from LoadBalancer so that we can deterministically produce near-optimal balancing actions.
The new parameter, emptyRegionServerPresent, helps decide whether we should move old and new regions to other servers.

I will upload a new patch where I describe the above in detail. I even plan to blog about the history of HBASE-3586 and this JIRA.

To validate my latest patch, I need a little help from community participants. Our use case creates hbase tables frequently in our flow with pre-split regions. Since those regions get round-robin assigned initially, it is not easy to reproduce what Stan experienced.

Thanks for the review and suggestion, Stack.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019199#comment-13019199 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

Empty server can be detected within balanceCluster(). But this detection has been performed by Master, hence the flag.

The static regionId helps make each region Id unique. I actually utilized this fact to debug my code.

Let me think more about writing unit test(s) that verifies the distribution over underloaded servers.

Both TestLoadBalancer and TestAdmin pass.

Preliminary response from Stan Barton showed improvement over random selector.
I am waiting for further feedback from gaojinchao@huawei.com and Stan.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu reassigned HBASE-3609:
-----------------------------

    Assignee: Ted Yu

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-empty-RS.txt

This version takes into account whether a new region server has just been discovered by master.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020211#comment-13020211 ] 

stack commented on HBASE-3609:
------------------------------

I am +1 on committing.  The doc is good.  Will wait to commit for a day or so in case others have opinion on this patch.  If you get feedback from others on how this balance change works for them, include it in the issue Ted.  Good stuff.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004967#comment-13004967 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

The goal of 3609-alternate.txt is to balance the 'ages' (reflected by sum of region Ids) of regions across region servers.
Currently HServerLoad doesn't contain HRegionInfo information. This makes validating the above goal less straightforward.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment:     (was: 3609-alternate.txt)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-double-alternation.txt

Added detailed javadoc for balanceCluster() describing what this JIRA has done.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004392#comment-13004392 ] 

stack commented on HBASE-3609:
------------------------------

On upgrading guava, thats no problem.  Just include in your patch a change to pom.xml that moves the guava version forward Ted.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-double-alternation.txt

Modified unit test so that regions generated have unique regionIds.
This is first step toward better verification of balancing results.
Also fixed a bug according to Stanislav Barton's feedback.


> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment:     (was: 3609-double-alternation.txt)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-double-alternation.txt

Removed statements used for debugging.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004153#comment-13004153 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

Can we upgrade Guava so that I can use this ?
http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/MinMaxPriorityQueue.html

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment: 3609-alternate.txt

Use latest Guava release.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3609:
--------------------------

    Attachment:     (was: 3609-empty-RS.txt)

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3609:
-------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed. Thanks for the patch Ted Yu

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>             Fix For: 0.92.0
>
>         Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018728#comment-13018728 ] 

Ted Yu commented on HBASE-3609:
-------------------------------

More background for Stan's cluster:
There're at least 600 regions on each region server. When 30 new regions were created on the same region server, random selector only chose 3 out of the 30 new regions for reassignment. The other region selection was from inactive (old) regions. This is expected behavior because new and old regions were selected equally probably.

Basically we traded some optimization for safety of not overloading a newly discovered region server.

My latest patch avoids the above behavior. At the same time, it also can deal with a region server which just joined the cluster and should be assigned both old and new regions.

> Improve the selection of regions to balance; part 2
> ---------------------------------------------------
>
>                 Key: HBASE-3609
>                 URL: https://issues.apache.org/jira/browse/HBASE-3609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Ted Yu
>         Attachments: 3609-alternate.txt, 3609-empty-RS.txt, hbase-3609-by-region-age.txt, hbase-3609.txt
>
>
> See 'HBASE-3586  Improve the selection of regions to balance' for discussion of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira