You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/01/05 22:38:45 UTC
[jira] Created: (HBASE-3422) Balancer will willing try to rebalance
thousands of regions in one go; needs an upper bound added.
Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.
--------------------------------------------------------------------------------------------------
Key: HBASE-3422
URL: https://issues.apache.org/jira/browse/HBASE-3422
Project: HBase
Issue Type: Improvement
Affects Versions: 0.90.0
Reporter: stack
See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3422) Balancer will willing try to
rebalance thousands of regions in one go; needs an upper bound added.
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008124#comment-13008124 ]
Ted Yu commented on HBASE-3422:
-------------------------------
How about introducing hbase.balancer.maxregions.perround ?
> Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3422) Balancer will try to rebalance
thousands of regions in one go; needs an upper bound added.
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008673#comment-13008673 ]
Ted Yu commented on HBASE-3422:
-------------------------------
Summary of chat with Stack on IRC (see http://pastebin.com/4uK9M1Z7):
Since it is difficult to estimate the appropriate number of regions to balance in one invocation of balance(), I resort to respecting the hbase.balancer.period
Another option would be to limit execution time of balance() to certain percentage of hbase.balancer.period
But that would introduce another parameter which complicates our scenario.
> Balancer will try to rebalance thousands of regions in one go; needs an upper bound added.
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
> Attachments: hbase-3422.txt
>
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HBASE-3422) Balancer will willing try to rebalance
thousands of regions in one go; needs an upper bound added.
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-3422:
--------------------------
Attachment: hbase-3422.txt
First attempt of using heuristics to decide whether executing the next RegionPlan would make single balancer() call too long.
> Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
> Attachments: hbase-3422.txt
>
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3422) Balancer will try to rebalance
thousands of regions in one go; needs an upper bound added.
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008702#comment-13008702 ]
Ted Yu commented on HBASE-3422:
-------------------------------
Related unit tests: TestMasterObserver, TestMultiParallel, TestLoadBalancer and TestRegionRebalancing all pass.
> Balancer will try to rebalance thousands of regions in one go; needs an upper bound added.
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
> Attachments: hbase-3422.txt
>
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (HBASE-3422) Balancer will try to rebalance
thousands of regions in one go; needs an upper bound added.
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-3422.
--------------------------
Resolution: Fixed
Fix Version/s: 0.92.0
Hadoop Flags: [Reviewed]
Lets try it Ted. On commit I added in being able to set in config. an explicit limit on how long balancer would run but that default is that this is not specified. I also added logging (DEBUG) for when balancer is cutoff because it ran out of time.
Thanks for the patch Ted. Committed to TRUNK.
> Balancer will try to rebalance thousands of regions in one go; needs an upper bound added.
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
> Fix For: 0.92.0
>
> Attachments: hbase-3422.txt
>
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Assigned: (HBASE-3422) Balancer will willing try to
rebalance thousands of regions in one go; needs an upper bound added.
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu reassigned HBASE-3422:
-----------------------------
Assignee: Ted Yu
> Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3422) Balancer will willing try to
rebalance thousands of regions in one go; needs an upper bound added.
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008131#comment-13008131 ]
stack commented on HBASE-3422:
------------------------------
Sounds good Ted. Should not apply to the bulk assign on startup though. Good stuff.
> Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3422) Balancer will willing try to
rebalance thousands of regions in one go; needs an upper bound added.
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008447#comment-13008447 ]
Ted Yu commented on HBASE-3422:
-------------------------------
Currently it is possible for one HMaster.balance() call to last longer than hbase.balancer.period
We should limit the execution time of HMaster.balance() by hbase.balancer.period
Is this equivalent to introducing hbase.balancer.maxregions.perround ?
> Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HBASE-3422) Balancer will try to rebalance
thousands of regions in one go; needs an upper bound added.
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-3422:
--------------------------
Component/s: master
Summary: Balancer will try to rebalance thousands of regions in one go; needs an upper bound added. (was: Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.)
> Balancer will try to rebalance thousands of regions in one go; needs an upper bound added.
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
> Attachments: hbase-3422.txt
>
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3422) Balancer will willing try to
rebalance thousands of regions in one go; needs an upper bound added.
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008286#comment-13008286 ]
stack commented on HBASE-3422:
------------------------------
@Ted I like that idea.
> Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Work started: (HBASE-3422) Balancer will willing try to
rebalance thousands of regions in one go; needs an upper bound added.
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HBASE-3422 started by Ted Yu.
> Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
> Attachments: hbase-3422.txt
>
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3422) Balancer will try to rebalance
thousands of regions in one go; needs an upper bound added.
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008890#comment-13008890 ]
Hudson commented on HBASE-3422:
-------------------------------
Integrated in HBase-TRUNK #1798 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1798/])
HBASE-3422 Balancer will try to rebalance thousands of regions in one go; needs an upper bound added
> Balancer will try to rebalance thousands of regions in one go; needs an upper bound added.
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
> Fix For: 0.92.0
>
> Attachments: hbase-3422.txt
>
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3422) Balancer will willing try to
rebalance thousands of regions in one go; needs an upper bound added.
Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008132#comment-13008132 ]
Ted Yu commented on HBASE-3422:
-------------------------------
In terms of putting upper bound on the time it takes per call to HMaster.balance(), I think master should establish some metric about the execution time of plan execution.
Here is related code:
{code}
List<RegionPlan> plans = this.balancer.balanceCluster(assignments);
if (plans != null && !plans.isEmpty()) {
for (RegionPlan plan: plans) {
LOG.info("balance " + plan);
this.assignmentManager.balance(plan);
{code}
If the metric is collected for assignmentManager.balance() calls, balancer.balanceCluster() can make use of the metric and adjust the maximum number of regions assigned in one round.
> Balancer will willing try to rebalance thousands of regions in one go; needs an upper bound added.
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-3422
> URL: https://issues.apache.org/jira/browse/HBASE-3422
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.0
> Reporter: stack
> Assignee: Ted Yu
>
> See HBASE-3420. Therein, a wonky cluster had 5k regions on one server and < 1k on others. Balancer ran and wanted to redistribute 3k+ all in one go. Madness.
> If a load of rebalancing to be done, should be done somewhat piecemeal. We need maximum regions to rebalance at a time upper bound at a minimum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira