You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@helix.apache.org by mkscrg <gi...@git.apache.org> on 2016/11/03 16:29:43 UTC
[GitHub] helix pull request #59: helix-core: AutoRebalanceStrategy should sort by par...
GitHub user mkscrg opened a pull request:
https://github.com/apache/helix/pull/59
helix-core: AutoRebalanceStrategy should sort by partition count and state priority
This is an alternative fix to #58. Here `AutoRebalanceStrategy` uses state priorities in addition to partition count to sort live nodes, rather than dropping no-priority states from the current mapping.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mkscrg/helix auto-sort-with-priority
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/helix/pull/59.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #59
----
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix issue #59: helix-core: AutoRebalanceStrategy should sort by partition ...
Posted by kishoreg <gi...@git.apache.org>.
Github user kishoreg commented on the issue:
https://github.com/apache/helix/pull/59
Maybe we should keep the partition count map in place.I am thinking about the case where new nodes get added, will the new change stop redistributing partitions.
A simple test case to try is
2 nodes (n1,n2) with 6 partitions
add a node(n3)
we should see one partition from n1,n2 move to n3. We already have that test case.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix issue #59: helix-core: AutoRebalanceStrategy should sort by partition ...
Posted by lei-xia <gi...@git.apache.org>.
Github user lei-xia commented on the issue:
https://github.com/apache/helix/pull/59
@mkscrg we already generated the 0.6.6 release, will include this fix into our next release. We will have more frequent releases from now on, and next release is expected to be out in another 1 month or so. If you really need this to be included in our official release and can not wait another month, please let us know and we can discuss internally to see whether we can build a hot release for you.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix pull request #59: helix-core: AutoRebalanceStrategy should sort by par...
Posted by lei-xia <gi...@git.apache.org>.
Github user lei-xia commented on a diff in the pull request:
https://github.com/apache/helix/pull/59#discussion_r88552224
--- Diff: helix-core/src/main/java/org/apache/helix/controller/rebalancer/strategy/AutoRebalanceStrategy.java ---
@@ -791,43 +795,61 @@ public String getLocation(int partitionId, int replicaId, int numPartitions, int
}
/**
- * Sorter for live nodes that sorts firstly according to the number of partitions currently
- * registered against a node (more partitions means sort earlier), then by node name.
+ * Sorter for live nodes that sorts
+ * - first by the sum of "state scores" of partitions currently registered to a node
+ * (more higher-priority partitions means sort earlier)
+ * - then by node name.
* This prevents unnecessarily moving partitions due to the capacity assignment
* unnecessarily reducing the capacity of lower down elements.
*/
private static class CurrentStateNodeComparator implements Comparator<String> {
/**
- * The number of partitions that are active for each participant.
+ * The sum of state scores for partitions on each . (Map<ParticipantId, Integer>)
+ * State scores are the reverse of state priorities, s.t. states with lower priority numbers have higher scores.
*/
- private final Map<String, Integer> partitionCounts;
+ private final Map<String, Integer> participantStatePriorities;
/**
* Create it.
* @param currentMapping The current mapping of partitions to participants.
+ * @param stateModelDef The resource's associated state model.
*/
- public CurrentStateNodeComparator(Map<String, Map<String, String>> currentMapping) {
- partitionCounts = new HashMap<String, Integer>();
- for (Entry<String, Map<String, String>> entry : currentMapping.entrySet()) {
- for (String participantId : entry.getValue().keySet()) {
- Integer existing = partitionCounts.get(participantId);
- partitionCounts.put(participantId, existing != null ? existing + 1 : 1);
+ public CurrentStateNodeComparator(Map<String, Map<String, String>> currentMapping,
+ StateModelDefinition stateModelDef) {
+ Map<String, Integer> stateScores = getStateScores(stateModelDef);
+
+ participantStatePriorities = new HashMap<String, Integer>();
+ for (Map<String, String> participantStates : currentMapping.values()) {
+ for (String participantId : participantStates.keySet()) {
+ int existing = participantStatePriorities.getOrDefault(participantId, 0);
+ participantStatePriorities.put(participantId, existing + stateScores.getOrDefault(participantId, 0));
--- End diff --
same here.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix issue #59: helix-core: AutoRebalanceStrategy should sort by partition ...
Posted by mkscrg <gi...@git.apache.org>.
Github user mkscrg commented on the issue:
https://github.com/apache/helix/pull/59
@lei-xia OK sounds good. Will keep in touch :+1:
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix pull request #59: helix-core: AutoRebalanceStrategy should sort by par...
Posted by lei-xia <gi...@git.apache.org>.
Github user lei-xia commented on a diff in the pull request:
https://github.com/apache/helix/pull/59#discussion_r88552181
--- Diff: helix-core/src/main/java/org/apache/helix/controller/rebalancer/strategy/AutoRebalanceStrategy.java ---
@@ -791,43 +795,61 @@ public String getLocation(int partitionId, int replicaId, int numPartitions, int
}
/**
- * Sorter for live nodes that sorts firstly according to the number of partitions currently
- * registered against a node (more partitions means sort earlier), then by node name.
+ * Sorter for live nodes that sorts
+ * - first by the sum of "state scores" of partitions currently registered to a node
+ * (more higher-priority partitions means sort earlier)
+ * - then by node name.
* This prevents unnecessarily moving partitions due to the capacity assignment
* unnecessarily reducing the capacity of lower down elements.
*/
private static class CurrentStateNodeComparator implements Comparator<String> {
/**
- * The number of partitions that are active for each participant.
+ * The sum of state scores for partitions on each . (Map<ParticipantId, Integer>)
+ * State scores are the reverse of state priorities, s.t. states with lower priority numbers have higher scores.
*/
- private final Map<String, Integer> partitionCounts;
+ private final Map<String, Integer> participantStatePriorities;
/**
* Create it.
* @param currentMapping The current mapping of partitions to participants.
+ * @param stateModelDef The resource's associated state model.
*/
- public CurrentStateNodeComparator(Map<String, Map<String, String>> currentMapping) {
- partitionCounts = new HashMap<String, Integer>();
- for (Entry<String, Map<String, String>> entry : currentMapping.entrySet()) {
- for (String participantId : entry.getValue().keySet()) {
- Integer existing = partitionCounts.get(participantId);
- partitionCounts.put(participantId, existing != null ? existing + 1 : 1);
+ public CurrentStateNodeComparator(Map<String, Map<String, String>> currentMapping,
+ StateModelDefinition stateModelDef) {
+ Map<String, Integer> stateScores = getStateScores(stateModelDef);
+
+ participantStatePriorities = new HashMap<String, Integer>();
+ for (Map<String, String> participantStates : currentMapping.values()) {
+ for (String participantId : participantStates.keySet()) {
+ int existing = participantStatePriorities.getOrDefault(participantId, 0);
--- End diff --
Also, you are using Java 8 feature, Helix still use Java 6 now, please refactor this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix issue #59: helix-core: AutoRebalanceStrategy should sort by partition ...
Posted by lei-xia <gi...@git.apache.org>.
Github user lei-xia commented on the issue:
https://github.com/apache/helix/pull/59
Looks good to me. Could you please include a test?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix issue #59: helix-core: AutoRebalanceStrategy should sort by partition ...
Posted by mkscrg <gi...@git.apache.org>.
Github user mkscrg commented on the issue:
https://github.com/apache/helix/pull/59
@kishoreg confirmed I can run individual tests with e.g. `mvn clean test -Dtest=TestAutoRebalance -DfailIfNoTests=false`. The full `helix-core` test suite hits OOM errors on my machine, but that may be a local config issue.
@lei-xia can we get this into the 0.6.6 release?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix issue #59: helix-core: AutoRebalanceStrategy should sort by partition ...
Posted by kishoreg <gi...@git.apache.org>.
Github user kishoreg commented on the issue:
https://github.com/apache/helix/pull/59
LGTM. @lei-xia can you also take a look at this.
Regarding test case, standard mvn command to run a specific test should work. mvn clean install should work. What is the failure you are seeing? maybe we should discuss that on mailing list
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix issue #59: helix-core: AutoRebalanceStrategy should sort by partition ...
Posted by mkscrg <gi...@git.apache.org>.
Github user mkscrg commented on the issue:
https://github.com/apache/helix/pull/59
I think this strategy accounts for partition count and priority:
- it assigns each state a "score", which is just a reverse numbering of the priority-ordered state list
- it ranks nodes according to the sum of the state scores of the partitions on the node
I modified my test code to your 2/6 scenario, and it worked as expected:
```
Setting up cluster THE_CLUSTER
Starting CONTROLLER
Starting NODE_0
Starting NODE_1
Adding resource R
Rebalancing resource R
Transition: NODE_0 OFFLINE to ONLINE for R_0
Transition: NODE_1 OFFLINE to ONLINE for R_3
Transition: NODE_1 OFFLINE to ONLINE for R_5
Transition: NODE_1 OFFLINE to ONLINE for R_1
Transition: NODE_0 OFFLINE to ONLINE for R_4
Transition: NODE_0 OFFLINE to ONLINE for R_2
Cluster state after setup:
NODE_0 NODE_1
R_0 ONLINE null
R_1 null ONLINE
R_2 ONLINE null
R_3 null ONLINE
R_4 ONLINE null
R_5 null ONLINE
------------------------------------------------------------
Starting NODE_2
Transition: NODE_0 ONLINE to OFFLINE for R_2
Transition: NODE_1 ONLINE to OFFLINE for R_5
Transition: NODE_0 OFFLINE to DROPPED for R_2
Transition: NODE_1 OFFLINE to DROPPED for R_5
Transition: NODE_2 OFFLINE to ONLINE for R_5
Transition: NODE_2 OFFLINE to ONLINE for R_2
Cluster state after starting third node:
NODE_0 NODE_1 NODE_2
R_0 ONLINE null null
R_1 null ONLINE null
R_2 null null ONLINE
R_3 null ONLINE null
R_4 ONLINE null null
R_5 null null ONLINE
------------------------------------------------------------
```
I can gist the test code if helpful.
You mentioned
> We already have that test case.
How do you run the test cases? I've noticed the `build` script has them disabled, and a simple `mvn clean install` fails.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix issue #59: helix-core: AutoRebalanceStrategy should sort by partition ...
Posted by lei-xia <gi...@git.apache.org>.
Github user lei-xia commented on the issue:
https://github.com/apache/helix/pull/59
Could you please also rebase to the HEAD of the branch? Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] helix issue #59: helix-core: AutoRebalanceStrategy should sort by partition ...
Posted by kishoreg <gi...@git.apache.org>.
Github user kishoreg commented on the issue:
https://github.com/apache/helix/pull/59
Any update on this PR. I think we are seeing similar issues in one of our projects
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---