You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2021/01/20 22:08:20 UTC

[GitHub] [helix] NealSun96 opened a new pull request #1615: Nealsun/baseline movement constraint

NealSun96 opened a new pull request #1615:
URL: https://github.com/apache/helix/pull/1615


   ### Issues
   
   - [x] My PR addresses the following Helix issues and references them in the PR description:
   
   Resolves #1614 
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI changes:
   
   ```
   This PR splits `PartitionMovementConstraint` into separate constraints that control baseline movements and bestPossible movements respectively. The current implementation offers a unified control for both baseline movements and bestPossible movements, but it's growing more apparent that a separate control is beneficial. For example, allowing more aggressive baseline movements while limiting bestPossible movements is suitable for clusters that are migrating from non-Waged (the partitions of which are usually not balanced) to Waged. The splitting unlocks movement controls in global rebalance and partial rebalance to allow maximum flexibility. 
   ```
   ### Tests
   
   - The following is the result of the "mvn test" command on the appropriate module:
   
   
   ```
   [ERROR] Tests run: 1258, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 5,088.618 s <<< FAILURE! - in TestSuite
   [ERROR] testLackEnoughInstances(org.apache.helix.integration.rebalancer.CrushRebalancers.TestCrushAutoRebalanceTopoplogyAwareDisabled)  Time elapsed: 0.288 s  <<< FAILURE!
   org.apache.helix.HelixException: Failed to drop instance: localhost_12920. Retry times: 3
           at org.apache.helix.integration.rebalancer.CrushRebalancers.TestCrushAutoRebalanceTopoplogyAwareDisabled.testLackEnoughInstances(TestCrushAutoRebalanceTopoplogyAwareDisabled.java:86)
   Caused by: org.apache.helix.zookeeper.exception.ZkClientException: Failed to delete /CLUSTER_TestCrushAutoRebalanceTopoplogyAwareDisabled/INSTANCES/localhost_12920
           at org.apache.helix.integration.rebalancer.CrushRebalancers.TestCrushAutoRebalanceTopoplogyAwareDisabled.testLackEnoughInstances(TestCrushAutoRebalanceTopoplogyAwareDisabled.java:86)
   Caused by: org.apache.helix.zookeeper.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /CLUSTER_TestCrushAutoRebalanceTopoplogyAwareDisabled/INSTANCES/localhost_12920
           at org.apache.helix.integration.rebalancer.CrushRebalancers.TestCrushAutoRebalanceTopoplogyAwareDisabled.testLackEnoughInstances(TestCrushAutoRebalanceTopoplogyAwareDisabled.java:86)
   Caused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode = Directory not empty for /CLUSTER_TestCrushAutoRebalanceTopoplogyAwareDisabled/INSTANCES/localhost_12920
           at org.apache.helix.integration.rebalancer.CrushRebalancers.TestCrushAutoRebalanceTopoplogyAwareDisabled.testLackEnoughInstances(TestCrushAutoRebalanceTopoplogyAwareDisabled.java:86)
   
   [ERROR] test(org.apache.helix.integration.TestDisableCustomCodeRunner)  Time elapsed: 4.588 s  <<< FAILURE!
   java.lang.AssertionError: expected:<false> but was:<true>
           at org.apache.helix.integration.TestDisableCustomCodeRunner.test(TestDisableCustomCodeRunner.java:236)
   
   [INFO] 
   [INFO] Results:
   [INFO] 
   [ERROR] Failures: 
   [ERROR]   TestDisableCustomCodeRunner.test:236 expected:<false> but was:<true>
   [ERROR] org.apache.helix.integration.rebalancer.CrushRebalancers.TestCrushAutoRebalanceTopoplogyAwareDisabled.testLackEnoughInstances(org.apache.helix.integration.rebalancer.CrushRebalancers.TestCrushAutoRebalanceTopoplogyAwareDisabled)
   [INFO]   Run 1: PASS
   [ERROR]   Run 2: TestCrushAutoRebalanceTopoplogyAwareDisabled.testLackEnoughInstances:86->TestCrushAutoRebalanceNonRack.testLackEnoughInstances:281 ยป Helix
   [INFO] 
   [INFO] 
   [ERROR] Tests run: 1257, Failures: 2, Errors: 0, Skipped: 0
   [INFO] 
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD FAILURE
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time:  01:24 h
   [INFO] Finished at: 2021-01-20T12:06:57-08:00
   [INFO] ------------------------------------------------------------------------
   ```
   Rerun
   ```
   [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.068 s - in TestSuite
   [INFO] 
   [INFO] Results:
   [INFO] 
   [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0
   [INFO] 
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time:  40.001 s
   [INFO] Finished at: 2021-01-20T13:42:05-08:00
   [INFO] ------------------------------------------------------------------------
   ```
   
   
   ### Documentation (Optional)
   
   - In case of new functionality, my PR adds documentation in the following wiki page:
   
   (Link the GitHub wiki you added)
   
   ### Commits
   
   - My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1615:
URL: https://github.com/apache/helix/pull/1615#discussion_r562311594



##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/BestPossibleStatePartitionMovementConstraint.java
##########
@@ -0,0 +1,20 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+import java.util.Map;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModelProvider;
+
+/**
+ * Evaluate the proposed assignment according to the potential partition movements cost.
+ * The cost is evaluated based on the existing best possible state assignment.
+ */
+public class BestPossibleStatePartitionMovementConstraint extends PartitionMovementConstraint {

Review comment:
       nit, since the 2 pipelines are taking different things as the best possible assignments (global pipeline uses the baseline, and the partial rebalance pipeline uses the best possible assignments), let's name the method more generic to avoid confusion. I suggest just call it PartitionMovementConstraint. Since any change means a movement or ST.
   
   On the other hand, we can change the "BaselinePartitionMovementConstraint" to be something like BaselineConvergeConstraint. The larger score or larger constraint weight means more converging to the baseline.

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java
##########
@@ -67,13 +68,15 @@ public static RebalanceAlgorithm getInstance(
         preferences.getOrDefault(ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 1);
 
     List<SoftConstraint> softConstraints = ImmutableList
-        .of(new PartitionMovementConstraint(), new InstancePartitionsCountConstraint(),
-            new ResourcePartitionAntiAffinityConstraint(),
+        .of(new BaselinePartitionMovementConstraint(),
+            new BestPossibleStatePartitionMovementConstraint(),
+            new InstancePartitionsCountConstraint(), new ResourcePartitionAntiAffinityConstraint(),
             new ResourceTopStateAntiAffinityConstraint(), new MaxCapacityUsageInstanceConstraint());
     Map<SoftConstraint, Float> softConstraintsWithWeight = Maps.toMap(softConstraints, key -> {
       String name = key.getClass().getSimpleName();
       float weight = MODEL.get(name);
-      return name.equals(PartitionMovementConstraint.class.getSimpleName()) ?
+      return name.equals(BaselinePartitionMovementConstraint.class.getSimpleName()) || name

Review comment:
       Put "BaselinePartitionMovementConstraint" here might be not right. A larger weight of "BaselinePartitionMovementConstraint" result might indicate more movements, right? Unfortunately, it is not evenness as well. Maybe we shall introduce the 3rd preference value for this one. Or we just don't adjust it's weight based on the rebalance preference.
   
   Either way works for this PR, since we will have a better idea after tuning. And for tuning, we won't need to use the rebalance preference. We can adjust the weight of the constraint directly.

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/PartitionMovementConstraint.java
##########
@@ -31,65 +31,37 @@
 /**
  * Evaluate the proposed assignment according to the potential partition movements cost.
  * The cost is evaluated based on the difference between the old assignment and the new assignment.
- * In detail, we consider the following two previous assignments as the base.
- * - Baseline assignment that is calculated regardless of the node state (online/offline).
- * - Previous Best Possible assignment.
- * Any change to these two assignments will increase the partition movements cost, so that the
+ * Any change from the old assignment will increase the partition movements cost, so that the
  * evaluated score will become lower.
  */
-class PartitionMovementConstraint extends SoftConstraint {
+abstract class PartitionMovementConstraint extends SoftConstraint {
   private static final double MAX_SCORE = 1f;
   private static final double MIN_SCORE = 0f;
   // The scale factor to adjust score when the proposed allocation partially matches the assignment
   // plan but will require a state transition (with partition movement).
   // TODO: these factors will be tuned based on user's preference
   private static final double STATE_TRANSITION_COST_FACTOR = 0.5;
-  private static final double MOVEMENT_COST_FACTOR = 0.25;
 
   PartitionMovementConstraint() {
     super(MAX_SCORE, MIN_SCORE);
   }
 
   /**
-   * @return 1 if the proposed assignment completely matches the previous best possible assignment
-   *         (or baseline assignment if the replica is newly added).
+   * @return 1 if the proposed assignment completely matches the previous assignment.
    *         STATE_TRANSITION_COST_FACTOR if the proposed assignment's allocation matches the
-   *         previous Best Possible assignment (or baseline assignment if the replica is newly
-   *         added) but state does not match.
-   *         MOVEMENT_COST_FACTOR if the proposed assignment's allocation matches the baseline
-   *         assignment only, but not matches the previous best possible assignment.
-   *         0 if the proposed assignment is a pure random movement.
+   *         previous assignment but state does not match.
    */
   @Override
   protected double getAssignmentScore(AssignableNode node, AssignableReplica replica,
       ClusterContext clusterContext) {
-    Map<String, String> bestPossibleAssignment =
-        getStateMap(replica, clusterContext.getBestPossibleAssignment());
-    Map<String, String> baselineAssignment =
-        getStateMap(replica, clusterContext.getBaselineAssignment());
-    String nodeName = node.getInstanceName();
-    String state = replica.getReplicaState();
-
-    if (bestPossibleAssignment.isEmpty()) {
-      // If bestPossibleAssignment of the replica is empty, indicating this is a new replica.
-      // Then the baseline is the only reference.
-      return calculateAssignmentScore(nodeName, state, baselineAssignment);
-    } else {
-      // Else, for minimizing partition movements or state transitions, prioritize the proposed
-      // assignment that matches the previous Best Possible assignment.
-      double score = calculateAssignmentScore(nodeName, state, bestPossibleAssignment);
-      // If no Best Possible assignment matches, check the baseline assignment.
-      if (score == 0 && baselineAssignment.containsKey(nodeName)) {
-        // Although not desired, the proposed assignment that matches the baseline is still better
-        // than a random movement. So try to evaluate the score with the MOVEMENT_COST_FACTOR
-        // punishment.
-        score = MOVEMENT_COST_FACTOR;
-      }
-      return score;
-    }
+    return calculateAssignmentScore(node.getInstanceName(), replica.getReplicaState(),
+        getReplicaStateMap(replica, clusterContext));
   }
 
-  private Map<String, String> getStateMap(AssignableReplica replica,
+  protected abstract Map<String, String> getReplicaStateMap(AssignableReplica replica,

Review comment:
       nit, The name of this method is very similar to the method below. And the functionality is very close. To avoid confusion, shall we call this one getStateMap, and the following one getStateMapFromAssignment? There should be a better name, please put more thoughts there : )

##########
File path: helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestPartitionMovementConstraint.java
##########
@@ -55,126 +57,55 @@ public void init() {
   }
 
   @Test
-  public void testGetAssignmentScoreWhenBestPossibleBaselineMissing() {
+  public void testGetAssignmentScore() {
+    // missing best possible and baseline
     when(_clusterContext.getBaselineAssignment()).thenReturn(Collections.emptyMap());
     when(_clusterContext.getBestPossibleAssignment()).thenReturn(Collections.emptyMap());
-    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double score = _baselineConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
     double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+        _baselineConstraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.0);
+    Assert.assertEquals(normalizedScore, 0.0);
+    score =
+        _bestPossibleStateConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    normalizedScore = _bestPossibleStateConstraint
+        .getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
     Assert.assertEquals(score, 0.0);
     Assert.assertEquals(normalizedScore, 0.0);
-  }
 
-  @Test
-  public void testGetAssignmentScoreWhenBestPossibleBaselineSame() {
     ResourceAssignment mockResourceAssignment = mock(ResourceAssignment.class);
     when(mockResourceAssignment.getReplicaMap(new Partition(PARTITION)))
         .thenReturn(ImmutableMap.of(INSTANCE, "Master"));
     Map<String, ResourceAssignment> assignmentMap =
         ImmutableMap.of(RESOURCE, mockResourceAssignment);
     when(_clusterContext.getBaselineAssignment()).thenReturn(assignmentMap);
     when(_clusterContext.getBestPossibleAssignment()).thenReturn(assignmentMap);
+
     // when the calculated states are both equal to the replica's current state
     when(_testReplica.getReplicaState()).thenReturn("Master");
-    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
-    double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
-    Assert.assertEquals(score, 1.0);
-    Assert.assertEquals(normalizedScore, 1.0);
-    // when the calculated states are both different from the replica's current state
-    when(_testReplica.getReplicaState()).thenReturn("Slave");
-    score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    score = _baselineConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
     normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
-    Assert.assertEquals(score, 0.5);
-    Assert.assertEquals(normalizedScore, 0.5);
-  }
-
-  @Test
-  public void testGetAssignmentScoreWhenBestPossibleBaselineOpposite() {
-    String instanceNameA = INSTANCE + "A";
-    String instanceNameB = INSTANCE + "B";
-    String instanceNameC = INSTANCE + "C";
-    AssignableNode testAssignableNode = mock(AssignableNode.class);
-
-    ResourceAssignment bestPossibleResourceAssignment = mock(ResourceAssignment.class);
-    when(bestPossibleResourceAssignment.getReplicaMap(new Partition(PARTITION)))
-        .thenReturn(ImmutableMap.of(instanceNameA, "Master", instanceNameB, "Slave"));
-    when(_clusterContext.getBestPossibleAssignment())
-        .thenReturn(ImmutableMap.of(RESOURCE, bestPossibleResourceAssignment));
-    ResourceAssignment baselineResourceAssignment = mock(ResourceAssignment.class);
-    when(baselineResourceAssignment.getReplicaMap(new Partition(PARTITION)))
-        .thenReturn(ImmutableMap.of(instanceNameA, "Slave", instanceNameC, "Master"));
-    when(_clusterContext.getBaselineAssignment())
-        .thenReturn(ImmutableMap.of(RESOURCE, baselineResourceAssignment));
-
-    // when the replica's state matches with best possible
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameA);
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    double score =
-        _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
+        _baselineConstraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
     Assert.assertEquals(score, 1.0);
     Assert.assertEquals(normalizedScore, 1.0);
-
-    // when the replica's allocation matches with best possible
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameB);
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    score = _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
-    Assert.assertEquals(score, 0.5);
-    Assert.assertEquals(normalizedScore, 0.5);
-
-    // when the replica's state matches with baseline only
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameC);
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    score = _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
-    // The calculated score is lower than previous value cause the replica's state matches with
-    // best possible is preferred
-    Assert.assertEquals(score, 0.25);
-    Assert.assertEquals(normalizedScore, 0.25);
-
-    // when the replica's allocation matches with baseline only
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameC);
-    when(_testReplica.getReplicaState()).thenReturn("Slave");
-    score = _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
-    // The calculated score is lower than previous value cause the replica's state matches with
-    // best possible is preferred
-    Assert.assertEquals(score, 0.25);
-    Assert.assertEquals(normalizedScore, 0.25);
-  }
-
-  @Test
-  public void testGetAssignmentScoreWhenBestPossibleMissing() {
-    ResourceAssignment mockResourceAssignment = mock(ResourceAssignment.class);
-    when(mockResourceAssignment.getReplicaMap(new Partition(PARTITION)))
-        .thenReturn(ImmutableMap.of(INSTANCE, "Master"));
-    Map<String, ResourceAssignment> assignmentMap =
-        ImmutableMap.of(RESOURCE, mockResourceAssignment);
-    when(_clusterContext.getBaselineAssignment()).thenReturn(assignmentMap);
-    when(_clusterContext.getBestPossibleAssignment()).thenReturn(Collections.emptyMap());
-    // when the calculated states are both equal to the replica's current state
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
-    double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
+    score =
+        _bestPossibleStateConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    normalizedScore = _bestPossibleStateConstraint
+        .getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
     Assert.assertEquals(score, 1.0);
     Assert.assertEquals(normalizedScore, 1.0);
+
     // when the calculated states are both different from the replica's current state
     when(_testReplica.getReplicaState()).thenReturn("Slave");
-    score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    score = _baselineConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
     normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
+        _baselineConstraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.5);

Review comment:
       This PR is fine. But this reminds me of one issue. So if a placement matches both assignment records (but not the state), then this placement would be evaluated as the same score as another candidate with the state, and allocation matches one of the records. In this case, this result will most possibly cause an unnecessary state change. I actually see some flip-flop leader state movements due to this.
   
   So we shall ensure the following 4 levels of scoring at least to avoid this scenario,
   1. no match - 0
   2. the allocation match one record - less than 0.5, maybe 0.3
   3. both state and allocation match - the highest score 1
   
   In this case, if these 2 constraints are configured in the same way, we have 5 levels combined.
   0 - no match in both records
   0.3 - allocation matches one record
   0.6 - allocation matches two records
   1.3 - allocation matches one and state matches the other record
   2 - state matches both records
   
   There is still a chance that if the user configured the constraint weights so the assumption above is broken again. So I guess we need more work in this area.

##########
File path: helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModelTestHelper.java
##########
@@ -42,7 +42,8 @@ public ClusterModel getDefaultClusterModel() throws IOException {
     Set<AssignableNode> assignableNodes = generateNodes(testCache);
 
     ClusterContext context =
-        new ClusterContext(assignableReplicas, assignableNodes, Collections.emptyMap(), Collections.emptyMap());
+        new ClusterContext(assignableReplicas, assignableNodes, Collections.emptyMap(),

Review comment:
       Avoid trivial code changes?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on pull request #1615:
URL: https://github.com/apache/helix/pull/1615#issuecomment-772758036


   Two things:
   1/ Now we use github test run results as standard, which in this case is ok. Shall we update the Test description?
   2/ `TestPreferenceListAsQueue.testParallelismInStateModel` did not fail before. Can you double check this change won't really affect the correctness of this test?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 edited a comment on pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
kaisun2000 edited a comment on pull request #1615:
URL: https://github.com/apache/helix/pull/1615#issuecomment-772758036


   Two things:
   1/ Now we use github test run results as standard, which in this case is ok. Shall we update the Test description?
   2/ `TestPreferenceListAsQueue.testParallelismInStateModel` did not fail before. Can you double check this change won't really affect the correctness of this test? I did have a look and don't see any potential possibility, but it would be good for you to double check too. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on pull request #1615:
URL: https://github.com/apache/helix/pull/1615#issuecomment-768769133


   @jiajunwang Could you create a branch for me to push to? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 closed pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
NealSun96 closed pull request #1615:
URL: https://github.com/apache/helix/pull/1615


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1615:
URL: https://github.com/apache/helix/pull/1615#discussion_r561397256



##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java
##########
@@ -67,13 +68,14 @@ public static RebalanceAlgorithm getInstance(
         preferences.getOrDefault(ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 1);
 
     List<SoftConstraint> softConstraints = ImmutableList
-        .of(new PartitionMovementConstraint(), new InstancePartitionsCountConstraint(),
-            new ResourcePartitionAntiAffinityConstraint(),
+        .of(new PartitionMovementConstraintGR(), new PartitionMovementConstraintPR(),
+            new InstancePartitionsCountConstraint(), new ResourcePartitionAntiAffinityConstraint(),
             new ResourceTopStateAntiAffinityConstraint(), new MaxCapacityUsageInstanceConstraint());
     Map<SoftConstraint, Float> softConstraintsWithWeight = Maps.toMap(softConstraints, key -> {
       String name = key.getClass().getSimpleName();
       float weight = MODEL.get(name);
-      return name.equals(PartitionMovementConstraint.class.getSimpleName()) ?
+      return name.equals(PartitionMovementConstraintGR.class.getSimpleName()) || name

Review comment:
       In this case, I think they are both "instance of" PartitionMovementConstraint, right? Will that check be simpler?

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java
##########
@@ -38,7 +38,8 @@
   private static final Map<String, Float> MODEL = new HashMap<String, Float>() {
     {
       // The default setting
-      put(PartitionMovementConstraint.class.getSimpleName(), 2f);
+      put(PartitionMovementConstraintGR.class.getSimpleName(), 2f);

Review comment:
       Let's give these 2 constraints more meaningful names.
   
   Also does this PR change the distribution of the weight among the constraint? I will take a look at the code below.

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/PartitionMovementConstraint.java
##########
@@ -39,7 +39,7 @@
  */
 class PartitionMovementConstraint extends SoftConstraint {

Review comment:
       Make it an abstract class?

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterContext.java
##########
@@ -135,6 +140,10 @@ public float getEstimatedMaxUtilization() {
     return _estimatedMaxUtilization;
   }
 
+  public ClusterModelProvider.RebalanceScopeType getScopeType() {

Review comment:
       As we discussed offline, this type will break the module design. It introduces a detail in one component to the other one. It would be better to make the constraint algorithm independent.
   Please have a try to split the constraint in the other way, baseline check constraint vs. best possible check constrant. Then we can tune the constraint weight to achieve our goal.

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/BestPossibleStatePartitionMovementConstraint.java
##########
@@ -0,0 +1,20 @@
+package org.apache.helix.controller.rebalancer.waged.constraints;
+
+import java.util.Map;
+
+import org.apache.helix.controller.rebalancer.waged.model.AssignableNode;
+import org.apache.helix.controller.rebalancer.waged.model.AssignableReplica;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterContext;
+import org.apache.helix.controller.rebalancer.waged.model.ClusterModelProvider;
+
+/**
+ * Evaluate the proposed assignment according to the potential partition movements cost.
+ * The cost is evaluated based on the existing best possible state assignment.
+ */
+public class BestPossibleStatePartitionMovementConstraint extends PartitionMovementConstraint {

Review comment:
       nit, since the 2 pipelines are taking different things as the best possible assignments (global pipeline uses the baseline, and the partial rebalance pipeline uses the best possible assignments), let's name the method more generic to avoid confusion. I suggest just call it PartitionMovementConstraint. Since any change means a movement or ST.
   
   On the other hand, we can change the "BaselinePartitionMovementConstraint" to be something like BaselineConvergeConstraint. The larger score or larger constraint weight means more converging to the baseline.

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java
##########
@@ -67,13 +68,15 @@ public static RebalanceAlgorithm getInstance(
         preferences.getOrDefault(ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 1);
 
     List<SoftConstraint> softConstraints = ImmutableList
-        .of(new PartitionMovementConstraint(), new InstancePartitionsCountConstraint(),
-            new ResourcePartitionAntiAffinityConstraint(),
+        .of(new BaselinePartitionMovementConstraint(),
+            new BestPossibleStatePartitionMovementConstraint(),
+            new InstancePartitionsCountConstraint(), new ResourcePartitionAntiAffinityConstraint(),
             new ResourceTopStateAntiAffinityConstraint(), new MaxCapacityUsageInstanceConstraint());
     Map<SoftConstraint, Float> softConstraintsWithWeight = Maps.toMap(softConstraints, key -> {
       String name = key.getClass().getSimpleName();
       float weight = MODEL.get(name);
-      return name.equals(PartitionMovementConstraint.class.getSimpleName()) ?
+      return name.equals(BaselinePartitionMovementConstraint.class.getSimpleName()) || name

Review comment:
       Put "BaselinePartitionMovementConstraint" here might be not right. A larger weight of "BaselinePartitionMovementConstraint" result might indicate more movements, right? Unfortunately, it is not evenness as well. Maybe we shall introduce the 3rd preference value for this one. Or we just don't adjust it's weight based on the rebalance preference.
   
   Either way works for this PR, since we will have a better idea after tuning. And for tuning, we won't need to use the rebalance preference. We can adjust the weight of the constraint directly.

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/PartitionMovementConstraint.java
##########
@@ -31,65 +31,37 @@
 /**
  * Evaluate the proposed assignment according to the potential partition movements cost.
  * The cost is evaluated based on the difference between the old assignment and the new assignment.
- * In detail, we consider the following two previous assignments as the base.
- * - Baseline assignment that is calculated regardless of the node state (online/offline).
- * - Previous Best Possible assignment.
- * Any change to these two assignments will increase the partition movements cost, so that the
+ * Any change from the old assignment will increase the partition movements cost, so that the
  * evaluated score will become lower.
  */
-class PartitionMovementConstraint extends SoftConstraint {
+abstract class PartitionMovementConstraint extends SoftConstraint {
   private static final double MAX_SCORE = 1f;
   private static final double MIN_SCORE = 0f;
   // The scale factor to adjust score when the proposed allocation partially matches the assignment
   // plan but will require a state transition (with partition movement).
   // TODO: these factors will be tuned based on user's preference
   private static final double STATE_TRANSITION_COST_FACTOR = 0.5;
-  private static final double MOVEMENT_COST_FACTOR = 0.25;
 
   PartitionMovementConstraint() {
     super(MAX_SCORE, MIN_SCORE);
   }
 
   /**
-   * @return 1 if the proposed assignment completely matches the previous best possible assignment
-   *         (or baseline assignment if the replica is newly added).
+   * @return 1 if the proposed assignment completely matches the previous assignment.
    *         STATE_TRANSITION_COST_FACTOR if the proposed assignment's allocation matches the
-   *         previous Best Possible assignment (or baseline assignment if the replica is newly
-   *         added) but state does not match.
-   *         MOVEMENT_COST_FACTOR if the proposed assignment's allocation matches the baseline
-   *         assignment only, but not matches the previous best possible assignment.
-   *         0 if the proposed assignment is a pure random movement.
+   *         previous assignment but state does not match.
    */
   @Override
   protected double getAssignmentScore(AssignableNode node, AssignableReplica replica,
       ClusterContext clusterContext) {
-    Map<String, String> bestPossibleAssignment =
-        getStateMap(replica, clusterContext.getBestPossibleAssignment());
-    Map<String, String> baselineAssignment =
-        getStateMap(replica, clusterContext.getBaselineAssignment());
-    String nodeName = node.getInstanceName();
-    String state = replica.getReplicaState();
-
-    if (bestPossibleAssignment.isEmpty()) {
-      // If bestPossibleAssignment of the replica is empty, indicating this is a new replica.
-      // Then the baseline is the only reference.
-      return calculateAssignmentScore(nodeName, state, baselineAssignment);
-    } else {
-      // Else, for minimizing partition movements or state transitions, prioritize the proposed
-      // assignment that matches the previous Best Possible assignment.
-      double score = calculateAssignmentScore(nodeName, state, bestPossibleAssignment);
-      // If no Best Possible assignment matches, check the baseline assignment.
-      if (score == 0 && baselineAssignment.containsKey(nodeName)) {
-        // Although not desired, the proposed assignment that matches the baseline is still better
-        // than a random movement. So try to evaluate the score with the MOVEMENT_COST_FACTOR
-        // punishment.
-        score = MOVEMENT_COST_FACTOR;
-      }
-      return score;
-    }
+    return calculateAssignmentScore(node.getInstanceName(), replica.getReplicaState(),
+        getReplicaStateMap(replica, clusterContext));
   }
 
-  private Map<String, String> getStateMap(AssignableReplica replica,
+  protected abstract Map<String, String> getReplicaStateMap(AssignableReplica replica,

Review comment:
       nit, The name of this method is very similar to the method below. And the functionality is very close. To avoid confusion, shall we call this one getStateMap, and the following one getStateMapFromAssignment? There should be a better name, please put more thoughts there : )

##########
File path: helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestPartitionMovementConstraint.java
##########
@@ -55,126 +57,55 @@ public void init() {
   }
 
   @Test
-  public void testGetAssignmentScoreWhenBestPossibleBaselineMissing() {
+  public void testGetAssignmentScore() {
+    // missing best possible and baseline
     when(_clusterContext.getBaselineAssignment()).thenReturn(Collections.emptyMap());
     when(_clusterContext.getBestPossibleAssignment()).thenReturn(Collections.emptyMap());
-    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double score = _baselineConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
     double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+        _baselineConstraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.0);
+    Assert.assertEquals(normalizedScore, 0.0);
+    score =
+        _bestPossibleStateConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    normalizedScore = _bestPossibleStateConstraint
+        .getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
     Assert.assertEquals(score, 0.0);
     Assert.assertEquals(normalizedScore, 0.0);
-  }
 
-  @Test
-  public void testGetAssignmentScoreWhenBestPossibleBaselineSame() {
     ResourceAssignment mockResourceAssignment = mock(ResourceAssignment.class);
     when(mockResourceAssignment.getReplicaMap(new Partition(PARTITION)))
         .thenReturn(ImmutableMap.of(INSTANCE, "Master"));
     Map<String, ResourceAssignment> assignmentMap =
         ImmutableMap.of(RESOURCE, mockResourceAssignment);
     when(_clusterContext.getBaselineAssignment()).thenReturn(assignmentMap);
     when(_clusterContext.getBestPossibleAssignment()).thenReturn(assignmentMap);
+
     // when the calculated states are both equal to the replica's current state
     when(_testReplica.getReplicaState()).thenReturn("Master");
-    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
-    double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
-    Assert.assertEquals(score, 1.0);
-    Assert.assertEquals(normalizedScore, 1.0);
-    // when the calculated states are both different from the replica's current state
-    when(_testReplica.getReplicaState()).thenReturn("Slave");
-    score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    score = _baselineConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
     normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
-    Assert.assertEquals(score, 0.5);
-    Assert.assertEquals(normalizedScore, 0.5);
-  }
-
-  @Test
-  public void testGetAssignmentScoreWhenBestPossibleBaselineOpposite() {
-    String instanceNameA = INSTANCE + "A";
-    String instanceNameB = INSTANCE + "B";
-    String instanceNameC = INSTANCE + "C";
-    AssignableNode testAssignableNode = mock(AssignableNode.class);
-
-    ResourceAssignment bestPossibleResourceAssignment = mock(ResourceAssignment.class);
-    when(bestPossibleResourceAssignment.getReplicaMap(new Partition(PARTITION)))
-        .thenReturn(ImmutableMap.of(instanceNameA, "Master", instanceNameB, "Slave"));
-    when(_clusterContext.getBestPossibleAssignment())
-        .thenReturn(ImmutableMap.of(RESOURCE, bestPossibleResourceAssignment));
-    ResourceAssignment baselineResourceAssignment = mock(ResourceAssignment.class);
-    when(baselineResourceAssignment.getReplicaMap(new Partition(PARTITION)))
-        .thenReturn(ImmutableMap.of(instanceNameA, "Slave", instanceNameC, "Master"));
-    when(_clusterContext.getBaselineAssignment())
-        .thenReturn(ImmutableMap.of(RESOURCE, baselineResourceAssignment));
-
-    // when the replica's state matches with best possible
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameA);
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    double score =
-        _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
+        _baselineConstraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
     Assert.assertEquals(score, 1.0);
     Assert.assertEquals(normalizedScore, 1.0);
-
-    // when the replica's allocation matches with best possible
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameB);
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    score = _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
-    Assert.assertEquals(score, 0.5);
-    Assert.assertEquals(normalizedScore, 0.5);
-
-    // when the replica's state matches with baseline only
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameC);
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    score = _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
-    // The calculated score is lower than previous value cause the replica's state matches with
-    // best possible is preferred
-    Assert.assertEquals(score, 0.25);
-    Assert.assertEquals(normalizedScore, 0.25);
-
-    // when the replica's allocation matches with baseline only
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameC);
-    when(_testReplica.getReplicaState()).thenReturn("Slave");
-    score = _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
-    // The calculated score is lower than previous value cause the replica's state matches with
-    // best possible is preferred
-    Assert.assertEquals(score, 0.25);
-    Assert.assertEquals(normalizedScore, 0.25);
-  }
-
-  @Test
-  public void testGetAssignmentScoreWhenBestPossibleMissing() {
-    ResourceAssignment mockResourceAssignment = mock(ResourceAssignment.class);
-    when(mockResourceAssignment.getReplicaMap(new Partition(PARTITION)))
-        .thenReturn(ImmutableMap.of(INSTANCE, "Master"));
-    Map<String, ResourceAssignment> assignmentMap =
-        ImmutableMap.of(RESOURCE, mockResourceAssignment);
-    when(_clusterContext.getBaselineAssignment()).thenReturn(assignmentMap);
-    when(_clusterContext.getBestPossibleAssignment()).thenReturn(Collections.emptyMap());
-    // when the calculated states are both equal to the replica's current state
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
-    double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
+    score =
+        _bestPossibleStateConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    normalizedScore = _bestPossibleStateConstraint
+        .getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
     Assert.assertEquals(score, 1.0);
     Assert.assertEquals(normalizedScore, 1.0);
+
     // when the calculated states are both different from the replica's current state
     when(_testReplica.getReplicaState()).thenReturn("Slave");
-    score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    score = _baselineConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
     normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
+        _baselineConstraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.5);

Review comment:
       This PR is fine. But this reminds me of one issue. So if a placement matches both assignment records (but not the state), then this placement would be evaluated as the same score as another candidate with the state, and allocation matches one of the records. In this case, this result will most possibly cause an unnecessary state change. I actually see some flip-flop leader state movements due to this.
   
   So we shall ensure the following 4 levels of scoring at least to avoid this scenario,
   1. no match - 0
   2. the allocation match one record - less than 0.5, maybe 0.3
   3. both state and allocation match - the highest score 1
   
   In this case, if these 2 constraints are configured in the same way, we have 5 levels combined.
   0 - no match in both records
   0.3 - allocation matches one record
   0.6 - allocation matches two records
   1.3 - allocation matches one and state matches the other record
   2 - state matches both records
   
   There is still a chance that if the user configured the constraint weights so the assumption above is broken again. So I guess we need more work in this area.

##########
File path: helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModelTestHelper.java
##########
@@ -42,7 +42,8 @@ public ClusterModel getDefaultClusterModel() throws IOException {
     Set<AssignableNode> assignableNodes = generateNodes(testCache);
 
     ClusterContext context =
-        new ClusterContext(assignableReplicas, assignableNodes, Collections.emptyMap(), Collections.emptyMap());
+        new ClusterContext(assignableReplicas, assignableNodes, Collections.emptyMap(),

Review comment:
       Avoid trivial code changes?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 edited a comment on pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
kaisun2000 edited a comment on pull request #1615:
URL: https://github.com/apache/helix/pull/1615#issuecomment-772758036


   Two things:
   1/ Now we use github test run results as standard, which in this case is ok. Shall we update the Test description?
   2/ `TestPreferenceListAsQueue.testParallelismInStateModel` did not fail before. Can you double check this change won't really affect the correctness of this test? I did have a look and don't see any potential possibility, but it would be good for you to double check too. Just give it another eye.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1615:
URL: https://github.com/apache/helix/pull/1615#discussion_r561397256



##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java
##########
@@ -67,13 +68,14 @@ public static RebalanceAlgorithm getInstance(
         preferences.getOrDefault(ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 1);
 
     List<SoftConstraint> softConstraints = ImmutableList
-        .of(new PartitionMovementConstraint(), new InstancePartitionsCountConstraint(),
-            new ResourcePartitionAntiAffinityConstraint(),
+        .of(new PartitionMovementConstraintGR(), new PartitionMovementConstraintPR(),
+            new InstancePartitionsCountConstraint(), new ResourcePartitionAntiAffinityConstraint(),
             new ResourceTopStateAntiAffinityConstraint(), new MaxCapacityUsageInstanceConstraint());
     Map<SoftConstraint, Float> softConstraintsWithWeight = Maps.toMap(softConstraints, key -> {
       String name = key.getClass().getSimpleName();
       float weight = MODEL.get(name);
-      return name.equals(PartitionMovementConstraint.class.getSimpleName()) ?
+      return name.equals(PartitionMovementConstraintGR.class.getSimpleName()) || name

Review comment:
       In this case, I think they are both "instance of" PartitionMovementConstraint, right? Will that check be simpler?

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java
##########
@@ -38,7 +38,8 @@
   private static final Map<String, Float> MODEL = new HashMap<String, Float>() {
     {
       // The default setting
-      put(PartitionMovementConstraint.class.getSimpleName(), 2f);
+      put(PartitionMovementConstraintGR.class.getSimpleName(), 2f);

Review comment:
       Let's give these 2 constraints more meaningful names.
   
   Also does this PR change the distribution of the weight among the constraint? I will take a look at the code below.

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/PartitionMovementConstraint.java
##########
@@ -39,7 +39,7 @@
  */
 class PartitionMovementConstraint extends SoftConstraint {

Review comment:
       Make it an abstract class?

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterContext.java
##########
@@ -135,6 +140,10 @@ public float getEstimatedMaxUtilization() {
     return _estimatedMaxUtilization;
   }
 
+  public ClusterModelProvider.RebalanceScopeType getScopeType() {

Review comment:
       As we discussed offline, this type will break the module design. It introduces a detail in one component to the other one. It would be better to make the constraint algorithm independent.
   Please have a try to split the constraint in the other way, baseline check constraint vs. best possible check constrant. Then we can tune the constraint weight to achieve our goal.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1615:
URL: https://github.com/apache/helix/pull/1615#discussion_r565789006



##########
File path: helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/constraints/TestPartitionMovementConstraint.java
##########
@@ -55,126 +57,55 @@ public void init() {
   }
 
   @Test
-  public void testGetAssignmentScoreWhenBestPossibleBaselineMissing() {
+  public void testGetAssignmentScore() {
+    // missing best possible and baseline
     when(_clusterContext.getBaselineAssignment()).thenReturn(Collections.emptyMap());
     when(_clusterContext.getBestPossibleAssignment()).thenReturn(Collections.emptyMap());
-    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    double score = _baselineConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
     double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+        _baselineConstraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.0);
+    Assert.assertEquals(normalizedScore, 0.0);
+    score =
+        _bestPossibleStateConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    normalizedScore = _bestPossibleStateConstraint
+        .getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
     Assert.assertEquals(score, 0.0);
     Assert.assertEquals(normalizedScore, 0.0);
-  }
 
-  @Test
-  public void testGetAssignmentScoreWhenBestPossibleBaselineSame() {
     ResourceAssignment mockResourceAssignment = mock(ResourceAssignment.class);
     when(mockResourceAssignment.getReplicaMap(new Partition(PARTITION)))
         .thenReturn(ImmutableMap.of(INSTANCE, "Master"));
     Map<String, ResourceAssignment> assignmentMap =
         ImmutableMap.of(RESOURCE, mockResourceAssignment);
     when(_clusterContext.getBaselineAssignment()).thenReturn(assignmentMap);
     when(_clusterContext.getBestPossibleAssignment()).thenReturn(assignmentMap);
+
     // when the calculated states are both equal to the replica's current state
     when(_testReplica.getReplicaState()).thenReturn("Master");
-    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
-    double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
-    Assert.assertEquals(score, 1.0);
-    Assert.assertEquals(normalizedScore, 1.0);
-    // when the calculated states are both different from the replica's current state
-    when(_testReplica.getReplicaState()).thenReturn("Slave");
-    score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    score = _baselineConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
     normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
-    Assert.assertEquals(score, 0.5);
-    Assert.assertEquals(normalizedScore, 0.5);
-  }
-
-  @Test
-  public void testGetAssignmentScoreWhenBestPossibleBaselineOpposite() {
-    String instanceNameA = INSTANCE + "A";
-    String instanceNameB = INSTANCE + "B";
-    String instanceNameC = INSTANCE + "C";
-    AssignableNode testAssignableNode = mock(AssignableNode.class);
-
-    ResourceAssignment bestPossibleResourceAssignment = mock(ResourceAssignment.class);
-    when(bestPossibleResourceAssignment.getReplicaMap(new Partition(PARTITION)))
-        .thenReturn(ImmutableMap.of(instanceNameA, "Master", instanceNameB, "Slave"));
-    when(_clusterContext.getBestPossibleAssignment())
-        .thenReturn(ImmutableMap.of(RESOURCE, bestPossibleResourceAssignment));
-    ResourceAssignment baselineResourceAssignment = mock(ResourceAssignment.class);
-    when(baselineResourceAssignment.getReplicaMap(new Partition(PARTITION)))
-        .thenReturn(ImmutableMap.of(instanceNameA, "Slave", instanceNameC, "Master"));
-    when(_clusterContext.getBaselineAssignment())
-        .thenReturn(ImmutableMap.of(RESOURCE, baselineResourceAssignment));
-
-    // when the replica's state matches with best possible
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameA);
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    double score =
-        _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
+        _baselineConstraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
     Assert.assertEquals(score, 1.0);
     Assert.assertEquals(normalizedScore, 1.0);
-
-    // when the replica's allocation matches with best possible
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameB);
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    score = _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
-    Assert.assertEquals(score, 0.5);
-    Assert.assertEquals(normalizedScore, 0.5);
-
-    // when the replica's state matches with baseline only
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameC);
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    score = _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
-    // The calculated score is lower than previous value cause the replica's state matches with
-    // best possible is preferred
-    Assert.assertEquals(score, 0.25);
-    Assert.assertEquals(normalizedScore, 0.25);
-
-    // when the replica's allocation matches with baseline only
-    when(testAssignableNode.getInstanceName()).thenReturn(instanceNameC);
-    when(_testReplica.getReplicaState()).thenReturn("Slave");
-    score = _constraint.getAssignmentScore(testAssignableNode, _testReplica, _clusterContext);
-    normalizedScore =
-        _constraint.getAssignmentNormalizedScore(testAssignableNode, _testReplica, _clusterContext);
-    // The calculated score is lower than previous value cause the replica's state matches with
-    // best possible is preferred
-    Assert.assertEquals(score, 0.25);
-    Assert.assertEquals(normalizedScore, 0.25);
-  }
-
-  @Test
-  public void testGetAssignmentScoreWhenBestPossibleMissing() {
-    ResourceAssignment mockResourceAssignment = mock(ResourceAssignment.class);
-    when(mockResourceAssignment.getReplicaMap(new Partition(PARTITION)))
-        .thenReturn(ImmutableMap.of(INSTANCE, "Master"));
-    Map<String, ResourceAssignment> assignmentMap =
-        ImmutableMap.of(RESOURCE, mockResourceAssignment);
-    when(_clusterContext.getBaselineAssignment()).thenReturn(assignmentMap);
-    when(_clusterContext.getBestPossibleAssignment()).thenReturn(Collections.emptyMap());
-    // when the calculated states are both equal to the replica's current state
-    when(_testReplica.getReplicaState()).thenReturn("Master");
-    double score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
-    double normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
+    score =
+        _bestPossibleStateConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    normalizedScore = _bestPossibleStateConstraint
+        .getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
     Assert.assertEquals(score, 1.0);
     Assert.assertEquals(normalizedScore, 1.0);
+
     // when the calculated states are both different from the replica's current state
     when(_testReplica.getReplicaState()).thenReturn("Slave");
-    score = _constraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
+    score = _baselineConstraint.getAssignmentScore(_testNode, _testReplica, _clusterContext);
     normalizedScore =
-        _constraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
-
+        _baselineConstraint.getAssignmentNormalizedScore(_testNode, _testReplica, _clusterContext);
+    Assert.assertEquals(score, 0.5);

Review comment:
       This needs experiments later, resolve for now. 

##########
File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/constraints/ConstraintBasedAlgorithmFactory.java
##########
@@ -67,13 +68,15 @@ public static RebalanceAlgorithm getInstance(
         preferences.getOrDefault(ClusterConfig.GlobalRebalancePreferenceKey.LESS_MOVEMENT, 1);
 
     List<SoftConstraint> softConstraints = ImmutableList
-        .of(new PartitionMovementConstraint(), new InstancePartitionsCountConstraint(),
-            new ResourcePartitionAntiAffinityConstraint(),
+        .of(new BaselinePartitionMovementConstraint(),
+            new BestPossibleStatePartitionMovementConstraint(),
+            new InstancePartitionsCountConstraint(), new ResourcePartitionAntiAffinityConstraint(),
             new ResourceTopStateAntiAffinityConstraint(), new MaxCapacityUsageInstanceConstraint());
     Map<SoftConstraint, Float> softConstraintsWithWeight = Maps.toMap(softConstraints, key -> {
       String name = key.getClass().getSimpleName();
       float weight = MODEL.get(name);
-      return name.equals(PartitionMovementConstraint.class.getSimpleName()) ?
+      return name.equals(BaselinePartitionMovementConstraint.class.getSimpleName()) || name

Review comment:
       Needs experiments later, resolve for now. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on pull request #1615: Baseline movement constraint

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on pull request #1615:
URL: https://github.com/apache/helix/pull/1615#issuecomment-768825693


   > @jiajunwang Could you create a branch for me to push to?
   
   I have created **wagedImprove** for you @NealSun96 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org