You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/10/07 04:11:03 UTC

[GitHub] [helix] kaisun2000 opened a new pull request #1449: Wait till verify #1448

kaisun2000 opened a new pull request #1449:
URL: https://github.com/apache/helix/pull/1449


   ### Issues
   
   - [x] My PR addresses the following Helix issues and references them in the PR description:
   
   fix #1448 part 1
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI changes:
   
       HelixClusterVerifier verify() and related method may return
       pre-maturely. The reason is that the verify the converging stable
       condition too early before controller has a chance to make
       change. Basically the previous stable state is mistaken as the
       expected next stable state.
       
       We fix this issue by adding waitTillVerify() timeout in
       construction time of verifier.
   
   ### Tests
   
   - [x] The following tests are written for this issue:
   
   github run
   - [ ] The following is the result of the "mvn test" command on the appropriate module:
   
   (Before CI test pass, please copy & paste the result of "mvn test")
   
   ### Documentation (Optional)
   
   - In case of new functionality, my PR adds documentation in the following wiki page:
   
   (Link the GitHub wiki you added)
   
   ### Commits
   
   - My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1449:
URL: https://github.com/apache/helix/pull/1449#discussion_r501244706



##########
File path: helix-core/src/test/java/org/apache/helix/controller/changedetector/TestResourceChangeDetector.java
##########
@@ -438,7 +441,7 @@ public void testResetSnapshots() {
     Assert.assertEquals(
         changeDetector.getAdditionsByType(ChangeType.IDEAL_STATE).size() + changeDetector
             .getChangesByType(ChangeType.IDEAL_STATE).size() + changeDetector
-            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 2);
+            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 1);

Review comment:
       Does this number change mean what we previously tested was wrong? How to justify the number change?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1449:
URL: https://github.com/apache/helix/pull/1449#discussion_r501286984



##########
File path: helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestSemiAutoRebalance.java
##########
@@ -92,33 +95,11 @@ public void beforeClass() throws InterruptedException {
     _controller = new ClusterControllerManager(ZK_ADDR, CLUSTER_NAME, controllerName);
     _controller.syncStart();
 
-    Thread.sleep(1000);
-
-    // verify ideal state and external view
-    IdealState idealState = _accessor.getProperty(_keyBuilder.idealStates(DB_NAME));
-    Assert.assertNotNull(idealState);
-    Assert.assertEquals(idealState.getNumPartitions(), PARTITION_NUMBER);
-    for (String partition : idealState.getPartitionSet()) {
-      List<String> preferenceList = idealState.getPreferenceList(partition);
-      Assert.assertNotNull(preferenceList);
-      Assert.assertEquals(preferenceList.size(), REPLICA_NUMBER);
-    }
-
-    ExternalView externalView = _accessor.getProperty(_keyBuilder.externalView(DB_NAME));
-    Assert.assertNotNull(externalView);
-    Assert.assertEquals(externalView.getPartitionSet().size(), PARTITION_NUMBER);
-    for (String partition : externalView.getPartitionSet()) {
-      Map<String, String> stateMap = externalView.getStateMap(partition);
-      Assert.assertEquals(stateMap.size(), REPLICA_NUMBER);
-
-      int masters = 0;
-      for (String state : stateMap.values()) {
-        if (state.equals(MasterSlaveSMD.States.MASTER.name())) {
-          ++masters;
-        }
-      }
-      Assert.assertEquals(masters, 1);
-    }
+    ZkHelixClusterVerifier verifier = new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME)

Review comment:
       replaced by line 102. The previous way is really too old. Older than deprecated ClusterVerifier and it does not work sometimes.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1449:
URL: https://github.com/apache/helix/pull/1449#discussion_r501247614



##########
File path: helix-core/src/test/java/org/apache/helix/integration/TestDisableCustomCodeRunner.java
##########
@@ -209,9 +214,7 @@ public void test() throws Exception {
 
     // Re-enable custom-code runner
     admin.enableResource(clusterName, customCodeRunnerResource, true);
-    result = ClusterStateVerifier.verifyByZkCallback(
-        new ClusterStateVerifier.BestPossAndExtViewZkVerifier(ZK_ADDR, clusterName));
-    Assert.assertTrue(result);
+    Assert.assertTrue(verifier.verifyByPolling());

Review comment:
       Just want to make sure, besides waiting for some time before verifying, the verify by polling function is same as what we previously did.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on pull request #1449:
URL: https://github.com/apache/helix/pull/1449#issuecomment-705251143


   This diff is approved. Please help to merge into trunk
   
   >fix #1448 part 1
   HelixClusterVerifier verify() and related method may return
   pre-maturely. The reason is that the verify the converging stable
   condition too early before controller has a chance to make
   change. Basically the previous stable state is mistaken as the
   expected next stable state.
   
   We fix this issue by adding waitTillVerify() timeout in
   construction time of verifier.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1449:
URL: https://github.com/apache/helix/pull/1449#discussion_r501280305



##########
File path: helix-core/src/test/java/org/apache/helix/controller/changedetector/TestResourceChangeDetector.java
##########
@@ -438,7 +441,7 @@ public void testResetSnapshots() {
     Assert.assertEquals(
         changeDetector.getAdditionsByType(ChangeType.IDEAL_STATE).size() + changeDetector
             .getChangesByType(ChangeType.IDEAL_STATE).size() + changeDetector
-            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 2);
+            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 1);

Review comment:
       see line 418, 419
   ```
         // remove newly added resource/ideastate
         _gSetupTool.getClusterManagementTool().dropResource(CLUSTER_NAME, resourceName);
   ```
   
   The newly added resource in the previous test is not really valid. (Confirmed with JJ before.) Or they would break this test. So in this diff, it is removed.  Accordingly the number here needs to be adjusted too.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1449:
URL: https://github.com/apache/helix/pull/1449#discussion_r501285799



##########
File path: helix-core/src/test/java/org/apache/helix/integration/TestEnableCompression.java
##########
@@ -111,10 +111,14 @@ public void testEnableCompressionResource() throws Exception {
     }
 
     BestPossibleExternalViewVerifier verifier =
-        new BestPossibleExternalViewVerifier.Builder(clusterName).setZkAddr(ZK_ADDR)
-            .setExpectLiveInstances(expectedLiveInstances).setResources(expectedResources).build();
-    boolean result = verifier.verify(120000L);
-    Assert.assertTrue(result);
+        new BestPossibleExternalViewVerifier.Builder(clusterName).setZkClient(_gZkClient)
+            .setExpectLiveInstances(expectedLiveInstances).setResources(expectedResources)
+            .setWaitTillVerify(TestHelper.DEFAULT_REBALANCE_PROCESSING_WAIT_TIME)
+            .build();
+
+    System.out.println("before TestEnableCompression verify by polling");

Review comment:
       removed,




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1449:
URL: https://github.com/apache/helix/pull/1449#discussion_r501288604



##########
File path: helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestAutoRebalance.java
##########
@@ -164,9 +169,19 @@ public void testAutoRebalance() throws Exception {
     // kill 1 node
     _participants[0].syncStop();
 
-    boolean result = ClusterStateVerifier
-        .verifyByZkCallback(new ExternalViewBalancedVerifier(_gZkClient, CLUSTER_NAME, TEST_DB));
-    Assert.assertTrue(result);
+    ZkHelixClusterVerifier verifierTestDb = new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME)
+        .setResources(new HashSet<>(Collections.singleton(TEST_DB)))
+        .setZkClient(_gZkClient)
+        .setWaitTillVerify(TestHelper.DEFAULT_REBALANCE_PROCESSING_WAIT_TIME)
+        .build();
+    Assert.assertTrue(verifierTestDb.verifyByPolling());
+
+    ZkHelixClusterVerifier verifierDb2 = new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME)

Review comment:
       Changed to "verifierClusterDb2" as the purpose it to validate cluster Db2




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1449:
URL: https://github.com/apache/helix/pull/1449#discussion_r501282200



##########
File path: helix-core/src/test/java/org/apache/helix/integration/TestDisableCustomCodeRunner.java
##########
@@ -209,9 +214,7 @@ public void test() throws Exception {
 
     // Re-enable custom-code runner
     admin.enableResource(clusterName, customCodeRunnerResource, true);
-    result = ClusterStateVerifier.verifyByZkCallback(
-        new ClusterStateVerifier.BestPossAndExtViewZkVerifier(ZK_ADDR, clusterName));
-    Assert.assertTrue(result);
+    Assert.assertTrue(verifier.verifyByPolling());

Review comment:
       The new one BestPossibleExternalViewVerifier replaced the deprecated one ClusterStateVerifier and BestPossAndExtViewZkVerifier




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1449:
URL: https://github.com/apache/helix/pull/1449#discussion_r501280305



##########
File path: helix-core/src/test/java/org/apache/helix/controller/changedetector/TestResourceChangeDetector.java
##########
@@ -438,7 +441,7 @@ public void testResetSnapshots() {
     Assert.assertEquals(
         changeDetector.getAdditionsByType(ChangeType.IDEAL_STATE).size() + changeDetector
             .getChangesByType(ChangeType.IDEAL_STATE).size() + changeDetector
-            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 2);
+            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 1);

Review comment:
       see line 418, 419
   ```
         // remove newly added resource/ideastate
         _gSetupTool.getClusterManagementTool().dropResource(CLUSTER_NAME, resourceName);
   ```
   
   The newly added resource in the previous test is not really valid. Or they would break this test. So in this diff, it is removed.  Accordingly the number here needs to be adjusted too.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1449:
URL: https://github.com/apache/helix/pull/1449#discussion_r501244706



##########
File path: helix-core/src/test/java/org/apache/helix/controller/changedetector/TestResourceChangeDetector.java
##########
@@ -438,7 +441,7 @@ public void testResetSnapshots() {
     Assert.assertEquals(
         changeDetector.getAdditionsByType(ChangeType.IDEAL_STATE).size() + changeDetector
             .getChangesByType(ChangeType.IDEAL_STATE).size() + changeDetector
-            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 2);
+            .getRemovalsByType(ChangeType.IDEAL_STATE).size(), 1);

Review comment:
       Does this number change mean what we previously tested was wrong? How to justify?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/TestDisableCustomCodeRunner.java
##########
@@ -209,9 +214,7 @@ public void test() throws Exception {
 
     // Re-enable custom-code runner
     admin.enableResource(clusterName, customCodeRunnerResource, true);
-    result = ClusterStateVerifier.verifyByZkCallback(
-        new ClusterStateVerifier.BestPossAndExtViewZkVerifier(ZK_ADDR, clusterName));
-    Assert.assertTrue(result);
+    Assert.assertTrue(verifier.verifyByPolling());

Review comment:
       Just want to make sure, besides waiting for some time before verifying, the verify by polling function is same as what previous is done?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestSemiAutoRebalance.java
##########
@@ -92,33 +95,11 @@ public void beforeClass() throws InterruptedException {
     _controller = new ClusterControllerManager(ZK_ADDR, CLUSTER_NAME, controllerName);
     _controller.syncStart();
 
-    Thread.sleep(1000);
-
-    // verify ideal state and external view
-    IdealState idealState = _accessor.getProperty(_keyBuilder.idealStates(DB_NAME));
-    Assert.assertNotNull(idealState);
-    Assert.assertEquals(idealState.getNumPartitions(), PARTITION_NUMBER);
-    for (String partition : idealState.getPartitionSet()) {
-      List<String> preferenceList = idealState.getPreferenceList(partition);
-      Assert.assertNotNull(preferenceList);
-      Assert.assertEquals(preferenceList.size(), REPLICA_NUMBER);
-    }
-
-    ExternalView externalView = _accessor.getProperty(_keyBuilder.externalView(DB_NAME));
-    Assert.assertNotNull(externalView);
-    Assert.assertEquals(externalView.getPartitionSet().size(), PARTITION_NUMBER);
-    for (String partition : externalView.getPartitionSet()) {
-      Map<String, String> stateMap = externalView.getStateMap(partition);
-      Assert.assertEquals(stateMap.size(), REPLICA_NUMBER);
-
-      int masters = 0;
-      for (String state : stateMap.values()) {
-        if (state.equals(MasterSlaveSMD.States.MASTER.name())) {
-          ++masters;
-        }
-      }
-      Assert.assertEquals(masters, 1);
-    }
+    ZkHelixClusterVerifier verifier = new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME)

Review comment:
       Is previous verification moved to somewhere else?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/TestEnableCompression.java
##########
@@ -111,10 +111,14 @@ public void testEnableCompressionResource() throws Exception {
     }
 
     BestPossibleExternalViewVerifier verifier =
-        new BestPossibleExternalViewVerifier.Builder(clusterName).setZkAddr(ZK_ADDR)
-            .setExpectLiveInstances(expectedLiveInstances).setResources(expectedResources).build();
-    boolean result = verifier.verify(120000L);
-    Assert.assertTrue(result);
+        new BestPossibleExternalViewVerifier.Builder(clusterName).setZkClient(_gZkClient)
+            .setExpectLiveInstances(expectedLiveInstances).setResources(expectedResources)
+            .setWaitTillVerify(TestHelper.DEFAULT_REBALANCE_PROCESSING_WAIT_TIME)
+            .build();
+
+    System.out.println("before TestEnableCompression verify by polling");

Review comment:
       Remove this line.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/TestEnableCompression.java
##########
@@ -111,10 +111,14 @@ public void testEnableCompressionResource() throws Exception {
     }
 
     BestPossibleExternalViewVerifier verifier =
-        new BestPossibleExternalViewVerifier.Builder(clusterName).setZkAddr(ZK_ADDR)
-            .setExpectLiveInstances(expectedLiveInstances).setResources(expectedResources).build();
-    boolean result = verifier.verify(120000L);
-    Assert.assertTrue(result);
+        new BestPossibleExternalViewVerifier.Builder(clusterName).setZkClient(_gZkClient)
+            .setExpectLiveInstances(expectedLiveInstances).setResources(expectedResources)
+            .setWaitTillVerify(TestHelper.DEFAULT_REBALANCE_PROCESSING_WAIT_TIME)
+            .build();
+
+    System.out.println("before TestEnableCompression verify by polling");
+    boolean reuslt = verifier.verifyByPolling(20 * 60 * 1000, 2000);

Review comment:
       Can we define the number somewhere instead of using a math expression?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/rebalancer/TestAutoRebalance.java
##########
@@ -164,9 +169,19 @@ public void testAutoRebalance() throws Exception {
     // kill 1 node
     _participants[0].syncStop();
 
-    boolean result = ClusterStateVerifier
-        .verifyByZkCallback(new ExternalViewBalancedVerifier(_gZkClient, CLUSTER_NAME, TEST_DB));
-    Assert.assertTrue(result);
+    ZkHelixClusterVerifier verifierTestDb = new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME)
+        .setResources(new HashSet<>(Collections.singleton(TEST_DB)))
+        .setZkClient(_gZkClient)
+        .setWaitTillVerify(TestHelper.DEFAULT_REBALANCE_PROCESSING_WAIT_TIME)
+        .build();
+    Assert.assertTrue(verifierTestDb.verifyByPolling());
+
+    ZkHelixClusterVerifier verifierDb2 = new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME)

Review comment:
       Please change Db2 to a better name.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani merged pull request #1449: HelixClusterVerifier verify() with default waitTillVerify time -- part one

Posted by GitBox <gi...@apache.org>.
alirezazamani merged pull request #1449:
URL: https://github.com/apache/helix/pull/1449


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org