You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2021/02/05 22:34:51 UTC

[GitHub] [helix] jiajunwang commented on a change in pull request #1631: fix TestCrushAutoRebalanceNonRack.testLackEnoughInstances unstable issue (#1630)

jiajunwang commented on a change in pull request #1631:
URL: https://github.com/apache/helix/pull/1631#discussion_r571287384



##########
File path: helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestCrushAutoRebalanceNonRack.java
##########
@@ -264,6 +266,12 @@ public void testLackEnoughInstances(String rebalanceStrategyName, String rebalan
     System.out.println("TestLackEnoughInstances " + rebalanceStrategyName);
     enablePersistBestPossibleAssignment(_gZkClient, CLUSTER_NAME, true);
 
+    // Drop instance from admin tools and controller sending message to the same instance are
+    // fundamentally async. The race condition can also happen in production.  For now we stabilize
+    // the test by disable controller and re-enable controller to eliminate this race condition as
+    // a workaround. New design is needed to fundamentally resolve the expose issue.
+    _controller.syncStop();

Review comment:
       Stop the controller will work, but it is not what we do in the production.
   Why not using the maintenance mode as I suggested originally?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org