You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/10/13 05:40:26 UTC

[GitHub] [helix] jiajunwang commented on a change in pull request #1227: Implement thread leakage check, fixing major thread leakage test and some flaky test

jiajunwang commented on a change in pull request #1227:
URL: https://github.com/apache/helix/pull/1227#discussion_r503667622



##########
File path: helix-core/src/test/java/org/apache/helix/controller/changedetector/TestResourceChangeDetector.java
##########
@@ -425,13 +426,19 @@ public void testIgnoreNonTopologyChanges() {
   @Test(dependsOnMethods = "testIgnoreNonTopologyChanges")
   public void testResetSnapshots() {
     // ensure the cluster converged before the test to ensure IS is not modified unexpectedly
-    HelixClusterVerifier _clusterVerifier =
+
+    ZkHelixClusterVerifier _clusterVerifier =

Review comment:
       Why? HelixClusterVerifier has close() method.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/TestDisableCustomCodeRunner.java
##########
@@ -228,7 +228,7 @@ public void test() throws Exception {
       }
     }
     Assert.assertNotNull(leader);
-    for (String instance : callbacks.keySet()) {
+/*    for (String instance : callbacks.keySet()) {

Review comment:
       Why? I think these are test logic.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/TestPartitionMovementThrottle.java
##########
@@ -278,8 +278,7 @@ public void testThrottleOnlyClusterLevelAnyType() {
     ClusterLiveNodesVerifier liveNodesVerifier =
         new ClusterLiveNodesVerifier(_gZkClient, CLUSTER_NAME,
             Lists.transform(Arrays.asList(_participants), MockParticipantManager::getInstanceName));
-    Assert.assertTrue(liveNodesVerifier.verifyByZkCallback(1000));
-    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+    Assert.assertTrue(liveNodesVerifier.verifyByZkCallback(5000));

Review comment:
       Again, why removing test logic?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/common/ZkStandAloneCMTestBase.java
##########
@@ -94,6 +95,8 @@ public void beforeClass() throws Exception {
 
   @AfterClass
   public void afterClass() throws Exception {
+    String testClassName = this.getShortClassName();

Review comment:
       Can we add it to ZkTestBase? The @AfterClass methods with different names will be executed one by one.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/rebalancer/CrushRebalancers/TestCrushAutoRebalance.java
##########
@@ -298,7 +298,7 @@ private void validateZoneAndTagIsolation(IdealState is, ExternalView ev, int exp
       Assert.assertEquals(assignedZones.size(), expectedReplica);
     }
   }
-
+/* These blank test would cause @AfterClass not get invoked.

Review comment:
       I don't think so. It is delayed but not ignored. Is it a concern if the after class is invoked with some delay?
   
   If so, then all the tests that are developed with a parent test class have this potential issue. Then we need to fix it with a different strategy.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/messaging/TestP2PNoDuplicatedMessage.java
##########
@@ -174,7 +174,10 @@ public void testP2PStateTransitionEnabled() {
       verifyP2PEnabled(startTime);
     }
 
-    Assert.assertEquals(p2pTrigged, total);
+    // Discussed with Meng who originally created this one. The success rate really depends on how

Review comment:
       Who is "Meng"? This code is for everyone to review. Please be aware of and follow the open-source code convention.

##########
File path: helix-core/src/test/java/org/apache/helix/manager/zk/client/TestHelixZkClient.java
##########
@@ -110,33 +110,39 @@ public void testSharingZkClient() throws Exception {
       @Override
       public void handleDataChange(String s, Object o) {
         notificationCountA[0]++;
+        System.out.println("sharedZkClient A increased n0 with path: " + s);

Review comment:
       These outputs are too verbose. Do we really need them?

##########
File path: helix-core/src/test/java/org/apache/helix/manager/zk/client/TestHelixZkClient.java
##########
@@ -37,7 +37,7 @@
   @Test
   public void testZkConnectionManager() {
     final String TEST_ROOT = "/testZkConnectionManager/IDEALSTATES";

Review comment:
       I think the TEST_ROOT already contains the test method name.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -56,7 +56,7 @@
 
   private static final String STATE_MODEL = BuiltInStateModelDefinitions.MasterSlave.name();
   private static final String TEST_DB = "TestDB";
-  private static final String CLASS_NAME = TestRoutingTableProvider.class.getSimpleName();
+  private static final String CLASS_NAME = TestRoutingTableProviderPeriodicRefresh.class.getSimpleName();

Review comment:
       getShortClassName();

##########
File path: helix-core/src/test/resources/log4j.properties
##########
@@ -37,4 +44,12 @@ log4j.logger.org.apache=ERROR
 log4j.logger.com.noelios=ERROR
 log4j.logger.org.restlet=ERROR
 
+
+#log4j.logger.org.apache.helix.controller.dataproviders.WorkflowControllerDataProvider=INFO

Review comment:
       remove

##########
File path: helix-core/src/test/java/org/apache/helix/integration/messaging/TestBatchMessage.java
##########
@@ -187,15 +194,18 @@ public void testChangeBatchMessageMode() throws Exception {
       participants[i] = new MockParticipantManager(ZK_ADDR, clusterName, instanceName);
       participants[i].syncStart();
     }
-
-    result =
-        ClusterStateVerifier.verifyByZkCallback(new BestPossAndExtViewZkVerifier(ZK_ADDR,
-            clusterName));
-    Assert.assertTrue(result);
-    // Change to three is because there is an extra factory registered
-    // So one extra NO_OP message send
-    Assert.assertTrue(listener._maxNumberOfChildren <= 3,
-        "Should get no more than 2 messages (O->S and S->M)");
+    BestPossAndExtViewZkVerifier verifier1 = new BestPossAndExtViewZkVerifier(ZK_ADDR, clusterName);

Review comment:
       Please mind the naming.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/common/ZkStandAloneCMTestBase.java
##########
@@ -85,6 +85,7 @@ public void beforeClass() throws Exception {
     _clusterVerifier = new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
         .setWaitTillVerify(TestHelper.DEFAULT_REBALANCE_PROCESSING_WAIT_TIME)
         .build();    Assert.assertTrue(_clusterVerifier.verifyByPolling());
+    Assert.assertTrue(_clusterVerifier.verifyByPolling());

Review comment:
       Duplicate code.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/TestZkConnectionLost.java
##########
@@ -66,6 +66,8 @@
   private ClusterSetup _setupTool;
   private HelixZkClient _zkClient;
 
+  private static final int RESTART_CNT = 2;

Review comment:
       It was 4.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/messaging/TestBatchMessage.java
##########
@@ -264,10 +274,16 @@ public void testSubMsgExecutionFail() throws Exception {
     Map<String, Map<String, String>> errStates = new HashMap<String, Map<String, String>>();
     errStates.put("TestDB0", new HashMap<String, String>());
     errStates.get("TestDB0").put(errPartition, masterOfPartition0);
-    boolean result =
-        ClusterStateVerifier.verifyByPolling(new ClusterStateVerifier.BestPossAndExtViewZkVerifier(
-            ZK_ADDR, clusterName, errStates));
-    Assert.assertTrue(result);
+
+    ClusterStateVerifier.BestPossAndExtViewZkVerifier verifier = new ClusterStateVerifier.BestPossAndExtViewZkVerifier(

Review comment:
       nit, why the other BestPossAndExtViewZkVerifier does not need the prefix "ClusterStateVerifier."?

##########
File path: helix-core/src/test/java/org/apache/helix/tools/TestClusterSetup.java
##########
@@ -61,56 +61,162 @@
   protected static final String STATE_MODEL = "MasterSlave";
   protected static final String TEST_NODE = "testnode_1";
 
-  private ClusterSetup _clusterSetup;
+  private ClusterSetup _clusterSetup = null;
 
   private static String[] createArgs(String str) {
     String[] split = str.split("[ ]+");
     System.out.println(Arrays.toString(split));
     return split;
   }
 
-  @BeforeClass()
+  @BeforeClass
   public void beforeClass() throws Exception {
     System.out
         .println("START TestClusterSetup.beforeClass() " + new Date(System.currentTimeMillis()));
+    _clusterSetup = new ClusterSetup(ZK_ADDR);
   }
 
-  @AfterClass()
+  @AfterClass
   public void afterClass() {
+    String testClassName = this.getShortClassName();
+    System.out.println("AfterClass: " + testClassName + " of TestClusterSetup called.");
+
     deleteCluster(CLUSTER_NAME);
+    _clusterSetup.close();
     System.out.println("END TestClusterSetup.afterClass() " + new Date(System.currentTimeMillis()));
   }
 
-  @BeforeMethod()
+  @BeforeMethod
   public void setup() {
+    // System.out.println("@BeforeMethod TestClusterSetup beforeMethod called. ");

Review comment:
       Remove

##########
File path: helix-core/src/test/java/org/apache/helix/monitoring/TestClusterStatusMonitorLifecycle.java
##########
@@ -291,7 +292,14 @@ public void testClusterStatusMonitorLifecycle() throws Exception {
         Query.not(Query.match(Query.attr("SensorName"), Query.value("MessageQueueStatus.*"))),
         exp1);
 
-    Assert.assertTrue(TestHelper.verify(() -> ManagementFactory.getPlatformMBeanServer()
-        .queryMBeans(new ObjectName("ClusterStatus:*"), exp2).isEmpty(), 3000));
+    // Note, the _asyncTasksThreadPool shutting down logic in GenericHelixController is best effort
+    // there is not guarantee that all threads in the pool is gone. MOstly they will, but not always.

Review comment:
       typo, MOstly -> Mostly

##########
File path: helix-core/src/test/java/org/apache/helix/integration/rebalancer/WagedRebalancer/TestWagedRebalance.java
##########
@@ -665,7 +642,6 @@ public void testRebalancerReset() throws Exception {
     // After reset done, the rebalancer will try to rebalance all the partitions since it has
     // forgotten the previous state.
     // TODO remove this sleep after fix https://github.com/apache/helix/issues/526

Review comment:
       Remove this too.

##########
File path: helix-core/src/test/java/org/apache/helix/manager/zk/client/TestHelixZkClient.java
##########
@@ -82,8 +82,8 @@ public void testZkConnectionManager() {
 
   @Test(dependsOnMethods = "testZkConnectionManager")
   public void testSharingZkClient() throws Exception {
-    final String TEST_ROOT = "/testSharingZkClient/IDEALSTATES";
-    final String TEST_PATH = TEST_ROOT + TEST_NODE;
+    final String TEST_ROOT = "/testSharingZkClient/IDEALSTATES1";

Review comment:
       Why?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/multizk/TestMultiZkHelixJavaApis.java
##########
@@ -222,6 +232,17 @@ public void afterClass() throws Exception {
       } else {
         System.clearProperty(MetadataStoreRoutingConstants.MSDS_SERVER_ENDPOINT_KEY);
       }
+
+      boolean status = false;

Review comment:
       Please rebase, I think this change has been in.

##########
File path: helix-core/src/test/java/org/apache/helix/tools/TestClusterSetup.java
##########
@@ -221,18 +327,18 @@ public void testRebalanceCluster() throws Exception {
     verifyReplication(_gZkClient, CLUSTER_NAME, TEST_DB, 4);
   }
 
-  @Test()
+  @Test(dependsOnMethods = "testAddClusterWithValidCloudConfig")
   public void testParseCommandLinesArgs() throws Exception {
     // wipe ZK
-    _gZkClient.deleteRecursively("/" + CLUSTER_NAME);
-    _clusterSetup = new ClusterSetup(ZK_ADDR);
+    //_gZkClient.deleteRecursively("/" + CLUSTER_NAME);
+    //_clusterSetup = new ClusterSetup(ZK_ADDR);
 
     ClusterSetup
         .processCommandLineArgs(createArgs("-zkSvr " + ZK_ADDR + " --addCluster " + CLUSTER_NAME));
 
     // wipe again
     _gZkClient.deleteRecursively("/" + CLUSTER_NAME);
-    _clusterSetup = new ClusterSetup(ZK_ADDR);
+    //_clusterSetup = new ClusterSetup(ZK_ADDR);

Review comment:
       remvoe

##########
File path: helix-core/src/test/java/org/apache/helix/tools/TestClusterSetup.java
##########
@@ -221,18 +327,18 @@ public void testRebalanceCluster() throws Exception {
     verifyReplication(_gZkClient, CLUSTER_NAME, TEST_DB, 4);
   }
 
-  @Test()
+  @Test(dependsOnMethods = "testAddClusterWithValidCloudConfig")
   public void testParseCommandLinesArgs() throws Exception {
     // wipe ZK
-    _gZkClient.deleteRecursively("/" + CLUSTER_NAME);
-    _clusterSetup = new ClusterSetup(ZK_ADDR);
+    //_gZkClient.deleteRecursively("/" + CLUSTER_NAME);
+    //_clusterSetup = new ClusterSetup(ZK_ADDR);

Review comment:
       Remove

##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderFromTargetEV.java
##########
@@ -113,13 +113,17 @@ public void afterClass() throws Exception {
     }
     deleteCluster(CLUSTER_NAME);
   }
+  @Test
+  public void testHead() {
+    Assert.assertTrue(true);
+  }

Review comment:
       Why? I guess you are playing some trick. Could you please explain?

##########
File path: helix-core/src/test/java/org/apache/helix/monitoring/TestClusterStatusMonitorLifecycle.java
##########
@@ -291,7 +292,14 @@ public void testClusterStatusMonitorLifecycle() throws Exception {
         Query.not(Query.match(Query.attr("SensorName"), Query.value("MessageQueueStatus.*"))),
         exp1);
 
-    Assert.assertTrue(TestHelper.verify(() -> ManagementFactory.getPlatformMBeanServer()
-        .queryMBeans(new ObjectName("ClusterStatus:*"), exp2).isEmpty(), 3000));
+    // Note, the _asyncTasksThreadPool shutting down logic in GenericHelixController is best effort
+    // there is not guarantee that all threads in the pool is gone. MOstly they will, but not always.
+    // see https://github.com/apache/helix/issues/1280
+    boolean result = TestHelper.verify(() -> ManagementFactory.getPlatformMBeanServer()
+        .queryMBeans(new ObjectName("ClusterStatus:*"), exp2).isEmpty(), TestHelper.WAIT_DURATION);
+
+    if (!result) {
+      System.out.println("controllers shutting down failed to tear down all _asyncTasksThreadPools");
+    }

Review comment:
       Don't do it, please. If it causes the test to become unstable, we can fix it separately. You are removing this test case now.

##########
File path: helix-core/src/test/java/org/apache/helix/tools/TestClusterSetup.java
##########
@@ -61,56 +61,162 @@
   protected static final String STATE_MODEL = "MasterSlave";
   protected static final String TEST_NODE = "testnode_1";
 
-  private ClusterSetup _clusterSetup;
+  private ClusterSetup _clusterSetup = null;
 
   private static String[] createArgs(String str) {
     String[] split = str.split("[ ]+");
     System.out.println(Arrays.toString(split));
     return split;
   }
 
-  @BeforeClass()
+  @BeforeClass
   public void beforeClass() throws Exception {
     System.out
         .println("START TestClusterSetup.beforeClass() " + new Date(System.currentTimeMillis()));
+    _clusterSetup = new ClusterSetup(ZK_ADDR);
   }
 
-  @AfterClass()
+  @AfterClass
   public void afterClass() {
+    String testClassName = this.getShortClassName();
+    System.out.println("AfterClass: " + testClassName + " of TestClusterSetup called.");
+
     deleteCluster(CLUSTER_NAME);
+    _clusterSetup.close();
     System.out.println("END TestClusterSetup.afterClass() " + new Date(System.currentTimeMillis()));
   }
 
-  @BeforeMethod()
+  @BeforeMethod
   public void setup() {
+    // System.out.println("@BeforeMethod TestClusterSetup beforeMethod called. ");
+    try {
+      _gZkClient.deleteRecursively("/" + CLUSTER_NAME);
+      _clusterSetup.addCluster(CLUSTER_NAME, true);
+    } catch (Exception e) {
+      System.out.println("@BeforeMethod TestClusterSetup exception:" + e);
+    }
+  }
 
-    _gZkClient.deleteRecursively("/" + CLUSTER_NAME);
-    _clusterSetup = new ClusterSetup(ZK_ADDR);
-    _clusterSetup.addCluster(CLUSTER_NAME, true);
+  @Test(expectedExceptions = HelixException.class)
+  public void testAddClusterWithInvalidCloudConfig() throws Exception {
+    String className = TestHelper.getTestClassName();
+    String methodName = TestHelper.getTestMethodName();
+    String clusterName = className + "_" + methodName;
+
+    CloudConfig.Builder cloudConfigInitBuilder = new CloudConfig.Builder();
+    cloudConfigInitBuilder.setCloudEnabled(true);
+    List<String> sourceList = new ArrayList<String>();
+    sourceList.add("TestURL");
+    cloudConfigInitBuilder.setCloudInfoSources(sourceList);
+    cloudConfigInitBuilder.setCloudProvider(CloudProvider.CUSTOMIZED);
+
+    CloudConfig cloudConfigInit = cloudConfigInitBuilder.build();
+
+    // Since setCloudInfoProcessorName is missing, this add cluster call will throw an exception
+    _clusterSetup.addCluster(clusterName, false, cloudConfigInit);
+  }
+
+  @Test(dependsOnMethods = "testAddClusterWithInvalidCloudConfig")
+  public void testAddClusterWithValidCloudConfig() throws Exception {
+    String className = TestHelper.getTestClassName();
+    String methodName = TestHelper.getTestMethodName();
+    String clusterName = className + "_" + methodName;
+
+    CloudConfig.Builder cloudConfigInitBuilder = new CloudConfig.Builder();
+    cloudConfigInitBuilder.setCloudEnabled(true);
+    cloudConfigInitBuilder.setCloudID("TestID");
+    List<String> sourceList = new ArrayList<String>();
+    sourceList.add("TestURL");
+    cloudConfigInitBuilder.setCloudInfoSources(sourceList);
+    cloudConfigInitBuilder.setCloudInfoProcessorName("TestProcessorName");
+    cloudConfigInitBuilder.setCloudProvider(CloudProvider.CUSTOMIZED);
+
+    CloudConfig cloudConfigInit = cloudConfigInitBuilder.build();
+
+    _clusterSetup.addCluster(clusterName, false, cloudConfigInit);
+
+    // Read CloudConfig from Zookeeper and check the content
+    ConfigAccessor _configAccessor = new ConfigAccessor(_gZkClient);
+    CloudConfig cloudConfigFromZk = _configAccessor.getCloudConfig(clusterName);
+    Assert.assertTrue(cloudConfigFromZk.isCloudEnabled());
+    Assert.assertEquals(cloudConfigFromZk.getCloudID(), "TestID");
+    List<String> listUrlFromZk = cloudConfigFromZk.getCloudInfoSources();
+    Assert.assertEquals(listUrlFromZk.get(0), "TestURL");
+    Assert.assertEquals(cloudConfigFromZk.getCloudInfoProcessorName(), "TestProcessorName");
+    Assert.assertEquals(cloudConfigFromZk.getCloudProvider(), CloudProvider.CUSTOMIZED.name());
+  }
+
+  @Test(dependsOnMethods = "testAddClusterWithValidCloudConfig")
+  public void testAddClusterAzureProvider() throws Exception {
+    String className = TestHelper.getTestClassName();
+    String methodName = TestHelper.getTestMethodName();
+    String clusterName = className + "_" + methodName;
+
+    CloudConfig.Builder cloudConfigInitBuilder = new CloudConfig.Builder();
+    cloudConfigInitBuilder.setCloudEnabled(true);
+    cloudConfigInitBuilder.setCloudID("TestID");
+    cloudConfigInitBuilder.setCloudProvider(CloudProvider.AZURE);
+
+    CloudConfig cloudConfigInit = cloudConfigInitBuilder.build();
+
+    _clusterSetup.addCluster(clusterName, false, cloudConfigInit);
+
+    // Read CloudConfig from Zookeeper and check the content
+    ConfigAccessor _configAccessor = new ConfigAccessor(ZK_ADDR);
+    CloudConfig cloudConfigFromZk = _configAccessor.getCloudConfig(clusterName);
+    Assert.assertTrue(cloudConfigFromZk.isCloudEnabled());
+    Assert.assertEquals(cloudConfigFromZk.getCloudID(), "TestID");
+    List<String> listUrlFromZk = cloudConfigFromZk.getCloudInfoSources();
+
+    // Since it is Azure, topology information should have been populated.
+    ClusterConfig clusterConfig = _configAccessor.getClusterConfig(clusterName);
+    Assert.assertEquals(clusterConfig.getTopology(), AzureConstants.AZURE_TOPOLOGY);
+    Assert.assertEquals(clusterConfig.getFaultZoneType(), AzureConstants.AZURE_FAULT_ZONE_TYPE);
+    Assert.assertTrue(clusterConfig.isTopologyAwareEnabled());
+
+    // Since provider is not customized, CloudInfoSources and CloudInfoProcessorName will be null.
+    Assert.assertNull(listUrlFromZk);
+    Assert.assertNull(cloudConfigFromZk.getCloudInfoProcessorName());
+    Assert.assertEquals(cloudConfigFromZk.getCloudProvider(), CloudProvider.AZURE.name());
   }
 
-  @Test
+  // Note, with mvn 3.6.1, we have a nasty bug that running "mvn test" under helix-core,
+  // all the bellow test will not be invoked. Also, @AfterClass cleanup of this class and

Review comment:
       I mentioned once that I think the below tests are not "not invoked". They are just delayed. Have you verified that if they never been triggered during the whole test suite?

##########
File path: helix-core/src/test/java/org/apache/helix/model/TestResourceConfig.java
##########
@@ -34,6 +34,11 @@
   private static final ObjectMapper _objectMapper = new ObjectMapper();
 
   @Test
+  public void testHead() {

Review comment:
       Again, curious about this trick. Wait for the explanation.

##########
File path: helix-core/src/test/java/org/apache/helix/tools/TestClusterSetup.java
##########
@@ -61,56 +61,162 @@
   protected static final String STATE_MODEL = "MasterSlave";
   protected static final String TEST_NODE = "testnode_1";
 
-  private ClusterSetup _clusterSetup;
+  private ClusterSetup _clusterSetup = null;
 
   private static String[] createArgs(String str) {
     String[] split = str.split("[ ]+");
     System.out.println(Arrays.toString(split));
     return split;
   }
 
-  @BeforeClass()
+  @BeforeClass
   public void beforeClass() throws Exception {
     System.out
         .println("START TestClusterSetup.beforeClass() " + new Date(System.currentTimeMillis()));
+    _clusterSetup = new ClusterSetup(ZK_ADDR);
   }
 
-  @AfterClass()
+  @AfterClass
   public void afterClass() {
+    String testClassName = this.getShortClassName();
+    System.out.println("AfterClass: " + testClassName + " of TestClusterSetup called.");
+
     deleteCluster(CLUSTER_NAME);
+    _clusterSetup.close();
     System.out.println("END TestClusterSetup.afterClass() " + new Date(System.currentTimeMillis()));
   }
 
-  @BeforeMethod()
+  @BeforeMethod
   public void setup() {
+    // System.out.println("@BeforeMethod TestClusterSetup beforeMethod called. ");
+    try {
+      _gZkClient.deleteRecursively("/" + CLUSTER_NAME);
+      _clusterSetup.addCluster(CLUSTER_NAME, true);
+    } catch (Exception e) {
+      System.out.println("@BeforeMethod TestClusterSetup exception:" + e);
+    }
+  }
 
-    _gZkClient.deleteRecursively("/" + CLUSTER_NAME);
-    _clusterSetup = new ClusterSetup(ZK_ADDR);
-    _clusterSetup.addCluster(CLUSTER_NAME, true);
+  @Test(expectedExceptions = HelixException.class)

Review comment:
       The following method movements are not easy to review since I don't know what has been changed. Could you move them to the original place so we can compare side by side?

##########
File path: helix-core/src/test/resources/log4j.properties
##########
@@ -16,7 +16,12 @@
 # specific language governing permissions and limitations
 # under the License.
 #
+
+
+# Set root logger level to DEBUG and its only appender to R.
 log4j.rootLogger=ERROR, C
+# log4j.rootLogger=ERROR, C, R

Review comment:
       remove

##########
File path: helix-core/src/test/java/org/apache/helix/tools/TestHelixAdminCli.java
##########
@@ -568,7 +568,9 @@ public void testDeactivateCluster() throws Exception {
     String command =
         "-zkSvr " + ZK_ADDR + " -activateCluster " + clusterName + " " + grandClusterName + " true";
     ClusterSetup.processCommandLineArgs(command.split("\\s+"));
-    Thread.sleep(500);
+
+    // wait long enough enough so that grand_cluster converge
+    Thread.sleep(20000);

Review comment:
       Verifier polling the grand_cluster?

##########
File path: helix-core/src/test/java/org/apache/helix/tools/TestClusterStateVerifier.java
##########
@@ -114,19 +118,30 @@ public void testResourceSubset() throws InterruptedException {
     Thread.sleep(1000);
     _admin.enableCluster(_clusterName, false);
     _admin.enableInstance(_clusterName, "localhost_12918", true);
-    boolean result =
-        ClusterStateVerifier.verifyByZkCallback(new BestPossAndExtViewZkVerifier(ZK_ADDR,
-            _clusterName, null, Sets.newHashSet(RESOURCES[1])));
-    Assert.assertTrue(result);
+    BestPossAndExtViewZkVerifier verifier = new BestPossAndExtViewZkVerifier(ZK_ADDR,
+        _clusterName, null, Sets.newHashSet(RESOURCES[1]));
+    boolean result = false;
+    try {
+      result = ClusterStateVerifier.verifyByZkCallback(verifier);
+      Assert.assertTrue(result);
+    } finally {
+      verifier.close();
+    }
+
     String[] args = {
         "--zkSvr", ZK_ADDR, "--cluster", _clusterName, "--resources", RESOURCES[1]
     };
     result = ClusterStateVerifier.verifyState(args);
     Assert.assertTrue(result);
 
     // But the full cluster verification should fail
-    boolean fullResult = new BestPossAndExtViewZkVerifier(ZK_ADDR, _clusterName).verify();
-    Assert.assertFalse(fullResult);
+    BestPossAndExtViewZkVerifier verifier1 = new BestPossAndExtViewZkVerifier(ZK_ADDR, _clusterName);

Review comment:
       verifier = ...

##########
File path: helix-core/src/test/resources/log4j.properties
##########
@@ -27,6 +32,8 @@ log4j.appender.R=org.apache.log4j.RollingFileAppender
 log4j.appender.R.layout=org.apache.log4j.PatternLayout
 log4j.appender.R.layout.ConversionPattern=%5p [%C:%M] (%F:%L) - %m%n
 log4j.appender.R.File=target/ClusterManagerLogs/log.txt
+log4j.appender.R.MaxBackupIndex=200

Review comment:
       I think we discussed, there is no need to configure MaxBackupIndex, right?

##########
File path: helix-core/src/test/java/org/apache/helix/tools/TestHelixAdminCli.java
##########
@@ -600,6 +602,15 @@ public void testDeactivateCluster() throws Exception {
       }
     }
 
+    boolean leaderNotExists = TestHelper.verify(() -> {
+      boolean isLeaderExists = _gZkClient.exists(path);
+      if (isLeaderExists) {
+        System.out.println("mysteriou leader out");

Review comment:
       typo? "mysteriou"
   
   This print is too verbose. Please remove.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org