You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/04/29 18:47:24 UTC

[GitHub] [helix] alirezazamani opened a new pull request #981: Stabilizing 4 flaky tests

alirezazamani opened a new pull request #981:
URL: https://github.com/apache/helix/pull/981


   ### Issues
   - [x] My PR addresses the following Helix issues and references them in the PR title:
   Fixes #977 
   Fixes #978 
   Fixes #979 
   Fixes #980 
   
   ### Description
   - [x] Here are some details about my PR, including screenshots of any UI changes:
   
   TestJobFailure was unstable because we get ExternalView of a resources and if the ExternalView is not populated yet by the controller, we hit NullPointerException.
   
   TestRebalanceRunningTask was unstable. In this PR, we make sure that the master is existed in two different nodes (master is switched to new instance) and then we check the assigned participants.
   
   TestRebalanceStopAndResume was unstable because of Thread.Sleep usage. Instead of stopping the workflow after some time, we first make sure that workflow and job is IN_PROGRESS and then stop the workflow.
   
   TestTaskSchedulingTwoCurrent has been stabilized by making sure that master has been switched to new instance after modifying IS. After that we make sure that task is assigned to the correct instance and make sure it does not switched to new instance and cancel is not being called incorrectly.
   
   
   ### Tests
   - [x] The following is the result of the "mvn test" command on the appropriate module:
   Test Result:
   [INFO] Results:
   [INFO] 
   [ERROR] Failures: 
   [ERROR]   TestTaskRebalancer.timeouts:200 expected:<true> but was:<false>
   [ERROR]   TestWorkflowTermination.testWorkflowRunningTimeout:131->verifyWorkflowCleanup:257 expected:<true> but was:<false>
   [ERROR]   TestClusterVerifier.testResourceSubset:225 expected:<false> but was:<true>
   [INFO] 
   [ERROR] Tests run: 1144, Failures: 3, Errors: 0, Skipped: 0
   [INFO] 
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD FAILURE
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time:  01:18 h
   [INFO] Finished at: 2020-04-29T11:25:52-07:00
   [INFO] ------------------------------------------------------------------------
   
   The failed tests has passed when I ran it individually. The failed tests also need to be stabilized later
   
   ### Commits
   
   - [x] My commits all reference appropriate Apache Helix GitHub issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - [x] My diff has been formatted using helix-style.xml


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r418322817



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,18 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      if (externalView == null) {
+        return false;
+      }
+      Map<String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      return "MASTER".equals(stateMap.get(PARTICIPANT_PREFIX + "_" + (_startPort + 1)));

Review comment:
       Added null check.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417695236



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,17 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");

Review comment:
       Yeah it is possible. I added new check for the null EV.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417562574



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,18 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      String state = stateMap.getOrDefault(PARTICIPANT_PREFIX + "_" + (_startPort + 1), null);
+      if (state == null) {
+        return false;
+      }

Review comment:
       If you're expecting null, there's really no reason to use `getOrDefault`?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,18 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      String state = stateMap.getOrDefault(PARTICIPANT_PREFIX + "_" + (_startPort + 1), null);
+      if (state == null) {
+        return false;
+      }
+      return (state.equals("MASTER"));

Review comment:
       redundant parentheses




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r418251181



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestJobFailure.java
##########
@@ -118,8 +118,15 @@ public void testNormalJobFailure(String comment, List<String> taskStates,
   }
 
   private Map<String, Map<String, String>> createPartitionConfig(List<String> taskStates,
-      List<String> expectedTaskEndingStates) {
+      List<String> expectedTaskEndingStates) throws Exception {
     Map<String, Map<String, String>> targetPartitionConfigs = new HashMap<>();
+    // Make sure external view has been created for the resource
+    Assert.assertTrue(TestHelper.verify(() -> {
+      ExternalView externalView =
+          _manager.getClusterManagmentTool().getResourceExternalView(CLUSTER_NAME, DB_NAME);
+      return (externalView != null);

Review comment:
       Parentheses aren't needed here.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestRebalanceRunningTask.java
##########
@@ -261,6 +263,31 @@ public void testFixedTargetTaskAndDisabledRebalanceAndNodeAdded() throws Interru
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(DATABASE)).build();
     Assert.assertTrue(clusterVerifier.verify(10 * 1000));
+
+    // Wait until master is switched to new instance and two masters existed on two different instance

Review comment:
       Nit: "two masters exist on two different instances"

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestRebalanceRunningTask.java
##########
@@ -261,6 +263,31 @@ public void testFixedTargetTaskAndDisabledRebalanceAndNodeAdded() throws Interru
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(DATABASE)).build();
     Assert.assertTrue(clusterVerifier.verify(10 * 1000));
+
+    // Wait until master is switched to new instance and two masters existed on two different instance
+    boolean isMasterOnTwoDifferentNodes = TestHelper.verify(() -> {
+      HashSet<String> masterInstances = new HashSet<>();
+      ExternalView externalView =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      if (externalView == null) {
+        return false;
+      }
+      Map<String, String> stateMap0 = externalView.getStateMap(DATABASE + "_0");
+      Map<String, String> stateMap1 = externalView.getStateMap(DATABASE + "_1");
+      for (Map.Entry<String, String> entry : stateMap0.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      for (Map.Entry<String, String> entry : stateMap1.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      return (masterInstances.size() == 2);

Review comment:
       Parentheses not needed?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestRebalanceRunningTask.java
##########
@@ -261,6 +263,31 @@ public void testFixedTargetTaskAndDisabledRebalanceAndNodeAdded() throws Interru
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(DATABASE)).build();
     Assert.assertTrue(clusterVerifier.verify(10 * 1000));
+
+    // Wait until master is switched to new instance and two masters existed on two different instance
+    boolean isMasterOnTwoDifferentNodes = TestHelper.verify(() -> {
+      HashSet<String> masterInstances = new HashSet<>();

Review comment:
       Could we declare it as Set instead of HashSet here?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,18 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      if (externalView == null) {
+        return false;
+      }
+      Map<String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      return "MASTER".equals(stateMap.get(PARTICIPANT_PREFIX + "_" + (_startPort + 1)));

Review comment:
       Do we need to do a null check on stateMap here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r418322555



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestJobFailure.java
##########
@@ -118,8 +118,15 @@ public void testNormalJobFailure(String comment, List<String> taskStates,
   }
 
   private Map<String, Map<String, String>> createPartitionConfig(List<String> taskStates,
-      List<String> expectedTaskEndingStates) {
+      List<String> expectedTaskEndingStates) throws Exception {
     Map<String, Map<String, String>> targetPartitionConfigs = new HashMap<>();
+    // Make sure external view has been created for the resource
+    Assert.assertTrue(TestHelper.verify(() -> {
+      ExternalView externalView =
+          _manager.getClusterManagmentTool().getResourceExternalView(CLUSTER_NAME, DB_NAME);
+      return (externalView != null);

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on pull request #981:
URL: https://github.com/apache/helix/pull/981#issuecomment-622148103


   This PR is ready to be merged, approved by @narendly.
   
   Final commit message:
   
   Title:
   Stabilizing 4 flaky tests
   
   Body:
   Four tests have been stabilized in this commit. These tests are:
   1-TestJobFailure
   2-TestRebalanceRunningTask
   3-TestTaskRebalancerStopResume
   4-TestTaskSchedulingTwoCurrentStates


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r418322612



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestRebalanceRunningTask.java
##########
@@ -261,6 +263,31 @@ public void testFixedTargetTaskAndDisabledRebalanceAndNodeAdded() throws Interru
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(DATABASE)).build();
     Assert.assertTrue(clusterVerifier.verify(10 * 1000));
+
+    // Wait until master is switched to new instance and two masters existed on two different instance
+    boolean isMasterOnTwoDifferentNodes = TestHelper.verify(() -> {
+      HashSet<String> masterInstances = new HashSet<>();

Review comment:
       Done.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestRebalanceRunningTask.java
##########
@@ -261,6 +263,31 @@ public void testFixedTargetTaskAndDisabledRebalanceAndNodeAdded() throws Interru
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(DATABASE)).build();
     Assert.assertTrue(clusterVerifier.verify(10 * 1000));
+
+    // Wait until master is switched to new instance and two masters existed on two different instance
+    boolean isMasterOnTwoDifferentNodes = TestHelper.verify(() -> {
+      HashSet<String> masterInstances = new HashSet<>();
+      ExternalView externalView =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      if (externalView == null) {
+        return false;
+      }
+      Map<String, String> stateMap0 = externalView.getStateMap(DATABASE + "_0");
+      Map<String, String> stateMap1 = externalView.getStateMap(DATABASE + "_1");
+      for (Map.Entry<String, String> entry : stateMap0.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      for (Map.Entry<String, String> entry : stateMap1.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      return (masterInstances.size() == 2);

Review comment:
       Done.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestRebalanceRunningTask.java
##########
@@ -261,6 +263,31 @@ public void testFixedTargetTaskAndDisabledRebalanceAndNodeAdded() throws Interru
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(DATABASE)).build();
     Assert.assertTrue(clusterVerifier.verify(10 * 1000));
+
+    // Wait until master is switched to new instance and two masters existed on two different instance

Review comment:
       Fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] pkuwm commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
pkuwm commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417666330



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestRebalanceRunningTask.java
##########
@@ -261,6 +263,28 @@ public void testFixedTargetTaskAndDisabledRebalanceAndNodeAdded() throws Interru
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(DATABASE)).build();
     Assert.assertTrue(clusterVerifier.verify(10 * 1000));
+
+    // Wait until master is switched to new instance and two masters existed on two different instance
+    boolean isMasterOnTwoDifferentNodes = TestHelper.verify(() -> {
+      HashSet<String> masterInstances = new HashSet<>();
+      ExternalView externalView =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map<String, String> stateMap0 = externalView.getStateMap(DATABASE + "_0");
+      Map<String, String> stateMap1 = externalView.getStateMap(DATABASE + "_1");
+      for (Map.Entry<String, String> entry : stateMap0.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      for (Map.Entry<String, String> entry : stateMap1.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      return (masterInstances.size() == 2);

Review comment:
       A counter is enough, no need to create a HashSet, right?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestJobFailure.java
##########
@@ -118,8 +118,15 @@ public void testNormalJobFailure(String comment, List<String> taskStates,
   }
 
   private Map<String, Map<String, String>> createPartitionConfig(List<String> taskStates,
-      List<String> expectedTaskEndingStates) {
+      List<String> expectedTaskEndingStates) throws Exception {
     Map<String, Map<String, String>> targetPartitionConfigs = new HashMap<>();
+    // Make sure external view is has been created for the resource
+    boolean isExternalViewCreated = TestHelper.verify(() -> {
+      ExternalView externalView =
+          _manager.getClusterManagmentTool().getResourceExternalView(CLUSTER_NAME, DB_NAME);
+      return (externalView != null);
+    }, TestHelper.WAIT_DURATION);
+    Assert.assertTrue(isExternalViewCreated);

Review comment:
       Nit, I would just do `assertTrue(TestHelper.verify());` to get rid of the temp boolean `isExternalViewCreated` var to minimize variable scope. Same for following usages.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,17 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");

Review comment:
       Is externalView possible to be null?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,17 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      if (!stateMap.containsKey(PARTICIPANT_PREFIX + "_" + (_startPort + 1))) {
+        return false;
+      }
+      return stateMap.get(PARTICIPANT_PREFIX + "_" + (_startPort + 1)).equals("MASTER");

Review comment:
       These checks could be simplified to `"MASTER".equals(stateMap.get(PARTICIPANT_PREFIX + "_" + (_startPort + 1)));`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417693833



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestJobFailure.java
##########
@@ -118,8 +118,15 @@ public void testNormalJobFailure(String comment, List<String> taskStates,
   }
 
   private Map<String, Map<String, String>> createPartitionConfig(List<String> taskStates,
-      List<String> expectedTaskEndingStates) {
+      List<String> expectedTaskEndingStates) throws Exception {
     Map<String, Map<String, String>> targetPartitionConfigs = new HashMap<>();
+    // Make sure external view is has been created for the resource
+    boolean isExternalViewCreated = TestHelper.verify(() -> {
+      ExternalView externalView =
+          _manager.getClusterManagmentTool().getResourceExternalView(CLUSTER_NAME, DB_NAME);
+      return (externalView != null);
+    }, TestHelper.WAIT_DURATION);
+    Assert.assertTrue(isExternalViewCreated);

Review comment:
       Good suggestion. Changed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417562878



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,18 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      String state = stateMap.getOrDefault(PARTICIPANT_PREFIX + "_" + (_startPort + 1), null);
+      if (state == null) {
+        return false;
+      }

Review comment:
       In this case, a simple containsKey check would do..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417573487



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,18 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      String state = stateMap.getOrDefault(PARTICIPANT_PREFIX + "_" + (_startPort + 1), null);
+      if (state == null) {
+        return false;
+      }

Review comment:
       Fixed. I simplified the logic.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417573487



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,18 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      String state = stateMap.getOrDefault(PARTICIPANT_PREFIX + "_" + (_startPort + 1), null);
+      if (state == null) {
+        return false;
+      }

Review comment:
       Fixed. Replaced with containsKey

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,18 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      String state = stateMap.getOrDefault(PARTICIPANT_PREFIX + "_" + (_startPort + 1), null);
+      if (state == null) {
+        return false;
+      }
+      return (state.equals("MASTER"));

Review comment:
       Fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417694705



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,17 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      if (!stateMap.containsKey(PARTICIPANT_PREFIX + "_" + (_startPort + 1))) {
+        return false;
+      }
+      return stateMap.get(PARTICIPANT_PREFIX + "_" + (_startPort + 1)).equals("MASTER");

Review comment:
       Fixed. Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417692442



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestRebalanceRunningTask.java
##########
@@ -261,6 +263,28 @@ public void testFixedTargetTaskAndDisabledRebalanceAndNodeAdded() throws Interru
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(DATABASE)).build();
     Assert.assertTrue(clusterVerifier.verify(10 * 1000));
+
+    // Wait until master is switched to new instance and two masters existed on two different instance
+    boolean isMasterOnTwoDifferentNodes = TestHelper.verify(() -> {
+      HashSet<String> masterInstances = new HashSet<>();
+      ExternalView externalView =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map<String, String> stateMap0 = externalView.getStateMap(DATABASE + "_0");
+      Map<String, String> stateMap1 = externalView.getStateMap(DATABASE + "_1");
+      for (Map.Entry<String, String> entry : stateMap0.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      for (Map.Entry<String, String> entry : stateMap1.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      return (masterInstances.size() == 2);

Review comment:
       Please note that we want to make sure that the master for two different resources are existed in two different instances. Counter is also possible. However, I still think this implementation in more readable. Specially considering that this only a test.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestRebalanceRunningTask.java
##########
@@ -261,6 +263,28 @@ public void testFixedTargetTaskAndDisabledRebalanceAndNodeAdded() throws Interru
         new BestPossibleExternalViewVerifier.Builder(CLUSTER_NAME).setZkClient(_gZkClient)
             .setResources(Sets.newHashSet(DATABASE)).build();
     Assert.assertTrue(clusterVerifier.verify(10 * 1000));
+
+    // Wait until master is switched to new instance and two masters existed on two different instance
+    boolean isMasterOnTwoDifferentNodes = TestHelper.verify(() -> {
+      HashSet<String> masterInstances = new HashSet<>();
+      ExternalView externalView =
+          _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map<String, String> stateMap0 = externalView.getStateMap(DATABASE + "_0");
+      Map<String, String> stateMap1 = externalView.getStateMap(DATABASE + "_1");
+      for (Map.Entry<String, String> entry : stateMap0.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      for (Map.Entry<String, String> entry : stateMap1.entrySet()) {
+        if (entry.getValue().equals("MASTER")) {
+          masterInstances.add(entry.getKey());
+        }
+      }
+      return (masterInstances.size() == 2);

Review comment:
       Please note that we want to make sure that the master for two different resources are existed in two different instances. Counter is also possible. However, I still think this implementation in more readable. Specially considering that this is only a test.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417557790



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestJobFailure.java
##########
@@ -118,8 +118,15 @@ public void testNormalJobFailure(String comment, List<String> taskStates,
   }
 
   private Map<String, Map<String, String>> createPartitionConfig(List<String> taskStates,
-      List<String> expectedTaskEndingStates) {
+      List<String> expectedTaskEndingStates) throws Exception {
     Map<String, Map<String, String>> targetPartitionConfigs = new HashMap<>();
+    // Make sure external view is has been created to the resource

Review comment:
       Make sure external view has been created for the resource...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417694705



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,17 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      if (!stateMap.containsKey(PARTICIPANT_PREFIX + "_" + (_startPort + 1))) {
+        return false;
+      }
+      return stateMap.get(PARTICIPANT_PREFIX + "_" + (_startPort + 1)).equals("MASTER");

Review comment:
       I don't see much difference. Anyway I change it as you suggested.

##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestTaskSchedulingTwoCurrentStates.java
##########
@@ -138,6 +139,17 @@ public void testTargetedTaskTwoCurrentStates() throws Exception {
     JobQueue.Builder jobQueue = TaskTestUtil.buildJobQueue(jobQueueName);
     jobQueue.enqueueJob("JOB0", jobBuilder0);
 
+    // Make sure master has been correctly switched to Participant1
+    boolean isMasterSwitchedToCorrectInstance = TestHelper.verify(() -> {
+      ExternalView externalView = _gSetupTool.getClusterManagementTool().getResourceExternalView(CLUSTER_NAME, DATABASE);
+      Map <String, String> stateMap = externalView.getStateMap(DATABASE + "_0");
+      if (!stateMap.containsKey(PARTICIPANT_PREFIX + "_" + (_startPort + 1))) {
+        return false;
+      }
+      return stateMap.get(PARTICIPANT_PREFIX + "_" + (_startPort + 1)).equals("MASTER");

Review comment:
       I don't see much difference. Anyway I changed it as you suggested.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #981: Stabilizing 4 flaky tests

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #981:
URL: https://github.com/apache/helix/pull/981#discussion_r417573281



##########
File path: helix-core/src/test/java/org/apache/helix/integration/task/TestJobFailure.java
##########
@@ -118,8 +118,15 @@ public void testNormalJobFailure(String comment, List<String> taskStates,
   }
 
   private Map<String, Map<String, String>> createPartitionConfig(List<String> taskStates,
-      List<String> expectedTaskEndingStates) {
+      List<String> expectedTaskEndingStates) throws Exception {
     Map<String, Map<String, String>> targetPartitionConfigs = new HashMap<>();
+    // Make sure external view is has been created to the resource

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org