You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/07/20 00:33:29 UTC

[GitHub] [helix] zhangmeng916 opened a new pull request #1157: fix TestRebalancePipeline test

zhangmeng916 opened a new pull request #1157:
URL: https://github.com/apache/helix/pull/1157


   ### Issues
   
   - [X] My PR addresses the following Helix issues and references them in the PR description:
   
   (#1156 ")
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI changes:
   In the test testDuplicateMsg from TestRebalancePipeline, currently an invalid session id is used. Here's the detail:
   setCurrentState(clusterName, "localhost_0", resourceName, resourceName + "_0", "session_1", "SLAVE"); 
   The wrong session id makes the following test and assertion meaningless. This PR uses the correct session id  liveInstances.get(0).getEphemeralOwner() instead. Meanwhile, we found a NPE thrown out during the pipeline and cause the pipeline to stop, which is due to the missing of `setAsyncTasksThreadPool` action. This PR also fixed the issue.
   
   ### Tests
   
   - [ ] The following is the result of the "mvn test" command on the appropriate module:
   
   ### Commits
   
   - [X] My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - [X] My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r457680552



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +70,12 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    // The AsyncTasksThreadPool needs to be set, otherwise to start pending message cleanup job
+    // will throw NPE and stop the pipeline. TODO: https://github.com/apache/helix/issues/1158
+    _executorService = Executors.newSingleThreadExecutor();

Review comment:
       `_executorService ` does not have to be a private variable. We could probably just use a local reference, and shut down at the end of the method.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r457015475



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +62,9 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    dataCache.setAsyncTasksThreadPool(Executors.newSingleThreadExecutor());

Review comment:
       Sure. I'll add. The reason I found this issue is that when I changed the session Id to be a valid one, the test actually failed. And debugging, I found it's due to the Executor was not set up, and thus the periodically clean up job for pending messages throw an NPE, which caused the pipeline to stop. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r456996863



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +62,9 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    dataCache.setAsyncTasksThreadPool(Executors.newSingleThreadExecutor());

Review comment:
       Could you add a comment here stating why we need to set async tasks thread pool? What happens if we don't? I ask because it doesn't seem like we do this in the previous version.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r457680264



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -45,10 +47,17 @@
 import org.apache.helix.model.Message;
 import org.apache.helix.model.Partition;
 import org.testng.Assert;
+import org.testng.annotations.AfterClass;
 import org.testng.annotations.Test;
 
 public class TestRebalancePipeline extends ZkUnitTestBase {
   private final String _className = getShortClassName();
+  private ExecutorService _executorService;

Review comment:
       Should we not have this variable in the scope of the class? Since this isn't being used throughout the class, we could just create the reference in the test method and shut down at the very end of the method. That way, we don't need to add the afterClass() method.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r457525016



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +62,9 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    dataCache.setAsyncTasksThreadPool(Executors.newSingleThreadExecutor());

Review comment:
       Our controller has properly set this field during construction: cache.setAsyncTasksThreadPool(_asyncTasksThreadPool);
   I'd think the code itself should be safe.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly merged pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly merged pull request #1157:
URL: https://github.com/apache/helix/pull/1157


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r457631317



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +62,11 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    // The AsyncTasksThreadPool needs to be set, otherwise to start pending message cleanup job
+    // will throw NPE and stop the pipeline. TODO: https://github.com/apache/helix/issues/1158
+    dataCache.setAsyncTasksThreadPool(Executors.newSingleThreadExecutor());

Review comment:
       Another important thing we should do here is to shut down the threads/thread pool. Otherwise, there will be a thread pool/thread leak throughout the test suite. This has been a real pain for Helix's test suite, so let's make sure we don't leak it here.
   
   We could pull out this single thread executor to create a reference and shut it down at the end of the test.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r457124770



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +62,9 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    dataCache.setAsyncTasksThreadPool(Executors.newSingleThreadExecutor());

Review comment:
       I see. Although not directly related to this PR, do you think we should consider handling this NPE? It seems like something we should handle more gracefully (we probably won't have this issue in production, the urgency won't be too high). I'm fine with creating an issue with a clear context/what to fix so that someone could pick it up in the future.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r457124770



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +62,9 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    dataCache.setAsyncTasksThreadPool(Executors.newSingleThreadExecutor());

Review comment:
       I see. Although not directly related to this PR, do you think we should consider handling this NPE? It seems like something we should handle more gracefully (we probably won't have this issue in production).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r457125877



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +62,11 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    // The AsyncTasksThreadPool needs to be set, otherwise to start pending message cleanup job
+    // will throw NPE and stop the pipeline.

Review comment:
       Once we create an issue, you could link that issue number in the comment so we could track it better.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on pull request #1157:
URL: https://github.com/apache/helix/pull/1157#issuecomment-661276855


   This PR is ready to be merged, approved by @narendly 
   Final commit message:
   Fix TestRebalancePipeline using correct session id and cache initialization.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on pull request #1157:
URL: https://github.com/apache/helix/pull/1157#issuecomment-660765154


   Could you do "fixes #N" in Issues section? That way, the issue gets closed when PR gets merged into master automatically :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r457015475



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +62,9 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    dataCache.setAsyncTasksThreadPool(Executors.newSingleThreadExecutor());

Review comment:
       Sure. I'll add. The reason I found this issue is that when I changed the session Id to be a valid one, the test actually failed. And debugging, I found it's due to the Executor was not set up, and thus the periodically clean up job for pending messages throw an NPE, which caused the pipeline to stop. 
   Details:
   1    [main] ERROR org.apache.helix.common.ZkTestBase  - Exception while executing pipeline:org.apache.helix.controller.pipeline.Pipeline@9597028. Will not continue to next pipeline
   java.lang.NullPointerException at org.apache.helix.controller.stages.MessageGenerationPhase.schedulePendingMessageCleanUp(MessageGenerationPhase.java:339)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on pull request #1157:
URL: https://github.com/apache/helix/pull/1157#issuecomment-661992655


   This PR is ready to be merged, approved by @narendly
   Final commit message:
   Fix TestRebalancePipeline using correct session id and cache initialization.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on pull request #1157:
URL: https://github.com/apache/helix/pull/1157#issuecomment-661278267


   In the test testDuplicateMsg from TestRebalancePipeline, currently an invalid session id is used.
   The wrong session id makes the following test and assertion meaningless. This PR uses the correct session id liveInstances.get(0).getEphemeralOwner() instead. 
   
   Also, we found a NPE thrown during the pipeline, causing the pipeline to stop, due to the missing thread pool in setAsyncTasksThreadPool. This PR works around the issue by giving it a thread pool and adds an issue reference in the comment block.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #1157: fix TestRebalancePipeline test

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #1157:
URL: https://github.com/apache/helix/pull/1157#discussion_r456996863



##########
File path: helix-core/src/test/java/org/apache/helix/controller/stages/TestRebalancePipeline.java
##########
@@ -61,8 +62,9 @@ public void testDuplicateMsg() {
     HelixManager manager = new DummyClusterManager(clusterName, accessor);
     ClusterEvent event = new ClusterEvent(ClusterEventType.Unknown);
     event.addAttribute(AttributeName.helixmanager.name(), manager);
-    event.addAttribute(AttributeName.ControllerDataProvider.name(),
-        new ResourceControllerDataProvider());
+    ResourceControllerDataProvider dataCache = new ResourceControllerDataProvider();
+    dataCache.setAsyncTasksThreadPool(Executors.newSingleThreadExecutor());

Review comment:
       Could you add a comment here stating why we need to set async tasks thread pool? What happens if we don't?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org