You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/10/29 00:31:45 UTC

[GitHub] [helix] jiajunwang opened a new pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

jiajunwang opened a new pull request #1497:
URL: https://github.com/apache/helix/pull/1497


   ### Issues
   
   - [X] My PR addresses the following Helix issues and references them in the PR description:
   
   Fixes #1250
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI changes:
   
   The test was unstable and do not clean up the environment one the failure. This PR tolerates more test results to be accepted as a good result.
   Also, enhance the cleanup logic to prevent leakages.
   
   ### Tests
   
   - [X] The following tests are written for this issue:
   
   TestRoutingTableProviderPeriodicRefresh passed
   
   - [X] The following is the result of the "mvn test" command on the appropriate module:
   
   NA, pure test changes.
   
   ### Documentation (Optional)
   
   - In case of new functionality, my PR adds documentation in the following wiki page:
   
   (Link the GitHub wiki you added)
   
   ### Commits
   
   - My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1497:
URL: https://github.com/apache/helix/pull/1497#discussion_r514475999



##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -64,6 +62,8 @@
   private static final int PARTITION_NUMBER = 20;
   private static final int REPLICA_NUMBER = 3;
 
+  private static final long REFRESH_PERIOD_MS = 1000L;

Review comment:
       Works in theory. But I prefer to keep the main test logic the same. I don't have a strong motivation to optimize this test since as Kai mentioned, we shall rewrite it eventually. Just want to make it work first.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1497:
URL: https://github.com/apache/helix/pull/1497#discussion_r514429911



##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -203,34 +208,32 @@ synchronized int getRefreshCount() {
 
   @Test
   public void testPeriodicRefresh() throws InterruptedException {
-    // Wait so that initial refreshes finish (not triggered by periodic refresh timer)
-    Thread.sleep(1000L);
+    // Wait to ensure that the initial refreshes finish (not triggered by periodic refresh timer)
+    Thread.sleep(REFRESH_PERIOD_MS * 2);
 
     // Test short refresh
     int prevRefreshCount = _routingTableProvider.getRefreshCount();
-    // Wait for one timer duration
-    Thread.sleep(1000L);
-    // The timer should have gone off, incrementing the refresh count
-    Assert.assertEquals(_routingTableProvider.getRefreshCount(), prevRefreshCount + 1);
+    // Wait for more than one timer duration
+    Thread.sleep((long) (REFRESH_PERIOD_MS * 1.5));
+    // The timer should have gone off, incrementing the refresh count by one or two depends on the
+    // timing.
+    int newRefreshCount = _routingTableProvider.getRefreshCount();
+    Assert.assertTrue(
+        newRefreshCount == prevRefreshCount + 1 || newRefreshCount == prevRefreshCount + 2);

Review comment:
       This comment is tied with my other comment. If we already know it's flaky between 1 or 2 refreshes, why don't we just `assert(+1 || +2)`, while keeping `sleep(1000)`? 
   
   If we change it to `sleep(1500)`, we are opening doors to "slow refresh" cases. If a refresh doesn't respect the refresh period, and instead took around 1500, the previous version of the test will catch it, but now it won't. Like you said, maybe even 1500 is fine since there is no hard requirement, we are just ensuring the refresh happens, but it feels to me that correctness check is relaxed. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1497:
URL: https://github.com/apache/helix/pull/1497#discussion_r514477645



##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -203,34 +208,32 @@ synchronized int getRefreshCount() {
 
   @Test
   public void testPeriodicRefresh() throws InterruptedException {
-    // Wait so that initial refreshes finish (not triggered by periodic refresh timer)
-    Thread.sleep(1000L);
+    // Wait to ensure that the initial refreshes finish (not triggered by periodic refresh timer)
+    Thread.sleep(REFRESH_PERIOD_MS * 2);
 
     // Test short refresh
     int prevRefreshCount = _routingTableProvider.getRefreshCount();
-    // Wait for one timer duration
-    Thread.sleep(1000L);
-    // The timer should have gone off, incrementing the refresh count
-    Assert.assertEquals(_routingTableProvider.getRefreshCount(), prevRefreshCount + 1);
+    // Wait for more than one timer duration
+    Thread.sleep((long) (REFRESH_PERIOD_MS * 1.5));
+    // The timer should have gone off, incrementing the refresh count by one or two depends on the
+    // timing.
+    int newRefreshCount = _routingTableProvider.getRefreshCount();
+    Assert.assertTrue(
+        newRefreshCount == prevRefreshCount + 1 || newRefreshCount == prevRefreshCount + 2);

Review comment:
       As I replied to Lei, this PR is not to improve the test. This is mainly for unblocking the other devs since the test cannot pass. Alternatively, we can disable it. But I prefer the current version. At least it checks some of the logic.
   
   We will need to address this by re-writing the test logic.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang merged pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
jiajunwang merged pull request #1497:
URL: https://github.com/apache/helix/pull/1497


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1497:
URL: https://github.com/apache/helix/pull/1497#discussion_r513928840



##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -203,34 +208,32 @@ synchronized int getRefreshCount() {
 
   @Test
   public void testPeriodicRefresh() throws InterruptedException {
-    // Wait so that initial refreshes finish (not triggered by periodic refresh timer)
-    Thread.sleep(1000L);
+    // Wait to ensure that the initial refreshes finish (not triggered by periodic refresh timer)
+    Thread.sleep(REFRESH_PERIOD_MS * 2);
 
     // Test short refresh
     int prevRefreshCount = _routingTableProvider.getRefreshCount();
-    // Wait for one timer duration
-    Thread.sleep(1000L);
-    // The timer should have gone off, incrementing the refresh count
-    Assert.assertEquals(_routingTableProvider.getRefreshCount(), prevRefreshCount + 1);

Review comment:
       I guess it is just because of some timing issues. When you set up a time window that equals the interval, you might have one or two refreshes happen.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1497:
URL: https://github.com/apache/helix/pull/1497#discussion_r513932686



##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -203,34 +208,32 @@ synchronized int getRefreshCount() {
 
   @Test
   public void testPeriodicRefresh() throws InterruptedException {
-    // Wait so that initial refreshes finish (not triggered by periodic refresh timer)
-    Thread.sleep(1000L);
+    // Wait to ensure that the initial refreshes finish (not triggered by periodic refresh timer)
+    Thread.sleep(REFRESH_PERIOD_MS * 2);
 
     // Test short refresh
     int prevRefreshCount = _routingTableProvider.getRefreshCount();
-    // Wait for one timer duration
-    Thread.sleep(1000L);
-    // The timer should have gone off, incrementing the refresh count
-    Assert.assertEquals(_routingTableProvider.getRefreshCount(), prevRefreshCount + 1);
+    // Wait for more than one timer duration
+    Thread.sleep((long) (REFRESH_PERIOD_MS * 1.5));
+    // The timer should have gone off, incrementing the refresh count by one or two depends on the
+    // timing.
+    int newRefreshCount = _routingTableProvider.getRefreshCount();
+    Assert.assertTrue(
+        newRefreshCount == prevRefreshCount + 1 || newRefreshCount == prevRefreshCount + 2);

Review comment:
       The java schedule task method that we are using is not accurate anyway. Moreover, the main test purpose here is to ensure the refresh happens. If it is faster a little bit, I guess we are still good.
   On the other hand, if the refresh does respect the interval, so during 1.5 seconds, it happens more than 2 times, then the test will still fail.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on pull request #1497:
URL: https://github.com/apache/helix/pull/1497#issuecomment-718983405


   Approved by @alirezazamani , I will merge this PR to unblock the test. But the issue is kept opening, since we still need a better solution.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] lei-xia commented on a change in pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
lei-xia commented on a change in pull request #1497:
URL: https://github.com/apache/helix/pull/1497#discussion_r514336457



##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -64,6 +62,8 @@
   private static final int PARTITION_NUMBER = 20;
   private static final int REPLICA_NUMBER = 3;
 
+  private static final long REFRESH_PERIOD_MS = 1000L;

Review comment:
       why 1000ms, could 500ms, or even 100ms work? That will reduce the test time.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1497:
URL: https://github.com/apache/helix/pull/1497#discussion_r514524453



##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -203,34 +208,32 @@ synchronized int getRefreshCount() {
 
   @Test
   public void testPeriodicRefresh() throws InterruptedException {
-    // Wait so that initial refreshes finish (not triggered by periodic refresh timer)
-    Thread.sleep(1000L);
+    // Wait to ensure that the initial refreshes finish (not triggered by periodic refresh timer)
+    Thread.sleep(REFRESH_PERIOD_MS * 2);
 
     // Test short refresh
     int prevRefreshCount = _routingTableProvider.getRefreshCount();
-    // Wait for one timer duration
-    Thread.sleep(1000L);
-    // The timer should have gone off, incrementing the refresh count
-    Assert.assertEquals(_routingTableProvider.getRefreshCount(), prevRefreshCount + 1);
+    // Wait for more than one timer duration
+    Thread.sleep((long) (REFRESH_PERIOD_MS * 1.5));
+    // The timer should have gone off, incrementing the refresh count by one or two depends on the
+    // timing.
+    int newRefreshCount = _routingTableProvider.getRefreshCount();
+    Assert.assertTrue(
+        newRefreshCount == prevRefreshCount + 1 || newRefreshCount == prevRefreshCount + 2);

Review comment:
       Talked offline: since periodic refresh can happen slightly less, equally, or slightly more frequent than the configured wait period, I agree with @jiajunwang 's change to use both `sleep(1500)` and `assert(+1 || +2)`. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1497: Fix TestRoutingTableProviderPeriodicRefresh.

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1497:
URL: https://github.com/apache/helix/pull/1497#discussion_r513843732



##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -203,34 +208,32 @@ synchronized int getRefreshCount() {
 
   @Test
   public void testPeriodicRefresh() throws InterruptedException {
-    // Wait so that initial refreshes finish (not triggered by periodic refresh timer)
-    Thread.sleep(1000L);
+    // Wait to ensure that the initial refreshes finish (not triggered by periodic refresh timer)
+    Thread.sleep(REFRESH_PERIOD_MS * 2);
 
     // Test short refresh
     int prevRefreshCount = _routingTableProvider.getRefreshCount();
-    // Wait for one timer duration
-    Thread.sleep(1000L);
-    // The timer should have gone off, incrementing the refresh count
-    Assert.assertEquals(_routingTableProvider.getRefreshCount(), prevRefreshCount + 1);

Review comment:
       Could you elaborate on the reason of failure before? Is it because periodic refresh sometimes take a bit more than 1000, so it's possible that `newRefreshCount == prevRefreshCount`; or, is it because periodic refresh may take less time, so it's possible that `newRefreshCount == prevRefreshCount + 2`?

##########
File path: helix-core/src/test/java/org/apache/helix/integration/spectator/TestRoutingTableProviderPeriodicRefresh.java
##########
@@ -203,34 +208,32 @@ synchronized int getRefreshCount() {
 
   @Test
   public void testPeriodicRefresh() throws InterruptedException {
-    // Wait so that initial refreshes finish (not triggered by periodic refresh timer)
-    Thread.sleep(1000L);
+    // Wait to ensure that the initial refreshes finish (not triggered by periodic refresh timer)
+    Thread.sleep(REFRESH_PERIOD_MS * 2);
 
     // Test short refresh
     int prevRefreshCount = _routingTableProvider.getRefreshCount();
-    // Wait for one timer duration
-    Thread.sleep(1000L);
-    // The timer should have gone off, incrementing the refresh count
-    Assert.assertEquals(_routingTableProvider.getRefreshCount(), prevRefreshCount + 1);
+    // Wait for more than one timer duration
+    Thread.sleep((long) (REFRESH_PERIOD_MS * 1.5));
+    // The timer should have gone off, incrementing the refresh count by one or two depends on the
+    // timing.
+    int newRefreshCount = _routingTableProvider.getRefreshCount();
+    Assert.assertTrue(
+        newRefreshCount == prevRefreshCount + 1 || newRefreshCount == prevRefreshCount + 2);

Review comment:
       While this eliminates flakiness, would it be too loose to ensure correctness?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org