You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/06/10 16:51:08 UTC

[GitHub] [pulsar] heesung-sn opened a new pull request, #16011: [improve][tests] improved flaky test runs

heesung-sn opened a new pull request, #16011:
URL: https://github.com/apache/pulsar/pull/16011

   
   <!--
   ### Contribution Checklist
     
     - PR title format should be *[type][component] summary*. For details, see *[Guideline - Pulsar PR Naming Convention](https://docs.google.com/document/d/1d8Pw6ZbWk-_pCKdOmdvx9rnhPiyuxwq60_TrD68d7BA/edit#heading=h.trs9rsex3xom)*. 
   
     - Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review.
     
     - Each pull request should address only one issue, not mix up code from multiple issues.
     
     - Each commit in the pull request has a meaningful commit message
   
     - Once all items of the checklist are addressed, remove the above text and this checklist, leaving only the filled out template below.
   
   **(The sections below can be removed for hotfixes of typos)**
   -->
   
   *(If this PR fixes a github issue, please add `Fixes #<xyz>`.)*
   
   Fixes #<xyz>
   
   *(or if this PR is one task of a github issue, please add `Master Issue: #<xyz>` to link to the master issue.)*
   
   Master Issue: #<xyz>
   
   ### Motivation
   
   Fixing flaky tests.
   
   ### Modifications
   
   - improved `PulsarFunctionTlsTests` by reordering tearDown() logic
   - improved `ManagedLedgerFactoryImpl.shutdown()` by closing cacheEviction threads early
   - improved `TestPulsarConnector` memory consumption by removing unnecessary spy()
   - improved `PulsarFunctionsTest` run by using receive() instead of receive(30, TimeUnit.SECONDS);
   
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   
   
   This change is already covered by existing tests.
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If `yes` was chosen, please highlight the changes*
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API: (yes / **no**)
     - The schema: (yes / **no** / don't know)
     - The default values of configurations: (yes / **no**)
     - The wire protocol: (yes / **no**)
     - The rest endpoints: (yes / **no**)
     - The admin cli options: (yes / **no**)
     - Anything that affects deployment: (yes / **no** / don't know)
   
   ### Documentation
   
   Check the box below or label this PR directly.
   
   Need to update docs? 
   
   - [ ] `doc-required` 
   (Your PR needs to update docs and you will update later)
     
   - [x] `doc-not-needed` 
   (Please explain why)
     
   - [ ] `doc` 
   (Your PR contains doc changes)
   
   - [ ] `doc-complete`
   (Docs have been already added)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari commented on a diff in pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
lhotari commented on code in PR #16011:
URL: https://github.com/apache/pulsar/pull/16011#discussion_r895055538


##########
tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java:
##########
@@ -1584,7 +1584,7 @@ private void publishAndConsumeMessages(String inputTopic,
         }
 
         for (int i = 0; i < numMessages; i++) {
-            Message<byte[]> msg = consumer.receive(30, TimeUnit.SECONDS);
+            Message<byte[]> msg = consumer.receive();

Review Comment:
   Is this change necessary? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] heesung-sn commented on pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
heesung-sn commented on PR #16011:
URL: https://github.com/apache/pulsar/pull/16011#issuecomment-1155627552

   > remove the timeout change from this PR
   @lhotari 
   I removed it. Please check this again by any chance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] heesung-sn commented on a diff in pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
heesung-sn commented on code in PR #16011:
URL: https://github.com/apache/pulsar/pull/16011#discussion_r895068717


##########
tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java:
##########
@@ -1584,7 +1584,7 @@ private void publishAndConsumeMessages(String inputTopic,
         }
 
         for (int i = 0; i < numMessages; i++) {
-            Message<byte[]> msg = consumer.receive(30, TimeUnit.SECONDS);
+            Message<byte[]> msg = consumer.receive();

Review Comment:
   Because the test env is hectic with many concurrent jobs, I think we could see timeouts from time to time. If this sync receive response is really delayed, the test will eventually fail. So, yes, I think this will improve the test stability.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] heesung-sn commented on a diff in pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
heesung-sn commented on code in PR #16011:
URL: https://github.com/apache/pulsar/pull/16011#discussion_r895237689


##########
tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java:
##########
@@ -1584,7 +1584,7 @@ private void publishAndConsumeMessages(String inputTopic,
         }
 
         for (int i = 0; i < numMessages; i++) {
-            Message<byte[]> msg = consumer.receive(30, TimeUnit.SECONDS);
+            Message<byte[]> msg = consumer.receive();

Review Comment:
   I don't have the clear answer here. 
   
   To me, 2 vCPUs might not be enough for the e2e tests, especially when running all pulsar components with dockers. I could be wrong here.
   
   If this timeout issue started happening only recently, then I agree that we have a bug here. Please let me know if we do not want this change. The intention is to make the test more stable for other PRs.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari merged pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
lhotari merged PR #16011:
URL: https://github.com/apache/pulsar/pull/16011


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] heesung-sn commented on a diff in pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
heesung-sn commented on code in PR #16011:
URL: https://github.com/apache/pulsar/pull/16011#discussion_r895068717


##########
tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java:
##########
@@ -1584,7 +1584,7 @@ private void publishAndConsumeMessages(String inputTopic,
         }
 
         for (int i = 0; i < numMessages; i++) {
-            Message<byte[]> msg = consumer.receive(30, TimeUnit.SECONDS);
+            Message<byte[]> msg = consumer.receive();

Review Comment:
   Because the test env can be hectic with many concurrent jobs, I think we could see timeouts from time to time. If the response is really delayed, the test will eventually fail. So, yes, I think this will improve the test stability.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] heesung-sn commented on a diff in pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
heesung-sn commented on code in PR #16011:
URL: https://github.com/apache/pulsar/pull/16011#discussion_r895068717


##########
tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java:
##########
@@ -1584,7 +1584,7 @@ private void publishAndConsumeMessages(String inputTopic,
         }
 
         for (int i = 0; i < numMessages; i++) {
-            Message<byte[]> msg = consumer.receive(30, TimeUnit.SECONDS);
+            Message<byte[]> msg = consumer.receive();

Review Comment:
   Because the test env can be really hectic, I think we could see timeouts from time to time. If the response is really delayed, the test will eventually fail. So, yes, I think this will improve the test stability.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] heesung-sn commented on a diff in pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
heesung-sn commented on code in PR #16011:
URL: https://github.com/apache/pulsar/pull/16011#discussion_r895068717


##########
tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java:
##########
@@ -1584,7 +1584,7 @@ private void publishAndConsumeMessages(String inputTopic,
         }
 
         for (int i = 0; i < numMessages; i++) {
-            Message<byte[]> msg = consumer.receive(30, TimeUnit.SECONDS);
+            Message<byte[]> msg = consumer.receive();

Review Comment:
   Because the test env is hectic with many concurrent jobs, I think we could see timeouts from time to time. If the response is really delayed, the test will eventually fail. So, yes, I think this will improve the test stability.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari commented on a diff in pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
lhotari commented on code in PR #16011:
URL: https://github.com/apache/pulsar/pull/16011#discussion_r895121446


##########
tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java:
##########
@@ -1584,7 +1584,7 @@ private void publishAndConsumeMessages(String inputTopic,
         }
 
         for (int i = 0; i < numMessages; i++) {
-            Message<byte[]> msg = consumer.receive(30, TimeUnit.SECONDS);
+            Message<byte[]> msg = consumer.receive();

Review Comment:
   The test env has reasonable computing resources in CI. Each build job has 2 CPUs and 7GB RAM. I think that it's a bug if receiving a message takes more than 30 seconds. Why would message receiving take more than that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] lhotari commented on a diff in pull request #16011: [improve][tests] improved flaky test runs

Posted by GitBox <gi...@apache.org>.
lhotari commented on code in PR #16011:
URL: https://github.com/apache/pulsar/pull/16011#discussion_r896436565


##########
tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java:
##########
@@ -1584,7 +1584,7 @@ private void publishAndConsumeMessages(String inputTopic,
         }
 
         for (int i = 0; i < numMessages; i++) {
-            Message<byte[]> msg = consumer.receive(30, TimeUnit.SECONDS);
+            Message<byte[]> msg = consumer.receive();

Review Comment:
   > To me, 2 vCPUs might not be enough for the e2e tests, especially when running all pulsar components with dockers. I could be wrong here.
   
   2 vCPUs and 7GB RAM is plenty of computation power & RAM for the tests that we have.
   
   > If this timeout issue started happening only recently, then I agree that we have a bug here. Please let me know if we do not want this change. The intention is to make the test more stable for other PRs.
   
   I am not convinced that we should remove the timeout. The problem must be investigated, and the root cause should be fixed. There must be a bug in production code if 30 seconds isn't sufficient.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org