You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/06/25 17:24:53 UTC

[GitHub] [kafka] C0urante opened a new pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

C0urante opened a new pull request #8928:
URL: https://github.com/apache/kafka/pull/8928


   [Jira](https://issues.apache.org/jira/projects/KAFKA/issues/KAFKA-10192)
   
   The `testBlockInConnectorStop` test is failing semi-frequently on Jenkins. It's difficult to verify the cause without complete logs and I'm unable to reproduce locally, but I suspect the cause may be that the Connect worker hasn't completed startup yet by the time the test begins and so the initial REST request to create a connector times out with a 500 error. This isn't an issue for normal tests but we artificially reduce the REST request timeout for these tests as some requests are meant to exhaust that timeout.
   
   The changes here use a small hack to verify that the worker has started and is ready to handle all types of REST requests before tests start.
   
   If possible, we might want to run Jenkins tests on this PR 5-10 times to see if the changes have any effect.
   
   cc @abbccdda 
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine commented on a change in pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
kkonstantine commented on a change in pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#discussion_r450292688



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/integration/BlockingConnectorTest.java
##########
@@ -76,6 +78,15 @@ public void setup() {
 
         // start the clusters
         connect.start();
+
+        // wait for the Connect REST API to become available. necessary because of the reduced REST
+        // request timeout; otherwise, we may get an unexpected 500 with our first real REST request
+        // if the worker is still getting on its feet.
+        waitForCondition(

Review comment:
       Should we just hit the endpoint that lists connectors to verify that the worker is ready to serve REST requests? 
   
   That's what we've been doing in system tests: 
   https://github.com/apache/kafka/blob/trunk/tests/kafkatest/services/connect.py#L110
   
   Given that this is a valid endpoint and doesn't need an artificial connector name seems less hacky and we could include this condition overall in the utils. Wdyt?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] abbccdda commented on pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
abbccdda commented on pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#issuecomment-650224041


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on a change in pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
C0urante commented on a change in pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#discussion_r449789091



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/integration/BlockingConnectorTest.java
##########
@@ -76,6 +78,15 @@ public void setup() {
 
         // start the clusters
         connect.start();
+
+        // wait for the Connect REST API to become available. necessary because of the reduced REST
+        // request timeout; otherwise, we may get an unexpected 500 with our first real REST request
+        // if the worker is still getting on its feet.
+        waitForCondition(

Review comment:
       That won't quite buy us what we need. The litmus test for a worker being available with that call is that its [root resource returns a valid ServerInfo response](https://github.com/apache/kafka/blob/72042f26aff396ed85d02e7e312fd07ea2cc7617/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/EmbeddedConnectCluster.java#L269-L270), but this doesn't guarantee that the worker has completed startup (reading to the end of internal topics, specifically) and so calls to the REST API that have to be handled in the `DistributedHerder::tick` method may still block and, because of the reduced timeouts for this test, fail.
   
   This isn't a great solution but as far as I can tell there's no official way to determine if a worker has completed startup or not via the REST API, so issuing an info request for a non-existent connector is used instead.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on a change in pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
C0urante commented on a change in pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#discussion_r449797010



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/integration/BlockingConnectorTest.java
##########
@@ -76,6 +78,15 @@ public void setup() {
 
         // start the clusters
         connect.start();
+
+        // wait for the Connect REST API to become available. necessary because of the reduced REST
+        // request timeout; otherwise, we may get an unexpected 500 with our first real REST request
+        // if the worker is still getting on its feet.
+        waitForCondition(

Review comment:
       I didn't change the assertions initially for two reasons:
   
   1. This test is the only one that artificially reduces the REST request timeout from the usual 90 seconds to just 5 seconds; it seemed fairly unlikely that the distinction between "any REST request is ready to be handled by the herder" and "all REST requests are ready to be handled by the herder" would matter in any test that still uses the normal 90 second timeout. Herders not starting in 5 seconds isn't too surprising; herders not starting in 90 seconds is probably a sign of something going wrong.
   
   2. The solution here, while valid for this test, is a little hacky. Issuing a request for info of a non-existent connector and waiting for the response status to transition from 500 to 404 works if you know that the connector doesn't exist, but not necessarily if you aren't certain that that connector shouldn't exist. We'd probably want to wait for the status to become either 404 (herder has completed startup and connector doesn't exist) or 200 (herder has completed startup and connector does exist) and use some obscure name like `"test-for-herder-startup"`, but even then, it's possible that there may be unexpected side effects to these requests if they're used for all integration tests instead of just ones where the conditions are more controlled.
   
   LMK what you think; I'm willing to add this logic to the embedded cluster assertions if there's a case to be made for how it'd benefit more than just this test.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine commented on a change in pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
kkonstantine commented on a change in pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#discussion_r449789475



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/integration/BlockingConnectorTest.java
##########
@@ -76,6 +78,15 @@ public void setup() {
 
         // start the clusters
         connect.start();
+
+        // wait for the Connect REST API to become available. necessary because of the reduced REST
+        // request timeout; otherwise, we may get an unexpected 500 with our first real REST request
+        // if the worker is still getting on its feet.
+        waitForCondition(

Review comment:
       Should we change the assertions then? I'd assume this will be useful to other tests as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine commented on a change in pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
kkonstantine commented on a change in pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#discussion_r451832657



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/integration/BlockingConnectorTest.java
##########
@@ -76,6 +78,15 @@ public void setup() {
 
         // start the clusters
         connect.start();
+
+        // wait for the Connect REST API to become available. necessary because of the reduced REST
+        // request timeout; otherwise, we may get an unexpected 500 with our first real REST request
+        // if the worker is still getting on its feet.
+        waitForCondition(

Review comment:
       Interesting observation. Of course, hitting the leader with a request doesn't tell you that other workers have started, so that's applicable in tests like this one, which start only one worker here, etc. This doesn't seem to be a race condition we encounter often, so I'm fine with an ad hoc specific fix here given the reduced timeout. I'd be surprised if it took noticeable time to load the services after the herder is submitted to its executor. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine commented on a change in pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
kkonstantine commented on a change in pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#discussion_r449787441



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/integration/BlockingConnectorTest.java
##########
@@ -76,6 +78,15 @@ public void setup() {
 
         // start the clusters
         connect.start();
+
+        // wait for the Connect REST API to become available. necessary because of the reduced REST
+        // request timeout; otherwise, we may get an unexpected 500 with our first real REST request
+        // if the worker is still getting on its feet.
+        waitForCondition(

Review comment:
       We have `EmbeddedConnectClusterAssertions#assertExactlyNumWorkersAreUp`
   Should we use this high level assertion to confirm that the workers are up?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine commented on pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
kkonstantine commented on pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#issuecomment-657751413


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine merged pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors in integration tests

Posted by GitBox <gi...@apache.org>.
kkonstantine merged pull request #8928:
URL: https://github.com/apache/kafka/pull/8928


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors in integration tests

Posted by GitBox <gi...@apache.org>.
C0urante commented on pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#issuecomment-658279963


   Thanks Konstantine!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] abbccdda commented on pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
abbccdda commented on pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#issuecomment-649998256


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine commented on pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
kkonstantine commented on pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#issuecomment-655764773


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on a change in pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors

Posted by GitBox <gi...@apache.org>.
C0urante commented on a change in pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#discussion_r451008207



##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/integration/BlockingConnectorTest.java
##########
@@ -76,6 +78,15 @@ public void setup() {
 
         // start the clusters
         connect.start();
+
+        // wait for the Connect REST API to become available. necessary because of the reduced REST
+        // request timeout; otherwise, we may get an unexpected 500 with our first real REST request
+        // if the worker is still getting on its feet.
+        waitForCondition(

Review comment:
       That hits the in-memory config state and doesn't require a herder request, so the availability of that endpoint doesn't guarantee that the herder has finished startup unfortunatey.

##########
File path: connect/runtime/src/test/java/org/apache/kafka/connect/integration/BlockingConnectorTest.java
##########
@@ -76,6 +78,15 @@ public void setup() {
 
         // start the clusters
         connect.start();
+
+        // wait for the Connect REST API to become available. necessary because of the reduced REST
+        // request timeout; otherwise, we may get an unexpected 500 with our first real REST request
+        // if the worker is still getting on its feet.
+        waitForCondition(

Review comment:
       That hits the in-memory config state and doesn't require a herder request, so the availability of that endpoint doesn't guarantee that the herder has finished startup unfortunately.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine commented on pull request #8928: KAFKA-10192: Wait for REST API to become available before testing blocked connectors in integration tests

Posted by GitBox <gi...@apache.org>.
kkonstantine commented on pull request #8928:
URL: https://github.com/apache/kafka/pull/8928#issuecomment-657958290


   This PR affected only integration tests. Merged to `trunk` and `2.6`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org