You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2022/06/14 01:18:09 UTC

[GitHub] [kafka] C0urante opened a new pull request, #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

C0urante opened a new pull request, #12291:
URL: https://github.com/apache/kafka/pull/12291

   This causes the artificial reductions in the Connect REST request timeout to be more isolated. Specifically, they now only take place in the tests that need them (instead of any tests that happen to be running after the reduction has taken place and before it has been reset), and they are only performed for the requests that are expected to time out, before being immediately reset. This should help reduce spurious test failures (especially in slow environments like Jenkins) for all Connect integration tests that interact with the REST API, not just the `BlockingConnectorTest` test suite.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
C0urante commented on PR #12291:
URL: https://github.com/apache/kafka/pull/12291#issuecomment-1155178954

   Thanks Bruno!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] cadonna merged pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
cadonna merged PR #12291:
URL: https://github.com/apache/kafka/pull/12291


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] cadonna commented on pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
cadonna commented on PR #12291:
URL: https://github.com/apache/kafka/pull/12291#issuecomment-1164332752

   I encountered the following failure:
   ```
   Build / JDK 17 and Scala 2.13 / org.apache.kafka.connect.integration.ConnectorRestartApiIntegrationTest.testMultiWorkerRestartOnlyConnector
   ```
   @C0urante Should that test be fixed by this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] cadonna commented on a diff in pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
cadonna commented on code in PR #12291:
URL: https://github.com/apache/kafka/pull/12291#discussion_r896615107


##########
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectResource.java:
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.connect.runtime.rest.resources;
+
+import java.util.concurrent.TimeUnit;
+
+/**
+ * This interface defines shared logic for all Connect REST resources.
+ */
+public interface ConnectResource {
+
+    // TODO: This should not be so long. However, due to potentially long rebalances that may have to wait a full
+    // session timeout to complete, during which we cannot serve some requests. Ideally we could reduce this, but
+    // we need to consider all possible scenarios this could fail. It might be ok to fail with a timeout in rare cases,
+    // but currently a worker simply leaving the group can take this long as well.
+    long DEFAULT_REST_REQUEST_TIMEOUT_MS = TimeUnit.SECONDS.toMillis(90);

Review Comment:
   If this timeout depends on session timeout, shouldn't it be defined in terms of session timeout? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
C0urante commented on PR #12291:
URL: https://github.com/apache/kafka/pull/12291#issuecomment-1162439015

   @ijuma since @rhauch and @kkonstantine appear to be busy, do you know anyone else who might be willing to review a flaky test fix for Connect?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
C0urante commented on PR #12291:
URL: https://github.com/apache/kafka/pull/12291#issuecomment-1164497410

   @cadonna it depends on the cause of the failure. If the cause is that a REST request timed out during the test, that might indicate a problem with the fix here. If it's a different cause, it's probably unrelated to this fix.
   
   I think the failure you're referring to had the error message "java.lang.AssertionError: Failed to stop connector and tasks within 120000ms", which is unrelated to the changes here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
C0urante commented on PR #12291:
URL: https://github.com/apache/kafka/pull/12291#issuecomment-1163466722

   👍 sounds good, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
C0urante commented on PR #12291:
URL: https://github.com/apache/kafka/pull/12291#issuecomment-1154607749

   @cadonna this is a follow-up from the earlier work in https://github.com/apache/kafka/pull/12191. It's not urgent but it should help reduce some flakiness; if you have time would you mind taking a look? If not I can probably find someone else who can review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on a diff in pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
C0urante commented on code in PR #12291:
URL: https://github.com/apache/kafka/pull/12291#discussion_r896810101


##########
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectResource.java:
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.connect.runtime.rest.resources;
+
+import java.util.concurrent.TimeUnit;
+
+/**
+ * This interface defines shared logic for all Connect REST resources.
+ */
+public interface ConnectResource {
+
+    // TODO: This should not be so long. However, due to potentially long rebalances that may have to wait a full
+    // session timeout to complete, during which we cannot serve some requests. Ideally we could reduce this, but
+    // we need to consider all possible scenarios this could fail. It might be ok to fail with a timeout in rare cases,
+    // but currently a worker simply leaving the group can take this long as well.
+    long DEFAULT_REST_REQUEST_TIMEOUT_MS = TimeUnit.SECONDS.toMillis(90);

Review Comment:
   It might be time to revisit that TODO, but I was only copying the comment and value from an existing part of the code base: https://github.com/apache/kafka/blob/4fcfd9ddc4a8da3d4cfbb69268c06763352e29a9/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java#L84-L88
   
   FWIW I haven't heard too much fuss about the REST request timeout to date; the only time that timeouts have been an issue in my experience is when the worker is completely hosed because its herder thread is blocked on something.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] cadonna commented on pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
cadonna commented on PR #12291:
URL: https://github.com/apache/kafka/pull/12291#issuecomment-1164514251

   Then let's merge!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on pull request #12291: KAFKA-13987: Isolate REST request timeout changes in Connect integration tests

Posted by GitBox <gi...@apache.org>.
C0urante commented on PR #12291:
URL: https://github.com/apache/kafka/pull/12291#issuecomment-1164523097

   💯 thanks Bruno!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org