You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/03/24 18:24:24 UTC

[GitHub] [kafka] C0urante opened a new pull request #10396: KAFKA-12474: Handle failure to write new session keys gracefully

C0urante opened a new pull request #10396:
URL: https://github.com/apache/kafka/pull/10396


   [Jira](https://issues.apache.org/jira/browse/KAFKA-12474)
   
   If a distributed worker fails to write (or read back) a new session key to/from the config topic, it dies.
   
   This fix softens the blow a bit by instead restarting the herder tick loop anew and forcing a read to the end of the config topic until the worker is able to successfully read to the end.
   
   At this point, if the worker was able to successfully write a new session key in its first attempt, it will have read that key back from the config topic and will not write a new key during the next tick iteration. If it was not able to write that key at all, it will try again to write a new key (if it is still the leader).
   
   Verified with new unit tests for both cases (failure to write, failure to read back after write).
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on a change in pull request #10396: KAFKA-12474: Handle failure to write new session keys gracefully

Posted by GitBox <gi...@apache.org>.
C0urante commented on a change in pull request #10396:
URL: https://github.com/apache/kafka/pull/10396#discussion_r603744554



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java
##########
@@ -363,10 +363,16 @@ public void tick() {
         if (checkForKeyRotation(now)) {
             log.debug("Distributing new session key");
             keyExpiration = Long.MAX_VALUE;
-            configBackingStore.putSessionKey(new SessionKey(
-                keyGenerator.generateKey(),
-                now
-            ));
+            try {
+                configBackingStore.putSessionKey(new SessionKey(
+                    keyGenerator.generateKey(),
+                    now
+                ));
+            } catch (Exception e) {
+                log.warn("Failed to write new session key to config topic; forcing a read to the end of the config topic before possibly retrying");

Review comment:
       Good point; considering the other `WARN`- and `ERROR`-level messages that get emitted with this exact code path, it should be fine to downgrade this to `INFO`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rhauch commented on pull request #10396: KAFKA-12474: Handle failure to write new session keys gracefully

Posted by GitBox <gi...@apache.org>.
rhauch commented on pull request #10396:
URL: https://github.com/apache/kafka/pull/10396#issuecomment-811994433


   Not sure why the build results are not showing up, but the build passed on JDK 8 and ARM, and failed unrelated tests on JDK 15. https://ci-builds.apache.org/job/Kafka/job/kafka-pr/view/change-requests/job/PR-10396/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rhauch commented on a change in pull request #10396: KAFKA-12474: Handle failure to write new session keys gracefully

Posted by GitBox <gi...@apache.org>.
rhauch commented on a change in pull request #10396:
URL: https://github.com/apache/kafka/pull/10396#discussion_r603575020



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java
##########
@@ -363,10 +363,16 @@ public void tick() {
         if (checkForKeyRotation(now)) {
             log.debug("Distributing new session key");
             keyExpiration = Long.MAX_VALUE;
-            configBackingStore.putSessionKey(new SessionKey(
-                keyGenerator.generateKey(),
-                now
-            ));
+            try {
+                configBackingStore.putSessionKey(new SessionKey(
+                    keyGenerator.generateKey(),
+                    now
+                ));
+            } catch (Exception e) {
+                log.warn("Failed to write new session key to config topic; forcing a read to the end of the config topic before possibly retrying");

Review comment:
       Is this worthy of a warning message rather than an info-level message, especially if we think the herder can automatically recover from typical causes of this (e.g., transient network issues, transient broker issues, etc.)?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rhauch merged pull request #10396: KAFKA-12474: Handle failure to write new session keys gracefully

Posted by GitBox <gi...@apache.org>.
rhauch merged pull request #10396:
URL: https://github.com/apache/kafka/pull/10396


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on pull request #10396: KAFKA-12474: Handle failure to write new session keys gracefully

Posted by GitBox <gi...@apache.org>.
C0urante commented on pull request #10396:
URL: https://github.com/apache/kafka/pull/10396#issuecomment-806057436


   @gharris1727 @ncliang either of you care to take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org