You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/04/08 01:19:11 UTC

[GitHub] [kafka] kpatelatwork opened a new pull request #10503: Kafka 9988: Suppress uncaught exceptions in log messages during task shutdown

kpatelatwork opened a new pull request #10503:
URL: https://github.com/apache/kafka/pull/10503


   *Uncaught exceptions logged during task stop were misleading because the task is already on its way of being shutdown.*
   
   *The suppression of exception causes a change in behavior as the caller method now calls statusLisner.onShutdown instead of statusLisner.onFailure which is the right behavior. A new test was added to test the right behavior for uncaught exception during shutdown and existing test was modified to test uncaught exception during normal execution*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kpatelatwork commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
kpatelatwork commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r610322098



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,12 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping && !cancelled) {

Review comment:
       @C0urante are these flags meant to be hidden or wdyt if we just add "stopped={}, cancelled={} " to the message instead of adding more if/else to the code?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine merged pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during Connect task shutdown

Posted by GitBox <gi...@apache.org>.
kkonstantine merged pull request #10503:
URL: https://github.com/apache/kafka/pull/10503


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
C0urante commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r610330282



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,12 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping && !cancelled) {

Review comment:
       Hmm... I don't believe "cancelled" is a term we've used in public-facing surfaces in the past. For example, when a task takes too long to shut down now and we have to cancel it, we log the message that "Graceful stop... failed": https://github.com/apache/kafka/blob/5964401bf9aab611bd4a072941bd1c927e044258/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L866
   
   Personally I think the additional code complexity is worth it; the original ticket mentions a case where these messages confuse users because they're generated for cancelled tasks, so I'd rather err on the side of making things as obvious as possible to them.
   
   It might be possible to keep things simple and eliminate branches by tweaking the message to make it clear that newer task instances won't be impacted by this failure, though. A possible downside to this is that it might be confusing if there are no newer instances that will be brought up on the worker (because the connector has been deleted, the number of tasks has been reduced, or the task has been reassigned to another worker). But with some careful wording we might be able to avoid misleading people into thinking that this message implies there's already another instance running.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
C0urante commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r610316462



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,12 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping && !cancelled) {

Review comment:
       I was thinking we could log different messages based on whether only `stopping` or both `stopping` and `cancelled` were true. If `cancelled` is true, we should make sure to let people know that there might be a newer instance of this task already running, and that the log message isn't indicative that that newer instance has failed. If only `stopping` is true, then the existing log message should suffice. Does that make sense?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kpatelatwork commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
kpatelatwork commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r610857516



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,12 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping && !cancelled) {

Review comment:
       @C0urante  please check now, I reworded the message and added a new case for cancelled vs stopped after our morning discussion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kkonstantine commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
kkonstantine commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r610945519



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,16 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping && !cancelled) {

Review comment:
       Should we just write this as: 
   ```
   if (cancelled) {
   } else if(stopping) {
   } else {
   }
   ```
   ?
   Should be equivalent but a bit easier to follow. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kpatelatwork commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
kpatelatwork commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r610062836



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,15 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping) {
+                log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
+                throw t;
+            } else {
+                log.warn("{} During stop, task threw an uncaught and unrecoverable exception: {}.", this, t.getMessage());

Review comment:
       Thanks for the review. I implemented all the suggestions. @C0urante  please review again.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
C0urante commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r610857776



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,12 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping && !cancelled) {

Review comment:
       Looks good to me!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kpatelatwork commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
kpatelatwork commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r610322098



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,12 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping && !cancelled) {

Review comment:
       are these flags internal or can we just add "stopped={}, cancelled={} " to the message instead of adding more if/else to the code?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] C0urante commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
C0urante commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r609882468



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,15 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping) {
+                log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
+                throw t;
+            } else {
+                log.warn("{} During stop, task threw an uncaught and unrecoverable exception: {}.", this, t.getMessage());

Review comment:
       Nit 1: "During stop" may be interpreted as "During an invocation of `Task::stop`", which isn't necessarily the case here if, for example, shutdown is scheduled but the task fails during a call to `put` or `poll` before `stop` can even be invoked.
   
   Nit 2: Also, this might be more of a personal thing, but I'm wondering if "unrecoverable" really adds much here since, even if the task were to somehow "recover", it'd still just shut down.
   
   Nit 3: I think it's still valuable to preserve the entire stack trace instead of only revealing with `DEBUG`-level logging. Any time a task throws an uncaught exception, it should be cause for concern, as there may be issues with the quality of the connector, the health of the Kafka cluster, or the configuration of the worker. And since task restarts have to be manually triggered, it doesn't seem likely that this'd lead to logs being flooded with stack traces. 
   
   WDYT about something like `log.warn("{} After being scheduled for shutdown, task threw an uncaught exception.", this, t);`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] kpatelatwork commented on a change in pull request #10503: KAFKA-9988: Suppress uncaught exceptions in log messages during task shutdown

Posted by GitBox <gi...@apache.org>.
kpatelatwork commented on a change in pull request #10503:
URL: https://github.com/apache/kafka/pull/10503#discussion_r611116918



##########
File path: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java
##########
@@ -185,8 +185,16 @@ private void doRun() throws InterruptedException {
 
             execute();
         } catch (Throwable t) {
-            log.error("{} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted", this, t);
-            throw t;
+            if (!stopping && !cancelled) {

Review comment:
       Very good suggestion and this made the flow more simple.  I have implemented it. @kkonstantine  please check the PR again.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org