You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Sanjana Kaundinya (Jira)" <ji...@apache.org> on 2020/05/13 20:42:00 UTC

[jira] [Created] (KAFKA-9988) Log incorrectly reports task has failed when task takes too long to shutdown

Sanjana Kaundinya created KAFKA-9988:
----------------------------------------

             Summary: Log incorrectly reports task has failed when task takes too long to shutdown
                 Key: KAFKA-9988
                 URL: https://issues.apache.org/jira/browse/KAFKA-9988
             Project: Kafka
          Issue Type: Bug
          Components: KafkaConnect
            Reporter: Sanjana Kaundinya


If the OffsetStorageReader is closed while the task is trying to shutdown, and the task is trying to access the offsets from the OffsetStorageReader, then we see the following in the logs.

{code:java}
[2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=replicator-18} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
org.apache.kafka.connect.errors.ConnectException: Failed to fetch offsets.
        at org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:114)
        at org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offset(OffsetStorageReaderImpl.java:63)
        at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:205)
        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kafka.connect.errors.ConnectException: Offset reader closed while attempting to read offsets. This is likely because the task was been scheduled to stop but has taken longer than the graceful shutdown period to do so.
        at org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:103)
        ... 14 more
[2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=replicator-18} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)
{code}

This is a bit misleading, because the task is already on its way of being shutdown, and doesn't actually need manual intervention to be restarted. We can see that as later on in the logs we see that it throws another unrecoverable exception.

{code:java}
[2020-05-05 05:40:39,361] ERROR WorkerSourceTask{id=replicator-18} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
{code}

If we know a task is on its way of shutting down, we should not throw a ConnectException and instead log a warning so that we don't log false negatives.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)