You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Adem Efe Gencer (Jira)" <ji...@apache.org> on 2020/04/29 01:32:00 UTC

[jira] [Created] (KAFKA-9930) Prevent ReplicaFetcherThread from throwing UnknownTopicOrPartitionException upon topic creation and deletion.

Adem Efe Gencer created KAFKA-9930:
--------------------------------------

             Summary: Prevent ReplicaFetcherThread from throwing UnknownTopicOrPartitionException upon topic creation and deletion.
                 Key: KAFKA-9930
                 URL: https://issues.apache.org/jira/browse/KAFKA-9930
             Project: Kafka
          Issue Type: Bug
          Components: logging
    Affects Versions: 2.5.0, 2.4.0, 2.3.0, 2.2.0, 2.1.0, 2.0.0, 1.1.0, 1.0.0, 0.11.0.0, 0.10.0.0
            Reporter: Adem Efe Gencer
            Assignee: Adem Efe Gencer


When does UnknownTopicOrPartitionException typically occur?
 * Upon a topic creation, a follower broker of a new partition starts replica fetcher before the prospective leader broker of the new partition receives the leadership information from the controller. Apache Kafka has a an open issue about this (see KAFKA-6221)
 * Upon a topic deletion, a follower broker of a to-be-deleted partition starts replica fetcher after the leader broker of the to-be-deleted partition processes the deletion information from the controller.
 * As expected, clusters with frequent topic creation and deletion report UnknownTopicOrPartitionException with relatively higher frequency.

What is the impact?
 * Exception tracking systems identify the error logs with UnknownTopicOrPartitionException as an exception. This results in a lot of noise for a transient issue that is expected to recover by itself and a natural process in Kafka due to its asynchronous state propagation.

Why not move it to a lower than warn-level log?
 * Despite typically being a transient issue, UnknownTopicOrPartitionException may also indicate real issues if it doesn't fix itself after a short period of time. To ensure detection of such scenarios, we set the log level to warn.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)