You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by "Jason Gustafson (Jira)" <ji...@apache.org> on 2020/05/19 16:40:00 UTC

[jira] [Commented] (KAFKA-10021) When reading to the end of the config log, check if fetch.max.wait.ms is greater than worker.sync.timeout.ms

    [ https://issues.apache.org/jira/browse/KAFKA-10021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111338#comment-17111338 ] 

Jason Gustafson commented on KAFKA-10021:
-----------------------------------------

For a little more detail, the problem today with the `readToLogEnd` function in `KafkaBasedLog` is that it can get blocked by `fetch.max.wait.ms`. This is because the connection that is used for finding the end offset is also shared by the consumer fetching from the log. If the topic has low volume, then it is in fact likely that the ListOffset request gets stuck behind a Fetch which is blocking on the broker. This can cause a timeout when syncing configs or even just slowness when reading offsets using `OffsetStorageReader`. The simplest fix would be to use a shared `AdminClient` to fetch the end offset instead of the consumer.

> When reading to the end of the config log, check if fetch.max.wait.ms is greater than worker.sync.timeout.ms
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-10021
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10021
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>            Reporter: Sanjana Kaundinya
>            Priority: Major
>
> Currently in the Connect code in DistributedHerder.java, we see the following piece of code
>  
> {{            if (!canReadConfigs && !readConfigToEnd(workerSyncTimeoutMs))
>                 return; // Safe to return and tick immediately because readConfigToEnd will do the backoff for us}}
> where the workerSyncTimeoutMs passed in is the timeout given to read to the end of the config log. This is a bug as we should check if fetch.wait.max.ms is greater than worker.sync.timeout.ms and if it is, use worker.sync.timeout.ms as the fetch.wait.max.ms. A better fix would be to use the AdminClient to read to the end of the log, but at a minimum we should check the configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)