You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Cheng Tan (Jira)" <ji...@apache.org> on 2020/04/23 04:27:00 UTC

[jira] [Updated] (KAFKA-9893) Configurable TCP connection timeout and improve the initial metadata fetch

     [ https://issues.apache.org/jira/browse/KAFKA-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheng Tan updated KAFKA-9893:
-----------------------------
    Summary: Configurable TCP connection timeout and improve the initial metadata fetch  (was: Configurable TCP connection timeout for AdminClient)

> Configurable TCP connection timeout and improve the initial metadata fetch
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-9893
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9893
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Cheng Tan
>            Assignee: Cheng Tan
>            Priority: Major
>
> We do not currently allow for connection timeouts to be defined within AdminClient, and as a result rely on the default OS settings to determine whether a broker is inactive before selecting an alternate broker from bootstrap.
> In the case of a connection timeout on initial handshake, and where tcp_syn_retries is the default (6), we won't timeout an unresponsive broker until ~127s - while the client will timeout sooner (~120s).
> Reducing tcp_syn_retries should mitigate the issue depending on the number of unresponsive brokers within the bootstrap, though this will be applied system wide, and it would be good if we could instead configure connection timeouts for AdminClient.
> The use case where this came up was a customer performing DC failover tests with a stretch cluster.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)