You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Luke Chen (Jira)" <ji...@apache.org> on 2022/04/15 03:20:00 UTC

[jira] [Resolved] (KAFKA-13653) Proactively discover alive brokers from bootstrap server lists when all nodes are down

     [ https://issues.apache.org/jira/browse/KAFKA-13653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luke Chen resolved KAFKA-13653.
-------------------------------
    Resolution: Duplicate

> Proactively discover alive brokers from bootstrap server lists when all nodes are down
> --------------------------------------------------------------------------------------
>
>                 Key: KAFKA-13653
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13653
>             Project: Kafka
>          Issue Type: Improvement
>          Components: clients
>    Affects Versions: 3.1.0
>            Reporter: Luke Chen
>            Priority: Major
>
> Currently, client metadata update has 2 situations:
>  # partition leader change
>  # metadata expired (default 5 mins)
> But sometimes, we will start the client and the brokers at the same time. The client might discover only partial of the brokers at first. And when the discovered brokers down accidentally within 5 mins (before metadata expired), there would be no chance to update the metadata and all cluster is down.
> Ex:
> 1. brokerA is up
> 2. producer is up, discovered brokerA, update its metadata
> 3. brokerB, brokerC are up (but producer doesn't know, and leader imbalance check is not expired (5 mins default))
> 4. producer keeps producing data without error
> 5. brokerA down, let's say, in 3 mins after producer started
> 6. Now, all cluster won't work even though brokerB and brokerC are up
>  
> We should proactively discover active brokers when there are no nodes to connect via the bootstrap server config. So, in the above example, if the bootstrap.server is set to "brokerA_IP,brokerB_IP,brokerC_IP", then we should be able to discover the brokerB and brokerC after step 6.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)