You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "James Olsen (Jira)" <ji...@apache.org> on 2021/12/26 21:37:00 UTC

[jira] [Commented] (KAFKA-13563) Consumer failure after rolling Broker upgrade

    [ https://issues.apache.org/jira/browse/KAFKA-13563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465463#comment-17465463 ] 

James Olsen commented on KAFKA-13563:
-------------------------------------

[~showuon] I've attached a reproducer ({{{}kafka.zip{}}}).  It includes a {{docker-compose.yml}} that brings up a 3 node cluster and a {{Main}} class with Producer and Consumer.

P.S. The easiest way to find the current coordinator is to search the logs for `discovered`.

> Consumer failure after rolling Broker upgrade
> ---------------------------------------------
>
>                 Key: KAFKA-13563
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13563
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>            Reporter: Luke Chen
>            Assignee: Luke Chen
>            Priority: Major
>         Attachments: kafka.zip
>
>
> This failure occurred again during this month's rolling OS security updates to the Brokers (no change to Broker version).  I have also been able to reproduce it locally with the following process:
>  
> 1. Start a 3 Broker cluster with a Topic having Replicas=3.
> 2. Start a Client with Producer and Consumer communicating over the Topic.
> 3. Stop the Broker that is acting as the Group Coordinator.
> 4. Observe successful Rediscovery of new Group Coordinator.
> 5. Restart the stopped Broker.
> 6. Stop the Broker that became the new Group Coordinator at step 4.
> 7. Observe "Rediscovery will be attempted" message but no "Discovered group coordinator" message.
>  
> In short, Group Coordinator Rediscovery only works for the first Broker failover not any subsequent failover.
>  
> I conducted tests using 2.7.1 servers.  The issue occurs with 2.7.1 and 2.7.2 Clients.  The issue does not occur with 2.5.1 and 2.7.0 Clients.  This make me suspect that https://issues.apache.org/jira/browse/KAFKA-10793 introduced this issue.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)