You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Eran Levy (Jira)" <ji...@apache.org> on 2021/09/20 12:53:00 UTC

[jira] [Resolved] (KAFKA-10643) Static membership - repetitive PreparingRebalance with updating metadata for member reason

     [ https://issues.apache.org/jira/browse/KAFKA-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eran Levy resolved KAFKA-10643.
-------------------------------
    Resolution: Cannot Reproduce

> Static membership - repetitive PreparingRebalance with updating metadata for member reason
> ------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-10643
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10643
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.6.0
>            Reporter: Eran Levy
>            Priority: Major
>         Attachments: broker-4-11.csv, client-4-11.csv, client-d-9-11-11-2020.csv
>
>
> Kafka streams 2.6.0, brokers version 2.6.0. Kafka nodes are healthy, kafka streams app is healthy. 
> Configured with static membership. 
> Every 10 minutes (I assume cause of topic.metadata.refresh.interval.ms), I see the following group coordinator log for different stream consumers: 
> INFO [GroupCoordinator 2]: Preparing to rebalance group **--**-stream in state PreparingRebalance with old generation 12244 (__consumer_offsets-45) (reason: Updating metadata for member ****-stream-11-1-013edd56-ed93-4370-b07c-1c29fbe72c9a) (kafka.coordinator.group.GroupCoordinator)
> and right after that the following log: 
> INFO [GroupCoordinator 2]: Assignment received from leader for group **-**-stream for generation 12246 (kafka.coordinator.group.GroupCoordinator)
>  
> Looked a bit on the kafka code and Im not sure that I get why such a thing happening - is this line described the situation that happens here re the "reason:"?[https://github.com/apache/kafka/blob/7ca299b8c0f2f3256c40b694078e422350c20d19/core/src/main/scala/kafka/coordinator/group/GroupCoordinator.scala#L311]
> I also dont see it happening too often in other kafka streams applications that we have. 
> The only thing suspicious that I see around every hour that different pods of that kafka streams application throw this exception: 
> {"timestamp":"2020-10-25T06:44:20.414Z","level":"INFO","thread":"**-**-stream-94561945-4191-4a07-ac1b-07b27e044402-StreamThread-1","logger":"org.apache.kafka.clients.FetchSessionHandler","message":"[Consumer clientId=**-**-stream-94561945-4191-4a07-ac1b-07b27e044402-StreamThread-1-restore-consumer, groupId=null] Error sending fetch request (sessionId=34683236, epoch=2872) to node 3:","context":"default","exception":"org.apache.kafka.common.errors.DisconnectException: null\n"}
> I came across this strange behaviour after stated to investigate a strange stuck rebalancing state after one of the members left the group and caused the rebalance to stuck - the only thing that I found is that maybe because that too often preparing to rebalance states, the app might affected of this bug - KAFKA-9752 ?
> I dont understand why it happens, it wasn't before I applied static membership to that kafka streams application (since around 2 weeks ago). 
> Will be happy if you can help me
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)