You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "David Jacot (Jira)" <ji...@apache.org> on 2021/06/17 12:06:00 UTC
[jira] [Resolved] (KAFKA-12890) Consumer group stuck in
`CompletingRebalance`
[ https://issues.apache.org/jira/browse/KAFKA-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Jacot resolved KAFKA-12890.
---------------------------------
Fix Version/s: (was: 2.8.1)
(was: 2.7.2)
(was: 2.6.3)
Reviewer: Jason Gustafson
Resolution: Fixed
I will backport the patch to 2.8 branch as well.
> Consumer group stuck in `CompletingRebalance`
> ---------------------------------------------
>
> Key: KAFKA-12890
> URL: https://issues.apache.org/jira/browse/KAFKA-12890
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 2.7.0, 2.6.1, 2.8.0, 2.7.1, 2.6.2
> Reporter: David Jacot
> Assignee: David Jacot
> Priority: Blocker
> Fix For: 3.0.0
>
>
> We have seen recently multiple consumer groups stuck in `CompletingRebalance`. It appears that those group never receives the assignment from the leader of the group and remains stuck in this state forever.
> When a group transitions to the `CompletingRebalance` state, the group coordinator sets up `DelayedHeartbeat` for each member of the group. It does so to ensure that the member sends a sync request within the session timeout. If it does not, the group coordinator rebalances the group. Note that here, `DelayedHeartbeat` is used here for this purpose. `DelayedHeartbeat` are also completed when member heartbeats.
> The issue is that https://github.com/apache/kafka/pull/8834 has changed the heartbeat logic to allow members to heartbeat while the group is in the `CompletingRebalance` state. This was not allowed before. Now, if a member starts to heartbeat while the group is in the `CompletingRebalance`, the heartbeat request will basically complete the pending `DelayedHeartbeat` that was setup previously for catching not receiving the sync request. Therefore, if the sync request never comes, the group coordinator does not notice anymore.
> We need to bring that behavior back somehow.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)