You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Manikumar (JIRA)" <ji...@apache.org> on 2018/11/09 10:46:00 UTC

[jira] [Updated] (KAFKA-7126) Reduce number of rebalance for large consumer groups after a topic is created

     [ https://issues.apache.org/jira/browse/KAFKA-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Manikumar updated KAFKA-7126:
-----------------------------
    Fix Version/s:     (was: 2.0.0)
                   2.0.1

> Reduce number of rebalance for large consumer groups after a topic is created
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-7126
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7126
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Dong Lin
>            Assignee: Jon Lee
>            Priority: Major
>             Fix For: 2.0.1, 2.1.0
>
>         Attachments: 1.diff
>
>
> For a group of 200 MirrorMaker consumers with patten-based topic subscription, a single topic creation caused 50 rebalances for each of these consumer over 5 minutes period. This causes the MM to significantly lag behind during this 5 minutes period and the clusters may be considerably out-of-sync during this period.
> Ideally we would like to trigger only 1 rebalance in the MM group after a topic is created. And conceptually it should be doable.
>  
> Here is the explanation of this repeated consumer rebalance based on the consumer rebalance logic in the latest Kafka code:
> 1) A topic of 10 partitions are created in the cluster and it matches the subscription pattern of the MM consumers.
> 2) The leader of the MM consumer group detects the new topic after metadata refresh. It triggers rebalance.
> 3) At time T0, the first rebalance finishes. 10 consumers are assigned 1 partition of this topic. The other 190 consumers are not assigned any partition of this topic. At this moment, the newly created topic will appear in `ConsumerCoordinator.subscriptions.subscription` for those consumers who is assigned partition of this consumer or who has refreshed metadata before time T0.
> 4) In the common case, half of the consumers has refreshed metadata before the leader of the consumer group refreshed metadata. Thus around 100 + 10 = 110 consumers has the newly created topic in `ConsumerCoordinator.subscriptions.subscription`. The other 90 consumers do not have this topic in `ConsumerCoordinator.subscriptions.subscription`.
> 5) For those 90 consumers, if any consumer refreshes metadata, it will add this topic to `ConsumerCoordinator.subscriptions.subscription`, which causes `ConsumerCoordinator.rejoinNeededOrPending()` to return true and triggers another rebalance. If a few consumers refresh metadata almost at the same time, they will jointly trigger one rebalance. Otherwise, they each trigger a separate rebalance.
> 6) The default metadata.max.age.ms is 5 minutes. Thus in the worse case, which is probably also the average case if number of consumers in the group is large, the latest consumer will refresh its metadata 5 minutes after T0. And the rebalance will be repeated during this 5 minutes interval.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)