You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by xiaobo lee <le...@gmail.com> on 2014/03/21 03:55:56 UTC

rebalance problem when use High Level Consumer API

The example of consumer api is here:
https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example

I read the source code of the consumer
[kafka.consumer.ZookeeperConsumerConnector] , I found the rebalance was
completed in consumer. I want to know, when a consumer A watched the
/consumer/[group_id]/ids/ path and found a new consumer B was joined, then
A and B will rebalance, before the rebalance was finished, the consumer C
was joined too and rebalace itself .....

I mean that when there were too many consumers was setup, every consumer
will rebalance again and again, it's inefficiency.

I want to try to realize a centralized rebalance manager, but it can not
solve the problem that rebalace too many when more consumers joined one by
one. Any one can help me ?

Re: rebalance problem when use High Level Consumer API

Posted by Neha Narkhede <ne...@gmail.com>.
ZookeeperConsumerConnector actually has a smart to avoid doing n rebalances
when n consumers start one after the other in quick succession. It queues
up requests for more rebalances while the current rebalance is in progress,
effectively reducing the number of rebalance attempts. Look for
watcherExecutorThread in ZookeeperConsumerConnector.

Thanks,
Neha


On Thu, Mar 20, 2014 at 7:55 PM, xiaobo lee <le...@gmail.com>wrote:

> The example of consumer api is here:
> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
>
> I read the source code of the consumer
> [kafka.consumer.ZookeeperConsumerConnector] , I found the rebalance was
> completed in consumer. I want to know, when a consumer A watched the
> /consumer/[group_id]/ids/ path and found a new consumer B was joined, then
> A and B will rebalance, before the rebalance was finished, the consumer C
> was joined too and rebalace itself .....
>
> I mean that when there were too many consumers was setup, every consumer
> will rebalance again and again, it's inefficiency.
>
> I want to try to realize a centralized rebalance manager, but it can not
> solve the problem that rebalace too many when more consumers joined one by
> one. Any one can help me ?
>