You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Cédric Chantepie <c....@yahoo.fr.INVALID> on 2017/05/17 23:59:06 UTC

Heartbeat failed w/ multi-group/consumers

Hi,

I have a test app using Java lib for consumers with Kafka 0.10, using Kafka storage for offset.

This app is managing 190 consumers, accross 19 different consumer group, against 12 distinct topics (details bellow).

When one app instance is starting, with 40 partitions per topic, it takes ~1m to get stable assignments on topic partitions for all the workers: long time but could be understandable ...

When two instances of the app are started (same kafka server setup), it goes through an infinite rebalancing loop, not consuming anything.

This test app is just printing data for now, no long processing. I've tried several settings (session.timeout, heartbeat, client.id, ...), but I cannot fix that.

Is there a know issue related to such case?

For more details, this app is simulating consumpting by several kinds of worker/processor, with one distinct consumer group per kind.

Some different kinds of worker, and so different consumer groups, could subscribe to a same topic.

The app is so using 19 distinct consumer groups (1 per simulated worker kind), against 12 different topics.

Moreover, per each kind of worker, 10 consumer instances are created and subscribe using the same consumer group to the appropriate topic.

Re: Heartbeat failed w/ multi-group/consumers

Posted by Guozhang Wang <wa...@gmail.com>.
Did you have any heavy logic in the rebalance callback? There are some
observed issues that have been fixed in later versions such as:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-62%3A+Allow+consumer+to+send+heartbeats+from+a+background+thread

(do not falls out of the rebalance during rebalance callbacks)

https://cwiki.apache.org/confluence/display/KAFKA/KIP-134%3A+Delay+initial+consumer+group+rebalance

(avoid triggering the rebalance immediately upon getting the first member
of a new group)


Guozhang


On Wed, May 17, 2017 at 8:33 PM, Abhimanyu Nagrath <
abhimanyunagrath@gmail.com> wrote:

> I am also facing the same issue .
>
>
> Regards,
> Abhimanyu
>
> On Thu, May 18, 2017 at 5:29 AM, Cédric Chantepie <
> c.chantepie@yahoo.fr.invalid> wrote:
>
> > Hi,
> >
> > I have a test app using Java lib for consumers with Kafka 0.10, using
> > Kafka storage for offset.
> >
> > This app is managing 190 consumers, accross 19 different consumer group,
> > against 12 distinct topics (details bellow).
> >
> > When one app instance is starting, with 40 partitions per topic, it takes
> > ~1m to get stable assignments on topic partitions for all the workers:
> long
> > time but could be understandable ...
> >
> > When two instances of the app are started (same kafka server setup), it
> > goes through an infinite rebalancing loop, not consuming anything.
> >
> > This test app is just printing data for now, no long processing. I've
> > tried several settings (session.timeout, heartbeat, client.id, ...), but
> > I cannot fix that.
> >
> > Is there a know issue related to such case?
> >
> > For more details, this app is simulating consumpting by several kinds of
> > worker/processor, with one distinct consumer group per kind.
> >
> > Some different kinds of worker, and so different consumer groups, could
> > subscribe to a same topic.
> >
> > The app is so using 19 distinct consumer groups (1 per simulated worker
> > kind), against 12 different topics.
> >
> > Moreover, per each kind of worker, 10 consumer instances are created and
> > subscribe using the same consumer group to the appropriate topic.
>



-- 
-- Guozhang

Re: Heartbeat failed w/ multi-group/consumers

Posted by Abhimanyu Nagrath <ab...@gmail.com>.
I am also facing the same issue .


Regards,
Abhimanyu

On Thu, May 18, 2017 at 5:29 AM, Cédric Chantepie <
c.chantepie@yahoo.fr.invalid> wrote:

> Hi,
>
> I have a test app using Java lib for consumers with Kafka 0.10, using
> Kafka storage for offset.
>
> This app is managing 190 consumers, accross 19 different consumer group,
> against 12 distinct topics (details bellow).
>
> When one app instance is starting, with 40 partitions per topic, it takes
> ~1m to get stable assignments on topic partitions for all the workers: long
> time but could be understandable ...
>
> When two instances of the app are started (same kafka server setup), it
> goes through an infinite rebalancing loop, not consuming anything.
>
> This test app is just printing data for now, no long processing. I've
> tried several settings (session.timeout, heartbeat, client.id, ...), but
> I cannot fix that.
>
> Is there a know issue related to such case?
>
> For more details, this app is simulating consumpting by several kinds of
> worker/processor, with one distinct consumer group per kind.
>
> Some different kinds of worker, and so different consumer groups, could
> subscribe to a same topic.
>
> The app is so using 19 distinct consumer groups (1 per simulated worker
> kind), against 12 different topics.
>
> Moreover, per each kind of worker, 10 consumer instances are created and
> subscribe using the same consumer group to the appropriate topic.