You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Edvard Fagerholm <ed...@gmail.com> on 2023/06/08 10:29:16 UTC

Partition stickiness on consumer group rebalance?

Hello,

I couldn't find an answer in the documentation to the following. If a new
machine joins a consumer group and Kafka triggers a rebalance, will it
randomly reassign partitions or will it hand over partitions from existing
consumers to the newly joined one? In other words, will it attempt to move
as few partitions as possible between consumers?

The main implications of this is local in-memory caches and scaling up the
number of machines in a consumer group, since the scaling up operation will
require nuking any local caches for the partitions that were moved. This
would cause a spike on any DBs that are being cached on the consumers.

Best,
Edvard

Re: Partition stickiness on consumer group rebalance?

Posted by Andrew Grant <an...@gmail.com>.
Hey Edvard,

https://developer.confluent.io/learn-kafka/architecture/consumer-group-protocol/?_gl=1*c3fnxf*_ga*MTQzNDM3Njc5OS4xNjg2MjIyNjAy*_ga_D2D3EGKSGD*MTY4NjIyMjYwMi4xLjAuMTY4NjIyMjYwMi42MC4wLjA.&_ga=2.244068423.375119977.1686222603-1434376799.1686222602
is also a good read that might provide some more information as well.

Andrew

On Thu, Jun 8, 2023 at 7:19 AM Edvard Fagerholm <ed...@gmail.com>
wrote:

> Hi Richard,
>
> Thanks! That answers my question and Kafka also supports exactly what I'm
> looking for.
>
> Best,
> Edvard
>
> On Thu, Jun 8, 2023 at 2:13 PM Richard Bosch <ri...@axual.com>
> wrote:
>
> > Hello Edvard,
> >
> > This is handled by the consumer partition assignment strategy
> > configuration.
> > The StickyAssignor and CooperativeStickyAssignor use an algorithm that
> > aims to leave all the assigned topics to their own consumer.
> > You can read a bit about it here
> >
> >
> https://kafka.apache.org/documentation/#consumerconfigs_partition.assignment.strategy
> > If you search for Kafka Consumer Assignor you can find a lot of
> > articles about this subject.
> >
> > The nice thing about this is that the consumer controls this
> > behaviour, and is not fixed in the broker.
> >
> > Kind regards,
> >
> > Richard Bosch
> > Developer Advocate
> > Axual BV
> > https://axual.com/
> >
> > On Thu, Jun 8, 2023 at 1:03 PM Edvard Fagerholm
> > <ed...@gmail.com> wrote:
> > >
> > > Sorry, but all of this I know and it doesn't answer the question. The
> > > question I have is assume that I have 3 consumers (think KafkaConsumer
> > java
> > > class):
> > >
> > > Consumer 1 assigned to partitions {0, 1}
> > > Consumer 2 assigned to partitions {2, 3}
> > > Consumer 3 assigned to partitions {4, 5}
> > >
> > > Now assume consumer 1 dies or leaves the consumer group whatever.
> > > Partitions are then rebalanced and you could only move 0, 1 to the
> > existing
> > > consumers:
> > >
> > > Consumer 2 assigned to partitions {0, 2, 3}
> > > Consumer 3 assigned to partitions {1, 4, 5}
> > >
> > > However, you could also just shuffle the set {0, 1, 2, 3, 4, 5} and
> > > randomly assign them e.g. :
> > >
> > > Consumer 2 assigned to partitions {1, 2, 5}
> > > Consumer 3 assigned to partitions {0, 3, 4}
> > >
> > > In the former case, only two partitions were moved to a new consumer,
> but
> > > in the latter, 4 partitions were moved.
> > >
> > > Does Kafka strive to do the former? Is there some form of algorithmic
> > > guarantee for how the new partition assignment is computed, so I can
> > build
> > > a deployment strategy around it?
> > >
> > > Best,
> > > Edvard
> > >
> > >
> > > On Thu, Jun 8, 2023 at 1:53 PM sunil chaudhari <
> > sunilmchaudhari05@gmail.com>
> > > wrote:
> > >
> > > > I will try to answer.
> > > > rebalancing triggers when one or two consuemrs(client) leaves the
> group
> > > > because of any reason.
> > > > The thumb rule is Number of partitions should be equal to number of
> > > > consumer threads.
> > > > If there are 300 partitions assigned one thread each it wont
> rebalance
> > > > untill some consumer marked as dead.
> > > > How it marks as dead: if the kafka doesnt receive heartbeat from
> > consumer
> > > > in 5 mins(or defined at client side).
> > > >
> > > > If 20 consumers are dead from kafka persepctive at time T1 then it
> will
> > > > trigger rebalance.
> > > > It will trigger rebalance at time T2 when there is new consumer added
> > to
> > > > the group and there is poll request from new consumer.
> > > >
> > > > If there is no issue with number of partitions and number of
> consumers
> > then
> > > > it wont trigger rebalance.
> > > >
> > > > Terms I used may not be accurate😊
> > > >
> > > > Regards,
> > > > Sunil.
> > > >
> > > > On Thu, 8 Jun 2023 at 3:59 PM, Edvard Fagerholm <
> > > > edvard.fagerholm@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I couldn't find an answer in the documentation to the following. If
> > a new
> > > > > machine joins a consumer group and Kafka triggers a rebalance, will
> > it
> > > > > randomly reassign partitions or will it hand over partitions from
> > > > existing
> > > > > consumers to the newly joined one? In other words, will it attempt
> to
> > > > move
> > > > > as few partitions as possible between consumers?
> > > > >
> > > > > The main implications of this is local in-memory caches and scaling
> > up
> > > > the
> > > > > number of machines in a consumer group, since the scaling up
> > operation
> > > > will
> > > > > require nuking any local caches for the partitions that were moved.
> > This
> > > > > would cause a spike on any DBs that are being cached on the
> > consumers.
> > > > >
> > > > > Best,
> > > > > Edvard
> > > > >
> > > >
> >
>


-- 
Andrew Grant
8054482621

Re: Partition stickiness on consumer group rebalance?

Posted by Edvard Fagerholm <ed...@gmail.com>.
Hi Richard,

Thanks! That answers my question and Kafka also supports exactly what I'm
looking for.

Best,
Edvard

On Thu, Jun 8, 2023 at 2:13 PM Richard Bosch <ri...@axual.com>
wrote:

> Hello Edvard,
>
> This is handled by the consumer partition assignment strategy
> configuration.
> The StickyAssignor and CooperativeStickyAssignor use an algorithm that
> aims to leave all the assigned topics to their own consumer.
> You can read a bit about it here
>
> https://kafka.apache.org/documentation/#consumerconfigs_partition.assignment.strategy
> If you search for Kafka Consumer Assignor you can find a lot of
> articles about this subject.
>
> The nice thing about this is that the consumer controls this
> behaviour, and is not fixed in the broker.
>
> Kind regards,
>
> Richard Bosch
> Developer Advocate
> Axual BV
> https://axual.com/
>
> On Thu, Jun 8, 2023 at 1:03 PM Edvard Fagerholm
> <ed...@gmail.com> wrote:
> >
> > Sorry, but all of this I know and it doesn't answer the question. The
> > question I have is assume that I have 3 consumers (think KafkaConsumer
> java
> > class):
> >
> > Consumer 1 assigned to partitions {0, 1}
> > Consumer 2 assigned to partitions {2, 3}
> > Consumer 3 assigned to partitions {4, 5}
> >
> > Now assume consumer 1 dies or leaves the consumer group whatever.
> > Partitions are then rebalanced and you could only move 0, 1 to the
> existing
> > consumers:
> >
> > Consumer 2 assigned to partitions {0, 2, 3}
> > Consumer 3 assigned to partitions {1, 4, 5}
> >
> > However, you could also just shuffle the set {0, 1, 2, 3, 4, 5} and
> > randomly assign them e.g. :
> >
> > Consumer 2 assigned to partitions {1, 2, 5}
> > Consumer 3 assigned to partitions {0, 3, 4}
> >
> > In the former case, only two partitions were moved to a new consumer, but
> > in the latter, 4 partitions were moved.
> >
> > Does Kafka strive to do the former? Is there some form of algorithmic
> > guarantee for how the new partition assignment is computed, so I can
> build
> > a deployment strategy around it?
> >
> > Best,
> > Edvard
> >
> >
> > On Thu, Jun 8, 2023 at 1:53 PM sunil chaudhari <
> sunilmchaudhari05@gmail.com>
> > wrote:
> >
> > > I will try to answer.
> > > rebalancing triggers when one or two consuemrs(client) leaves the group
> > > because of any reason.
> > > The thumb rule is Number of partitions should be equal to number of
> > > consumer threads.
> > > If there are 300 partitions assigned one thread each it wont rebalance
> > > untill some consumer marked as dead.
> > > How it marks as dead: if the kafka doesnt receive heartbeat from
> consumer
> > > in 5 mins(or defined at client side).
> > >
> > > If 20 consumers are dead from kafka persepctive at time T1 then it will
> > > trigger rebalance.
> > > It will trigger rebalance at time T2 when there is new consumer added
> to
> > > the group and there is poll request from new consumer.
> > >
> > > If there is no issue with number of partitions and number of consumers
> then
> > > it wont trigger rebalance.
> > >
> > > Terms I used may not be accurate😊
> > >
> > > Regards,
> > > Sunil.
> > >
> > > On Thu, 8 Jun 2023 at 3:59 PM, Edvard Fagerholm <
> > > edvard.fagerholm@gmail.com>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > I couldn't find an answer in the documentation to the following. If
> a new
> > > > machine joins a consumer group and Kafka triggers a rebalance, will
> it
> > > > randomly reassign partitions or will it hand over partitions from
> > > existing
> > > > consumers to the newly joined one? In other words, will it attempt to
> > > move
> > > > as few partitions as possible between consumers?
> > > >
> > > > The main implications of this is local in-memory caches and scaling
> up
> > > the
> > > > number of machines in a consumer group, since the scaling up
> operation
> > > will
> > > > require nuking any local caches for the partitions that were moved.
> This
> > > > would cause a spike on any DBs that are being cached on the
> consumers.
> > > >
> > > > Best,
> > > > Edvard
> > > >
> > >
>

Re: Partition stickiness on consumer group rebalance?

Posted by Richard Bosch <ri...@axual.com>.
Hello Edvard,

This is handled by the consumer partition assignment strategy configuration.
The StickyAssignor and CooperativeStickyAssignor use an algorithm that
aims to leave all the assigned topics to their own consumer.
You can read a bit about it here
https://kafka.apache.org/documentation/#consumerconfigs_partition.assignment.strategy
If you search for Kafka Consumer Assignor you can find a lot of
articles about this subject.

The nice thing about this is that the consumer controls this
behaviour, and is not fixed in the broker.

Kind regards,

Richard Bosch
Developer Advocate
Axual BV
https://axual.com/

On Thu, Jun 8, 2023 at 1:03 PM Edvard Fagerholm
<ed...@gmail.com> wrote:
>
> Sorry, but all of this I know and it doesn't answer the question. The
> question I have is assume that I have 3 consumers (think KafkaConsumer java
> class):
>
> Consumer 1 assigned to partitions {0, 1}
> Consumer 2 assigned to partitions {2, 3}
> Consumer 3 assigned to partitions {4, 5}
>
> Now assume consumer 1 dies or leaves the consumer group whatever.
> Partitions are then rebalanced and you could only move 0, 1 to the existing
> consumers:
>
> Consumer 2 assigned to partitions {0, 2, 3}
> Consumer 3 assigned to partitions {1, 4, 5}
>
> However, you could also just shuffle the set {0, 1, 2, 3, 4, 5} and
> randomly assign them e.g. :
>
> Consumer 2 assigned to partitions {1, 2, 5}
> Consumer 3 assigned to partitions {0, 3, 4}
>
> In the former case, only two partitions were moved to a new consumer, but
> in the latter, 4 partitions were moved.
>
> Does Kafka strive to do the former? Is there some form of algorithmic
> guarantee for how the new partition assignment is computed, so I can build
> a deployment strategy around it?
>
> Best,
> Edvard
>
>
> On Thu, Jun 8, 2023 at 1:53 PM sunil chaudhari <su...@gmail.com>
> wrote:
>
> > I will try to answer.
> > rebalancing triggers when one or two consuemrs(client) leaves the group
> > because of any reason.
> > The thumb rule is Number of partitions should be equal to number of
> > consumer threads.
> > If there are 300 partitions assigned one thread each it wont rebalance
> > untill some consumer marked as dead.
> > How it marks as dead: if the kafka doesnt receive heartbeat from consumer
> > in 5 mins(or defined at client side).
> >
> > If 20 consumers are dead from kafka persepctive at time T1 then it will
> > trigger rebalance.
> > It will trigger rebalance at time T2 when there is new consumer added to
> > the group and there is poll request from new consumer.
> >
> > If there is no issue with number of partitions and number of consumers then
> > it wont trigger rebalance.
> >
> > Terms I used may not be accurate😊
> >
> > Regards,
> > Sunil.
> >
> > On Thu, 8 Jun 2023 at 3:59 PM, Edvard Fagerholm <
> > edvard.fagerholm@gmail.com>
> > wrote:
> >
> > > Hello,
> > >
> > > I couldn't find an answer in the documentation to the following. If a new
> > > machine joins a consumer group and Kafka triggers a rebalance, will it
> > > randomly reassign partitions or will it hand over partitions from
> > existing
> > > consumers to the newly joined one? In other words, will it attempt to
> > move
> > > as few partitions as possible between consumers?
> > >
> > > The main implications of this is local in-memory caches and scaling up
> > the
> > > number of machines in a consumer group, since the scaling up operation
> > will
> > > require nuking any local caches for the partitions that were moved. This
> > > would cause a spike on any DBs that are being cached on the consumers.
> > >
> > > Best,
> > > Edvard
> > >
> >

Re: Partition stickiness on consumer group rebalance?

Posted by Edvard Fagerholm <ed...@gmail.com>.
Sorry, but all of this I know and it doesn't answer the question. The
question I have is assume that I have 3 consumers (think KafkaConsumer java
class):

Consumer 1 assigned to partitions {0, 1}
Consumer 2 assigned to partitions {2, 3}
Consumer 3 assigned to partitions {4, 5}

Now assume consumer 1 dies or leaves the consumer group whatever.
Partitions are then rebalanced and you could only move 0, 1 to the existing
consumers:

Consumer 2 assigned to partitions {0, 2, 3}
Consumer 3 assigned to partitions {1, 4, 5}

However, you could also just shuffle the set {0, 1, 2, 3, 4, 5} and
randomly assign them e.g. :

Consumer 2 assigned to partitions {1, 2, 5}
Consumer 3 assigned to partitions {0, 3, 4}

In the former case, only two partitions were moved to a new consumer, but
in the latter, 4 partitions were moved.

Does Kafka strive to do the former? Is there some form of algorithmic
guarantee for how the new partition assignment is computed, so I can build
a deployment strategy around it?

Best,
Edvard


On Thu, Jun 8, 2023 at 1:53 PM sunil chaudhari <su...@gmail.com>
wrote:

> I will try to answer.
> rebalancing triggers when one or two consuemrs(client) leaves the group
> because of any reason.
> The thumb rule is Number of partitions should be equal to number of
> consumer threads.
> If there are 300 partitions assigned one thread each it wont rebalance
> untill some consumer marked as dead.
> How it marks as dead: if the kafka doesnt receive heartbeat from consumer
> in 5 mins(or defined at client side).
>
> If 20 consumers are dead from kafka persepctive at time T1 then it will
> trigger rebalance.
> It will trigger rebalance at time T2 when there is new consumer added to
> the group and there is poll request from new consumer.
>
> If there is no issue with number of partitions and number of consumers then
> it wont trigger rebalance.
>
> Terms I used may not be accurate😊
>
> Regards,
> Sunil.
>
> On Thu, 8 Jun 2023 at 3:59 PM, Edvard Fagerholm <
> edvard.fagerholm@gmail.com>
> wrote:
>
> > Hello,
> >
> > I couldn't find an answer in the documentation to the following. If a new
> > machine joins a consumer group and Kafka triggers a rebalance, will it
> > randomly reassign partitions or will it hand over partitions from
> existing
> > consumers to the newly joined one? In other words, will it attempt to
> move
> > as few partitions as possible between consumers?
> >
> > The main implications of this is local in-memory caches and scaling up
> the
> > number of machines in a consumer group, since the scaling up operation
> will
> > require nuking any local caches for the partitions that were moved. This
> > would cause a spike on any DBs that are being cached on the consumers.
> >
> > Best,
> > Edvard
> >
>

Re: Partition stickiness on consumer group rebalance?

Posted by sunil chaudhari <su...@gmail.com>.
I will try to answer.
rebalancing triggers when one or two consuemrs(client) leaves the group
because of any reason.
The thumb rule is Number of partitions should be equal to number of
consumer threads.
If there are 300 partitions assigned one thread each it wont rebalance
untill some consumer marked as dead.
How it marks as dead: if the kafka doesnt receive heartbeat from consumer
in 5 mins(or defined at client side).

If 20 consumers are dead from kafka persepctive at time T1 then it will
trigger rebalance.
It will trigger rebalance at time T2 when there is new consumer added to
the group and there is poll request from new consumer.

If there is no issue with number of partitions and number of consumers then
it wont trigger rebalance.

Terms I used may not be accurate😊

Regards,
Sunil.

On Thu, 8 Jun 2023 at 3:59 PM, Edvard Fagerholm <ed...@gmail.com>
wrote:

> Hello,
>
> I couldn't find an answer in the documentation to the following. If a new
> machine joins a consumer group and Kafka triggers a rebalance, will it
> randomly reassign partitions or will it hand over partitions from existing
> consumers to the newly joined one? In other words, will it attempt to move
> as few partitions as possible between consumers?
>
> The main implications of this is local in-memory caches and scaling up the
> number of machines in a consumer group, since the scaling up operation will
> require nuking any local caches for the partitions that were moved. This
> would cause a spike on any DBs that are being cached on the consumers.
>
> Best,
> Edvard
>