You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Giselle Van Dongen <gi...@klarrio.com> on 2020/07/29 11:58:17 UTC

Partition assignment not well distributed over threads

We have a Kafka Streams (2.4) app consisting of 5 instances. It reads from a Kafka topic with 20 partitions (5 brokers). 

We notice that the partition assignment does not always lead to well distributed load over the different threads. We notice this at startup as well as after a recovery of a failed thread.

1. At startup, some instances get a significantly lower load and sometimes even no load. It seems like instances that come up slightly later get no partitions assigned (because of sticky assignment?). 

2. When one thread (container) dies and comes back it often does not receive any or very few partitions to work on. We assume this has to do with the sticky assignment. Is there any way we can make this distribution more equal?

I was also wondering whether Kafka Streams takes into account colocation of Kafka brokers with stream processing threads when assigning partitions. Do partitions on brokers get assigned to the streams thread that is colocated with it on the same machine? 

Re: Partition assignment not well distributed over threads

Posted by Giselle Van Dongen <gi...@klarrio.com>.
Hey Sophie,

This was indeed the issue. An environment variable got passed through wrong. 
Thank you for your tip that made me check this.

Giselle


On 2020/07/29 17:41:43, Sophie Blee-Goldman <so...@confluent.io> wrote: 
> Hey Giselle,
> 
> How many stream threads is each instance configured with? If the total
> number of threads
> across all instances exceeds the total number of tasks, then some threads
> won't get any
> assigned tasks. There's a known bug where tasks might not get evenly
> distributed over all
> instances in this scenario, as Streams would only attempt to balance the
> tasks over the
> threads. See KAFKA-9173 <https://issues.apache.org/jira/browse/KAFKA-9173>.
> Luckily, this should be fixed in 2.6 which is just about to be
> released.
> 
> Instances that joined later, or restarted, would be more likely to have
> these threads with no
> assigned tasks due to the stickiness optimization, as you guessed.
> 
> If the problem you've run into is due to running more stream threads than
> tasks, I would
> recommend just decreasing the number of threads per instance to get a
> balanced assignment.
> This won't hurt performance in any way since those extra threads would have
> just been sitting
> idle anyways. Or better yet, upgrade to 2.6.
> 
> Regarding the colocation question: no, the assignment doesn't take that
> into account at the
> moment. Typically Streams applications won't be running on the same machine
> as the broker.
> Clearly it has been difficult enough to optimize for two things at the same
> time, stickiness and
> balance, without introducing a third :)
> 
> On Wed, Jul 29, 2020 at 4:58 AM Giselle Van Dongen <
> giselle.vandongen@klarrio.com> wrote:
> 
> > We have a Kafka Streams (2.4) app consisting of 5 instances. It reads from
> > a Kafka topic with 20 partitions (5 brokers).
> >
> > We notice that the partition assignment does not always lead to well
> > distributed load over the different threads. We notice this at startup as
> > well as after a recovery of a failed thread.
> >
> > 1. At startup, some instances get a significantly lower load and sometimes
> > even no load. It seems like instances that come up slightly later get no
> > partitions assigned (because of sticky assignment?).
> >
> > 2. When one thread (container) dies and comes back it often does not
> > receive any or very few partitions to work on. We assume this has to do
> > with the sticky assignment. Is there any way we can make this distribution
> > more equal?
> >
> > I was also wondering whether Kafka Streams takes into account colocation
> > of Kafka brokers with stream processing threads when assigning partitions.
> > Do partitions on brokers get assigned to the streams thread that is
> > colocated with it on the same machine?
> >
> 

Re: Partition assignment not well distributed over threads

Posted by Sophie Blee-Goldman <so...@confluent.io>.
Hey Giselle,

How many stream threads is each instance configured with? If the total
number of threads
across all instances exceeds the total number of tasks, then some threads
won't get any
assigned tasks. There's a known bug where tasks might not get evenly
distributed over all
instances in this scenario, as Streams would only attempt to balance the
tasks over the
threads. See KAFKA-9173 <https://issues.apache.org/jira/browse/KAFKA-9173>.
Luckily, this should be fixed in 2.6 which is just about to be
released.

Instances that joined later, or restarted, would be more likely to have
these threads with no
assigned tasks due to the stickiness optimization, as you guessed.

If the problem you've run into is due to running more stream threads than
tasks, I would
recommend just decreasing the number of threads per instance to get a
balanced assignment.
This won't hurt performance in any way since those extra threads would have
just been sitting
idle anyways. Or better yet, upgrade to 2.6.

Regarding the colocation question: no, the assignment doesn't take that
into account at the
moment. Typically Streams applications won't be running on the same machine
as the broker.
Clearly it has been difficult enough to optimize for two things at the same
time, stickiness and
balance, without introducing a third :)

On Wed, Jul 29, 2020 at 4:58 AM Giselle Van Dongen <
giselle.vandongen@klarrio.com> wrote:

> We have a Kafka Streams (2.4) app consisting of 5 instances. It reads from
> a Kafka topic with 20 partitions (5 brokers).
>
> We notice that the partition assignment does not always lead to well
> distributed load over the different threads. We notice this at startup as
> well as after a recovery of a failed thread.
>
> 1. At startup, some instances get a significantly lower load and sometimes
> even no load. It seems like instances that come up slightly later get no
> partitions assigned (because of sticky assignment?).
>
> 2. When one thread (container) dies and comes back it often does not
> receive any or very few partitions to work on. We assume this has to do
> with the sticky assignment. Is there any way we can make this distribution
> more equal?
>
> I was also wondering whether Kafka Streams takes into account colocation
> of Kafka brokers with stream processing threads when assigning partitions.
> Do partitions on brokers get assigned to the streams thread that is
> colocated with it on the same machine?
>