You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Steven Schlansker <ss...@opentable.com> on 2018/01/09 00:47:03 UTC

Re: [External] Topic/Partition Assignment in streams application cluster

For what it's worth, we run 32 partitions per topic and have also observed
imbalanced balancing, where a large number of A partitions are assigned to
worker 1 and a large number of B partitions are assigned to worker 2, leading
to imbalanced load.  Nothing super bad for us yet but the effect is noticeable
even in non-degenerate configurations.

> On Jan 6, 2018, at 10:16 AM, Matthias J. Sax <ma...@confluent.io> wrote:
> 
> The assignment strategy cannot be configures in Kafka Streams atm.
> 
> How many partitions do you have in your topics? If I read in-between the
> lines, it seems that all topics have just one partition? For this case,
> it will be hard to scale out, as Kafka Streams scales via the number of
> partitions... The maximum number of partitions of your input topics is a
> limiting factor for horizontal scale out.
> 
> https://docs.confluent.io/current/streams/architecture.html
> 
> -Matthias
> 
> On 1/6/18 4:38 AM, Karthik Selva wrote:
>> Hi,
>> 
>> I am using Kafka Streams for one of our application. The application has
>> several type of topics like the initial ingestion topics and the live
>> stream topic, all are sharing the same state in continuous fashion.
>> 
>> My problem is that the the assignment of these topics/partitions where I am
>> observing all the ingestion topics are assigned to the instance A of the
>> cluster(A,B,C) and live stream topics are assigned to the instance B and
>> other type topics are assigned to instance C. Due to this, the load is not
>> distributed to these 3 instances and it always goes to any one of the
>> instance the instance in the cluster.
>> 
>> Is there any way for me to give weightage for the topics or always
>> distribute the equivalent amount of partitions(of each topic) to each
>> instance. Also, how will I handle this scenario when I need to scale
>> up/down the cluster?
>> 
>