You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Karthik Selva <ka...@julysystems.com> on 2018/01/06 12:38:34 UTC

Topic/Partition Assignment in streams application cluster

Hi,

I am using Kafka Streams for one of our application. The application has
several type of topics like the initial ingestion topics and the live
stream topic, all are sharing the same state in continuous fashion.

My problem is that the the assignment of these topics/partitions where I am
observing all the ingestion topics are assigned to the instance A of the
cluster(A,B,C) and live stream topics are assigned to the instance B and
other type topics are assigned to instance C. Due to this, the load is not
distributed to these 3 instances and it always goes to any one of the
instance the instance in the cluster.

Is there any way for me to give weightage for the topics or always
distribute the equivalent amount of partitions(of each topic) to each
instance. Also, how will I handle this scenario when I need to scale
up/down the cluster?

-- 
 

------------------------------

DISCLAIMER

This email message along with attachments, email thread if any, is for the 
sole use of the intended recipients and for the intended purpose and may 
contain July Systems confidential and privileged information.Any 
unauthorized review, use, disclosure, copying, distribution or any other 
form of unintended use is strictly prohibited. If you are not the intended 
recipient, please contact the sender by reply email and destroy all copies 
/ forms of the original document.

Re: Topic/Partition Assignment in streams application cluster

Posted by "Matthias J. Sax" <ma...@confluent.io>.
The assignment strategy cannot be configures in Kafka Streams atm.

How many partitions do you have in your topics? If I read in-between the
lines, it seems that all topics have just one partition? For this case,
it will be hard to scale out, as Kafka Streams scales via the number of
partitions... The maximum number of partitions of your input topics is a
limiting factor for horizontal scale out.

https://docs.confluent.io/current/streams/architecture.html

-Matthias

On 1/6/18 4:38 AM, Karthik Selva wrote:
> Hi,
> 
> I am using Kafka Streams for one of our application. The application has
> several type of topics like the initial ingestion topics and the live
> stream topic, all are sharing the same state in continuous fashion.
> 
> My problem is that the the assignment of these topics/partitions where I am
> observing all the ingestion topics are assigned to the instance A of the
> cluster(A,B,C) and live stream topics are assigned to the instance B and
> other type topics are assigned to instance C. Due to this, the load is not
> distributed to these 3 instances and it always goes to any one of the
> instance the instance in the cluster.
> 
> Is there any way for me to give weightage for the topics or always
> distribute the equivalent amount of partitions(of each topic) to each
> instance. Also, how will I handle this scenario when I need to scale
> up/down the cluster?
> 


Re: [External] Topic/Partition Assignment in streams application cluster

Posted by Steven Schlansker <ss...@opentable.com>.
For what it's worth, we run 32 partitions per topic and have also observed
imbalanced balancing, where a large number of A partitions are assigned to
worker 1 and a large number of B partitions are assigned to worker 2, leading
to imbalanced load.  Nothing super bad for us yet but the effect is noticeable
even in non-degenerate configurations.

> On Jan 6, 2018, at 10:16 AM, Matthias J. Sax <ma...@confluent.io> wrote:
> 
> The assignment strategy cannot be configures in Kafka Streams atm.
> 
> How many partitions do you have in your topics? If I read in-between the
> lines, it seems that all topics have just one partition? For this case,
> it will be hard to scale out, as Kafka Streams scales via the number of
> partitions... The maximum number of partitions of your input topics is a
> limiting factor for horizontal scale out.
> 
> https://docs.confluent.io/current/streams/architecture.html
> 
> -Matthias
> 
> On 1/6/18 4:38 AM, Karthik Selva wrote:
>> Hi,
>> 
>> I am using Kafka Streams for one of our application. The application has
>> several type of topics like the initial ingestion topics and the live
>> stream topic, all are sharing the same state in continuous fashion.
>> 
>> My problem is that the the assignment of these topics/partitions where I am
>> observing all the ingestion topics are assigned to the instance A of the
>> cluster(A,B,C) and live stream topics are assigned to the instance B and
>> other type topics are assigned to instance C. Due to this, the load is not
>> distributed to these 3 instances and it always goes to any one of the
>> instance the instance in the cluster.
>> 
>> Is there any way for me to give weightage for the topics or always
>> distribute the equivalent amount of partitions(of each topic) to each
>> instance. Also, how will I handle this scenario when I need to scale
>> up/down the cluster?
>> 
> 


Re: Topic/Partition Assignment in streams application cluster

Posted by "Matthias J. Sax" <ma...@confluent.io>.
The assignment strategy cannot be configures in Kafka Streams atm.

How many partitions do you have in your topics? If I read in-between the
lines, it seems that all topics have just one partition? For this case,
it will be hard to scale out, as Kafka Streams scales via the number of
partitions... The maximum number of partitions of your input topics is a
limiting factor for horizontal scale out.

https://docs.confluent.io/current/streams/architecture.html

-Matthias

On 1/6/18 4:38 AM, Karthik Selva wrote:
> Hi,
> 
> I am using Kafka Streams for one of our application. The application has
> several type of topics like the initial ingestion topics and the live
> stream topic, all are sharing the same state in continuous fashion.
> 
> My problem is that the the assignment of these topics/partitions where I am
> observing all the ingestion topics are assigned to the instance A of the
> cluster(A,B,C) and live stream topics are assigned to the instance B and
> other type topics are assigned to instance C. Due to this, the load is not
> distributed to these 3 instances and it always goes to any one of the
> instance the instance in the cluster.
> 
> Is there any way for me to give weightage for the topics or always
> distribute the equivalent amount of partitions(of each topic) to each
> instance. Also, how will I handle this scenario when I need to scale
> up/down the cluster?
>