You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by girija arumugam <gi...@gmail.com> on 2020/12/01 10:52:04 UTC
Regarding Fairness in choosing topic and partition for an event
Team,
We are having a use-case, where a User in an Organisation can send data at
different rates for the event - sending mail.These events are published to
kafka and are consumed at our application side for processing.We need to
maintain the order for the events produced by a user in an org.
In our case, the growing factors are
1. Organisation - 1 to million.
2. Each Org can have Users from 1 to million.
*Topic Design :*
1. .Fixed number of topics with fixed number of partitions : - 10 topics
with 10 partitions
2. 2.Single topic with N partitions
In these designs, we are not able to provide fairness to all the
events/messages.
(ie) Take an example,
Let's say there are two orgs o1, o2 . Each org has 2 users. u1,u2
belong to o1 and u3,u4 belong to o2.And I have a single topic with 5
partitions.
Let's assume that my partition logic for the example , the produce rate
and its state
- o1,u1 -> 0th partition -> produce data at the rate of 1000
messages/sec -> hyperactive
- o1,u2 -> 1st partition -> produce data at the rate of 10 messages/sec
-> normal
- o2,u3 -> 0th partition -> produce data at the rate of 5 messages/min
-> normal
- o2,u4 -> 2nd partition -> produce data at the rate of 50 messages/sec
-> normal
Here, o1,u1 and o2,u3 are producing data to the same partition at
different rates.In this case, I want to give fairness to o1,u1 &
o2,u3.o1,u1 will be the one who is using the 0th partition more than
o2,u3.So , if i get a message from o2,u3 , I need to give some high
priority to the message.
I would like to design a topic & partition in such a way that every message
produced should be equally treated and the message from one org-one user
has to be maintained in an order.
*Questions ?*
1. How to design a topic for an event to give fairness in-terms of orgs
as well as users ?
2. We know that at the end, the event will be residing on one of the
partitions in the topic.How to choose a partition for an event created by a
user in an org ? How to give fairness in choosing a partition for a user ?
3. *How to achieve one/more hyperactive user(s) from an org doesn't
affect the others ?*
Anyone, please guide me to achieve a design.
I have referred the following references,
1.
https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/
2. https://www.confluent.io/blog/put-several-event-types-kafka-topic/
Regards,
Girija A.
Re: Regarding Fairness in choosing topic and partition for an event
Posted by Gowtham S <go...@gmail.com>.
Team, we too have these challenges at our end, anyone please guide us.
With regards,
Gowtham S.
On Tue, 1 Dec 2020 at 16:23, girija arumugam <gi...@gmail.com>
wrote:
> Team,
> We are having a use-case, where a User in an Organisation can send data at
> different rates for the event - sending mail.These events are published to
> kafka and are consumed at our application side for processing.We need to
> maintain the order for the events produced by a user in an org.
>
> In our case, the growing factors are
>
> 1. Organisation - 1 to million.
> 2. Each Org can have Users from 1 to million.
>
> *Topic Design :*
>
> 1. .Fixed number of topics with fixed number of partitions : - 10 topics
> with 10 partitions
> 2. 2.Single topic with N partitions
>
> In these designs, we are not able to provide fairness to all the
> events/messages.
>
> (ie) Take an example,
>
> Let's say there are two orgs o1, o2 . Each org has 2 users. u1,u2
> belong to o1 and u3,u4 belong to o2.And I have a single topic with 5
> partitions.
>
> Let's assume that my partition logic for the example , the produce rate
> and its state
>
> - o1,u1 -> 0th partition -> produce data at the rate of 1000
> messages/sec -> hyperactive
> - o1,u2 -> 1st partition -> produce data at the rate of 10 messages/sec
> -> normal
> - o2,u3 -> 0th partition -> produce data at the rate of 5 messages/min
> -> normal
> - o2,u4 -> 2nd partition -> produce data at the rate of 50 messages/sec
> -> normal
>
> Here, o1,u1 and o2,u3 are producing data to the same partition at
> different rates.In this case, I want to give fairness to o1,u1 &
> o2,u3.o1,u1 will be the one who is using the 0th partition more than
> o2,u3.So , if i get a message from o2,u3 , I need to give some high
> priority to the message.
>
> I would like to design a topic & partition in such a way that every message
> produced should be equally treated and the message from one org-one user
> has to be maintained in an order.
>
> *Questions ?*
>
> 1. How to design a topic for an event to give fairness in-terms of orgs
> as well as users ?
> 2. We know that at the end, the event will be residing on one of the
> partitions in the topic.How to choose a partition for an event created
> by a
> user in an org ? How to give fairness in choosing a partition for a
> user ?
> 3. *How to achieve one/more hyperactive user(s) from an org doesn't
> affect the others ?*
>
>
> Anyone, please guide me to achieve a design.
>
> I have referred the following references,
>
> 1.
>
> https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/
> 2. https://www.confluent.io/blog/put-several-event-types-kafka-topic/
>
>
> Regards,
> Girija A.
>