You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by girija arumugam <gi...@gmail.com> on 2020/12/01 10:52:04 UTC

Regarding Fairness in choosing topic and partition for an event

Team,
 We are having a use-case, where a User in an Organisation can send data at
different rates for the event - sending mail.These events are published to
kafka and are consumed at our application side for processing.We need to
maintain the order for the events produced by a user in an org.

In our case, the growing factors are

   1. Organisation - 1 to million.
   2. Each Org can have Users from 1 to million.

*Topic Design :*

   1. .Fixed number of topics with fixed number of partitions : - 10 topics
   with 10 partitions
   2. 2.Single topic with N partitions

In these designs, we are not able to provide fairness to all the
events/messages.

(ie) Take an example,

     Let's say there are two orgs o1, o2 . Each org has 2 users. u1,u2
belong to o1 and u3,u4 belong to o2.And I have a single topic with 5
partitions.

    Let's assume that my partition logic for the example , the produce rate
and its state

   - o1,u1 -> 0th partition -> produce data at the rate of 1000
   messages/sec -> hyperactive
   - o1,u2 -> 1st partition -> produce data at the rate of 10 messages/sec
   -> normal
   - o2,u3 -> 0th partition -> produce data at the rate of 5 messages/min
   -> normal
   - o2,u4 -> 2nd partition -> produce data at the rate of 50 messages/sec
   -> normal

 Here, o1,u1 and o2,u3 are producing data to the same partition at
different rates.In this case, I want to give fairness to o1,u1 &
o2,u3.o1,u1 will be the one who is using the 0th partition more than
o2,u3.So , if i get a message from o2,u3 , I need to give some high
priority to the message.

I would like to design a topic & partition in such a way that every message
produced should be equally treated and the message from one org-one user
has to be maintained in an order.

*Questions ?*

   1. How to design a topic for an event to give fairness in-terms of  orgs
   as well as users ?
   2. We know that at the end, the event will be residing on one of the
   partitions in the topic.How to choose a partition for an event created by a
   user in an org ? How to give fairness in choosing a partition for a user ?
   3. *How to achieve one/more hyperactive user(s) from an org doesn't
   affect the others ?*


Anyone, please guide me to achieve a design.

I have referred the following references,

   1.
   https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/
   2. https://www.confluent.io/blog/put-several-event-types-kafka-topic/


Regards,
Girija A.

Re: Regarding Fairness in choosing topic and partition for an event

Posted by Gowtham S <go...@gmail.com>.
Team, we too have these challenges at our end, anyone please guide us.


With regards,
Gowtham S.


On Tue, 1 Dec 2020 at 16:23, girija arumugam <gi...@gmail.com>
wrote:

> Team,
>  We are having a use-case, where a User in an Organisation can send data at
> different rates for the event - sending mail.These events are published to
> kafka and are consumed at our application side for processing.We need to
> maintain the order for the events produced by a user in an org.
>
> In our case, the growing factors are
>
>    1. Organisation - 1 to million.
>    2. Each Org can have Users from 1 to million.
>
> *Topic Design :*
>
>    1. .Fixed number of topics with fixed number of partitions : - 10 topics
>    with 10 partitions
>    2. 2.Single topic with N partitions
>
> In these designs, we are not able to provide fairness to all the
> events/messages.
>
> (ie) Take an example,
>
>      Let's say there are two orgs o1, o2 . Each org has 2 users. u1,u2
> belong to o1 and u3,u4 belong to o2.And I have a single topic with 5
> partitions.
>
>     Let's assume that my partition logic for the example , the produce rate
> and its state
>
>    - o1,u1 -> 0th partition -> produce data at the rate of 1000
>    messages/sec -> hyperactive
>    - o1,u2 -> 1st partition -> produce data at the rate of 10 messages/sec
>    -> normal
>    - o2,u3 -> 0th partition -> produce data at the rate of 5 messages/min
>    -> normal
>    - o2,u4 -> 2nd partition -> produce data at the rate of 50 messages/sec
>    -> normal
>
>  Here, o1,u1 and o2,u3 are producing data to the same partition at
> different rates.In this case, I want to give fairness to o1,u1 &
> o2,u3.o1,u1 will be the one who is using the 0th partition more than
> o2,u3.So , if i get a message from o2,u3 , I need to give some high
> priority to the message.
>
> I would like to design a topic & partition in such a way that every message
> produced should be equally treated and the message from one org-one user
> has to be maintained in an order.
>
> *Questions ?*
>
>    1. How to design a topic for an event to give fairness in-terms of  orgs
>    as well as users ?
>    2. We know that at the end, the event will be residing on one of the
>    partitions in the topic.How to choose a partition for an event created
> by a
>    user in an org ? How to give fairness in choosing a partition for a
> user ?
>    3. *How to achieve one/more hyperactive user(s) from an org doesn't
>    affect the others ?*
>
>
> Anyone, please guide me to achieve a design.
>
> I have referred the following references,
>
>    1.
>
> https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/
>    2. https://www.confluent.io/blog/put-several-event-types-kafka-topic/
>
>
> Regards,
> Girija A.
>