You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Maria Pilar <pi...@gmail.com> on 2018/01/29 16:58:17 UTC

Choose the number of partitions/topics

Hi everyone

I have design an integration between 2 systems throug our API Stream Kafka,
and the requirements are unclear to choose properly the number of
partitions/topics.

That is the use case:

My producer will send 28 different type of events, so I have decided to
create 28 topics.

The max size value for one message will be 4,096 bytes and the total size
(MB/day) will be 2.469,888 mb/day.

The retention will be 2 days.

By default I´m thinking in one partition that as recomentation by confluent
it can produce 10 Mb/second.

However the requirement for the consumer is the minimun latency (sub 3
seconds), so I thinking to create more leader partitions/per topic to
paralle and achive the thoughput.

Do you know what is the best practice or formule to define it properly?

Thanks

Re: Choose the number of partitions/topics

Posted by "Chicolo, Robert (rchicolo@student.cccs.edu)" <rc...@student.cccs.edu>.
so it goes beyond the throughput that kafka can support.  You have to decide as to what degree of parallelism your application can support. If one message processing depends on processing for another message, that limits the degree to which you can process in parallel. Depending on how much time the processing of the message takes and the desired response times the stream can be parallelized.

________________________________
From: Maria Pilar <pi...@gmail.com>
Sent: Monday, January 29, 2018 8:58:17 AM
To: dev@kafka.apache.org; users@kafka.apache.org
Subject: Choose the number of partitions/topics

Hi everyone

I have design an integration between 2 systems throug our API Stream Kafka,
and the requirements are unclear to choose properly the number of
partitions/topics.

That is the use case:

My producer will send 28 different type of events, so I have decided to
create 28 topics.

The max size value for one message will be 4,096 bytes and the total size
(MB/day) will be 2.469,888 mb/day.

The retention will be 2 days.

By default I´m thinking in one partition that as recomentation by confluent
it can produce 10 Mb/second.

However the requirement for the consumer is the minimun latency (sub 3
seconds), so I thinking to create more leader partitions/per topic to
paralle and achive the thoughput.

Do you know what is the best practice or formule to define it properly?

Thanks

Re: Choose the number of partitions/topics

Posted by "Chicolo, Robert (rchicolo@student.cccs.edu)" <rc...@student.cccs.edu>.
so it goes beyond the throughput that kafka can support.  You have to decide as to what degree of parallelism your application can support. If one message processing depends on processing for another message, that limits the degree to which you can process in parallel. Depending on how much time the processing of the message takes and the desired response times the stream can be parallelized.

________________________________
From: Maria Pilar <pi...@gmail.com>
Sent: Monday, January 29, 2018 8:58:17 AM
To: dev@kafka.apache.org; users@kafka.apache.org
Subject: Choose the number of partitions/topics

Hi everyone

I have design an integration between 2 systems throug our API Stream Kafka,
and the requirements are unclear to choose properly the number of
partitions/topics.

That is the use case:

My producer will send 28 different type of events, so I have decided to
create 28 topics.

The max size value for one message will be 4,096 bytes and the total size
(MB/day) will be 2.469,888 mb/day.

The retention will be 2 days.

By default I´m thinking in one partition that as recomentation by confluent
it can produce 10 Mb/second.

However the requirement for the consumer is the minimun latency (sub 3
seconds), so I thinking to create more leader partitions/per topic to
paralle and achive the thoughput.

Do you know what is the best practice or formule to define it properly?

Thanks