You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Craig Pastro <si...@gmail.com> on 2019/11/29 03:10:23 UTC

More partitions => less throughput?

Hello there,

I was wondering if anyone here could help me with some insight into a
conundrum that I am facing.

Basically, the story is that I am running three Kafka brokers via docker on
a single vm with log.flush.interval.messages = 1 and min.insync.replicas =
2. Then I create two topics: both with replication factor = 3, but one with
one partition and the other with 64.

Then I try to run a benchmark using these topics and what I find is as
follows:

1 partition, 1381.02 records/sec,  685.87 ms average latency
64 partitions, 601.00 records/sec, 1298.18 ms average latency

This is the opposite of what I expected. In neither case am I even close to
the IOPS of what the disk can handle. So what I would like to know is if
there is any obvious reason that I am missing for the slow down with more
partitions?

If it is helpful the docker-compose file and the code to do the
benchmarking can be found at https://github.com/siyopao/kafka-benchmark.
(Any comments or advice on how to make the code better are greatly
appreciated!) The benchmarking code is inspired by and very similar to what
the bin/kafka-producer-perf-test.sh script does.

Thank you!

Best wishes,
Craig

Re: More partitions => less throughput?

Posted by Craig Pastro <si...@gmail.com>.
Dear Tom, Peter and Eric,

Thank you very much for your answers!

What I think is that I need to play around with more configurations.
Actually I had not thought that 64 partitions on a single box was very
large. Thank you!

I had thought that when I set log.flush.interval.message = 1, that I could
send a lot of records to the brokers and have the throughput determined by
the IOPS of the disk, but this was not the case at all. Nor CPU, nor
memory, nor network it seems. So I am wondering what is throttling the
throughput...? In any case, I'll create some new environments and play
around a bit more.

Thank you!

Best wishes,
Craig





On Sun, Dec 1, 2019 at 2:53 AM Eric Owhadi <er...@esgyn.com> wrote:

> What is happening imho is that when you have multiple partitions, each
> consumer will fetch data from its partition and find only 1/64th the amount
> of data (compared to the single partition case) to send every time it is
> its turn to send stuff. Therefore you end up having a more chatty
> situation, where each push to broker carry too small number of messages,
> compared to the single partition case that optimize can perform the same
> function but each set of message send to broker contains higher message
> count.
> Eric
>
> -----Original Message-----
> From: Craig Pastro <si...@gmail.com>
> Sent: Thursday, November 28, 2019 9:10 PM
> To: users@kafka.apache.org
> Subject: More partitions => less throughput?
>
> External
>
> Hello there,
>
> I was wondering if anyone here could help me with some insight into a
> conundrum that I am facing.
>
> Basically, the story is that I am running three Kafka brokers via docker
> on a single vm with log.flush.interval.messages = 1 and min.insync.replicas
> = 2. Then I create two topics: both with replication factor = 3, but one
> with one partition and the other with 64.
>
> Then I try to run a benchmark using these topics and what I find is as
> follows:
>
> 1 partition, 1381.02 records/sec,  685.87 ms average latency
> 64 partitions, 601.00 records/sec, 1298.18 ms average latency
>
> This is the opposite of what I expected. In neither case am I even close
> to the IOPS of what the disk can handle. So what I would like to know is if
> there is any obvious reason that I am missing for the slow down with more
> partitions?
>
> If it is helpful the docker-compose file and the code to do the
> benchmarking can be found at https://github.com/siyopao/kafka-benchmark.
> (Any comments or advice on how to make the code better are greatly
> appreciated!) The benchmarking code is inspired by and very similar to
> what the bin/kafka-producer-perf-test.sh script does.
>
> Thank you!
>
> Best wishes,
> Craig
>

RE: More partitions => less throughput?

Posted by Eric Owhadi <er...@esgyn.com>.
What is happening imho is that when you have multiple partitions, each consumer will fetch data from its partition and find only 1/64th the amount of data (compared to the single partition case) to send every time it is its turn to send stuff. Therefore you end up having a more chatty situation, where each push to broker carry too small number of messages, compared to the single partition case that optimize can perform the same function but each set of message send to broker contains higher message count.
Eric

-----Original Message-----
From: Craig Pastro <si...@gmail.com> 
Sent: Thursday, November 28, 2019 9:10 PM
To: users@kafka.apache.org
Subject: More partitions => less throughput?

External

Hello there,

I was wondering if anyone here could help me with some insight into a conundrum that I am facing.

Basically, the story is that I am running three Kafka brokers via docker on a single vm with log.flush.interval.messages = 1 and min.insync.replicas = 2. Then I create two topics: both with replication factor = 3, but one with one partition and the other with 64.

Then I try to run a benchmark using these topics and what I find is as
follows:

1 partition, 1381.02 records/sec,  685.87 ms average latency
64 partitions, 601.00 records/sec, 1298.18 ms average latency

This is the opposite of what I expected. In neither case am I even close to the IOPS of what the disk can handle. So what I would like to know is if there is any obvious reason that I am missing for the slow down with more partitions?

If it is helpful the docker-compose file and the code to do the benchmarking can be found at https://github.com/siyopao/kafka-benchmark.
(Any comments or advice on how to make the code better are greatly
appreciated!) The benchmarking code is inspired by and very similar to what the bin/kafka-producer-perf-test.sh script does.

Thank you!

Best wishes,
Craig

Re: More partitions => less throughput?

Posted by Peter Bukowinski <pm...@gmail.com>.
Testing multiple brokers VMs on a single host won’t give you accurate performance numbers unless that is how you will be deploying kafka in production. (Don’t do this.) All your kafka networking is being handled by a single host, so instead of being spread out between machines to increase total possible throughput, they are competing with each other.

Given that this is the test environment you settled on, you should tune the number of partitions taking number of producers and consumers, and also the average message size into account. If you have only one producer, then a single consumer should be sufficient to read the data in real-time. If you have multiple producers, you may need to scale up the consumer count and use consumer groups.

-- Peter

> On Nov 30, 2019, at 8:57 AM, Tom Brown <to...@gmail.com> wrote:
> 
> I think the number of partitions needs to be tuned to the size of the
> cluster; 64 partitions on what is essentially a single box seems high. Do
> you know what hardware you will be deploying on in production? Can you run
> your benchmark on that instead of a vm?
> 
> —Tom
> 
>> On Thursday, November 28, 2019, Craig Pastro <si...@gmail.com> wrote:
>> 
>> Hello there,
>> 
>> I was wondering if anyone here could help me with some insight into a
>> conundrum that I am facing.
>> 
>> Basically, the story is that I am running three Kafka brokers via docker on
>> a single vm with log.flush.interval.messages = 1 and min.insync.replicas =
>> 2. Then I create two topics: both with replication factor = 3, but one with
>> one partition and the other with 64.
>> 
>> Then I try to run a benchmark using these topics and what I find is as
>> follows:
>> 
>> 1 partition, 1381.02 records/sec,  685.87 ms average latency
>> 64 partitions, 601.00 records/sec, 1298.18 ms average latency
>> 
>> This is the opposite of what I expected. In neither case am I even close to
>> the IOPS of what the disk can handle. So what I would like to know is if
>> there is any obvious reason that I am missing for the slow down with more
>> partitions?
>> 
>> If it is helpful the docker-compose file and the code to do the
>> benchmarking can be found at https://github.com/siyopao/kafka-benchmark.
>> (Any comments or advice on how to make the code better are greatly
>> appreciated!) The benchmarking code is inspired by and very similar to what
>> the bin/kafka-producer-perf-test.sh script does.
>> 
>> Thank you!
>> 
>> Best wishes,
>> Craig
>> 

Re: More partitions => less throughput?

Posted by Tom Brown <to...@gmail.com>.
I think the number of partitions needs to be tuned to the size of the
cluster; 64 partitions on what is essentially a single box seems high. Do
you know what hardware you will be deploying on in production? Can you run
your benchmark on that instead of a vm?

—Tom

On Thursday, November 28, 2019, Craig Pastro <si...@gmail.com> wrote:

> Hello there,
>
> I was wondering if anyone here could help me with some insight into a
> conundrum that I am facing.
>
> Basically, the story is that I am running three Kafka brokers via docker on
> a single vm with log.flush.interval.messages = 1 and min.insync.replicas =
> 2. Then I create two topics: both with replication factor = 3, but one with
> one partition and the other with 64.
>
> Then I try to run a benchmark using these topics and what I find is as
> follows:
>
> 1 partition, 1381.02 records/sec,  685.87 ms average latency
> 64 partitions, 601.00 records/sec, 1298.18 ms average latency
>
> This is the opposite of what I expected. In neither case am I even close to
> the IOPS of what the disk can handle. So what I would like to know is if
> there is any obvious reason that I am missing for the slow down with more
> partitions?
>
> If it is helpful the docker-compose file and the code to do the
> benchmarking can be found at https://github.com/siyopao/kafka-benchmark.
> (Any comments or advice on how to make the code better are greatly
> appreciated!) The benchmarking code is inspired by and very similar to what
> the bin/kafka-producer-perf-test.sh script does.
>
> Thank you!
>
> Best wishes,
> Craig
>