You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Elben Shira <el...@gmail.com> on 2012/03/09 19:32:38 UTC

ec2 performance

Hey guys,

We're trying to deploy kafka 0.7 on EC2. According to a thread [1], he was
getting 20,000 messages/sec on both EBS and local disk, at a message size
of 1000. We have message sizes of 2K-6K, at a rate of 5,000 messages/sec
and growing. So we ran some tests to see how kafka can handle this. My
setup is a m1.large server running zookeeper and kafka server. Another
m1.large server doing the perf tests.

For the producer test, I ran:

bin/kafka-producer-perf-test.sh --async --batch-size 200 --brokerinfo
zk.connect=[REDACTED] --compression-codec 0 --message-size 3000 --messages
5000000 --topic elben-perf-test-2 --vary-message-size

And the results: https://gist.github.com/dc5e9cce497807d578d9

There are some weird results like this line:

INFO thread 8: 495000 messages sent 14124.2938 nMsg/sec 19.9273 MBs/sec
INFO thread 8: 500000 messages sent 21459.2275 nMsg/sec 30.8321 MBs/sec

Any ideas what's happening here? Are the perf tests miscalculating the
running average? But I think a correct conclusion is it produced 7496644565
bytes in 369 seconds, or roughly 20 MB/s.

Running the producer with --compression-codec 1 (gzip), I get:

bin/producer-perf-test.sh --async --batch-size 200 --brokerinfo zk.connect=
kafka1.i.massrel.com --compression-codec 2 --message-size 3000 --messages
1000000 --topic elben-perf-test-3 --vary-message-size
[0] 0:bash*

INFO Total Num Messages: 1000000 bytes: 1500536347 in 126.447 secs
(kafka.tools.ProducerPerformance$)
INFO Messages/sec: 7908.4518 (kafka.tools.ProducerPerformance$)
INFO MB/sec: 11.3172 (kafka.tools.ProducerPerformance$)

For the consumer test, I ran:

bin/kafka-consumer-perf-test.sh --props config/consumer.properties --topic
elben-perf-test-2 --threads 10

With these results: https://gist.github.com/654093bd70571d21fb34

Again, there are weird things like why are the other threads consuming 0
MB/s and only thread 7 is doing 6.9 MB/s? Anyone else getting similar
results? We need to consume at least 10 MB/s—I suppose it would be best to
use partitions and use multiple consumers if we're seeing only 6 MB/s on a
dump consumer with 10 threads each.

Any suggestions or ideas? I've had lots of fun with Kafka and hope to be
able to use it!

Elben

[1]
http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201202.mbox/%3CCADWPM3jzgMZmc57HYb55PX=GeAt6d6wzbvowvrMEM4Dw3ttu2g@mail.gmail.com%3E

Re: ec2 performance

Posted by Jun Rao <ju...@gmail.com>.
For the consumer part, currently, a partition can only be consumed by 1
consumer in a group. So, if there are more consumers in a group than # of
total partitions, some consumers will never get any data.

Thanks,

Jun

On Fri, Mar 9, 2012 at 10:32 AM, Elben Shira <el...@gmail.com> wrote:

> Hey guys,
>
> We're trying to deploy kafka 0.7 on EC2. According to a thread [1], he was
> getting 20,000 messages/sec on both EBS and local disk, at a message size
> of 1000. We have message sizes of 2K-6K, at a rate of 5,000 messages/sec
> and growing. So we ran some tests to see how kafka can handle this. My
> setup is a m1.large server running zookeeper and kafka server. Another
> m1.large server doing the perf tests.
>
> For the producer test, I ran:
>
> bin/kafka-producer-perf-test.sh --async --batch-size 200 --brokerinfo
> zk.connect=[REDACTED] --compression-codec 0 --message-size 3000 --messages
> 5000000 --topic elben-perf-test-2 --vary-message-size
>
> And the results: https://gist.github.com/dc5e9cce497807d578d9
>
> There are some weird results like this line:
>
> INFO thread 8: 495000 messages sent 14124.2938 nMsg/sec 19.9273 MBs/sec
> INFO thread 8: 500000 messages sent 21459.2275 nMsg/sec 30.8321 MBs/sec
>
> Any ideas what's happening here? Are the perf tests miscalculating the
> running average? But I think a correct conclusion is it produced 7496644565
> bytes in 369 seconds, or roughly 20 MB/s.
>
> Running the producer with --compression-codec 1 (gzip), I get:
>
> bin/producer-perf-test.sh --async --batch-size 200 --brokerinfo zk.connect=
> kafka1.i.massrel.com --compression-codec 2 --message-size 3000 --messages
> 1000000 --topic elben-perf-test-3 --vary-message-size
> [0] 0:bash*
>
> INFO Total Num Messages: 1000000 bytes: 1500536347 in 126.447 secs
> (kafka.tools.ProducerPerformance$)
> INFO Messages/sec: 7908.4518 (kafka.tools.ProducerPerformance$)
> INFO MB/sec: 11.3172 (kafka.tools.ProducerPerformance$)
>
> For the consumer test, I ran:
>
> bin/kafka-consumer-perf-test.sh --props config/consumer.properties --topic
> elben-perf-test-2 --threads 10
>
> With these results: https://gist.github.com/654093bd70571d21fb34
>
> Again, there are weird things like why are the other threads consuming 0
> MB/s and only thread 7 is doing 6.9 MB/s? Anyone else getting similar
> results? We need to consume at least 10 MB/s—I suppose it would be best to
> use partitions and use multiple consumers if we're seeing only 6 MB/s on a
> dump consumer with 10 threads each.
>
> Any suggestions or ideas? I've had lots of fun with Kafka and hope to be
> able to use it!
>
> Elben
>
> [1]
>
> http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201202.mbox/%3CCADWPM3jzgMZmc57HYb55PX=GeAt6d6wzbvowvrMEM4Dw3ttu2g@mail.gmail.com%3E
>