You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Zack Payton <zp...@gmail.com> on 2014/06/26 22:29:32 UTC

Scalability question?

Hi there,

There have been some internal debates here about how far we can scale
Kafka.  Ideally, we'd be able to make it scale to 90 billion events a day.
 I've seen somewhere that linked scaled it up to 40 billion events a day.
 Has anyone seen a hard plateau in terms of scalability?  Does anyone have
any advice for tweaking configs to achieve ultra-high performance?

Thanks,
Z

Re: Scalability question?

Posted by Jay Kreps <ja...@gmail.com>.
I think currently we do a little over 200 billion events per day at
LinkedIn, though we are not actually the largest Kafka user any more.

On the whole scaling the volume of messages is actually not that hard in
Kafka. Data is partitioned, and partitions don't really communicate with
each other, so adding more machines will add more capacity, there really
aren't a ton of gotchas.

The operations section of the wiki has some tips on performance tuning. I
recommend using the performance test commands described in the link from
this test to try out some stuff on your gear and get a feeling for how much
hardware you need:
http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines

-Jay


On Thu, Jun 26, 2014 at 1:29 PM, Zack Payton <zp...@gmail.com> wrote:

> Hi there,
>
> There have been some internal debates here about how far we can scale
> Kafka.  Ideally, we'd be able to make it scale to 90 billion events a day.
>  I've seen somewhere that linked scaled it up to 40 billion events a day.
>  Has anyone seen a hard plateau in terms of scalability?  Does anyone have
> any advice for tweaking configs to achieve ultra-high performance?
>
> Thanks,
> Z
>