You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Todd S <to...@borked.ca> on 2015/04/08 19:50:24 UTC

Re: Post on running Kafka at LinkedIn

Sorry go back this far in time, I just noticed that the list had
replied accusing this email being spam, so I'll try again with better
formatting...

A few questions, hopefully you (and everyone) don't mind. Feel free to
ignore any/all.. I am trying to learn what I can from people who are
considerably larger than we are, so we don't have the same pains
(hopefully)

* Are all 1100 brokers hardware?
* Is there any Hardware or OS tuning you've found beneficial?
* How do you manage deploying config updates? In particular, how do
you manage the broker restarts‎ to pickup changes?
* Why 60 clusters? What segmentation of purpose (aside from the 2
layers detailed in this doc) do you have?
* Do you tune the clusters for different workloads/data types?
* What challenges have you faced running that many clusters and nodes
vs when you were smaller?
* How do you manage keeping topics named nicely between clusters? (not
conflicting) .
* How do you manage partitioning and balancing (and rebalancing when a
topic/partition start growing very quickly)?
* Have you/how have you enabled your users/customers to monitor their
data flow, or do they just trust you to let them know if there are
issues?

Thanks very much, sorry for the question dump!

On Mon, Mar 23, 2015 at 9:42 AM, Todd Palino <tp...@gmail.com> wrote:
> Emmanuel, if it helps, here's a little more detail on the hardware spec we
> are using at the moment:
>
> 12 CPU (HT enabled)
> 64 GB RAM
> 16 x 1TB SAS drives (2 are used as a RAID-1 set for the OS, 14 are a
> RAID-10 set just for the Kafka log segments)
>
> We don't colocate any other applications with Kafka except for a couple
> monitoring agents. Zookeeper runs on completely separate nodes.
>
> I suggest starting with looking at the basics - watch the CPU, memory, and
> disk IO usage on the brokers as you are testing. You're likely going to
> find one of these three is the constraint. Disk IO in particular can lead
> to a significant increase in produce latency as it increases even over
> 10-15% utilization.
>
> -Todd
>
>
> On Fri, Mar 20, 2015 at 3:41 PM, Emmanuel <el...@msn.com> wrote:
>
>> This is why I'm confused because I'm tryign to benchmark and I see numbers
>> that seem pretty low to me...8000 events/sec on 2 brokers with 3CPU each
>> and 5 partitions should be way faster than this and I don't know where to
>> start to debug...
>> the kafka-consumer-perf-test script gives me ridiculously low numbers
>> (1000 events/sec/thread)
>>
>> So what could be causing this?
>> From: jbringhurst@linkedin.com.INVALID
>> To: users@kafka.apache.org
>> Subject: Re: Post on running Kafka at LinkedIn
>> Date: Fri, 20 Mar 2015 22:16:29 +0000
>>
>> Keep in mind that these brokers aren't really stressed too much at any
>> given time -- we need to stay ahead of the capacity curve.
>> Your message throughput will really just depend on what hardware you're
>> using. However, in the past, we've benchmarked at 400,000 to more than
>> 800,000 messages / broker / sec, depending on configuration (
>> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
>> ).
>>
>> -Jon
>> On Mar 20, 2015, at 3:03 PM, Emmanuel <el...@msn.com> wrote:800B
>> messages / day = 9.26M messages / sec over 1100 brokers
>> = ~8400 message / broker / sec
>> Do I get this right?
>> Trying to benchmark my own test cluster and that's what I see with 2
>> brokers...Just wondering if my numbers are good or bad...
>>
>>
>> Subject: Re: Post on running Kafka at LinkedIn
>> From: clark@kafka.guru
>> Date: Fri, 20 Mar 2015 14:27:58 -0700
>> To: users@kafka.apache.org
>>
>> Yep! We are growing :)
>>
>> -Clark
>>
>> Sent from my iPhone
>>
>> On Mar 20, 2015, at 2:14 PM, James Cheng <jc...@tivo.com> wrote:
>>
>> Amazing growth numbers.
>>
>> At the meetup on 1/27, Clark Haskins presented their Kafka usage at the
>> time. It was:
>>
>> Bytes in: 120 TB
>> Messages In: 585 million
>> Bytes out: 540 TB
>> Total brokers: 704
>>
>> In Todd's post, the current numbers:
>>
>> Bytes in: 175 TB (45% growth)
>> Messages In: 800 billion (36% growth)
>> Bytes out: 650 TB (20% growth)
>> Total brokers: 1100 (56% growth)
>>
>> That much growth in just 2 months? Wowzers.
>>
>> -James
>>
>> On Mar 20, 2015, at 11:30 AM, James Cheng <jc...@tivo.com> wrote:
>>
>> For those who missed it:
>>
>> The Kafka Audit tool was also presented at the 1/27 Kafka meetup:
>> http://www.meetup.com/http-kafka-apache-org/events/219626780/
>>
>> Recorded video is here, starting around the 40 minute mark:
>> http://www.ustream.tv/recorded/58109076
>>
>> Slides are here:
>> http://www.ustream.tv/recorded/58109076
>>
>> -James
>>
>> On Mar 20, 2015, at 9:47 AM, Todd Palino <tp...@gmail.com> wrote:
>>
>> For those who are interested in detail on how we've got Kafka set up at
>> LinkedIn, I have just published a new posted to our Engineering blog titled
>> "Running Kafka at Scale"
>>
>>   https://engineering.linkedin.com/kafka/running-kafka-scale
>>
>> It's a general overview of our current Kafka install, tiered architecture,
>> audit, and the libraries we use for producers and consumers. You'll also be
>> seeing more posts from the SRE team here in the coming weeks on deeper
>> looks into both Kafka and Samza.
>>
>> Additionally, I'll be giving a talk at ApacheCon next month on running
>> tiered Kafka architectures. If you're in Austin for that, please come by
>> and check it out.
>>
>> -Todd
>>
>>
>>
>>

Re: Post on running Kafka at LinkedIn

Posted by Todd Palino <tp...@gmail.com>.

Good questions. Here are the answers...

- Yes, all brokers we run are hardware. We do not use virtual systems for
Kafka or Zookeeper

- There's a number of things we have done. I covered a lot of them last
year at ApacheCon (
http://www.slideshare.net/ToddPalino/enterprise-kafka-kafka-as-a-service).
Some of the disk tuning is changing, but we haven't settled on the final
config yet.

- We have an internal configuration system and deployment system at
LinkedIn that work together. We change configs for everything in a central
repository, and when we do deployments they are shipped out and installed.
If we make a change, we either need to wait for the next time we deploy
(current cadence is every 3 weeks), or do a manual deployment

- The number of clusters is because we have 3 major data types (tracking,
metrics, and queuing), which is in multiple datacenters, and each one is
aggregated. We also have clusters for logging in all datacenters, and we
have custom clusters for some other use cases that don't mesh well (either
for security or usage pattern reasons). In addition, we currently have 7
test clusters for development work and release testing.

- Most of the broker tuning is all the same, except for retention times
that may vary. We generally look at tuning the partition counts. For
example, one of our sets of clusters for outbound traffic from Hadoop is
tuned to have number of partitions for each topic equal to the number of
brokers, to keep the balance as close to perfect as possible.

- The largest challenge is around how many instances of mirror maker we
run, and keeping track of all of the data flows. Our overall management is
actually much simpler now, because even a couple years ago we did not have
all the configuration and deployment automation that we have now.

- We have fixed naming schemes for metrics and logs that are enforced by
the clients. Queuing topics are pretty much a free for all, but we don't
care if someone names something the same as another cluster's topic, as
queuing does not interact with other clusters. The tracking pipelines are
generally controlled by an internal data model committee that reviews and
approves schemas. We're moving towards more centralized management of topic
configurations to enable the developers to spin up services with less
friction.

- Repartitioning is all handled manually. We monitor the partition sizes on
disk and periodically review that and expand topics as needed. We're
working on automation for it, but 350k partitions with a per-partition
metric and alert makes parts of the monitoring system a little cranky :)

- Our monitoring systems are all open, so anyone can review the current
status. We have also developed a web console that shows overviews of the
various clusters and mirror makers, and provides tools like offset checking
(without needing the CLI tools). Our client libraries also expose a
standard set of metrics, so when they are instantiated inside an
application, they are automatically put into the monitoring system for the
customers.

-Todd



On Wed, Apr 8, 2015 at 10:50 AM, Todd S <to...@borked.ca> wrote:

> Sorry go back this far in time, I just noticed that the list had
> replied accusing this email being spam, so I'll try again with better
> formatting...
>
> A few questions, hopefully you (and everyone) don't mind. Feel free to
> ignore any/all.. I am trying to learn what I can from people who are
> considerably larger than we are, so we don't have the same pains
> (hopefully)
>
> * Are all 1100 brokers hardware?
> * Is there any Hardware or OS tuning you've found beneficial?
> * How do you manage deploying config updates? In particular, how do
> you manage the broker restarts‎ to pickup changes?
> * Why 60 clusters? What segmentation of purpose (aside from the 2
> layers detailed in this doc) do you have?
> * Do you tune the clusters for different workloads/data types?
> * What challenges have you faced running that many clusters and nodes
> vs when you were smaller?
> * How do you manage keeping topics named nicely between clusters? (not
> conflicting) .
> * How do you manage partitioning and balancing (and rebalancing when a
> topic/partition start growing very quickly)?
> * Have you/how have you enabled your users/customers to monitor their
> data flow, or do they just trust you to let them know if there are
> issues?
>
> Thanks very much, sorry for the question dump!
>
> On Mon, Mar 23, 2015 at 9:42 AM, Todd Palino <tp...@gmail.com> wrote:
> > Emmanuel, if it helps, here's a little more detail on the hardware spec
> we
> > are using at the moment:
> >
> > 12 CPU (HT enabled)
> > 64 GB RAM
> > 16 x 1TB SAS drives (2 are used as a RAID-1 set for the OS, 14 are a
> > RAID-10 set just for the Kafka log segments)
> >
> > We don't colocate any other applications with Kafka except for a couple
> > monitoring agents. Zookeeper runs on completely separate nodes.
> >
> > I suggest starting with looking at the basics - watch the CPU, memory,
> and
> > disk IO usage on the brokers as you are testing. You're likely going to
> > find one of these three is the constraint. Disk IO in particular can lead
> > to a significant increase in produce latency as it increases even over
> > 10-15% utilization.
> >
> > -Todd
> >
> >
> > On Fri, Mar 20, 2015 at 3:41 PM, Emmanuel <el...@msn.com> wrote:
> >
> >> This is why I'm confused because I'm tryign to benchmark and I see
> numbers
> >> that seem pretty low to me...8000 events/sec on 2 brokers with 3CPU each
> >> and 5 partitions should be way faster than this and I don't know where
> to
> >> start to debug...
> >> the kafka-consumer-perf-test script gives me ridiculously low numbers
> >> (1000 events/sec/thread)
> >>
> >> So what could be causing this?
> >> From: jbringhurst@linkedin.com.INVALID
> >> To: users@kafka.apache.org
> >> Subject: Re: Post on running Kafka at LinkedIn
> >> Date: Fri, 20 Mar 2015 22:16:29 +0000
> >>
> >> Keep in mind that these brokers aren't really stressed too much at any
> >> given time -- we need to stay ahead of the capacity curve.
> >> Your message throughput will really just depend on what hardware you're
> >> using. However, in the past, we've benchmarked at 400,000 to more than
> >> 800,000 messages / broker / sec, depending on configuration (
> >>
> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
> >> ).
> >>
> >> -Jon
> >> On Mar 20, 2015, at 3:03 PM, Emmanuel <el...@msn.com> wrote:800B
> >> messages / day = 9.26M messages / sec over 1100 brokers
> >> = ~8400 message / broker / sec
> >> Do I get this right?
> >> Trying to benchmark my own test cluster and that's what I see with 2
> >> brokers...Just wondering if my numbers are good or bad...
> >>
> >>
> >> Subject: Re: Post on running Kafka at LinkedIn
> >> From: clark@kafka.guru
> >> Date: Fri, 20 Mar 2015 14:27:58 -0700
> >> To: users@kafka.apache.org
> >>
> >> Yep! We are growing :)
> >>
> >> -Clark
> >>
> >> Sent from my iPhone
> >>
> >> On Mar 20, 2015, at 2:14 PM, James Cheng <jc...@tivo.com> wrote:
> >>
> >> Amazing growth numbers.
> >>
> >> At the meetup on 1/27, Clark Haskins presented their Kafka usage at the
> >> time. It was:
> >>
> >> Bytes in: 120 TB
> >> Messages In: 585 million
> >> Bytes out: 540 TB
> >> Total brokers: 704
> >>
> >> In Todd's post, the current numbers:
> >>
> >> Bytes in: 175 TB (45% growth)
> >> Messages In: 800 billion (36% growth)
> >> Bytes out: 650 TB (20% growth)
> >> Total brokers: 1100 (56% growth)
> >>
> >> That much growth in just 2 months? Wowzers.
> >>
> >> -James
> >>
> >> On Mar 20, 2015, at 11:30 AM, James Cheng <jc...@tivo.com> wrote:
> >>
> >> For those who missed it:
> >>
> >> The Kafka Audit tool was also presented at the 1/27 Kafka meetup:
> >> http://www.meetup.com/http-kafka-apache-org/events/219626780/
> >>
> >> Recorded video is here, starting around the 40 minute mark:
> >> http://www.ustream.tv/recorded/58109076
> >>
> >> Slides are here:
> >> http://www.ustream.tv/recorded/58109076
> >>
> >> -James
> >>
> >> On Mar 20, 2015, at 9:47 AM, Todd Palino <tp...@gmail.com> wrote:
> >>
> >> For those who are interested in detail on how we've got Kafka set up at
> >> LinkedIn, I have just published a new posted to our Engineering blog
> titled
> >> "Running Kafka at Scale"
> >>
> >>   https://engineering.linkedin.com/kafka/running-kafka-scale
> >>
> >> It's a general overview of our current Kafka install, tiered
> architecture,
> >> audit, and the libraries we use for producers and consumers. You'll
> also be
> >> seeing more posts from the SRE team here in the coming weeks on deeper
> >> looks into both Kafka and Samza.
> >>
> >> Additionally, I'll be giving a talk at ApacheCon next month on running
> >> tiered Kafka architectures. If you're in Austin for that, please come by
> >> and check it out.
> >>
> >> -Todd
> >>
> >>
> >>
> >>
>