You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Joe Crobak <jo...@gmail.com> on 2014/09/29 22:45:12 UTC

AWS EC2 deployment best practices

We're planning a deploy to AWS EC2, and I was hoping to get some advice on
best practices. I've seen the Loggly presentation [1], which has some good
recommendations on instance types and EBS setup. Aside from that, there
seem to be several options in terms of multi-Availability Zone (AZ)
deployment. The ones we're considering are:

1) Treat each AZ as a separate data center. Producers write to the kafka
cluster in the same AZ. For consumption, two options:
1a) designate one cluster the "master" cluster and use mirrormaker. This
was discussed here [2] where some gotchas related to offset management were
raised.
1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
for each cluster).

Pros:
* if there's a network partition between AZs (or extra latency), the
consumer(s) will catch up once the event is resolved.
* If an AZ goes offline, only unprocessed data in that AZ is lost until the
AZ comes back online. The other AZ is unaffected. (consume failover is more
complicated in 1a, it seems).
Cons:
* Duplicate infrastructure and either more moving parts (1a) or more
complicated consumers (1b).
* It's unclear how this scales if one wants to add a second region to the
mix.

2) The second option is to treat AZs as the same data center. In this case,
there's no guarantee that a writer is writing to a node in the same AZ.

Pros:
* Simplified setup-all data is in one place.
Cons:
* Harder to design for availability—what if the leader of the partition is
in a different AZ than the producer and there's a partition between AZs? If
latency is high or throughput is low between AZs, write throughput suffers
if `request.required.acks` = -1


Some other considerations:
* Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
AZs, but option 1a/b would let us do separate clusters per AZ.
* EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
replication. Are folks using ephemeral storage instead of EBS now?
Provisioned IOPs can get expensive pretty quickly.

Any suggestions/experience along these lines (or others!) would be greatly
appreciated. If there's good feedback, I'd be happy to put together a wiki
page with the details.

Thanks,
Joe

[1] http://search-hadoop.com/m/4TaT4BQRJy
[2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain

Re: AWS EC2 deployment best practices

Posted by James Cheng <jc...@tivo.com>.

I'm also interested in hearing more about deploying Kafka in AWS.

I was also considering options like your 1a and 2. I ran some calculations and one interesting thing I ran across was bandwidth costs between AZs.

In 1a, if you can have your producers and consumers in the same AZ as the "master", then you won't have to pay any bandwidth costs for your producers/consumers. You will have to pay bandwidth costs for the mirror-maker traffic between clusters in different AZs.

In 2, if your producers and consumers are writing/reading to different AZs, then you are paying bandwidth costs between AZs for both producers and consumers. In my cost calculation for a modest size cluster, my bandwidth costs were roughly the same as my (EC2 instance + EBS) costs.

An idea for #2 is to deploy your producers and your consumers so that they always are deployed in the AZ that contains the partitions they want to read/write. Or, said another way, move your partitions to the brokers in the same AZs as where your producers/consumers are. I think it's doable, but it means means that you'd want to write a Kafka client library that is aware of your AZ's, and also manage the cluster partitions in-sync with your producer/consumer deployments.

With ephemeral disks, I imagine that Kafka would become network bound. In case you find it useful, I ran some network performance tests against different EC2 instances. I only went as far as c3.4xlarge.

https://docs.google.com/spreadsheets/d/1QF-4EO3PQ_YOLbvf6HKpqBTNQ8fyYeRuDMrlDYlK0yQ/pubchart?oid=1634430904&format=interactive

-James

On Sep 30, 2014, at 7:47 AM, Philip O'Toole <ph...@yahoo.com.INVALID> wrote:

> OK, yeah, speaking from experience I would be comfortable with using the ephemeral storage if it's replicated across AZs. More and more EC2 instances have local SSDs, so you'll get great IO. Of course, you better monitor your instance, and if a instance terminates, you're vulnerable if a second instance is lost. It might argue for 3 copies.
> 
> As you correctly pointed out in your original e-mail, the Loggly setup predated 0.8 -- so there was no replication to worry about. We ran 3-broker clusters, and put a broker, of each cluster, in a different AZ. This did mean that during an AZ failure that certain brokers would be unavailable (but the messages were still on disk, ready for processing when the AZ came back online), but it did mean that there was always some Kafka brokers running somewhere that were reachable, and incoming traffic could be sent there. The Producers we wrote took care of dealing with this. In other words the pipeline kept moving data.
> 
> 
> Of course, in a healthy pipeline, each message was written to ES within a matter of seconds, and we had replication there (as outlined in the accompanying talk). It all worked very well.
> 
> 
> Philip
> 
> 
> -----------------------------------------
> http://www.philipotoole.com 
> 
> 
> On Tuesday, September 30, 2014 2:49 PM, Joe Crobak <jo...@gmail.com> wrote:
> 
> 
> 
> I didn't know about KAFKA-1215, thanks. I'm not sure it would fully address
> my concerns of a producer writing to the partition leader in different AZ,
> though.
> 
> To answer your question, I was thinking ephemerals with replication, yes.
> With a reservation, it's pretty easy to get e.g. two i2.xlarge for an
> amortized cost below a single m2.2xlarge with the same amount of EBS
> storage and provisioned IOPs.
> 
> 
> On Mon, Sep 29, 2014 at 9:40 PM, Philip O'Toole <
> philip.otoole@yahoo.com.invalid> wrote:
> 
>> If only Kafka had rack awareness....you could run 1 cluster and set up the
>> replicas in different AZs.
>> 
>> 
>> https://issues.apache.org/jira/browse/KAFKA-1215
>> 
>> As for your question about ephemeral versus EBS, I presume you are
>> proposing to use ephemeral *with* replicas, right?
>> 
>> 
>> Philip
>> 
>> 
>> 
>> -----------------------------------------
>> http://www.philipotoole.com
>> 
>> 
>> On Monday, September 29, 2014 9:45 PM, Joe Crobak <jo...@gmail.com>
>> wrote:
>> 
>> 
>> 
>> We're planning a deploy to AWS EC2, and I was hoping to get some advice on
>> best practices. I've seen the Loggly presentation [1], which has some good
>> recommendations on instance types and EBS setup. Aside from that, there
>> seem to be several options in terms of multi-Availability Zone (AZ)
>> deployment. The ones we're considering are:
>> 
>> 1) Treat each AZ as a separate data center. Producers write to the kafka
>> cluster in the same AZ. For consumption, two options:
>> 1a) designate one cluster the "master" cluster and use mirrormaker. This
>> was discussed here [2] where some gotchas related to offset management were
>> raised.
>> 1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
>> for each cluster).
>> 
>> Pros:
>> * if there's a network partition between AZs (or extra latency), the
>> consumer(s) will catch up once the event is resolved.
>> * If an AZ goes offline, only unprocessed data in that AZ is lost until the
>> AZ comes back online. The other AZ is unaffected. (consume failover is more
>> complicated in 1a, it seems).
>> Cons:
>> * Duplicate infrastructure and either more moving parts (1a) or more
>> complicated consumers (1b).
>> * It's unclear how this scales if one wants to add a second region to the
>> mix.
>> 
>> 2) The second option is to treat AZs as the same data center. In this case,
>> there's no guarantee that a writer is writing to a node in the same AZ.
>> 
>> Pros:
>> * Simplified setup-all data is in one place.
>> Cons:
>> * Harder to design for availability—what if the leader of the partition is
>> in a different AZ than the producer and there's a partition between AZs? If
>> latency is high or throughput is low between AZs, write throughput suffers
>> if `request.required.acks` = -1
>> 
>> 
>> Some other considerations:
>> * Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
>> AZs, but option 1a/b would let us do separate clusters per AZ.
>> * EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
>> replication. Are folks using ephemeral storage instead of EBS now?
>> Provisioned IOPs can get expensive pretty quickly.
>> 
>> Any suggestions/experience along these lines (or others!) would be greatly
>> appreciated. If there's good feedback, I'd be happy to put together a wiki
>> page with the details.
>> 
>> Thanks,
>> Joe
>> 
>> [1] http://search-hadoop.com/m/4TaT4BQRJy
>> [2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain

Re: AWS EC2 deployment best practices

Posted by Philip O'Toole <ph...@yahoo.com.INVALID>.

OK, yeah, speaking from experience I would be comfortable with using the ephemeral storage if it's replicated across AZs. More and more EC2 instances have local SSDs, so you'll get great IO. Of course, you better monitor your instance, and if a instance terminates, you're vulnerable if a second instance is lost. It might argue for 3 copies.

As you correctly pointed out in your original e-mail, the Loggly setup predated 0.8 -- so there was no replication to worry about. We ran 3-broker clusters, and put a broker, of each cluster, in a different AZ. This did mean that during an AZ failure that certain brokers would be unavailable (but the messages were still on disk, ready for processing when the AZ came back online), but it did mean that there was always some Kafka brokers running somewhere that were reachable, and incoming traffic could be sent there. The Producers we wrote took care of dealing with this. In other words the pipeline kept moving data.

Of course, in a healthy pipeline, each message was written to ES within a matter of seconds, and we had replication there (as outlined in the accompanying talk). It all worked very well.

Philip

-----------------------------------------
http://www.philipotoole.com 

On Tuesday, September 30, 2014 2:49 PM, Joe Crobak <jo...@gmail.com> wrote:

I didn't know about KAFKA-1215, thanks. I'm not sure it would fully address
my concerns of a producer writing to the partition leader in different AZ,
though.

To answer your question, I was thinking ephemerals with replication, yes.
With a reservation, it's pretty easy to get e.g. two i2.xlarge for an
amortized cost below a single m2.2xlarge with the same amount of EBS
storage and provisioned IOPs.

On Mon, Sep 29, 2014 at 9:40 PM, Philip O'Toole <
philip.otoole@yahoo.com.invalid> wrote:

> If only Kafka had rack awareness....you could run 1 cluster and set up the
> replicas in different AZs.
>
>
> https://issues.apache.org/jira/browse/KAFKA-1215
>
> As for your question about ephemeral versus EBS, I presume you are
> proposing to use ephemeral *with* replicas, right?
>
>
> Philip
>
>
>
> -----------------------------------------
> http://www.philipotoole.com
>
>
> On Monday, September 29, 2014 9:45 PM, Joe Crobak <jo...@gmail.com>
> wrote:
>
>
>
> We're planning a deploy to AWS EC2, and I was hoping to get some advice on
> best practices. I've seen the Loggly presentation [1], which has some good
> recommendations on instance types and EBS setup. Aside from that, there
> seem to be several options in terms of multi-Availability Zone (AZ)
> deployment. The ones we're considering are:
>
> 1) Treat each AZ as a separate data center. Producers write to the kafka
> cluster in the same AZ. For consumption, two options:
> 1a) designate one cluster the "master" cluster and use mirrormaker. This
> was discussed here [2] where some gotchas related to offset management were
> raised.
> 1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
> for each cluster).
>
> Pros:
> * if there's a network partition between AZs (or extra latency), the
> consumer(s) will catch up once the event is resolved.
> * If an AZ goes offline, only unprocessed data in that AZ is lost until the
> AZ comes back online. The other AZ is unaffected. (consume failover is more
> complicated in 1a, it seems).
> Cons:
> * Duplicate infrastructure and either more moving parts (1a) or more
> complicated consumers (1b).
> * It's unclear how this scales if one wants to add a second region to the
> mix.
>
> 2) The second option is to treat AZs as the same data center. In this case,
> there's no guarantee that a writer is writing to a node in the same AZ.
>
> Pros:
> * Simplified setup-all data is in one place.
> Cons:
> * Harder to design for availability—what if the leader of the partition is
> in a different AZ than the producer and there's a partition between AZs? If
> latency is high or throughput is low between AZs, write throughput suffers
> if `request.required.acks` = -1
>
>
> Some other considerations:
> * Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
> AZs, but option 1a/b would let us do separate clusters per AZ.
> * EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
> replication. Are folks using ephemeral storage instead of EBS now?
> Provisioned IOPs can get expensive pretty quickly.
>
> Any suggestions/experience along these lines (or others!) would be greatly
> appreciated. If there's good feedback, I'd be happy to put together a wiki
> page with the details.
>
> Thanks,
> Joe
>
> [1] http://search-hadoop.com/m/4TaT4BQRJy
> [2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain
>

Re: AWS EC2 deployment best practices

Posted by Joe Crobak <jo...@gmail.com>.

I didn't know about KAFKA-1215, thanks. I'm not sure it would fully address
my concerns of a producer writing to the partition leader in different AZ,
though.

To answer your question, I was thinking ephemerals with replication, yes.
With a reservation, it's pretty easy to get e.g. two i2.xlarge for an
amortized cost below a single m2.2xlarge with the same amount of EBS
storage and provisioned IOPs.

On Mon, Sep 29, 2014 at 9:40 PM, Philip O'Toole <
philip.otoole@yahoo.com.invalid> wrote:

> If only Kafka had rack awareness....you could run 1 cluster and set up the
> replicas in different AZs.
>
>
> https://issues.apache.org/jira/browse/KAFKA-1215
>
> As for your question about ephemeral versus EBS, I presume you are
> proposing to use ephemeral *with* replicas, right?
>
>
> Philip
>
>
>
> -----------------------------------------
> http://www.philipotoole.com
>
>
> On Monday, September 29, 2014 9:45 PM, Joe Crobak <jo...@gmail.com>
> wrote:
>
>
>
> We're planning a deploy to AWS EC2, and I was hoping to get some advice on
> best practices. I've seen the Loggly presentation [1], which has some good
> recommendations on instance types and EBS setup. Aside from that, there
> seem to be several options in terms of multi-Availability Zone (AZ)
> deployment. The ones we're considering are:
>
> 1) Treat each AZ as a separate data center. Producers write to the kafka
> cluster in the same AZ. For consumption, two options:
> 1a) designate one cluster the "master" cluster and use mirrormaker. This
> was discussed here [2] where some gotchas related to offset management were
> raised.
> 1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
> for each cluster).
>
> Pros:
> * if there's a network partition between AZs (or extra latency), the
> consumer(s) will catch up once the event is resolved.
> * If an AZ goes offline, only unprocessed data in that AZ is lost until the
> AZ comes back online. The other AZ is unaffected. (consume failover is more
> complicated in 1a, it seems).
> Cons:
> * Duplicate infrastructure and either more moving parts (1a) or more
> complicated consumers (1b).
> * It's unclear how this scales if one wants to add a second region to the
> mix.
>
> 2) The second option is to treat AZs as the same data center. In this case,
> there's no guarantee that a writer is writing to a node in the same AZ.
>
> Pros:
> * Simplified setup-all data is in one place.
> Cons:
> * Harder to design for availability—what if the leader of the partition is
> in a different AZ than the producer and there's a partition between AZs? If
> latency is high or throughput is low between AZs, write throughput suffers
> if `request.required.acks` = -1
>
>
> Some other considerations:
> * Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
> AZs, but option 1a/b would let us do separate clusters per AZ.
> * EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
> replication. Are folks using ephemeral storage instead of EBS now?
> Provisioned IOPs can get expensive pretty quickly.
>
> Any suggestions/experience along these lines (or others!) would be greatly
> appreciated. If there's good feedback, I'd be happy to put together a wiki
> page with the details.
>
> Thanks,
> Joe
>
> [1] http://search-hadoop.com/m/4TaT4BQRJy
> [2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain
>

Re: AWS EC2 deployment best practices

Posted by Philip O'Toole <ph...@yahoo.com.INVALID>.

If only Kafka had rack awareness....you could run 1 cluster and set up the replicas in different AZs.

https://issues.apache.org/jira/browse/KAFKA-1215

As for your question about ephemeral versus EBS, I presume you are proposing to use ephemeral *with* replicas, right?

Philip

-----------------------------------------
http://www.philipotoole.com

On Monday, September 29, 2014 9:45 PM, Joe Crobak <jo...@gmail.com> wrote:

We're planning a deploy to AWS EC2, and I was hoping to get some advice on
best practices. I've seen the Loggly presentation [1], which has some good
recommendations on instance types and EBS setup. Aside from that, there
seem to be several options in terms of multi-Availability Zone (AZ)
deployment. The ones we're considering are:

1) Treat each AZ as a separate data center. Producers write to the kafka
cluster in the same AZ. For consumption, two options:
1a) designate one cluster the "master" cluster and use mirrormaker. This
was discussed here [2] where some gotchas related to offset management were
raised.
1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
for each cluster).

Pros:
* if there's a network partition between AZs (or extra latency), the
consumer(s) will catch up once the event is resolved.
* If an AZ goes offline, only unprocessed data in that AZ is lost until the
AZ comes back online. The other AZ is unaffected. (consume failover is more
complicated in 1a, it seems).
Cons:
* Duplicate infrastructure and either more moving parts (1a) or more
complicated consumers (1b).
* It's unclear how this scales if one wants to add a second region to the
mix.

2) The second option is to treat AZs as the same data center. In this case,
there's no guarantee that a writer is writing to a node in the same AZ.

Pros:
* Simplified setup-all data is in one place.
Cons:
* Harder to design for availability—what if the leader of the partition is
in a different AZ than the producer and there's a partition between AZs? If
latency is high or throughput is low between AZs, write throughput suffers
if `request.required.acks` = -1

Some other considerations:
* Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
AZs, but option 1a/b would let us do separate clusters per AZ.
* EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
replication. Are folks using ephemeral storage instead of EBS now?
Provisioned IOPs can get expensive pretty quickly.

Any suggestions/experience along these lines (or others!) would be greatly
appreciated. If there's good feedback, I'd be happy to put together a wiki
page with the details.

Thanks,
Joe

[1] http://search-hadoop.com/m/4TaT4BQRJy
[2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain