You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Dillian Murphey <cr...@gmail.com> on 2015/01/14 20:42:41 UTC

kafka cluster on aws

I can't seem to find much information to help me (being green to kafka) on
setting up a cluster on aws. Does anyone have any sources?

The question I have off the bat is, what methods have already been explored
to generate a unique broker id? If I spin up a new server, do I just need
to maintain my own broker-id list somewhere so I don't re-use an already
allocated broker id?

Also, I read an article about a broker going down and requiring a new
broker be spun up with the same id. Is this also something I need to
implement?

I want to setup a kafka auto-scaling group on AWS, so I can add brokers at
well or based on load. It doesn't seem too complicated, or maybe I'm too
green to see it, but I don't want to re-invent everything myself.

I know Loggly uses AWS/Kafka, so I'm hunting for more details on that too.

Thanks for any help

Re: kafka cluster on aws

Posted by Dillian Murphey <cr...@gmail.com>.

Trying to understand the docs.  Can I just use the docker image and run the
minotaur command from there?  I don't understand the Basion SSH stuff. Do I
need that? I just want a quick start for right now. Also, not sure where I
get the ENVIRONMENT.key.

Any extra help is greatly appreciated. You can email directly. Thanks!

On Wed, Jan 14, 2015 at 12:09 PM, Joe Stein <jo...@stealth.ly> wrote:

> We have an open source framework you can use to spin up Kafka (any version
> or even any build you want) clusters (and Zookeeper) with CloudFormation on
> AWS https://github.com/stealthly/minotaur
>
> It is very nice/handy you basically specify your instance types, counts,
> versions of code, etc and hit a <enter>
> https://github.com/stealthly/minotaur/tree/master/labs/kafka e.g.
>
> ./minotaur.py lab deploy kafka -e bdoss-dev -d testing -r us-east-1 -z
> us-east-1a -k http://example.com/kafka.tar.gz -n 3 -i m1.small
>
> There is some setup for the bastion host (
>
> https://github.com/stealthly/minotaur/tree/master/infrastructure/aws/bastion
> )
> and supervisor (
> https://github.com/stealthly/minotaur/tree/master/supervisor)
> and after that it is really nice and easy.
>
> /*******************************************
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> ********************************************/
>
> On Wed, Jan 14, 2015 at 2:54 PM, Joseph Lawson <jl...@roomkey.com>
> wrote:
>
> > We have a separate daemon process that assigns EIPs to servers when they
> > startup in an autoscaling group based off of an autoscaling message.  So
> > for a cluster of 3 we have 3 EIPs. Then we inject the EIPs into startup
> > script for Kafka which checks to see if it has one of the EIPs and
> assigns
> > itself the index of that IP so in the list:
> > 10.0.0.1 10.0.0.2 10.0.0.3
> >
> > 1 is broker 0, 2 is broker 1 and 3 is broker 2.  All this is injected via
> > cloudformation and then we have a mod value so if we want to spin brokers
> > in the same group we do mod 1,2 and get brokers mod * 3 + index to
> > determine which is in the group. (the EIPs are different as it is a
> > different cloudformation)
> >
> > For redundancy make sure you run at least two that have full replicas of
> > all other partitions.  We run replication factor of 3 with three
> instances
> > so if any goes down the other two bring it back in sync once a fresh
> server
> > spins in the autoscaling group.
> >
> > ________________________________________
> > From: Dillian Murphey <cr...@gmail.com>
> > Sent: Wednesday, January 14, 2015 2:42 PM
> > To: users@kafka.apache.org
> > Subject: kafka cluster on aws
> >
> > I can't seem to find much information to help me (being green to kafka)
> on
> > setting up a cluster on aws. Does anyone have any sources?
> >
> > The question I have off the bat is, what methods have already been
> explored
> > to generate a unique broker id? If I spin up a new server, do I just need
> > to maintain my own broker-id list somewhere so I don't re-use an already
> > allocated broker id?
> >
> > Also, I read an article about a broker going down and requiring a new
> > broker be spun up with the same id. Is this also something I need to
> > implement?
> >
> > I want to setup a kafka auto-scaling group on AWS, so I can add brokers
> at
> > well or based on load. It doesn't seem too complicated, or maybe I'm too
> > green to see it, but I don't want to re-invent everything myself.
> >
> > I know Loggly uses AWS/Kafka, so I'm hunting for more details on that
> too.
> >
> > Thanks for any help
> >
>

Re: kafka cluster on aws

Posted by Dillian Murphey <cr...@gmail.com>.

Thanks for the comments.  Hey Joe, I'm looking at your project now. I'm
going to give it a try.

Re: kafka cluster on aws

Posted by Joe Stein <jo...@stealth.ly>.

We have an open source framework you can use to spin up Kafka (any version
or even any build you want) clusters (and Zookeeper) with CloudFormation on
AWS https://github.com/stealthly/minotaur

It is very nice/handy you basically specify your instance types, counts,
versions of code, etc and hit a <enter>
https://github.com/stealthly/minotaur/tree/master/labs/kafka e.g.

./minotaur.py lab deploy kafka -e bdoss-dev -d testing -r us-east-1 -z
us-east-1a -k http://example.com/kafka.tar.gz -n 3 -i m1.small

There is some setup for the bastion host (
https://github.com/stealthly/minotaur/tree/master/infrastructure/aws/bastion)
and supervisor (https://github.com/stealthly/minotaur/tree/master/supervisor)
and after that it is really nice and easy.

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/

On Wed, Jan 14, 2015 at 2:54 PM, Joseph Lawson <jl...@roomkey.com> wrote:

> We have a separate daemon process that assigns EIPs to servers when they
> startup in an autoscaling group based off of an autoscaling message.  So
> for a cluster of 3 we have 3 EIPs. Then we inject the EIPs into startup
> script for Kafka which checks to see if it has one of the EIPs and assigns
> itself the index of that IP so in the list:
> 10.0.0.1 10.0.0.2 10.0.0.3
>
> 1 is broker 0, 2 is broker 1 and 3 is broker 2.  All this is injected via
> cloudformation and then we have a mod value so if we want to spin brokers
> in the same group we do mod 1,2 and get brokers mod * 3 + index to
> determine which is in the group. (the EIPs are different as it is a
> different cloudformation)
>
> For redundancy make sure you run at least two that have full replicas of
> all other partitions.  We run replication factor of 3 with three instances
> so if any goes down the other two bring it back in sync once a fresh server
> spins in the autoscaling group.
>
> ________________________________________
> From: Dillian Murphey <cr...@gmail.com>
> Sent: Wednesday, January 14, 2015 2:42 PM
> To: users@kafka.apache.org
> Subject: kafka cluster on aws
>
> I can't seem to find much information to help me (being green to kafka) on
> setting up a cluster on aws. Does anyone have any sources?
>
> The question I have off the bat is, what methods have already been explored
> to generate a unique broker id? If I spin up a new server, do I just need
> to maintain my own broker-id list somewhere so I don't re-use an already
> allocated broker id?
>
> Also, I read an article about a broker going down and requiring a new
> broker be spun up with the same id. Is this also something I need to
> implement?
>
> I want to setup a kafka auto-scaling group on AWS, so I can add brokers at
> well or based on load. It doesn't seem too complicated, or maybe I'm too
> green to see it, but I don't want to re-invent everything myself.
>
> I know Loggly uses AWS/Kafka, so I'm hunting for more details on that too.
>
> Thanks for any help
>

Re: kafka cluster on aws

Posted by Joseph Lawson <jl...@roomkey.com>.

We have a separate daemon process that assigns EIPs to servers when they startup in an autoscaling group based off of an autoscaling message.  So for a cluster of 3 we have 3 EIPs. Then we inject the EIPs into startup script for Kafka which checks to see if it has one of the EIPs and assigns itself the index of that IP so in the list:
10.0.0.1 10.0.0.2 10.0.0.3

1 is broker 0, 2 is broker 1 and 3 is broker 2.  All this is injected via cloudformation and then we have a mod value so if we want to spin brokers in the same group we do mod 1,2 and get brokers mod * 3 + index to determine which is in the group. (the EIPs are different as it is a different cloudformation)

For redundancy make sure you run at least two that have full replicas of all other partitions.  We run replication factor of 3 with three instances so if any goes down the other two bring it back in sync once a fresh server spins in the autoscaling group.

________________________________________
From: Dillian Murphey <cr...@gmail.com>
Sent: Wednesday, January 14, 2015 2:42 PM
To: users@kafka.apache.org
Subject: kafka cluster on aws

I can't seem to find much information to help me (being green to kafka) on
setting up a cluster on aws. Does anyone have any sources?

The question I have off the bat is, what methods have already been explored
to generate a unique broker id? If I spin up a new server, do I just need
to maintain my own broker-id list somewhere so I don't re-use an already
allocated broker id?

Also, I read an article about a broker going down and requiring a new
broker be spun up with the same id. Is this also something I need to
implement?

I want to setup a kafka auto-scaling group on AWS, so I can add brokers at
well or based on load. It doesn't seem too complicated, or maybe I'm too
green to see it, but I don't want to re-invent everything myself.

I know Loggly uses AWS/Kafka, so I'm hunting for more details on that too.

Thanks for any help