You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Sa Li <sa...@gmail.com> on 2014/09/30 05:31:57 UTC

multi-node and multi-broker kafka cluster setup

Hi, 
I am kinda newbie to kafka, I plan to build a cluster with multiple nodes, and multiple brokers on each node, I can find tutorials for set multiple brokers cluster in single node, say
http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/
Also I can find some instructions for multiple node setup, but with single broker on each node. I have not seen any documents to teach me how to setup multiple nodes cluster and multiple brokers in each node. I notice some documents points out: we should install kafka on each node which makes sense, and all the brokers in each node should connect to same zookeeper. I am confused since I thought I could setup a zookeeper ensemble cluster separately, and all the brokers connecting to this zookeeper cluster and this zk cluster doesn’t have to be the server hosting the kafka, but some tutorial says I should install zookeeper on each kafka node. 

Here is my plan:
- I have three nodes: kfServer1, kfserver2, kfserver3, 
- kfserver1 and kfserver2 are configured as the zookeeper ensemble, which i have done.
  zk.connect=kfserver1:2181,kfserver2:2181
- broker1, broker2, broker3 are in kfserver1, 
  broker4, broker5, broker6 are on kfserver2, 
  broker7, broker8, broker9 are on kfserver3.

When I am configuring, the zk DataDir is in local directory of each node, instead located at the zk ensemble directory, is that correct? So far, I couldnot make above scheme working, anyone have ever made multi-node and multi-broker kafka cluster setup?

thanks

Alec



Re: multi-node and multi-broker kafka cluster setup

Posted by Sa Li <sa...@gmail.com>.
Just clarify, I am using 3 zkServer ensemble, myid: 1, 2, 3. But in each
kafka node server.properties of each broker, I make zk.connect to
localhost, which means the broker info stored in local zkServer, I know it
is bit of weird, other than assign the broker info automatically by
zkServer leader.

On Thu, Oct 2, 2014 at 2:25 PM, Sa Li <sa...@gmail.com> wrote:

> Daniel, thanks for reply
>
> It is still the learn curve to me to setup the cluster, we finally want to
> make connection between kafka cluster and storm cluster. As you mentioned,
> seems 1 single broker per node is more efficient, is it good to handle
> multiple topics? For my case, say I can build the 3-node kafka cluster, and
> three brokers, and certainly that will limit the replica number, as far as
> I understand, broker number should greater or equal to replica number.
>
> For the zk Server, my understanding after play around is: I should run zk
> Server server for each kafka node, I could zk.connect to single zk server
> in kafka server.properties, and all the broker info will store in that
> zkserver, But I may think it might be better to store each individual
> broker info in local zkServer, then when zkCli,sh, we can see things under
> /brokers/ids.
>
> Is that good solution? I am using such architecture now.
>
> thanks
>
> On Tue, Sep 30, 2014 at 1:02 PM, Daniel Compton <de...@danielcompton.net>
> wrote:
>
>> Hi Sa
>>
>> While it's possible to run multiple brokers on a single machine, I would
>> be interested to hear why you would want to. Kafka is very efficient and
>> can use all of the system resources under load. Running multiple brokers
>> would increase zookeeper load, force resource sharing between the Kafka
>> processes, and require more admin overhead.
>>
>> Additionally, you almost certainly want to run three Zookeepers. Two
>> Zookeepers gives you no more reliability than one because ZK voting is
>> based on a majority vote. If neither ZK can reach a majority on its own
>> then it will fail. More info at
>> http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A7
>>
>> Daniel.
>>
>> > On 1/10/2014, at 4:35 am, Guozhang Wang <wa...@gmail.com> wrote:
>> >
>> > Hello,
>> >
>> > In general it is not required to have the kafka brokers installed on the
>> > same nodes of the zk servers, and each node can host multiple kafka
>> > brokers: you just need to make sure they do not share the same port and
>> the
>> > same data dir.
>> >
>> > Guozhang
>> >
>> >> On Mon, Sep 29, 2014 at 8:31 PM, Sa Li <sa...@gmail.com> wrote:
>> >>
>> >> Hi,
>> >> I am kinda newbie to kafka, I plan to build a cluster with multiple
>> nodes,
>> >> and multiple brokers on each node, I can find tutorials for set
>> multiple
>> >> brokers cluster in single node, say
>> >>
>> >>
>> http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/
>> >> Also I can find some instructions for multiple node setup, but with
>> single
>> >> broker on each node. I have not seen any documents to teach me how to
>> setup
>> >> multiple nodes cluster and multiple brokers in each node. I notice some
>> >> documents points out: we should install kafka on each node which makes
>> >> sense, and all the brokers in each node should connect to same
>> zookeeper. I
>> >> am confused since I thought I could setup a zookeeper ensemble cluster
>> >> separately, and all the brokers connecting to this zookeeper cluster
>> and
>> >> this zk cluster doesn’t have to be the server hosting the kafka, but
>> some
>> >> tutorial says I should install zookeeper on each kafka node.
>> >>
>> >> Here is my plan:
>> >> - I have three nodes: kfServer1, kfserver2, kfserver3,
>> >> - kfserver1 and kfserver2 are configured as the zookeeper ensemble,
>> which
>> >> i have done.
>> >>  zk.connect=kfserver1:2181,kfserver2:2181
>> >> - broker1, broker2, broker3 are in kfserver1,
>> >>  broker4, broker5, broker6 are on kfserver2,
>> >>  broker7, broker8, broker9 are on kfserver3.
>> >>
>> >> When I am configuring, the zk DataDir is in local directory of each
>> node,
>> >> instead located at the zk ensemble directory, is that correct? So far,
>> I
>> >> couldnot make above scheme working, anyone have ever made multi-node
>> and
>> >> multi-broker kafka cluster setup?
>> >>
>> >> thanks
>> >>
>> >> Alec
>> >
>> >
>> > --
>> > -- Guozhang
>>
>
>
>
> --
>
> Alec Li
>



-- 

Alec Li

Re: multi-node and multi-broker kafka cluster setup

Posted by Sa Li <sa...@gmail.com>.
Daniel, thanks for reply

It is still the learn curve to me to setup the cluster, we finally want to
make connection between kafka cluster and storm cluster. As you mentioned,
seems 1 single broker per node is more efficient, is it good to handle
multiple topics? For my case, say I can build the 3-node kafka cluster, and
three brokers, and certainly that will limit the replica number, as far as
I understand, broker number should greater or equal to replica number.

For the zk Server, my understanding after play around is: I should run zk
Server server for each kafka node, I could zk.connect to single zk server
in kafka server.properties, and all the broker info will store in that
zkserver, But I may think it might be better to store each individual
broker info in local zkServer, then when zkCli,sh, we can see things under
/brokers/ids.

Is that good solution? I am using such architecture now.

thanks

On Tue, Sep 30, 2014 at 1:02 PM, Daniel Compton <de...@danielcompton.net>
wrote:

> Hi Sa
>
> While it's possible to run multiple brokers on a single machine, I would
> be interested to hear why you would want to. Kafka is very efficient and
> can use all of the system resources under load. Running multiple brokers
> would increase zookeeper load, force resource sharing between the Kafka
> processes, and require more admin overhead.
>
> Additionally, you almost certainly want to run three Zookeepers. Two
> Zookeepers gives you no more reliability than one because ZK voting is
> based on a majority vote. If neither ZK can reach a majority on its own
> then it will fail. More info at
> http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A7
>
> Daniel.
>
> > On 1/10/2014, at 4:35 am, Guozhang Wang <wa...@gmail.com> wrote:
> >
> > Hello,
> >
> > In general it is not required to have the kafka brokers installed on the
> > same nodes of the zk servers, and each node can host multiple kafka
> > brokers: you just need to make sure they do not share the same port and
> the
> > same data dir.
> >
> > Guozhang
> >
> >> On Mon, Sep 29, 2014 at 8:31 PM, Sa Li <sa...@gmail.com> wrote:
> >>
> >> Hi,
> >> I am kinda newbie to kafka, I plan to build a cluster with multiple
> nodes,
> >> and multiple brokers on each node, I can find tutorials for set multiple
> >> brokers cluster in single node, say
> >>
> >>
> http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/
> >> Also I can find some instructions for multiple node setup, but with
> single
> >> broker on each node. I have not seen any documents to teach me how to
> setup
> >> multiple nodes cluster and multiple brokers in each node. I notice some
> >> documents points out: we should install kafka on each node which makes
> >> sense, and all the brokers in each node should connect to same
> zookeeper. I
> >> am confused since I thought I could setup a zookeeper ensemble cluster
> >> separately, and all the brokers connecting to this zookeeper cluster and
> >> this zk cluster doesn’t have to be the server hosting the kafka, but
> some
> >> tutorial says I should install zookeeper on each kafka node.
> >>
> >> Here is my plan:
> >> - I have three nodes: kfServer1, kfserver2, kfserver3,
> >> - kfserver1 and kfserver2 are configured as the zookeeper ensemble,
> which
> >> i have done.
> >>  zk.connect=kfserver1:2181,kfserver2:2181
> >> - broker1, broker2, broker3 are in kfserver1,
> >>  broker4, broker5, broker6 are on kfserver2,
> >>  broker7, broker8, broker9 are on kfserver3.
> >>
> >> When I am configuring, the zk DataDir is in local directory of each
> node,
> >> instead located at the zk ensemble directory, is that correct? So far, I
> >> couldnot make above scheme working, anyone have ever made multi-node and
> >> multi-broker kafka cluster setup?
> >>
> >> thanks
> >>
> >> Alec
> >
> >
> > --
> > -- Guozhang
>



-- 

Alec Li

Re: multi-node and multi-broker kafka cluster setup

Posted by Daniel Compton <de...@danielcompton.net>.
Hi Sa

While it's possible to run multiple brokers on a single machine, I would be interested to hear why you would want to. Kafka is very efficient and can use all of the system resources under load. Running multiple brokers would increase zookeeper load, force resource sharing between the Kafka processes, and require more admin overhead. 

Additionally, you almost certainly want to run three Zookeepers. Two Zookeepers gives you no more reliability than one because ZK voting is based on a majority vote. If neither ZK can reach a majority on its own then it will fail. More info at http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A7

Daniel.

> On 1/10/2014, at 4:35 am, Guozhang Wang <wa...@gmail.com> wrote:
> 
> Hello,
> 
> In general it is not required to have the kafka brokers installed on the
> same nodes of the zk servers, and each node can host multiple kafka
> brokers: you just need to make sure they do not share the same port and the
> same data dir.
> 
> Guozhang
> 
>> On Mon, Sep 29, 2014 at 8:31 PM, Sa Li <sa...@gmail.com> wrote:
>> 
>> Hi,
>> I am kinda newbie to kafka, I plan to build a cluster with multiple nodes,
>> and multiple brokers on each node, I can find tutorials for set multiple
>> brokers cluster in single node, say
>> 
>> http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/
>> Also I can find some instructions for multiple node setup, but with single
>> broker on each node. I have not seen any documents to teach me how to setup
>> multiple nodes cluster and multiple brokers in each node. I notice some
>> documents points out: we should install kafka on each node which makes
>> sense, and all the brokers in each node should connect to same zookeeper. I
>> am confused since I thought I could setup a zookeeper ensemble cluster
>> separately, and all the brokers connecting to this zookeeper cluster and
>> this zk cluster doesn’t have to be the server hosting the kafka, but some
>> tutorial says I should install zookeeper on each kafka node.
>> 
>> Here is my plan:
>> - I have three nodes: kfServer1, kfserver2, kfserver3,
>> - kfserver1 and kfserver2 are configured as the zookeeper ensemble, which
>> i have done.
>>  zk.connect=kfserver1:2181,kfserver2:2181
>> - broker1, broker2, broker3 are in kfserver1,
>>  broker4, broker5, broker6 are on kfserver2,
>>  broker7, broker8, broker9 are on kfserver3.
>> 
>> When I am configuring, the zk DataDir is in local directory of each node,
>> instead located at the zk ensemble directory, is that correct? So far, I
>> couldnot make above scheme working, anyone have ever made multi-node and
>> multi-broker kafka cluster setup?
>> 
>> thanks
>> 
>> Alec
> 
> 
> -- 
> -- Guozhang

Re: multi-node and multi-broker kafka cluster setup

Posted by Guozhang Wang <wa...@gmail.com>.
Hello,

In general it is not required to have the kafka brokers installed on the
same nodes of the zk servers, and each node can host multiple kafka
brokers: you just need to make sure they do not share the same port and the
same data dir.

Guozhang

On Mon, Sep 29, 2014 at 8:31 PM, Sa Li <sa...@gmail.com> wrote:

> Hi,
> I am kinda newbie to kafka, I plan to build a cluster with multiple nodes,
> and multiple brokers on each node, I can find tutorials for set multiple
> brokers cluster in single node, say
>
> http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/
> Also I can find some instructions for multiple node setup, but with single
> broker on each node. I have not seen any documents to teach me how to setup
> multiple nodes cluster and multiple brokers in each node. I notice some
> documents points out: we should install kafka on each node which makes
> sense, and all the brokers in each node should connect to same zookeeper. I
> am confused since I thought I could setup a zookeeper ensemble cluster
> separately, and all the brokers connecting to this zookeeper cluster and
> this zk cluster doesn’t have to be the server hosting the kafka, but some
> tutorial says I should install zookeeper on each kafka node.
>
> Here is my plan:
> - I have three nodes: kfServer1, kfserver2, kfserver3,
> - kfserver1 and kfserver2 are configured as the zookeeper ensemble, which
> i have done.
>   zk.connect=kfserver1:2181,kfserver2:2181
> - broker1, broker2, broker3 are in kfserver1,
>   broker4, broker5, broker6 are on kfserver2,
>   broker7, broker8, broker9 are on kfserver3.
>
> When I am configuring, the zk DataDir is in local directory of each node,
> instead located at the zk ensemble directory, is that correct? So far, I
> couldnot make above scheme working, anyone have ever made multi-node and
> multi-broker kafka cluster setup?
>
> thanks
>
> Alec
>
>
>


-- 
-- Guozhang