You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by abhijeet kadam <ab...@gmail.com> on 2014/03/05 21:06:10 UTC
Kafka Producer load distribution
Hi, I am new with kafka and using kafka 0.8 to build a distributed queuing
system in amazon web service cluster.
I have 4 machines Z1, B1, B2 and B3. 1 Zookeeper instance is running on Z1
and 3 different brokers are running on B1,B2 and B3 respectively.
I am running 3 producers on 3 broker machines(B1, B2, B3) , one in each
machine. Similarly 3 consumers on 3 broker machines, one in each machine.
I created a topic , lets say 'test', with 12 partitions (test-0,test-1 ...
test-11).
4 partitions in each broker machine.
B1 - test-0,test-1,test-2,test-3
B2 - test-4,test-5,test-6,test-7
B3 - test-8,test-9,test-10,test-11
Zookeeper assigned broker in each machine as a leader to the partitions
present in the same machine.
Partition - leader
test-0 - B1
test-1 - B1
test-2 - B1
test-3 - B1
test-4 - B2
test-5 - B2
test-6 - B2
test-7 - B2
test-8 - B3
test-9 - B3
test-10 - B3
test-11 - B3
All 3 producers are producing messages to this topic 'test' and all 3
consumers are trying to consume from the same topic 'test'.
What I am trying to achieve here is , whenever a producer send a message to
this topic , it should use the broker present in the same machine as
producer and ultimately using the partitions in the same machine.
Producer 1 ---> B1 ----> (test-0,test-1,test-2,test-3) -----> consumer 1
Producer 2 ---> B2 ----> (test-4,test-5,test-6,test-7) -----> consumer 2
Producer 3 ---> B3 ----> (test-8,test-9,test-10,test-11) -----> consumer 3
I am assuming this will reduce the inter-machine message transfer and will
improve the performance.
My questions are :
1) Does it really help in improving performance, when message is produced
and consumed from same machine in a distributed environment.
2) I read that producer can fetch metadata from broker about all
leader-partition mapping for a topic. It will help to pick the leader
present in the same machine as producer. How a producer can fetch this
metadata ? Could not find any implementation.
Thanks in advance,
Abhijeet
Re: Kafka Producer load distribution
Posted by Joel Koshy <jj...@gmail.com>.
> I am assuming this will reduce the inter-machine message transfer and will
> improve the performance.
>
> My questions are :
>
> 1) Does it really help in improving performance, when message is produced
> and consumed from same machine in a distributed environment.
I doubt that it helps a whole lot - especially if you let the producer
batch messages in a single request (default).
> 2) I read that producer can fetch metadata from broker about all
> leader-partition mapping for a topic. It will help to pick the leader
> present in the same machine as producer. How a producer can fetch this
> metadata ? Could not find any implementation.
You can use the SyncProducer or SimpleConsumer class - which provide a
send(<request>) API that can accept a topic metadata request and
returns a topic metadata response.
--
Joel