You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Otto Mok <Ot...@acuityads.com> on 2014/03/25 22:55:56 UTC

Separate broker replication traffic from producer/consumer traffic

Hi all,

Is there any way to configure the brokers such that producers & consumers are talking via IP1, while the brokers are replicating between themselves using IP2?

I see there are broker settings for host.name and advertised.host.name, but it doesn't look like these settings does what I'm looking for.

Any help or insights will be appreciated.

Thanks.

Otto out!


RE: Separate broker replication traffic from producer/consumer traffic

Posted by Otto Mok <Ot...@acuityads.com>.
Jay,

Thanks for your responses.
What type of hardware specs are your Kafka servers?  10G or bonded NICs?


Joris,

Thank you for your detailed example.

I was thinking along those lines before, and was hoping for something cleaner.

Glad to know that it actually works!
That's a whole lot of host file entries to maintain.  =)
I'll try that for my environment as well.

Thanks all!

Otto out!

-----Original Message-----
From: Joris VanRemoortere [mailto:jvanremoortere@tagged.com] 
Sent: March-26-14 12:53 PM
To: users@kafka.apache.org
Subject: Re: Separate broker replication traffic from producer/consumer traffic

Hi Otto,

We've separated our traffic for a couple of reasons:
1. We wanted to protect our producer bandwidth to maintain a low latency
pipeline
2. We expected consumers to sometimes pick up from an older offset and clog
the pipe, causing latency for other services
3. When an out of sync replica comes back online, we don't want it to
impact producers / consumers for the in-sync replicas that are feeding data
for replication.

This is our set-up (hopefully this helps you design something similar)
*Brokers:*

   - 3 NICs: Broker[XX][eth 0, 1, 2]
   - Ensure that your brokers respond back over the same network interface
   that the request comes in.

*Producers:*

   - 1 NIC
   - Host file maps Broker[XX] -> Broker[XX][eth 1]

*Consumers:*

   - 1 NIC
   - Host file maps Brokers[XX] -> Broker[XX][eth 2]

*Zookeeper:*

   - 1 NIC
   - No Modifications

*What this accomplishes:*

   - [Replication] Broker A to Broker B and response = Broker[A][eth 0] ->
   Brokers[B][eth 0] -> Broker[A][eth 0]
   - [Publish] Producer A to Broker B and response = Producer[A][eth 0] ->
   Broker[B][eth 1] -> Producer[A][eth 0]
   - [Consume] Consumer A to Broker B and response = Consumer[A][eth 0] ->
   Broker[B][eth 2] -> Consumer[A][eth 0]
   - By default (without a host file entry) our hosts talk to eachother
   over eth 0
   - All services still talk to a normal zookeeper tier.
   - The zookeeper publishes the hostname of the broker to talk to.
   - Hostname translation allows us to remap the hostname to the different
   IP associated with the different network interface.
   - This works well when your different services are different physical /
   virtual machines. You need to get fancier with packet rewriting services if
   you are hosting multiple services on the same host.

Hope this helps!

Joris


On Wed, Mar 26, 2014 at 9:10 AM, Jay Kreps <ja...@gmail.com> wrote:

> Hey Otto,
>
> Yeah this isn't something we've really thought about. Presumably the
> implementation would be that the server accept connections on two
> interfaces. That is pretty easy. However the harder part is that I think
> this would require updating the metadata to advertise a different ip/host
> to other brokers versus to producers (right now there is just one for
> both). Or maybe there would be another way to do it?
>
> -Jay
>
>
> On Wed, Mar 26, 2014 at 6:44 AM, Otto Mok <Ot...@acuityads.com> wrote:
>
> > Hi Jay,
> >
> > We're pushing a lot of data from the producers (n) and have many
> consumers
> > (3n) reading them.
> >
> > We're configured to have replication factor of 3, so replication traffic
> > is about (2n).
> >
> > Currently all traffic was on a single NIC, so that's about (6n) total.
> >
> > Having the replication traffic on different IP/NIC would reduce the
> > bandwidth usage by 33%, down to (4n).
> > Or 50% more capacity for producers to push before hitting the NIC's cap
> (1
> > Gbps)
> >
> > We're not quite at the cap yet, but would like to see if we can make use
> > of the second NIC to give us more room in the primary NIC.
> >
> > Thanks.
> >
> > Otto out!
> >
> > -----Original Message-----
> > From: Jay Kreps [mailto:jay.kreps@gmail.com]
> > Sent: March-25-14 6:22 PM
> > To: users@kafka.apache.org
> > Subject: Re: Separate broker replication traffic from producer/consumer
> > traffic
> >
> > No not at the moment. Are you seeing a problem that this would resolve?
> >
> > -Jay
> >
> >
> > On Tue, Mar 25, 2014 at 2:55 PM, Otto Mok <Ot...@acuityads.com>
> wrote:
> >
> > > Hi all,
> > >
> > > Is there any way to configure the brokers such that producers &
> consumers
> > > are talking via IP1, while the brokers are replicating between
> themselves
> > > using IP2?
> > >
> > > I see there are broker settings for host.name and advertised.host.name
> ,
> > > but it doesn't look like these settings does what I'm looking for.
> > >
> > > Any help or insights will be appreciated.
> > >
> > > Thanks.
> > >
> > > Otto out!
> > >
> > >
> >
>

Re: Separate broker replication traffic from producer/consumer traffic

Posted by Joris VanRemoortere <jv...@tagged.com>.
Hi Otto,

We've separated our traffic for a couple of reasons:
1. We wanted to protect our producer bandwidth to maintain a low latency
pipeline
2. We expected consumers to sometimes pick up from an older offset and clog
the pipe, causing latency for other services
3. When an out of sync replica comes back online, we don't want it to
impact producers / consumers for the in-sync replicas that are feeding data
for replication.

This is our set-up (hopefully this helps you design something similar)
*Brokers:*

   - 3 NICs: Broker[XX][eth 0, 1, 2]
   - Ensure that your brokers respond back over the same network interface
   that the request comes in.

*Producers:*

   - 1 NIC
   - Host file maps Broker[XX] -> Broker[XX][eth 1]

*Consumers:*

   - 1 NIC
   - Host file maps Brokers[XX] -> Broker[XX][eth 2]

*Zookeeper:*

   - 1 NIC
   - No Modifications

*What this accomplishes:*

   - [Replication] Broker A to Broker B and response = Broker[A][eth 0] ->
   Brokers[B][eth 0] -> Broker[A][eth 0]
   - [Publish] Producer A to Broker B and response = Producer[A][eth 0] ->
   Broker[B][eth 1] -> Producer[A][eth 0]
   - [Consume] Consumer A to Broker B and response = Consumer[A][eth 0] ->
   Broker[B][eth 2] -> Consumer[A][eth 0]
   - By default (without a host file entry) our hosts talk to eachother
   over eth 0
   - All services still talk to a normal zookeeper tier.
   - The zookeeper publishes the hostname of the broker to talk to.
   - Hostname translation allows us to remap the hostname to the different
   IP associated with the different network interface.
   - This works well when your different services are different physical /
   virtual machines. You need to get fancier with packet rewriting services if
   you are hosting multiple services on the same host.

Hope this helps!

Joris


On Wed, Mar 26, 2014 at 9:10 AM, Jay Kreps <ja...@gmail.com> wrote:

> Hey Otto,
>
> Yeah this isn't something we've really thought about. Presumably the
> implementation would be that the server accept connections on two
> interfaces. That is pretty easy. However the harder part is that I think
> this would require updating the metadata to advertise a different ip/host
> to other brokers versus to producers (right now there is just one for
> both). Or maybe there would be another way to do it?
>
> -Jay
>
>
> On Wed, Mar 26, 2014 at 6:44 AM, Otto Mok <Ot...@acuityads.com> wrote:
>
> > Hi Jay,
> >
> > We're pushing a lot of data from the producers (n) and have many
> consumers
> > (3n) reading them.
> >
> > We're configured to have replication factor of 3, so replication traffic
> > is about (2n).
> >
> > Currently all traffic was on a single NIC, so that's about (6n) total.
> >
> > Having the replication traffic on different IP/NIC would reduce the
> > bandwidth usage by 33%, down to (4n).
> > Or 50% more capacity for producers to push before hitting the NIC's cap
> (1
> > Gbps)
> >
> > We're not quite at the cap yet, but would like to see if we can make use
> > of the second NIC to give us more room in the primary NIC.
> >
> > Thanks.
> >
> > Otto out!
> >
> > -----Original Message-----
> > From: Jay Kreps [mailto:jay.kreps@gmail.com]
> > Sent: March-25-14 6:22 PM
> > To: users@kafka.apache.org
> > Subject: Re: Separate broker replication traffic from producer/consumer
> > traffic
> >
> > No not at the moment. Are you seeing a problem that this would resolve?
> >
> > -Jay
> >
> >
> > On Tue, Mar 25, 2014 at 2:55 PM, Otto Mok <Ot...@acuityads.com>
> wrote:
> >
> > > Hi all,
> > >
> > > Is there any way to configure the brokers such that producers &
> consumers
> > > are talking via IP1, while the brokers are replicating between
> themselves
> > > using IP2?
> > >
> > > I see there are broker settings for host.name and advertised.host.name
> ,
> > > but it doesn't look like these settings does what I'm looking for.
> > >
> > > Any help or insights will be appreciated.
> > >
> > > Thanks.
> > >
> > > Otto out!
> > >
> > >
> >
>

Re: Separate broker replication traffic from producer/consumer traffic

Posted by Jay Kreps <ja...@gmail.com>.
Hey Otto,

Yeah this isn't something we've really thought about. Presumably the
implementation would be that the server accept connections on two
interfaces. That is pretty easy. However the harder part is that I think
this would require updating the metadata to advertise a different ip/host
to other brokers versus to producers (right now there is just one for
both). Or maybe there would be another way to do it?

-Jay


On Wed, Mar 26, 2014 at 6:44 AM, Otto Mok <Ot...@acuityads.com> wrote:

> Hi Jay,
>
> We're pushing a lot of data from the producers (n) and have many consumers
> (3n) reading them.
>
> We're configured to have replication factor of 3, so replication traffic
> is about (2n).
>
> Currently all traffic was on a single NIC, so that's about (6n) total.
>
> Having the replication traffic on different IP/NIC would reduce the
> bandwidth usage by 33%, down to (4n).
> Or 50% more capacity for producers to push before hitting the NIC's cap (1
> Gbps)
>
> We're not quite at the cap yet, but would like to see if we can make use
> of the second NIC to give us more room in the primary NIC.
>
> Thanks.
>
> Otto out!
>
> -----Original Message-----
> From: Jay Kreps [mailto:jay.kreps@gmail.com]
> Sent: March-25-14 6:22 PM
> To: users@kafka.apache.org
> Subject: Re: Separate broker replication traffic from producer/consumer
> traffic
>
> No not at the moment. Are you seeing a problem that this would resolve?
>
> -Jay
>
>
> On Tue, Mar 25, 2014 at 2:55 PM, Otto Mok <Ot...@acuityads.com> wrote:
>
> > Hi all,
> >
> > Is there any way to configure the brokers such that producers & consumers
> > are talking via IP1, while the brokers are replicating between themselves
> > using IP2?
> >
> > I see there are broker settings for host.name and advertised.host.name,
> > but it doesn't look like these settings does what I'm looking for.
> >
> > Any help or insights will be appreciated.
> >
> > Thanks.
> >
> > Otto out!
> >
> >
>

RE: Separate broker replication traffic from producer/consumer traffic

Posted by Otto Mok <Ot...@acuityads.com>.
Hi Jay,

We're pushing a lot of data from the producers (n) and have many consumers (3n) reading them.

We're configured to have replication factor of 3, so replication traffic is about (2n).

Currently all traffic was on a single NIC, so that's about (6n) total.

Having the replication traffic on different IP/NIC would reduce the bandwidth usage by 33%, down to (4n).
Or 50% more capacity for producers to push before hitting the NIC's cap (1 Gbps)

We're not quite at the cap yet, but would like to see if we can make use of the second NIC to give us more room in the primary NIC.

Thanks.

Otto out!

-----Original Message-----
From: Jay Kreps [mailto:jay.kreps@gmail.com] 
Sent: March-25-14 6:22 PM
To: users@kafka.apache.org
Subject: Re: Separate broker replication traffic from producer/consumer traffic

No not at the moment. Are you seeing a problem that this would resolve?

-Jay


On Tue, Mar 25, 2014 at 2:55 PM, Otto Mok <Ot...@acuityads.com> wrote:

> Hi all,
>
> Is there any way to configure the brokers such that producers & consumers
> are talking via IP1, while the brokers are replicating between themselves
> using IP2?
>
> I see there are broker settings for host.name and advertised.host.name,
> but it doesn't look like these settings does what I'm looking for.
>
> Any help or insights will be appreciated.
>
> Thanks.
>
> Otto out!
>
>

Re: Separate broker replication traffic from producer/consumer traffic

Posted by Jay Kreps <ja...@gmail.com>.
No not at the moment. Are you seeing a problem that this would resolve?

-Jay


On Tue, Mar 25, 2014 at 2:55 PM, Otto Mok <Ot...@acuityads.com> wrote:

> Hi all,
>
> Is there any way to configure the brokers such that producers & consumers
> are talking via IP1, while the brokers are replicating between themselves
> using IP2?
>
> I see there are broker settings for host.name and advertised.host.name,
> but it doesn't look like these settings does what I'm looking for.
>
> Any help or insights will be appreciated.
>
> Thanks.
>
> Otto out!
>
>