You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Christian Carollo <cc...@gmail.com> on 2012/07/16 23:32:50 UTC

Multiple Mirror Architecture Example

We are looking to implement a multi-mirror queue architecture and wondering if there are any known documents or examples.  In general, we are looking to setup two Kafka clusters that replicate or mirror one another.

Christian

Re: Multiple Mirror Architecture Example

Posted by Joel Koshy <jj...@gmail.com>.
Unfortunately, no - the mirroring mechanism needs ZK to function. I think
some people have used plain rsync's to maintain mirrors, but if there is a
strong case for this then we can think about implementing a less
full-fledged mirroring component that uses SimpleConsumer and talks to the
brokers directly without using ZK.

Joel


On Thu, Jul 26, 2012 at 11:49 AM, Christian Carollo <cc...@gmail.com>wrote:

> This configuration can be done with out ZK, correct?
>
> On Jul 26, 2012, at 11:43 AM, Joel Koshy <jj...@gmail.com> wrote:
>
> > I see - yes that should be fine. i.e., as long as the topics are distinct
> > then you can mirror both ways and the topic "owned" by each machine
> should
> > be read-only on the other machine.
> >
> > Thanks,
> >
> > Joel
> >
> > On Thu, Jul 26, 2012 at 11:38 AM, Christian Carollo <ccarollo@gmail.com
> >wrote:
> >
> >> Hi Joel,
> >>
> >> Thanks for getting back to me.  I actually missed your response.
> >>
> >> What I really am trying to is…
> >>
> >> Have two machines each running a single broker (not clustered)?  One
> topic
> >> is homed on machine A
> >> an mirrored to machine B and another topic is homed on machine B and
> >> mirrored to machine A.
> >>
> >> When machine A goes down machine B can take over as owner of the topic
> >> that machine A was previously owner of.
> >>
> >> Is that clearer?
> >>
> >> Thanks
> >> Christian
> >>
> >> On Jul 18, 2012, at 10:21 AM, Joel Koshy <jj...@gmail.com> wrote:
> >>
> >>> Missed this thread - can you elaborate a bit more on your use case? I'm
> >>> assuming you are referring to the inter-cluster (mirroring) feature
> (and
> >>> not replication). It isn't really possible for two clusters to mirror
> >> each
> >>> other - as the mirroring would never end, at least the way it is
> >>> implemented right now. If you want the aggregate data from all your
> >>> data-centers to be available in each data-center there are some
> >> topologies
> >>> that may work for you.
> >>>
> >>> Joel
> >>>
> >>> On Mon, Jul 16, 2012 at 2:32 PM, Christian Carollo <ccarollo@gmail.com
> >>> wrote:
> >>>
> >>>> We are looking to implement a multi-mirror queue architecture and
> >>>> wondering if there are any known documents or examples.  In general,
> we
> >> are
> >>>> looking to setup two Kafka clusters that replicate or mirror one
> >> another.
> >>>>
> >>>> Christian
> >>
> >>
>
>

Re: Multiple Mirror Architecture Example

Posted by Christian Carollo <cc...@gmail.com>.
This configuration can be done with out ZK, correct?

On Jul 26, 2012, at 11:43 AM, Joel Koshy <jj...@gmail.com> wrote:

> I see - yes that should be fine. i.e., as long as the topics are distinct
> then you can mirror both ways and the topic "owned" by each machine should
> be read-only on the other machine.
> 
> Thanks,
> 
> Joel
> 
> On Thu, Jul 26, 2012 at 11:38 AM, Christian Carollo <cc...@gmail.com>wrote:
> 
>> Hi Joel,
>> 
>> Thanks for getting back to me.  I actually missed your response.
>> 
>> What I really am trying to is…
>> 
>> Have two machines each running a single broker (not clustered)?  One topic
>> is homed on machine A
>> an mirrored to machine B and another topic is homed on machine B and
>> mirrored to machine A.
>> 
>> When machine A goes down machine B can take over as owner of the topic
>> that machine A was previously owner of.
>> 
>> Is that clearer?
>> 
>> Thanks
>> Christian
>> 
>> On Jul 18, 2012, at 10:21 AM, Joel Koshy <jj...@gmail.com> wrote:
>> 
>>> Missed this thread - can you elaborate a bit more on your use case? I'm
>>> assuming you are referring to the inter-cluster (mirroring) feature (and
>>> not replication). It isn't really possible for two clusters to mirror
>> each
>>> other - as the mirroring would never end, at least the way it is
>>> implemented right now. If you want the aggregate data from all your
>>> data-centers to be available in each data-center there are some
>> topologies
>>> that may work for you.
>>> 
>>> Joel
>>> 
>>> On Mon, Jul 16, 2012 at 2:32 PM, Christian Carollo <ccarollo@gmail.com
>>> wrote:
>>> 
>>>> We are looking to implement a multi-mirror queue architecture and
>>>> wondering if there are any known documents or examples.  In general, we
>> are
>>>> looking to setup two Kafka clusters that replicate or mirror one
>> another.
>>>> 
>>>> Christian
>> 
>> 


Re: Multiple Mirror Architecture Example

Posted by Joel Koshy <jj...@gmail.com>.
I see - yes that should be fine. i.e., as long as the topics are distinct
then you can mirror both ways and the topic "owned" by each machine should
be read-only on the other machine.

Thanks,

Joel

On Thu, Jul 26, 2012 at 11:38 AM, Christian Carollo <cc...@gmail.com>wrote:

> Hi Joel,
>
> Thanks for getting back to me.  I actually missed your response.
>
> What I really am trying to is…
>
> Have two machines each running a single broker (not clustered)?  One topic
> is homed on machine A
> an mirrored to machine B and another topic is homed on machine B and
> mirrored to machine A.
>
> When machine A goes down machine B can take over as owner of the topic
> that machine A was previously owner of.
>
> Is that clearer?
>
> Thanks
> Christian
>
> On Jul 18, 2012, at 10:21 AM, Joel Koshy <jj...@gmail.com> wrote:
>
> > Missed this thread - can you elaborate a bit more on your use case? I'm
> > assuming you are referring to the inter-cluster (mirroring) feature (and
> > not replication). It isn't really possible for two clusters to mirror
> each
> > other - as the mirroring would never end, at least the way it is
> > implemented right now. If you want the aggregate data from all your
> > data-centers to be available in each data-center there are some
> topologies
> > that may work for you.
> >
> > Joel
> >
> > On Mon, Jul 16, 2012 at 2:32 PM, Christian Carollo <ccarollo@gmail.com
> >wrote:
> >
> >> We are looking to implement a multi-mirror queue architecture and
> >> wondering if there are any known documents or examples.  In general, we
> are
> >> looking to setup two Kafka clusters that replicate or mirror one
> another.
> >>
> >> Christian
>
>

Re: Multiple Mirror Architecture Example

Posted by Christian Carollo <cc...@gmail.com>.
Hi Joel,

Thanks for getting back to me.  I actually missed your response. 

What I really am trying to is…

Have two machines each running a single broker (not clustered)?  One topic is homed on machine A
an mirrored to machine B and another topic is homed on machine B and mirrored to machine A.

When machine A goes down machine B can take over as owner of the topic that machine A was previously owner of.

Is that clearer?

Thanks
Christian

On Jul 18, 2012, at 10:21 AM, Joel Koshy <jj...@gmail.com> wrote:

> Missed this thread - can you elaborate a bit more on your use case? I'm
> assuming you are referring to the inter-cluster (mirroring) feature (and
> not replication). It isn't really possible for two clusters to mirror each
> other - as the mirroring would never end, at least the way it is
> implemented right now. If you want the aggregate data from all your
> data-centers to be available in each data-center there are some topologies
> that may work for you.
> 
> Joel
> 
> On Mon, Jul 16, 2012 at 2:32 PM, Christian Carollo <cc...@gmail.com>wrote:
> 
>> We are looking to implement a multi-mirror queue architecture and
>> wondering if there are any known documents or examples.  In general, we are
>> looking to setup two Kafka clusters that replicate or mirror one another.
>> 
>> Christian


Re: Multiple Mirror Architecture Example

Posted by Jun Rao <ju...@gmail.com>.
Both lists should support wildcard.

Thanks,

Jun

On Thu, Jul 26, 2012 at 10:09 AM, Riju Kallivalappil <
riju.kallivalappil@corp.247customer.com> wrote:

> The problem I have with no wildcard or regex support in whitelist/blacklist
> is the following (I'm fine with not having the ability to specify both
> whitelist and blacklist at the same time):
>
> Let's say I've producers in DC1 writing to topic A and the ones in DC2
> writing to topic B. I can have a mirroring setup for this specific
> scenario. Now, let's say we have a whole set of new producers who want to
> write to topic C in DC1. This will require us to manually change the
> whitelist configuration in the mirroring setup. Whereas, if we had wild
> card support we could have done something like the following:
>
> Have all producers in DC1 write to topic names prefixed with "dc1:" and all
> producers in DC2 write to topics with "dc2:" prefix. If whitelist/blacklist
> supports wildcards we could have just specified something like "dc1:*" for
> whitelist and then not worry about producers creating new topics as long as
> they stick to the naming convention.
>
> On Thu, Jul 26, 2012 at 9:58 AM, Jun Rao <ju...@gmail.com> wrote:
>
> > Currently, you can't specify both whitelist/blacklist in an embedded
> > consumer, but you can specify one of them. Is that not enough for you?
> And
> > you can always run multiple instances of mirror maker, if necessary.
> >
> > Thanks,
> >
> > Jun
> >
> >
> >
>

Re: Multiple Mirror Architecture Example

Posted by Riju Kallivalappil <ri...@corp.247customer.com>.
The problem I have with no wildcard or regex support in whitelist/blacklist
is the following (I'm fine with not having the ability to specify both
whitelist and blacklist at the same time):

Let's say I've producers in DC1 writing to topic A and the ones in DC2
writing to topic B. I can have a mirroring setup for this specific
scenario. Now, let's say we have a whole set of new producers who want to
write to topic C in DC1. This will require us to manually change the
whitelist configuration in the mirroring setup. Whereas, if we had wild
card support we could have done something like the following:

Have all producers in DC1 write to topic names prefixed with "dc1:" and all
producers in DC2 write to topics with "dc2:" prefix. If whitelist/blacklist
supports wildcards we could have just specified something like "dc1:*" for
whitelist and then not worry about producers creating new topics as long as
they stick to the naming convention.

On Thu, Jul 26, 2012 at 9:58 AM, Jun Rao <ju...@gmail.com> wrote:

> Currently, you can't specify both whitelist/blacklist in an embedded
> consumer, but you can specify one of them. Is that not enough for you? And
> you can always run multiple instances of mirror maker, if necessary.
>
> Thanks,
>
> Jun
>
>
>

Re: Multiple Mirror Architecture Example

Posted by Jun Rao <ju...@gmail.com>.
Currently, you can't specify both whitelist/blacklist in an embedded
consumer, but you can specify one of them. Is that not enough for you? And
you can always run multiple instances of mirror maker, if necessary.

Thanks,

Jun

On Thu, Jul 26, 2012 at 9:39 AM, Riju Kallivalappil <
riju.kallivalappil@corp.247customer.com> wrote:

> Thanks Jun.
>
> On a related note, is it possible to specify wildcards for
> whitelist/blacklist in the mirroring setup? The mirroring wiki [1] says
> that it is not supported, but I thought I'll double check just in case.
>
> Are there any technical reasons why this cannot be supported? It'll go a
> long way in helping with two-way mirroring setups. Otherwise, one will be
> able to work only with a fixed set of topics.
>
> [1]  https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring
>
> On Thu, Jul 26, 2012 at 8:27 AM, Jun Rao <ju...@gmail.com> wrote:
>
> > Riju,
> >
> > This would be fine since data is initially generated in one DC.
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Jul 25, 2012 at 5:54 PM, Riju Kallivalappil <
> > riju.kallivalappil@corp.247customer.com> wrote:
> >
> > > > It isn't really possible for two clusters to mirror each
> > > > other - as the mirroring would never end
> > >
> > > Are there ways to work around this? For example, let us say that I have
> > two
> > > clusters (cluster1 in DC1 and cluster2 in DC2). And let's say that all
> > > producers in DC1 write to topic A in cluster1 and all producers in DC2
> > > write to topic B in cluster2. Isn't it possible to have a mirroring
> setup
> > > where cluster1 mirrors topic B from cluster2 and cluster2 mirrors
> topic A
> > > from cluster1? If this is possible, then we can have consumers in both
> > DC's
> > > consume from topics A and B, and pretty much have two way mirroring.
> > >
> >
>

Re: Multiple Mirror Architecture Example

Posted by Riju Kallivalappil <ri...@corp.247customer.com>.
Thanks Jun.

On a related note, is it possible to specify wildcards for
whitelist/blacklist in the mirroring setup? The mirroring wiki [1] says
that it is not supported, but I thought I'll double check just in case.

Are there any technical reasons why this cannot be supported? It'll go a
long way in helping with two-way mirroring setups. Otherwise, one will be
able to work only with a fixed set of topics.

[1]  https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring

On Thu, Jul 26, 2012 at 8:27 AM, Jun Rao <ju...@gmail.com> wrote:

> Riju,
>
> This would be fine since data is initially generated in one DC.
>
> Thanks,
>
> Jun
>
> On Wed, Jul 25, 2012 at 5:54 PM, Riju Kallivalappil <
> riju.kallivalappil@corp.247customer.com> wrote:
>
> > > It isn't really possible for two clusters to mirror each
> > > other - as the mirroring would never end
> >
> > Are there ways to work around this? For example, let us say that I have
> two
> > clusters (cluster1 in DC1 and cluster2 in DC2). And let's say that all
> > producers in DC1 write to topic A in cluster1 and all producers in DC2
> > write to topic B in cluster2. Isn't it possible to have a mirroring setup
> > where cluster1 mirrors topic B from cluster2 and cluster2 mirrors topic A
> > from cluster1? If this is possible, then we can have consumers in both
> DC's
> > consume from topics A and B, and pretty much have two way mirroring.
> >
>

Re: Multiple Mirror Architecture Example

Posted by Jun Rao <ju...@gmail.com>.
Riju,

This would be fine since data is initially generated in one DC.

Thanks,

Jun

On Wed, Jul 25, 2012 at 5:54 PM, Riju Kallivalappil <
riju.kallivalappil@corp.247customer.com> wrote:

> > It isn't really possible for two clusters to mirror each
> > other - as the mirroring would never end
>
> Are there ways to work around this? For example, let us say that I have two
> clusters (cluster1 in DC1 and cluster2 in DC2). And let's say that all
> producers in DC1 write to topic A in cluster1 and all producers in DC2
> write to topic B in cluster2. Isn't it possible to have a mirroring setup
> where cluster1 mirrors topic B from cluster2 and cluster2 mirrors topic A
> from cluster1? If this is possible, then we can have consumers in both DC's
> consume from topics A and B, and pretty much have two way mirroring.
>

Re: Multiple Mirror Architecture Example

Posted by Riju Kallivalappil <ri...@corp.247customer.com>.
> It isn't really possible for two clusters to mirror each
> other - as the mirroring would never end

Are there ways to work around this? For example, let us say that I have two
clusters (cluster1 in DC1 and cluster2 in DC2). And let's say that all
producers in DC1 write to topic A in cluster1 and all producers in DC2
write to topic B in cluster2. Isn't it possible to have a mirroring setup
where cluster1 mirrors topic B from cluster2 and cluster2 mirrors topic A
from cluster1? If this is possible, then we can have consumers in both DC's
consume from topics A and B, and pretty much have two way mirroring.

Re: Multiple Mirror Architecture Example

Posted by Joel Koshy <jj...@gmail.com>.
Missed this thread - can you elaborate a bit more on your use case? I'm
assuming you are referring to the inter-cluster (mirroring) feature (and
not replication). It isn't really possible for two clusters to mirror each
other - as the mirroring would never end, at least the way it is
implemented right now. If you want the aggregate data from all your
data-centers to be available in each data-center there are some topologies
that may work for you.

Joel

On Mon, Jul 16, 2012 at 2:32 PM, Christian Carollo <cc...@gmail.com>wrote:

> We are looking to implement a multi-mirror queue architecture and
> wondering if there are any known documents or examples.  In general, we are
> looking to setup two Kafka clusters that replicate or mirror one another.
>
> Christian