You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jack Foy <jf...@hiya.com> on 2017/03/06 22:50:57 UTC

MirrorMaker and producers

Hey, all. Is there any general guidance around using mirrored topics
in the context of a cluster migration?

We're moving operations from one data center to another, and we want
to stream mirrored data from the old cluster to the new, migrate
consumers, then migrate producers.

Our basic question is whether it's safe for us to commingle mirrored
and directly-produced data in the same topic, even serially. In other
words, is the following procedure safe? Why or why not?

- Data is produced to topic T on cluster A
- Topic T is mirrored to cluster B
- Consumers run against T on cluster B
- Producers gradually migrate from A to B

We've found the following, which seems to suggest no, but doesn't
address the point directly:
http://events.linuxfoundation.org/sites/events/files/slides/Kafka%20At%20Scale.pdf

-- 
Jack Foy <jf...@hiya.com>

Re: MirrorMaker and producers

Posted by Todd Palino <tp...@gmail.com>.
For this type of use case, there’s no problem with mirroring and producing
into the same topic. Kafka can handle it just fine, and as long as you’re
OK with the intermingled data from the consumer side (for example, knowing
that it may not be time-ordered if you’re working with keyed data), it will
work properly.

I’ve often railed against producing into clusters and topics that you are
mirroring into, but this is largely because those clusters are designed to
be aggregates of other clusters. If you produce directly to the aggregate
cluster, it no longer matches the other aggregate clusters. But the
migration use case is different than that.

-Todd


On Mon, Mar 6, 2017 at 2:50 PM, Jack Foy <jf...@hiya.com> wrote:

> Hey, all. Is there any general guidance around using mirrored topics
> in the context of a cluster migration?
>
> We're moving operations from one data center to another, and we want
> to stream mirrored data from the old cluster to the new, migrate
> consumers, then migrate producers.
>
> Our basic question is whether it's safe for us to commingle mirrored
> and directly-produced data in the same topic, even serially. In other
> words, is the following procedure safe? Why or why not?
>
> - Data is produced to topic T on cluster A
> - Topic T is mirrored to cluster B
> - Consumers run against T on cluster B
> - Producers gradually migrate from A to B
>
> We've found the following, which seems to suggest no, but doesn't
> address the point directly:
> http://events.linuxfoundation.org/sites/events/files/slides/
> Kafka%20At%20Scale.pdf
>
> --
> Jack Foy <jf...@hiya.com>
>



-- 
*Todd Palino*
Staff Site Reliability Engineer
Data Infrastructure Streaming



linkedin.com/in/toddpalino