You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by UMESH CHAUDHARY <um...@gmail.com> on 2016/08/25 08:36:37 UTC

Question regarding functionality of MirrorMaker

Hey Folks,
I was trying to understand the behavior of MirrorMaker but looks like I am
missing something here. Please see the steps which I performed :

1) I configured MM on source Kafka cluster
2) Created a topic and pushed some data in it using console producer.
3) My understanding is that MM would start mirroring the data (which is
there in the topic) based on "offsetCommitIntervalMs" and it would be there
in destination cluster.

https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/MirrorMaker.scala#L503

4) But when I list the topics on destination, I cant see the topic which I
recently created on source.
5) I tried to check the offset of "mirrormaker_group" for that topic (on
source cluster) using kafka.admin.ConsumerGroupCommand, I see the offsets
for that topic as "unknown".
6) But when I start console consumer for that topic on source or
destination (auto creation of topic is true), I see that all data in being
mirrored via MM and kafka.admin.ConsumerGroupCommand tells the right
offsets this time.

Is this expected behavior of MM or did I mess up with some configuration?

Regards,
Umesh

Re: Question regarding functionality of MirrorMaker

Posted by UMESH CHAUDHARY <um...@gmail.com>.
Hello cs,
Apologies for delayed response.
I found one topic in my Kafka env which has no leaders and no replicas.
That was pretty weird and I am not sure what caused this.

Because of this topic MirrorMaker was hanging and printing messages like
"No leader found for topic ..". Due to this hang MM was not able to
replicate other topics.

When I deleted that zombie topic, MM worked as expected.

Regards,
Umesh Chaudhary

On Fri, 26 Aug 2016 at 12:40 cs user <ac...@gmail.com> wrote:

> Hi Umesh,
>
> I haven't had that problem, it seems to work fine for me. The only issue I
> found, which kind of makes sense, it that it doesn't mirror existing topics
> immediately, only when messages are first set to the topic after mirror
> maker connects. It doesn't start from the first offset available, only the
> current one.
>
> However once you start sending messages it seems to subscribe to them fine
> and they get created on the mirror maker cluster, same for new topics which
> are created on the source cluster, they seem to come over fine.
>
> Only thing I can think of is that you have disabled auto topic creation on
> the mirror maker cluster so that mirror maker is unable to create them
> automatically? But then it wouldn't be able to create the existing topics
> either so that doesn't make sense.
>
> Are there any error messages in your mirror maker logs or on the mirror
> maker cluster which point to what the issue might be?
>
> Other than the boostrap servers, my producer settings look as follows:
>
> producer.type=async
> compression.codec=0
> serializer.class=kafka.serializer.DefaultEncoder
> max.message.size=10000000
> queue.time=1000
> queue.enqueueTimeout.ms=-1
>
>
>
> Cheers!
>
>
>
>
> On Fri, Aug 26, 2016 at 6:08 AM, UMESH CHAUDHARY <um...@gmail.com>
> wrote:
>
> > Hello Mate,
> > Thanks for your detailed response and it surely helps.
> >
> > WhiteList is the required config for MM from 0.9.0 onwards. And you are
> > correct that --new-consumer requires --bootstrap-servers rather than
> > --zookeeper .
> >
> > However, did you notice that MM picks the topics which are present at the
> > time of its startup and mirrors the data. When you add some new topics
> > after its startup it doesn't pick it automatically?
> >
> > Regards,
> > Umesh Chaudhary
> >
> > On Thu, 25 Aug 2016 at 19:23 cs user <ac...@gmail.com> wrote:
> >
> > > Hi Umesh,
> > >
> > > I am new to kafka as well, and configuring the MirrorMaker. I got mine
> > > working in the following way.
> > >
> > > I run the mirror maker instance on the mirror cluster, as in where you
> > want
> > > to mirror the topics to, although I'm not sure it matters.
> > >
> > > I use the following options when starting my service (systemd file):
> > >
> > > KAFKA_RUN="/opt/kafka/bin/kafka-run-class.sh"
> > > KAFKA_ARGS="kafka.tools.MirrorMaker"
> > > KAFKA_CONFIG="--new.consumer --offset.commit.interval.ms=5000
> > > --consumer.config /opt/kafka/config/consumer-mirror1.properties
> > > --producer.config /opt/kafka/config/producer.properties
> > --whitelist=\".*\""
> > >
> > > Without the --new.consumer parameter, the --consumer.config and
> > > producer.config files need to contain the zookeeper config for relevant
> > > clusters. When using the --new.consumer switch this is no longer needed
> > (as
> > > I understand it).
> > >
> > > The consumer config points at my source cluster, the producer config
> > points
> > > locally, to my mirror cluster. I think it's also important to configure
> > the
> > > whitelist to tell it which topics you want to mirror, in my case I
> mirror
> > > all of them with a wildcard.
> > >
> > > Not much config in the consumer.config and producer.config files apart
> > from
> > > the bootstrap.servers list, pointing at the relevant cluster. I have 3
> > > brokers in my mirror cluster and each one of them runs the same mirror
> > > maker service so one will take over if another one fails.
> > >
> > > I hope someone will correct me if I am wrong about anything, and
> > hopefully
> > > this will help!
> > >
> > > Cheers!
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Aug 25, 2016 at 9:36 AM, UMESH CHAUDHARY <um...@gmail.com>
> > > wrote:
> > >
> > > > Hey Folks,
> > > > I was trying to understand the behavior of MirrorMaker but looks
> like I
> > > am
> > > > missing something here. Please see the steps which I performed :
> > > >
> > > > 1) I configured MM on source Kafka cluster
> > > > 2) Created a topic and pushed some data in it using console producer.
> > > > 3) My understanding is that MM would start mirroring the data (which
> is
> > > > there in the topic) based on "offsetCommitIntervalMs" and it would be
> > > there
> > > > in destination cluster.
> > > >
> > > > https://github.com/apache/kafka/blob/0.9.0/core/src/
> > > > main/scala/kafka/tools/MirrorMaker.scala#L503
> > > >
> > > > 4) But when I list the topics on destination, I cant see the topic
> > which
> > > I
> > > > recently created on source.
> > > > 5) I tried to check the offset of "mirrormaker_group" for that topic
> > (on
> > > > source cluster) using kafka.admin.ConsumerGroupCommand, I see the
> > offsets
> > > > for that topic as "unknown".
> > > > 6) But when I start console consumer for that topic on source or
> > > > destination (auto creation of topic is true), I see that all data in
> > > being
> > > > mirrored via MM and kafka.admin.ConsumerGroupCommand tells the right
> > > > offsets this time.
> > > >
> > > > Is this expected behavior of MM or did I mess up with some
> > configuration?
> > > >
> > > > Regards,
> > > > Umesh
> > > >
> > >
> >
>

Re: Question regarding functionality of MirrorMaker

Posted by cs user <ac...@gmail.com>.
Hi Umesh,

I haven't had that problem, it seems to work fine for me. The only issue I
found, which kind of makes sense, it that it doesn't mirror existing topics
immediately, only when messages are first set to the topic after mirror
maker connects. It doesn't start from the first offset available, only the
current one.

However once you start sending messages it seems to subscribe to them fine
and they get created on the mirror maker cluster, same for new topics which
are created on the source cluster, they seem to come over fine.

Only thing I can think of is that you have disabled auto topic creation on
the mirror maker cluster so that mirror maker is unable to create them
automatically? But then it wouldn't be able to create the existing topics
either so that doesn't make sense.

Are there any error messages in your mirror maker logs or on the mirror
maker cluster which point to what the issue might be?

Other than the boostrap servers, my producer settings look as follows:

producer.type=async
compression.codec=0
serializer.class=kafka.serializer.DefaultEncoder
max.message.size=10000000
queue.time=1000
queue.enqueueTimeout.ms=-1



Cheers!




On Fri, Aug 26, 2016 at 6:08 AM, UMESH CHAUDHARY <um...@gmail.com>
wrote:

> Hello Mate,
> Thanks for your detailed response and it surely helps.
>
> WhiteList is the required config for MM from 0.9.0 onwards. And you are
> correct that --new-consumer requires --bootstrap-servers rather than
> --zookeeper .
>
> However, did you notice that MM picks the topics which are present at the
> time of its startup and mirrors the data. When you add some new topics
> after its startup it doesn't pick it automatically?
>
> Regards,
> Umesh Chaudhary
>
> On Thu, 25 Aug 2016 at 19:23 cs user <ac...@gmail.com> wrote:
>
> > Hi Umesh,
> >
> > I am new to kafka as well, and configuring the MirrorMaker. I got mine
> > working in the following way.
> >
> > I run the mirror maker instance on the mirror cluster, as in where you
> want
> > to mirror the topics to, although I'm not sure it matters.
> >
> > I use the following options when starting my service (systemd file):
> >
> > KAFKA_RUN="/opt/kafka/bin/kafka-run-class.sh"
> > KAFKA_ARGS="kafka.tools.MirrorMaker"
> > KAFKA_CONFIG="--new.consumer --offset.commit.interval.ms=5000
> > --consumer.config /opt/kafka/config/consumer-mirror1.properties
> > --producer.config /opt/kafka/config/producer.properties
> --whitelist=\".*\""
> >
> > Without the --new.consumer parameter, the --consumer.config and
> > producer.config files need to contain the zookeeper config for relevant
> > clusters. When using the --new.consumer switch this is no longer needed
> (as
> > I understand it).
> >
> > The consumer config points at my source cluster, the producer config
> points
> > locally, to my mirror cluster. I think it's also important to configure
> the
> > whitelist to tell it which topics you want to mirror, in my case I mirror
> > all of them with a wildcard.
> >
> > Not much config in the consumer.config and producer.config files apart
> from
> > the bootstrap.servers list, pointing at the relevant cluster. I have 3
> > brokers in my mirror cluster and each one of them runs the same mirror
> > maker service so one will take over if another one fails.
> >
> > I hope someone will correct me if I am wrong about anything, and
> hopefully
> > this will help!
> >
> > Cheers!
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Aug 25, 2016 at 9:36 AM, UMESH CHAUDHARY <um...@gmail.com>
> > wrote:
> >
> > > Hey Folks,
> > > I was trying to understand the behavior of MirrorMaker but looks like I
> > am
> > > missing something here. Please see the steps which I performed :
> > >
> > > 1) I configured MM on source Kafka cluster
> > > 2) Created a topic and pushed some data in it using console producer.
> > > 3) My understanding is that MM would start mirroring the data (which is
> > > there in the topic) based on "offsetCommitIntervalMs" and it would be
> > there
> > > in destination cluster.
> > >
> > > https://github.com/apache/kafka/blob/0.9.0/core/src/
> > > main/scala/kafka/tools/MirrorMaker.scala#L503
> > >
> > > 4) But when I list the topics on destination, I cant see the topic
> which
> > I
> > > recently created on source.
> > > 5) I tried to check the offset of "mirrormaker_group" for that topic
> (on
> > > source cluster) using kafka.admin.ConsumerGroupCommand, I see the
> offsets
> > > for that topic as "unknown".
> > > 6) But when I start console consumer for that topic on source or
> > > destination (auto creation of topic is true), I see that all data in
> > being
> > > mirrored via MM and kafka.admin.ConsumerGroupCommand tells the right
> > > offsets this time.
> > >
> > > Is this expected behavior of MM or did I mess up with some
> configuration?
> > >
> > > Regards,
> > > Umesh
> > >
> >
>

Re: Question regarding functionality of MirrorMaker

Posted by UMESH CHAUDHARY <um...@gmail.com>.
Hello Mate,
Thanks for your detailed response and it surely helps.

WhiteList is the required config for MM from 0.9.0 onwards. And you are
correct that --new-consumer requires --bootstrap-servers rather than
--zookeeper .

However, did you notice that MM picks the topics which are present at the
time of its startup and mirrors the data. When you add some new topics
after its startup it doesn't pick it automatically?

Regards,
Umesh Chaudhary

On Thu, 25 Aug 2016 at 19:23 cs user <ac...@gmail.com> wrote:

> Hi Umesh,
>
> I am new to kafka as well, and configuring the MirrorMaker. I got mine
> working in the following way.
>
> I run the mirror maker instance on the mirror cluster, as in where you want
> to mirror the topics to, although I'm not sure it matters.
>
> I use the following options when starting my service (systemd file):
>
> KAFKA_RUN="/opt/kafka/bin/kafka-run-class.sh"
> KAFKA_ARGS="kafka.tools.MirrorMaker"
> KAFKA_CONFIG="--new.consumer --offset.commit.interval.ms=5000
> --consumer.config /opt/kafka/config/consumer-mirror1.properties
> --producer.config /opt/kafka/config/producer.properties --whitelist=\".*\""
>
> Without the --new.consumer parameter, the --consumer.config and
> producer.config files need to contain the zookeeper config for relevant
> clusters. When using the --new.consumer switch this is no longer needed (as
> I understand it).
>
> The consumer config points at my source cluster, the producer config points
> locally, to my mirror cluster. I think it's also important to configure the
> whitelist to tell it which topics you want to mirror, in my case I mirror
> all of them with a wildcard.
>
> Not much config in the consumer.config and producer.config files apart from
> the bootstrap.servers list, pointing at the relevant cluster. I have 3
> brokers in my mirror cluster and each one of them runs the same mirror
> maker service so one will take over if another one fails.
>
> I hope someone will correct me if I am wrong about anything, and hopefully
> this will help!
>
> Cheers!
>
>
>
>
>
>
>
> On Thu, Aug 25, 2016 at 9:36 AM, UMESH CHAUDHARY <um...@gmail.com>
> wrote:
>
> > Hey Folks,
> > I was trying to understand the behavior of MirrorMaker but looks like I
> am
> > missing something here. Please see the steps which I performed :
> >
> > 1) I configured MM on source Kafka cluster
> > 2) Created a topic and pushed some data in it using console producer.
> > 3) My understanding is that MM would start mirroring the data (which is
> > there in the topic) based on "offsetCommitIntervalMs" and it would be
> there
> > in destination cluster.
> >
> > https://github.com/apache/kafka/blob/0.9.0/core/src/
> > main/scala/kafka/tools/MirrorMaker.scala#L503
> >
> > 4) But when I list the topics on destination, I cant see the topic which
> I
> > recently created on source.
> > 5) I tried to check the offset of "mirrormaker_group" for that topic (on
> > source cluster) using kafka.admin.ConsumerGroupCommand, I see the offsets
> > for that topic as "unknown".
> > 6) But when I start console consumer for that topic on source or
> > destination (auto creation of topic is true), I see that all data in
> being
> > mirrored via MM and kafka.admin.ConsumerGroupCommand tells the right
> > offsets this time.
> >
> > Is this expected behavior of MM or did I mess up with some configuration?
> >
> > Regards,
> > Umesh
> >
>

Re: Question regarding functionality of MirrorMaker

Posted by cs user <ac...@gmail.com>.
Hi Umesh,

I am new to kafka as well, and configuring the MirrorMaker. I got mine
working in the following way.

I run the mirror maker instance on the mirror cluster, as in where you want
to mirror the topics to, although I'm not sure it matters.

I use the following options when starting my service (systemd file):

KAFKA_RUN="/opt/kafka/bin/kafka-run-class.sh"
KAFKA_ARGS="kafka.tools.MirrorMaker"
KAFKA_CONFIG="--new.consumer --offset.commit.interval.ms=5000
--consumer.config /opt/kafka/config/consumer-mirror1.properties
--producer.config /opt/kafka/config/producer.properties --whitelist=\".*\""

Without the --new.consumer parameter, the --consumer.config and
producer.config files need to contain the zookeeper config for relevant
clusters. When using the --new.consumer switch this is no longer needed (as
I understand it).

The consumer config points at my source cluster, the producer config points
locally, to my mirror cluster. I think it's also important to configure the
whitelist to tell it which topics you want to mirror, in my case I mirror
all of them with a wildcard.

Not much config in the consumer.config and producer.config files apart from
the bootstrap.servers list, pointing at the relevant cluster. I have 3
brokers in my mirror cluster and each one of them runs the same mirror
maker service so one will take over if another one fails.

I hope someone will correct me if I am wrong about anything, and hopefully
this will help!

Cheers!







On Thu, Aug 25, 2016 at 9:36 AM, UMESH CHAUDHARY <um...@gmail.com>
wrote:

> Hey Folks,
> I was trying to understand the behavior of MirrorMaker but looks like I am
> missing something here. Please see the steps which I performed :
>
> 1) I configured MM on source Kafka cluster
> 2) Created a topic and pushed some data in it using console producer.
> 3) My understanding is that MM would start mirroring the data (which is
> there in the topic) based on "offsetCommitIntervalMs" and it would be there
> in destination cluster.
>
> https://github.com/apache/kafka/blob/0.9.0/core/src/
> main/scala/kafka/tools/MirrorMaker.scala#L503
>
> 4) But when I list the topics on destination, I cant see the topic which I
> recently created on source.
> 5) I tried to check the offset of "mirrormaker_group" for that topic (on
> source cluster) using kafka.admin.ConsumerGroupCommand, I see the offsets
> for that topic as "unknown".
> 6) But when I start console consumer for that topic on source or
> destination (auto creation of topic is true), I see that all data in being
> mirrored via MM and kafka.admin.ConsumerGroupCommand tells the right
> offsets this time.
>
> Is this expected behavior of MM or did I mess up with some configuration?
>
> Regards,
> Umesh
>