You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Carey, Lynda" <Ly...@Six3Systems.com> on 2011/12/12 20:57:47 UTC

Kafak Mirroring Questions

Hi!  I've been tasked with setting up a kafka cluster and mirroring.  I've set up a small cluster of just two machines.  I am running zookeeper, two brokers, 1 producer and 3 consumers.  Each consumer is in it's own group and all are reading from the same topic.  So, with this setup, I'm using simple console producer that reads from the command line and writes to kafka.  The consumer read the messages fine. 

Now, I need to set up a mirroring cluster.  I've read the wiki document about kafka mirroring and it leaves me a little confused.  I understand that the mirror cluster uses an embedded consumer to read from the source cluster and writes the messages to kafka on the mirrored cluster.  What's not clear to me is what else needs to be established on the mirrored cluster?  Do I only need one embedded consumer and producer on the mirror to get all the messages (regardless of topic/broker/etc)?  of do I need a pair of embedded consumer and producer for each topic?  What is reading the messages on the mirrored cluster? -- do I need to deploy the same consumers there as on the source cluster?  Is there any other documentation regarding?

Any information you can give me would be awesome.  I'm just not getting it from the documentation alone.

Thanks
_______________________________
Lynda Carey, Software Engineer
Six3 Systems, Enterprise Systems Division

This e-mail message is for the sole use of the intended recipient(s) and may contain Six3 Systems Private or Six3 Systems Proprietary information. Any unauthorized review, use, disclosure, or distribution is prohibited. If you are not an intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.


Re: Kafak Mirroring Questions

Posted by Tim Lossen <ti...@lossen.de>.
lynda,

another possibility is to simply rsync the segment files
periodically to another machine, and fire up a "standby"
kafka there if the main one fails.

we use this setup on 0.6 in production, and it works quite
well (but you can lose a few messages, of course).

cheers
tim


On 2011-12-13, at 18:33 , Neha Narkhede wrote:

>>> However, when I try to submit three properties files, I get a usage
> error (see below).
> 
> The wiki doesn't clearly state this, but it is written for Kafka v0.7 (to
> be released soon). The setup instructions do not work with 0.6, since in
> that version, the mirror didn't use a producer to write the data, it merely
> used internal Kafka APIs to write to the local broker.
> 
> We will update the wiki to reflect that, for now, would it work for you to
> try this out on trunk ? Another option is using RC7 from branch
> kafka/branches/0.7.
> 
> Thanks,
> Neha
> 
> On Tue, Dec 13, 2011 at 9:07 AM, Carey, Lynda
> <Ly...@six3systems.com>wrote:
> 
>> So, I've read all the documents that were suggested and now I'm trying to
>> follow the instructions for "How to set up a Mirror" on this site:
>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring.  The
>> instructions state that you provide the broker with the 3 properties files.
>> However, when I try to submit three properties files, I get a usage error
>> (see below).
>> 
>> [kafka %]  JMX_PORT=8888 bin/kafka-server-start.sh
>> config/mirror-server.properties config/mirror-consumer.properties
>> config/mirror-producer.properties
>> 
>> USAGE: java [options] KafkaServer server.properties [consumer.properties
>> 
>> 
>> I am using kafka-0.6, downloaded from
>> http://sna-projects.com/kafka/downloads.php
>> 
>> 
>> _______________________________
>> Lynda Carey, Software Engineer
>> Six3 Systems, Enterprise Systems Division
>> 301-206-6000 (Office)
>> 410-262-4942(Cell)
>> 
>> This e-mail message is for the sole use of the intended recipient(s) and
>> may contain Six3 Systems Private or Six3 Systems Proprietary information.
>> Any unauthorized review, use, disclosure, or distribution is prohibited. If
>> you are not an intended recipient, please contact the sender by reply
>> e-mail and destroy all copies of the original message.
>> 
>> 
>> ________________________________________
>> From: Jay Kreps [jay.kreps@gmail.com]
>> Sent: Monday, December 12, 2011 11:49 PM
>> To: kafka-users@incubator.apache.org
>> Subject: Re: Kafak Mirroring Questions
>> 
>> Hi Olivier,
>> 
>> No the mirrored cluster is essentially a completely different cluster. It
>> may have a different number of partitions or servers and there is no
>> correspondence between offsets.
>> 
>> -Jay
>> 
>> On Mon, Dec 12, 2011 at 8:35 PM, Olivier Pomel <ol...@datadoghq.com> wrote:
>> 
>>> One more question: will the offsets for individual messages in the
>>> master and mirror always be the same? In other words, would a failover
>>> be completely transparent to consumers that may persist state linked
>>> to specific offsets?
>>> Thanks,
>>> Olivier.
>>> 
>>> On Mon, Dec 12, 2011 at 8:18 PM, Jun Rao <ju...@gmail.com> wrote:
>>>> You can find more information in the patch of this jira:
>>>> https://issues.apache.org/jira/browse/KAFKA-199
>>>> 
>>>> Thanks,
>>>> 
>>>> Jun
>>>> 
>>>> On Mon, Dec 12, 2011 at 11:57 AM, Carey, Lynda
>>>> <Ly...@six3systems.com>wrote:
>>>> 
>>>>> Hi!  I've been tasked with setting up a kafka cluster and mirroring.
>>> I've
>>>>> set up a small cluster of just two machines.  I am running zookeeper,
>>> two
>>>>> brokers, 1 producer and 3 consumers.  Each consumer is in it's own
>> group
>>>>> and all are reading from the same topic.  So, with this setup, I'm
>> using
>>>>> simple console producer that reads from the command line and writes to
>>>>> kafka.  The consumer read the messages fine.
>>>>> 
>>>>> Now, I need to set up a mirroring cluster.  I've read the wiki
>> document
>>>>> about kafka mirroring and it leaves me a little confused.  I
>> understand
>>>>> that the mirror cluster uses an embedded consumer to read from the
>>> source
>>>>> cluster and writes the messages to kafka on the mirrored cluster.
>>> What's
>>>>> not clear to me is what else needs to be established on the mirrored
>>>>> cluster?  Do I only need one embedded consumer and producer on the
>>> mirror
>>>>> to get all the messages (regardless of topic/broker/etc)?  of do I
>> need
>>> a
>>>>> pair of embedded consumer and producer for each topic?  What is
>> reading
>>> the
>>>>> messages on the mirrored cluster? -- do I need to deploy the same
>>> consumers
>>>>> there as on the source cluster?  Is there any other documentation
>>> regarding?
>>>>> 
>>>>> Any information you can give me would be awesome.  I'm just not
>> getting
>>> it
>>>>> from the documentation alone.
>>>>> 
>>>>> Thanks
>>>>> _______________________________
>>>>> Lynda Carey, Software Engineer
>>>>> Six3 Systems, Enterprise Systems Division
>>>>> 
>>>>> This e-mail message is for the sole use of the intended recipient(s)
>> and
>>>>> may contain Six3 Systems Private or Six3 Systems Proprietary
>>> information.
>>>>> Any unauthorized review, use, disclosure, or distribution is
>>> prohibited. If
>>>>> you are not an intended recipient, please contact the sender by reply
>>>>> e-mail and destroy all copies of the original message.
>>>>> 
>>>>> 
>>> 
>> 

--
http://tim.lossen.de




Re: Kafak Mirroring Questions

Posted by Neha Narkhede <ne...@gmail.com>.
>> However, when I try to submit three properties files, I get a usage
error (see below).

The wiki doesn't clearly state this, but it is written for Kafka v0.7 (to
be released soon). The setup instructions do not work with 0.6, since in
that version, the mirror didn't use a producer to write the data, it merely
used internal Kafka APIs to write to the local broker.

We will update the wiki to reflect that, for now, would it work for you to
try this out on trunk ? Another option is using RC7 from branch
kafka/branches/0.7.

Thanks,
Neha

On Tue, Dec 13, 2011 at 9:07 AM, Carey, Lynda
<Ly...@six3systems.com>wrote:

> So, I've read all the documents that were suggested and now I'm trying to
> follow the instructions for "How to set up a Mirror" on this site:
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring.  The
> instructions state that you provide the broker with the 3 properties files.
>  However, when I try to submit three properties files, I get a usage error
> (see below).
>
> [kafka %]  JMX_PORT=8888 bin/kafka-server-start.sh
> config/mirror-server.properties config/mirror-consumer.properties
> config/mirror-producer.properties
>
> USAGE: java [options] KafkaServer server.properties [consumer.properties
>
>
> I am using kafka-0.6, downloaded from
> http://sna-projects.com/kafka/downloads.php
>
>
> _______________________________
> Lynda Carey, Software Engineer
> Six3 Systems, Enterprise Systems Division
> 301-206-6000 (Office)
> 410-262-4942(Cell)
>
> This e-mail message is for the sole use of the intended recipient(s) and
> may contain Six3 Systems Private or Six3 Systems Proprietary information.
> Any unauthorized review, use, disclosure, or distribution is prohibited. If
> you are not an intended recipient, please contact the sender by reply
> e-mail and destroy all copies of the original message.
>
>
> ________________________________________
> From: Jay Kreps [jay.kreps@gmail.com]
> Sent: Monday, December 12, 2011 11:49 PM
> To: kafka-users@incubator.apache.org
> Subject: Re: Kafak Mirroring Questions
>
> Hi Olivier,
>
> No the mirrored cluster is essentially a completely different cluster. It
> may have a different number of partitions or servers and there is no
> correspondence between offsets.
>
> -Jay
>
> On Mon, Dec 12, 2011 at 8:35 PM, Olivier Pomel <ol...@datadoghq.com> wrote:
>
> > One more question: will the offsets for individual messages in the
> > master and mirror always be the same? In other words, would a failover
> > be completely transparent to consumers that may persist state linked
> > to specific offsets?
> > Thanks,
> > Olivier.
> >
> > On Mon, Dec 12, 2011 at 8:18 PM, Jun Rao <ju...@gmail.com> wrote:
> > > You can find more information in the patch of this jira:
> > > https://issues.apache.org/jira/browse/KAFKA-199
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Mon, Dec 12, 2011 at 11:57 AM, Carey, Lynda
> > > <Ly...@six3systems.com>wrote:
> > >
> > >> Hi!  I've been tasked with setting up a kafka cluster and mirroring.
> >  I've
> > >> set up a small cluster of just two machines.  I am running zookeeper,
> > two
> > >> brokers, 1 producer and 3 consumers.  Each consumer is in it's own
> group
> > >> and all are reading from the same topic.  So, with this setup, I'm
> using
> > >> simple console producer that reads from the command line and writes to
> > >> kafka.  The consumer read the messages fine.
> > >>
> > >> Now, I need to set up a mirroring cluster.  I've read the wiki
> document
> > >> about kafka mirroring and it leaves me a little confused.  I
> understand
> > >> that the mirror cluster uses an embedded consumer to read from the
> > source
> > >> cluster and writes the messages to kafka on the mirrored cluster.
> >  What's
> > >> not clear to me is what else needs to be established on the mirrored
> > >> cluster?  Do I only need one embedded consumer and producer on the
> > mirror
> > >> to get all the messages (regardless of topic/broker/etc)?  of do I
> need
> > a
> > >> pair of embedded consumer and producer for each topic?  What is
> reading
> > the
> > >> messages on the mirrored cluster? -- do I need to deploy the same
> > consumers
> > >> there as on the source cluster?  Is there any other documentation
> > regarding?
> > >>
> > >> Any information you can give me would be awesome.  I'm just not
> getting
> > it
> > >> from the documentation alone.
> > >>
> > >> Thanks
> > >> _______________________________
> > >> Lynda Carey, Software Engineer
> > >> Six3 Systems, Enterprise Systems Division
> > >>
> > >> This e-mail message is for the sole use of the intended recipient(s)
> and
> > >> may contain Six3 Systems Private or Six3 Systems Proprietary
> > information.
> > >> Any unauthorized review, use, disclosure, or distribution is
> > prohibited. If
> > >> you are not an intended recipient, please contact the sender by reply
> > >> e-mail and destroy all copies of the original message.
> > >>
> > >>
> >
>

RE: Kafak Mirroring Questions

Posted by "Carey, Lynda" <Ly...@Six3Systems.com>.
So, I've read all the documents that were suggested and now I'm trying to follow the instructions for "How to set up a Mirror" on this site: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring.  The instructions state that you provide the broker with the 3 properties files.  However, when I try to submit three properties files, I get a usage error (see below).  

[kafka %]  JMX_PORT=8888 bin/kafka-server-start.sh config/mirror-server.properties config/mirror-consumer.properties config/mirror-producer.properties

USAGE: java [options] KafkaServer server.properties [consumer.properties


I am using kafka-0.6, downloaded from http://sna-projects.com/kafka/downloads.php


_______________________________
Lynda Carey, Software Engineer
Six3 Systems, Enterprise Systems Division
301-206-6000 (Office)
410-262-4942(Cell)

This e-mail message is for the sole use of the intended recipient(s) and may contain Six3 Systems Private or Six3 Systems Proprietary information. Any unauthorized review, use, disclosure, or distribution is prohibited. If you are not an intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.


________________________________________
From: Jay Kreps [jay.kreps@gmail.com]
Sent: Monday, December 12, 2011 11:49 PM
To: kafka-users@incubator.apache.org
Subject: Re: Kafak Mirroring Questions

Hi Olivier,

No the mirrored cluster is essentially a completely different cluster. It
may have a different number of partitions or servers and there is no
correspondence between offsets.

-Jay

On Mon, Dec 12, 2011 at 8:35 PM, Olivier Pomel <ol...@datadoghq.com> wrote:

> One more question: will the offsets for individual messages in the
> master and mirror always be the same? In other words, would a failover
> be completely transparent to consumers that may persist state linked
> to specific offsets?
> Thanks,
> Olivier.
>
> On Mon, Dec 12, 2011 at 8:18 PM, Jun Rao <ju...@gmail.com> wrote:
> > You can find more information in the patch of this jira:
> > https://issues.apache.org/jira/browse/KAFKA-199
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Dec 12, 2011 at 11:57 AM, Carey, Lynda
> > <Ly...@six3systems.com>wrote:
> >
> >> Hi!  I've been tasked with setting up a kafka cluster and mirroring.
>  I've
> >> set up a small cluster of just two machines.  I am running zookeeper,
> two
> >> brokers, 1 producer and 3 consumers.  Each consumer is in it's own group
> >> and all are reading from the same topic.  So, with this setup, I'm using
> >> simple console producer that reads from the command line and writes to
> >> kafka.  The consumer read the messages fine.
> >>
> >> Now, I need to set up a mirroring cluster.  I've read the wiki document
> >> about kafka mirroring and it leaves me a little confused.  I understand
> >> that the mirror cluster uses an embedded consumer to read from the
> source
> >> cluster and writes the messages to kafka on the mirrored cluster.
>  What's
> >> not clear to me is what else needs to be established on the mirrored
> >> cluster?  Do I only need one embedded consumer and producer on the
> mirror
> >> to get all the messages (regardless of topic/broker/etc)?  of do I need
> a
> >> pair of embedded consumer and producer for each topic?  What is reading
> the
> >> messages on the mirrored cluster? -- do I need to deploy the same
> consumers
> >> there as on the source cluster?  Is there any other documentation
> regarding?
> >>
> >> Any information you can give me would be awesome.  I'm just not getting
> it
> >> from the documentation alone.
> >>
> >> Thanks
> >> _______________________________
> >> Lynda Carey, Software Engineer
> >> Six3 Systems, Enterprise Systems Division
> >>
> >> This e-mail message is for the sole use of the intended recipient(s) and
> >> may contain Six3 Systems Private or Six3 Systems Proprietary
> information.
> >> Any unauthorized review, use, disclosure, or distribution is
> prohibited. If
> >> you are not an intended recipient, please contact the sender by reply
> >> e-mail and destroy all copies of the original message.
> >>
> >>
>

Re: Kafak Mirroring Questions

Posted by Jay Kreps <ja...@gmail.com>.
Hi Olivier,

No the mirrored cluster is essentially a completely different cluster. It
may have a different number of partitions or servers and there is no
correspondence between offsets.

-Jay

On Mon, Dec 12, 2011 at 8:35 PM, Olivier Pomel <ol...@datadoghq.com> wrote:

> One more question: will the offsets for individual messages in the
> master and mirror always be the same? In other words, would a failover
> be completely transparent to consumers that may persist state linked
> to specific offsets?
> Thanks,
> Olivier.
>
> On Mon, Dec 12, 2011 at 8:18 PM, Jun Rao <ju...@gmail.com> wrote:
> > You can find more information in the patch of this jira:
> > https://issues.apache.org/jira/browse/KAFKA-199
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Dec 12, 2011 at 11:57 AM, Carey, Lynda
> > <Ly...@six3systems.com>wrote:
> >
> >> Hi!  I've been tasked with setting up a kafka cluster and mirroring.
>  I've
> >> set up a small cluster of just two machines.  I am running zookeeper,
> two
> >> brokers, 1 producer and 3 consumers.  Each consumer is in it's own group
> >> and all are reading from the same topic.  So, with this setup, I'm using
> >> simple console producer that reads from the command line and writes to
> >> kafka.  The consumer read the messages fine.
> >>
> >> Now, I need to set up a mirroring cluster.  I've read the wiki document
> >> about kafka mirroring and it leaves me a little confused.  I understand
> >> that the mirror cluster uses an embedded consumer to read from the
> source
> >> cluster and writes the messages to kafka on the mirrored cluster.
>  What's
> >> not clear to me is what else needs to be established on the mirrored
> >> cluster?  Do I only need one embedded consumer and producer on the
> mirror
> >> to get all the messages (regardless of topic/broker/etc)?  of do I need
> a
> >> pair of embedded consumer and producer for each topic?  What is reading
> the
> >> messages on the mirrored cluster? -- do I need to deploy the same
> consumers
> >> there as on the source cluster?  Is there any other documentation
> regarding?
> >>
> >> Any information you can give me would be awesome.  I'm just not getting
> it
> >> from the documentation alone.
> >>
> >> Thanks
> >> _______________________________
> >> Lynda Carey, Software Engineer
> >> Six3 Systems, Enterprise Systems Division
> >>
> >> This e-mail message is for the sole use of the intended recipient(s) and
> >> may contain Six3 Systems Private or Six3 Systems Proprietary
> information.
> >> Any unauthorized review, use, disclosure, or distribution is
> prohibited. If
> >> you are not an intended recipient, please contact the sender by reply
> >> e-mail and destroy all copies of the original message.
> >>
> >>
>

Re: Kafak Mirroring Questions

Posted by Olivier Pomel <ol...@datadoghq.com>.
One more question: will the offsets for individual messages in the
master and mirror always be the same? In other words, would a failover
be completely transparent to consumers that may persist state linked
to specific offsets?
Thanks,
Olivier.

On Mon, Dec 12, 2011 at 8:18 PM, Jun Rao <ju...@gmail.com> wrote:
> You can find more information in the patch of this jira:
> https://issues.apache.org/jira/browse/KAFKA-199
>
> Thanks,
>
> Jun
>
> On Mon, Dec 12, 2011 at 11:57 AM, Carey, Lynda
> <Ly...@six3systems.com>wrote:
>
>> Hi!  I've been tasked with setting up a kafka cluster and mirroring.  I've
>> set up a small cluster of just two machines.  I am running zookeeper, two
>> brokers, 1 producer and 3 consumers.  Each consumer is in it's own group
>> and all are reading from the same topic.  So, with this setup, I'm using
>> simple console producer that reads from the command line and writes to
>> kafka.  The consumer read the messages fine.
>>
>> Now, I need to set up a mirroring cluster.  I've read the wiki document
>> about kafka mirroring and it leaves me a little confused.  I understand
>> that the mirror cluster uses an embedded consumer to read from the source
>> cluster and writes the messages to kafka on the mirrored cluster.  What's
>> not clear to me is what else needs to be established on the mirrored
>> cluster?  Do I only need one embedded consumer and producer on the mirror
>> to get all the messages (regardless of topic/broker/etc)?  of do I need a
>> pair of embedded consumer and producer for each topic?  What is reading the
>> messages on the mirrored cluster? -- do I need to deploy the same consumers
>> there as on the source cluster?  Is there any other documentation regarding?
>>
>> Any information you can give me would be awesome.  I'm just not getting it
>> from the documentation alone.
>>
>> Thanks
>> _______________________________
>> Lynda Carey, Software Engineer
>> Six3 Systems, Enterprise Systems Division
>>
>> This e-mail message is for the sole use of the intended recipient(s) and
>> may contain Six3 Systems Private or Six3 Systems Proprietary information.
>> Any unauthorized review, use, disclosure, or distribution is prohibited. If
>> you are not an intended recipient, please contact the sender by reply
>> e-mail and destroy all copies of the original message.
>>
>>

Re: Kafak Mirroring Questions

Posted by Jun Rao <ju...@gmail.com>.
You can find more information in the patch of this jira:
https://issues.apache.org/jira/browse/KAFKA-199

Thanks,

Jun

On Mon, Dec 12, 2011 at 11:57 AM, Carey, Lynda
<Ly...@six3systems.com>wrote:

> Hi!  I've been tasked with setting up a kafka cluster and mirroring.  I've
> set up a small cluster of just two machines.  I am running zookeeper, two
> brokers, 1 producer and 3 consumers.  Each consumer is in it's own group
> and all are reading from the same topic.  So, with this setup, I'm using
> simple console producer that reads from the command line and writes to
> kafka.  The consumer read the messages fine.
>
> Now, I need to set up a mirroring cluster.  I've read the wiki document
> about kafka mirroring and it leaves me a little confused.  I understand
> that the mirror cluster uses an embedded consumer to read from the source
> cluster and writes the messages to kafka on the mirrored cluster.  What's
> not clear to me is what else needs to be established on the mirrored
> cluster?  Do I only need one embedded consumer and producer on the mirror
> to get all the messages (regardless of topic/broker/etc)?  of do I need a
> pair of embedded consumer and producer for each topic?  What is reading the
> messages on the mirrored cluster? -- do I need to deploy the same consumers
> there as on the source cluster?  Is there any other documentation regarding?
>
> Any information you can give me would be awesome.  I'm just not getting it
> from the documentation alone.
>
> Thanks
> _______________________________
> Lynda Carey, Software Engineer
> Six3 Systems, Enterprise Systems Division
>
> This e-mail message is for the sole use of the intended recipient(s) and
> may contain Six3 Systems Private or Six3 Systems Proprietary information.
> Any unauthorized review, use, disclosure, or distribution is prohibited. If
> you are not an intended recipient, please contact the sender by reply
> e-mail and destroy all copies of the original message.
>
>