You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Murtaza Doctor <mu...@richrelevance.com> on 2012/08/01 00:01:45 UTC

Deployment Options for Kafka

Hello Folks,

We are trying to work out the Kafka deployment topology and need some feedback from the community:

Kafka Version: 0.7 (under test) and Zookeper to maintain offsets.

At present, we have multiple live data centers (DC) just like LinkedIn and a single backend data center. All the live colos are identical in terms of the capacity and can serve the same functionality. We want to stream events from each of these live DC's to the backend DC. I am assuming we have 3 different options:

  1.  Have kafka brokers in each of the live DC's and then have kafka brokers on the backend DC in a mirrored setup as described in this link: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring which Jay has put together. This probably mimics what LinkedIn has, please correct me if I am wrong.
  2.  Have a set of Kafka brokers in our backend DC and just have the producers asynchronously send their messages to the broker. The option does have lot of gaping holes specifically if the link goes down, In that event we will have message loss specifically in the absence of message replication which will be supported in 0.8 [Note we are going in production with 0.7]
  3.  Have brokers in each of the live data centers and then just have consumers which reach from each of the brokers or in parallel from their respective offsets. NO mirroring setup, but have the consumer be responsible to read the messages.

Would love to hear feedback on each of these deployment strategies and their pros & cons. Additionally if there are other mechanisms to support this which I have missed.

Thanks,
murtaza

Re: Deployment Options for Kafka

Posted by Joel Koshy <jj...@gmail.com>.
Just updated the old wiki with this note.

Thanks,

Joel

On Wed, Aug 1, 2012 at 10:52 AM, Jun Rao <ju...@gmail.com> wrote:

> The biggest change is wildcard support in consumer. In 0.7.1, mirror maker
> replaces the embedded consumer.
>
> Thanks,
>
> Jun
>
> On Wed, Aug 1, 2012 at 10:49 AM, Murtaza Doctor
> <mu...@richrelevance.com>wrote:
>
> > Thanks Jun.
> >
> > One quick follow-up.
> >
> > How different are the 0.7.1 capabilities from the earlier version  0.7,
> > since that is the one we are on?
> >
> > Does the following instructions still hold true for 0.7.1:
> > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring
> >
> >
> > Thanks,
> > murtaza
> >
> > On 8/1/12 7:55 AM, "Jun Rao" <ju...@gmail.com> wrote:
> >
> > >At LinkedIn, we are using the mirror maker tool in 0.7.1 for mirroring
> > >across DC.
> > >
> >
> >
>

Re: Deployment Options for Kafka

Posted by Jun Rao <ju...@gmail.com>.
The biggest change is wildcard support in consumer. In 0.7.1, mirror maker
replaces the embedded consumer.

Thanks,

Jun

On Wed, Aug 1, 2012 at 10:49 AM, Murtaza Doctor
<mu...@richrelevance.com>wrote:

> Thanks Jun.
>
> One quick follow-up.
>
> How different are the 0.7.1 capabilities from the earlier version  0.7,
> since that is the one we are on?
>
> Does the following instructions still hold true for 0.7.1:
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring
>
>
> Thanks,
> murtaza
>
> On 8/1/12 7:55 AM, "Jun Rao" <ju...@gmail.com> wrote:
>
> >At LinkedIn, we are using the mirror maker tool in 0.7.1 for mirroring
> >across DC.
> >
>
>

Re: Deployment Options for Kafka

Posted by Murtaza Doctor <mu...@richrelevance.com>.
Thanks Jun.

One quick follow-up.

How different are the 0.7.1 capabilities from the earlier version  0.7,
since that is the one we are on?

Does the following instructions still hold true for 0.7.1:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring


Thanks,
murtaza

On 8/1/12 7:55 AM, "Jun Rao" <ju...@gmail.com> wrote:

>At LinkedIn, we are using the mirror maker tool in 0.7.1 for mirroring
>across DC.
>


Re: Deployment Options for Kafka

Posted by Jun Rao <ju...@gmail.com>.
Murtaza,

At LinkedIn, we are using the mirror maker tool in 0.7.1 for mirroring
across DC.

Thanks,

Jun

On Tue, Jul 31, 2012 at 3:01 PM, Murtaza Doctor
<mu...@richrelevance.com>wrote:

> Hello Folks,
>
> We are trying to work out the Kafka deployment topology and need some
> feedback from the community:
>
> Kafka Version: 0.7 (under test) and Zookeper to maintain offsets.
>
> At present, we have multiple live data centers (DC) just like LinkedIn and
> a single backend data center. All the live colos are identical in terms of
> the capacity and can serve the same functionality. We want to stream events
> from each of these live DC's to the backend DC. I am assuming we have 3
> different options:
>
>   1.  Have kafka brokers in each of the live DC's and then have kafka
> brokers on the backend DC in a mirrored setup as described in this link:
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring which
> Jay has put together. This probably mimics what LinkedIn has, please
> correct me if I am wrong.
>   2.  Have a set of Kafka brokers in our backend DC and just have the
> producers asynchronously send their messages to the broker. The option does
> have lot of gaping holes specifically if the link goes down, In that event
> we will have message loss specifically in the absence of message
> replication which will be supported in 0.8 [Note we are going in production
> with 0.7]
>   3.  Have brokers in each of the live data centers and then just have
> consumers which reach from each of the brokers or in parallel from their
> respective offsets. NO mirroring setup, but have the consumer be
> responsible to read the messages.
>
> Would love to hear feedback on each of these deployment strategies and
> their pros & cons. Additionally if there are other mechanisms to support
> this which I have missed.
>
> Thanks,
> murtaza
>