You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Janos Mucza <Ja...@epam.com> on 2014/07/07 16:54:50 UTC

Kafka 0.8.x failover with multiple data centers

Dear Kafka Users,

I would like to use Kafka 0.8.x in a multi-cluster environment so that when my primary cluster fails, producers and consumers could switch to the secondary cluster. Clusters would be hosted in different data centers.

A possibility would be mirroring topics (similar to Kafka 0.7.x mirror maker). The issue with this is consumer offset management, since a mirrored message will probably have different offset than the source message.

Running a single Kafka cluster with nodes in both data centers raises the question of how to ensure a message was persisted by at least one broker in each data center. Even with all in sync replicas ACK requested, the producer can't be sure what brokers persisted a message because in sync replicas might change dynamically.

Could you please share your experience about running Kafka 0.8.x cluster(s) on multiple data centers?

Thank you very much.

Best regards,
Janos

Re: Kafka 0.8.x failover with multiple data centers

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Janos,

The approach we took at LinkedIn is the first option, i.e. using different
clusters at different DC, and mirroring data asynchronously. For the offset
inconsistency issue, our applications usually use the offset request with
the timestamp when primary DC was down and conservatively get an older
offset to start with, and dedup messages at the application level.

Guozhang


On Mon, Jul 7, 2014 at 7:54 AM, Janos Mucza <Ja...@epam.com> wrote:

> Dear Kafka Users,
>
> I would like to use Kafka 0.8.x in a multi-cluster environment so that
> when my primary cluster fails, producers and consumers could switch to the
> secondary cluster. Clusters would be hosted in different data centers.
>
> A possibility would be mirroring topics (similar to Kafka 0.7.x mirror
> maker). The issue with this is consumer offset management, since a mirrored
> message will probably have different offset than the source message.
>
> Running a single Kafka cluster with nodes in both data centers raises the
> question of how to ensure a message was persisted by at least one broker in
> each data center. Even with all in sync replicas ACK requested, the
> producer can't be sure what brokers persisted a message because in sync
> replicas might change dynamically.
>
> Could you please share your experience about running Kafka 0.8.x
> cluster(s) on multiple data centers?
>
> Thank you very much.
>
> Best regards,
> Janos
>



-- 
-- Guozhang