You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by ade brogan <ad...@gmail.com> on 2018/06/30 16:55:46 UTC

Kafka Streams - Shared Ktable State Store

Hi

We are using kafka streams api. We have come across an issue which we need
a short term fix for.

We host 2 independent kafka clusters in 2 different data centres and these
clusters are mirrored using mirrormaker. The problem this brings is that we
cannot use kTables backed by a rocksDB because when traffic flips from one
datacentre to the other, the local state will not be populated with
previous state from the other datacentre.

This would not be an issue if we had one logical kafka cluster. We are
moving to this setup, but in the short term need a solution.

The kafka docs say that you can plug in a custom database for ktables, so
we are exploring the possibility of using a shared database (e.g. mongo, or
oracle) to store the state. This database is visible by both kafka clusters
and acts as a single source of truth.

I guess my question is does anybody see an reason why this would not work?

Thanks
Ade

Re: Kafka Streams - Shared Ktable State Store

Posted by Guozhang Wang <wa...@gmail.com>.
Hello,

Thanks for sharing your use scenarios, this seems a common multi-DC
deployment question where you want to have a smooth process upon Kafka
cluster failover. I'd like to ask where is your Streams application
sitting? Is it sitting in one of the data centers or it is sitting in
another different one? And when the traffic flips (e.g. the original Kafka
broker cluster is no longer available), does the application needs to
migrate as well?

Assuming the answer is "no", then I'm not sure why your local state cannot
be populated from the other data center: as long as all topics, including
the changelog topics are replicated across these two different clusters,
then you can still bootstrap your local state from the changelog topics of
the other cluster, right?


Guozhang


On Sat, Jun 30, 2018 at 9:55 AM, ade brogan <ad...@gmail.com> wrote:

> Hi
>
> We are using kafka streams api. We have come across an issue which we need
> a short term fix for.
>
> We host 2 independent kafka clusters in 2 different data centres and these
> clusters are mirrored using mirrormaker. The problem this brings is that we
> cannot use kTables backed by a rocksDB because when traffic flips from one
> datacentre to the other, the local state will not be populated with
> previous state from the other datacentre.
>
> This would not be an issue if we had one logical kafka cluster. We are
> moving to this setup, but in the short term need a solution.
>
> The kafka docs say that you can plug in a custom database for ktables, so
> we are exploring the possibility of using a shared database (e.g. mongo, or
> oracle) to store the state. This database is visible by both kafka clusters
> and acts as a single source of truth.
>
> I guess my question is does anybody see an reason why this would not work?
>
> Thanks
> Ade
>



-- 
-- Guozhang