You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Luke Kysow <lu...@hootsuite.com> on 2015/07/24 02:47:50 UTC

Disaster Recovery

Hello All, would very much appreciate your thoughts and experiences on
backup, restore, and disaster recovery.

In the confluent.io docs (
http://docs.confluent.io/1.0/kafka/post-deployment.html under Backup &
Restoration) it says the best way to backup your cluster is to set up a
mirror.

1. Given a mirror (call it B), if all brokers in the main cluster (A) go
down, how would I bring it back up with the mirror? Would I

   - bring up a new A cluster and set it to mirror from B but not allow it
   to be produced to and not connect it to the A zookeeper (I assume
   connecting it to the A zookeeper while it is mirroring from B would cause
   issues)
   - when it's in-sync, stop the mirroring. Then allow it to be produced to
   again and for B to mirror from A again.


2. Is it possible to rebuild a cluster using EBS Snapshots (we're in AWS)?

   - Assume all kafka brokers go down
   - Spin up a new broker and attach its latest snapshot
   - Spin up new follower brokers and wait for them to replicate

What happens to zookeeper during this period? Also I assume we can't bring
up all three brokers from snapshots because their data would be out of sync
and I'm not sure if Kafka can handle this situation.

3. What else are you guys doing for disaster recovery?

Thanks in advance for any help.


-- 
   <http://hootsuite.com>
*Luke Kysow*
Software Engineer | Hootsuite <https://www.hootsuite.com>
We are hiring in a *big* way! Apply now <http://hootsuite.com/careers>