You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Oleg Dulin <ol...@gmail.com> on 2013/09/18 22:41:12 UTC

Need help configuring WAN replication over slow WAN

Here is a problem:

My customer has a 45Megabit connection to their off-site DR data 
center. They have about 500G worth of data. That connection is shared. 
Needless to say this is not an optimal configuration.

To replicate all that in real time it'll take a week.

My primary cluster is 4 nodes, RF=2. DR cluster is also 4 nodes, RF=2.

I need a way to somehow setup the primary cluster, populate all the 
data, then transfer it to the DR cluster.



One suggestion is:

1) Setup the primary cluster, plus configure a Mac Mini as a backup 
data center but on the same network
2) Populate the data
3) Physically take Mac Mini to the DR data center and transfer its data 
to one of the nodes  and then run nodetool cleanup to move the data 
around amongs nodes.

Now… this doesn't strike me as optimal. I feel like I'll need to run 
repair on the new cluster, which defeats the purpose -- it'll just hog 
the 45Megabit pipe…

Somehow I need away to load all the data into primary cluster, then 
ship it over to the backup in a more timely fashion…

Any suggestions are greatly appreciated.

Also,  I need a way to know if the replication is up to date or not.

-- 
Regards,
Oleg Dulin
http://www.olegdulin.com