You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by David M <da...@gmail.com> on 2014/09/10 02:49:33 UTC

cassandra on own distributed network

Hi everyone

I am at a loss for locating use cases/examples/documentation/books/etc for
deploying Cassandra where multi-dc nodes of a single cluster are on your
own network at points around the world.
In my example a Cassandra dc equates to a building.

Of interest to me is how installations are inter-connecting their dcs
(circuit bandwidth, latency requirements) for optimal
replication/gossip/etc and any lessons learned they can share.

I know there isn't going to be a single config that applies to every
deployment/usage pattern/etc but surely there are at least loose rules of
thumb that will get me going (or maybe alternative deployments).

The interesting posts/blogs/books/etc seem to reference Cassandra in the
cloud (eg specifying AWS instance types) leaving out
descriptions/usage/requirements at the network layer.
If anyone knows of any information on this topic that I've missed I'd
appreciate your sharing.


Thanks,
David

Re: cassandra on own distributed network

Posted by James Briggs <ja...@yahoo.com>.
What you're describing depends on the load (data size) and latency.

Doing a bootstrap or backup would require a fair amount of bandwidth if
you want it done quickly with a lot of data. Also, latency would
be very high going over some kind of office VPN. But
there's no reason you can't do what you're describing.

You could setup a test cluster and see what the actual latency is.



Most people use 4 nodes per POP with NetworkTopologyStrategy (NTS)for a multi-DC setup with RF=3.
 

Thanks, James Briggs
--
Cassandra/MySQL DBA. Available in San Jose area or remote.



________________________________
 From: David M <da...@gmail.com>
To: user@cassandra.apache.org 
Sent: Tuesday, September 9, 2014 5:49 PM
Subject: cassandra on own distributed network
 


Hi everyone

I am at a loss for locating use cases/examples/documentation/books/etc for deploying Cassandra where multi-dc nodes of a single cluster are on your own network at points around the world.
In my example a Cassandra dc equates to a building.

Of interest to me is how installations are inter-connecting their dcs (circuit bandwidth, latency requirements) for optimal replication/gossip/etc and any lessons learned they can share.

I know there isn't going to be a single config that applies to every deployment/usage pattern/etc but surely there are at least loose rules of thumb that will get me going (or maybe alternative deployments).


The interesting posts/blogs/books/etc seem to reference Cassandra in the cloud (eg specifying AWS instance types) leaving out descriptions/usage/requirements at the network layer.

If anyone knows of any information on this topic that I've missed I'd appreciate your sharing.


Thanks,

David