You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Cassa L <lc...@gmail.com> on 2011/10/11 07:09:02 UTC

Multi DC setup

I am trying to understand multi DC setup for cassandra. As I understand, in
this setup,  replicas exists in same cluster ring, but physically nodes are
distributed across DCs. Is this correct?
I have two different cluster rings in two DCs, and want to replicate data
bidirectionally. They both have same keyspace. They take  data traffic from
different sources, but we want to make sure, data exists in both the rings.
What could be the way to achieve this?

Thanks,
L.

Re: Multi DC setup

Posted by Brandon Williams <dr...@gmail.com>.

On Tue, Oct 11, 2011 at 2:36 AM, Peter Schuller
<pe...@infidyne.com> wrote:
> Google/check wiki/read docs about NetworkTopologyStrategy and
> PropertyFileSnitch. I don't have a good link to multi-dc off hand
> (anyone got a good link to suggest that goes through this?).

http://www.datastax.com/docs/0.8/cluster_architecture/replication is
pretty good imo.

-Brandon

Re: Multi DC setup

Posted by Eric Tamme <et...@gmail.com>.

>> We already have two separate rings. Idea of bidirectional sync is, if one
>> ring is down, we can still send the traffic to other ring. When original
>> cluster comes back, it will pick up the data from available cluster. I'm not
>> sure if it makes sense to have separate rings or combine these two rings
>> into one.
I am not sure you fully understand how Cassandra is supposed to work - 
you do not need two rings to have two complete sets of data that you can 
"hot cutover" between.

> Cassandra doesn't have support for synchronizing data between two
> different rings. The multi-dc support in Cassandra amounts to having a
> single ring containing all nodes from all data centers. Cassandra is
> told (by configuring the snitch, such as through a property files)
> which nodes are in which data center. Using the
> NetworkTopologyStrategy, you then make sure to distribute replicas in
> DC:s as you see fit.
Using NTS you can configure a single ring into multiple "logical 
rings".  This is effectively what the property file snitch does in 
conjunction with NTS.

I gave a presentation on the NTS internals, and replicating data across 
geographically distributed data centers. You can find the slides here 
http://files.meetup.com/1794037/NTS_presentation.pdf

Also Edward Capriolio's book "high performance cassandra" has some 
recipes for using NTS.

I currently have 4 nodes in two data centers and I use NTS with property 
file snitch to write 1 copy of data to each DC (one node per DC) so that 
in the event of a total DC failure, we can still get to the data.  The 
first write is "local" and the replica is asynchronous if you set write 
consistency to 1 - so you get fast writes with distribution.

-Eric

Re: Multi DC setup

Posted by Peter Schuller <pe...@infidyne.com>.

> We already have two separate rings. Idea of bidirectional sync is, if one
> ring is down, we can still send the traffic to other ring. When original
> cluster comes back, it will pick up the data from available cluster. I'm not
> sure if it makes sense to have separate rings or combine these two rings
> into one.

Cassandra doesn't have support for synchronizing data between two
different rings. The multi-dc support in Cassandra amounts to having a
single ring containing all nodes from all data centers. Cassandra is
told (by configuring the snitch, such as through a property files)
which nodes are in which data center. Using the
NetworkTopologyStrategy, you then make sure to distribute replicas in
DC:s as you see fit.

Cassandra will then prefer local nodes for read and write operations,
and you can use e.g. LOCAL_QUORUM consistency level to get quorum like
consistency within a DC.

Google/check wiki/read docs about NetworkTopologyStrategy and
PropertyFileSnitch. I don't have a good link to multi-dc off hand
(anyone got a good link to suggest that goes through this?).

-- 
/ Peter Schuller (@scode on twitter)

Re: Multi DC setup

Posted by Cassa L <lc...@gmail.com>.

We already have two separate rings. Idea of bidirectional sync is, if one
ring is down, we can still send the traffic to other ring. When original
cluster comes back, it will pick up the data from available cluster. I'm not
sure if it makes sense to have separate rings or combine these two rings
into one.



On Mon, Oct 10, 2011 at 10:17 PM, Milind Parikh <mi...@gmail.com>wrote:

> Why have two rings? Cassandra manages the replication for you....one ring
> with physical nodes in two dc might be a better option. Of course, depending
> on the inter-dc failure characteristics, might need to endure split-brain
> for a while.
>
> /***********************
> sent from my android...please pardon occasional typos as I respond @ the
> speed of thought
> ************************/
>
> On Oct 10, 2011 10:09 PM, "Cassa L" <lc...@gmail.com> wrote:
>
> I am trying to understand multi DC setup for cassandra. As I understand, in
> this setup,  replicas exists in same cluster ring, but physically nodes are
> distributed across DCs. Is this correct?
> I have two different cluster rings in two DCs, and want to replicate data
> bidirectionally. They both have same keyspace. They take  data traffic from
> different sources, but we want to make sure, data exists in both the rings.
> What could be the way to achieve this?
>
> Thanks,
> L.
>
>

Re: Multi DC setup

Posted by Milind Parikh <mi...@gmail.com>.

Why have two rings? Cassandra manages the replication for you....one ring
with physical nodes in two dc might be a better option. Of course, depending
on the inter-dc failure characteristics, might need to endure split-brain
for a while.

/***********************
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
************************/

On Oct 10, 2011 10:09 PM, "Cassa L" <lc...@gmail.com> wrote:

I am trying to understand multi DC setup for cassandra. As I understand, in
this setup,  replicas exists in same cluster ring, but physically nodes are
distributed across DCs. Is this correct?
I have two different cluster rings in two DCs, and want to replicate data
bidirectionally. They both have same keyspace. They take  data traffic from
different sources, but we want to make sure, data exists in both the rings.
What could be the way to achieve this?

Thanks,
L.