You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ran Tavory <ra...@gmail.com> on 2010/04/15 17:16:26 UTC

RackAware and replication strategy

I'm reading this on this page
http://wiki.apache.org/cassandra/ArchitectureInternals :

 AbstractReplicationStrategy controls what nodes get secondary, tertiary,
> etc. replicas of each key range. Primary replica is always determined by the
> token ring (in TokenMetadata) but you can do a lot of variation with the
> others. RackUnaware just puts replicas on the next N-1 nodes in the ring.
> RackAware puts the first non-primary replica in the next node in the ring in
> ANOTHER data center than the primary; then the remaining replicas in the
> same as the primary.


So I just want to make sure I got this right and that documentation is up to
date.
I have two data centers and rack-aware.

When replication factor is 2: is it always the case that the primary replica
goes to one DC and the second replica to the second DC?
When replication factor is 3: First replica in DC1, second in DC2 and third
in DC1
When replication factor is 4: First replica in DC1, second in DC2, third in
DC1, fourth in DC1 etc

If I have 4 hosts in each DC, which replication factors make sense?
N=1 - When I don't care about losing data, cool
N=2 - When I want to make sure each DC has a copy; useful for local fast
access and allows recovery if only one host down.
N=3 - If I want to make sure each DC has a copy plus recovery can be made
faster in certain cases, and more resilient to two hosts down.
N=4 - Like N=3 but even more resilient. etc

Say I want to have two replicas in each DC, can this be done?

Re: RackAware and replication strategy

Posted by Benjamin Black <b...@b3k.us>.
Have a look at locator/DatacenterShardStrategy.java.

On Thu, Apr 15, 2010 at 8:16 AM, Ran Tavory <ra...@gmail.com> wrote:
> I'm reading this on this
> page http://wiki.apache.org/cassandra/ArchitectureInternals :
>>
>> AbstractReplicationStrategy controls what nodes get secondary, tertiary,
>> etc. replicas of each key range. Primary replica is always determined by the
>> token ring (in TokenMetadata) but you can do a lot of variation with the
>> others. RackUnaware just puts replicas on the next N-1 nodes in the ring.
>> RackAware puts the first non-primary replica in the next node in the ring in
>> ANOTHER data center than the primary; then the remaining replicas in the
>> same as the primary.
>
> So I just want to make sure I got this right and that documentation is up to
> date.
> I have two data centers and rack-aware.
> When replication factor is 2: is it always the case that the primary replica
> goes to one DC and the second replica to the second DC?
> When replication factor is 3: First replica in DC1, second in DC2 and third
> in DC1
> When replication factor is 4: First replica in DC1, second in DC2, third in
> DC1, fourth in DC1 etc
> If I have 4 hosts in each DC, which replication factors make sense?
> N=1 - When I don't care about losing data, cool
> N=2 - When I want to make sure each DC has a copy; useful for local fast
> access and allows recovery if only one host down.
> N=3 - If I want to make sure each DC has a copy plus recovery can be made
> faster in certain cases, and more resilient to two hosts down.
> N=4 - Like N=3 but even more resilient. etc
> Say I want to have two replicas in each DC, can this be done?
>