You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by "Walsh, Stephen" <St...@Aspect.com> on 2015/11/18 17:22:30 UTC

Replication of data over 2 Datacentre's, when one node fails we get replica issues

Hey all,

We're testing Cassandra failover over 2 Datacentre's.

There are 3 nodes on each.
All CF's have a Replication of 2 on both Datacentre's (DC1:2, DC2:2)

When one Datacentre goes down then all queries go to the other.
This works fine for LOCAL_QUOURM queries. As 2 replicas of the data exist in this Datacentre.

However in the scenario where the 2 Datacentre's are up and one node goes down, all queries to that Datacentre will fail for LOCAL_QUOURM
This is because the node that failed had the replica data, and there is only 1 remaining node with the data. So LOCAL_QUOURM which requires 2 nodes with replica data will fail

Is there a way to not send these queries to the incomplete Datacentre.
What the best way the handle this?


Regards
Stephen Walsh
This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain information that is confidential. If you have received this message in error, please do not read, copy or forward this message. Please notify the sender immediately, delete it from your system and destroy any copies. You may not further disclose or distribute this email or its attachments.

Re: Replication of data over 2 Datacentre's, when one node fails we get replica issues

Posted by Anuj Wadehra <an...@yahoo.co.in>.

Hi Walsh,

My comments:

1. Keeping RF at 2 and CL at LOCAL_QUORUM would not give you any additional fault tolerance. You wont be able to afford a single node failure with RF=2. I would suggest keeping it at 3 so that you can tolerate a single node failure.

Your query failed because RF=2 and not because the query went to other DC.

2. If you fire a query at LOCAL_QUORUM, it will only read data from local dc ( dc where it was fired). It will never go to other DC unless there is a digest mismatch in local dc and read_repair_chance >0.

For completely preventing queries to go in other dc ( even in case of digest mismatch), you can set read_repair_chance=0 and instead set dc_local_read_repair_chance>0 to make sure blocking read repair happens in local dc.

Thanks

Anuj

Sent from Yahoo Mail on Android

From:"Walsh, Stephen" <St...@Aspect.com>
Date:Wed, 18 Nov, 2015 at 9:52 pm
Subject:Replication of data over 2 Datacentre's, when one node fails we get replica issues

Hey all,

We’re testing Cassandra failover over 2 Datacentre’s.

There are 3 nodes on each.

All CF’s have a Replication of 2 on both Datacentre’s (DC1:2, DC2:2)

When one Datacentre goes down then all queries go to the other.

This works fine for LOCAL_QUOURM queries. As 2 replicas of the data exist in this Datacentre.

However in the scenario where the 2 Datacentre’s are up and one node goes down, all queries to that Datacentre will fail for LOCAL_QUOURM

This is because the node that failed had the replica data, and there is only 1 remaining node with the data. So LOCAL_QUOURM which requires 2 nodes with replica data will fail

Is there a way to not send these queries to the incomplete Datacentre.

What the best way the handle this?

Regards

Stephen Walsh

This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain information that is confidential. If you have received this message in error, please do not read, copy or forward this message. Please notify the sender immediately, delete it from your system and destroy any copies. You may not further disclose or distribute this email or its attachments.