You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Sergey Tryuber <st...@gmail.com> on 2012/09/09 11:09:17 UTC

Replication factor 2, consistency and failover

Hi

We have to use Cassandra with RF=2 (don't ask why...). There are two
datacenters (RF=2 in each datacenter). Also we use Astyanax as a client
library. In general we want to achieve strong consistency. Read performance
is important for us, that's why we perform writes with LOCAL_QUORUM and
reads with ONE. If one server is down, we automatically switch to
Writes.ONE, Reads.ONE only for that replica which has failed node (we
modified Astyanax to achieve that). When the server comes back, we turn
back Writes.LOCAL_QUORUM and Reads.ONE, but, of course, we see some
inconsistencies during the switching process and some time after (when
hinted handnoff works).

Basically I don't have any questions, just want to share our "ugly"
failover algorithm, to hear your criticism and may be advise on how to
improve it. Unfortunately we can't change replication factor and most of
the time we have to read with consistency level ONE (because we have strict
requirements on read performance).

Thank you!

Re: Replication factor 2, consistency and failover

Posted by Sergey Tryuber <st...@gmail.com>.

Aaron, thank you! Your message was exactly what we wanted to see: that we
didn't miss something critical. We'll share our Astyanax patch in the
future.

On 10 September 2012 03:44, aaron morton <aa...@thelastpickle.com> wrote:

> In general we want to achieve strong consistency.
>
> You need to have R + W > N
>
> LOCAL_QUORUM and reads with ONE.
>
> Gives you 2  + 1 > 2 when you use it. When you drop back to ONE / ONE you
> no longer have strong consistency.
>
> may be advise on how to improve it.
>
> Sounds like you know how to improve it :)
>
> Things you could play with:
>
> * hinted_handoff_throttle_delay_in_ms in YAML to reduce the time it takes
> for HH delay to deliver the messages.
> * increase the read_repair_chance for the CF's. This will increase the
> chance of RR repairing an inconsistency behind the scenes, so the next read
> is consistent. This will also increase the IO load on the system.
>
> With the RF 2 restriction you are probably doing the best you can. You are
> giving up consistency for availability and partition tolerance. The best
> thing to do to get peeps to agree that "we will accept reduced consistency
> for high availability" rather than say "in general we want to achieve
> strong consistency".
>
> Hope that helps.
>
>   -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 9/09/2012, at 9:09 PM, Sergey Tryuber <st...@gmail.com> wrote:
>
> Hi
>
> We have to use Cassandra with RF=2 (don't ask why...). There are two
> datacenters (RF=2 in each datacenter). Also we use Astyanax as a client
> library. In general we want to achieve strong consistency. Read performance
> is important for us, that's why we perform writes with LOCAL_QUORUM and
> reads with ONE. If one server is down, we automatically switch to
> Writes.ONE, Reads.ONE only for that replica which has failed node (we
> modified Astyanax to achieve that). When the server comes back, we turn
> back Writes.LOCAL_QUORUM and Reads.ONE, but, of course, we see some
> inconsistencies during the switching process and some time after (when
> hinted handnoff works).
>
> Basically I don't have any questions, just want to share our "ugly"
> failover algorithm, to hear your criticism and may be advise on how to
> improve it. Unfortunately we can't change replication factor and most of
> the time we have to read with consistency level ONE (because we have strict
> requirements on read performance).
>
> Thank you!
>
>
>

Re: Replication factor 2, consistency and failover

Posted by aaron morton <aa...@thelastpickle.com>.

> In general we want to achieve strong consistency. 
You need to have R + W > N

> LOCAL_QUORUM and reads with ONE.
Gives you 2  + 1 > 2 when you use it. When you drop back to ONE / ONE you no longer have strong consistency. 

> may be advise on how to improve it. 
Sounds like you know how to improve it :)

Things you could play with:

* hinted_handoff_throttle_delay_in_ms in YAML to reduce the time it takes for HH delay to deliver the messages.
* increase the read_repair_chance for the CF's. This will increase the chance of RR repairing an inconsistency behind the scenes, so the next read is consistent. This will also increase the IO load on the system. 

With the RF 2 restriction you are probably doing the best you can. You are giving up consistency for availability and partition tolerance. The best thing to do to get peeps to agree that "we will accept reduced consistency for high availability" rather than say "in general we want to achieve strong consistency".

Hope that helps. 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 9/09/2012, at 9:09 PM, Sergey Tryuber <st...@gmail.com> wrote:

> Hi
> 
> We have to use Cassandra with RF=2 (don't ask why...). There are two datacenters (RF=2 in each datacenter). Also we use Astyanax as a client library. In general we want to achieve strong consistency. Read performance is important for us, that's why we perform writes with LOCAL_QUORUM and reads with ONE. If one server is down, we automatically switch to Writes.ONE, Reads.ONE only for that replica which has failed node (we modified Astyanax to achieve that). When the server comes back, we turn back Writes.LOCAL_QUORUM and Reads.ONE, but, of course, we see some inconsistencies during the switching process and some time after (when hinted handnoff works).
> 
> Basically I don't have any questions, just want to share our "ugly" failover algorithm, to hear your criticism and may be advise on how to improve it. Unfortunately we can't change replication factor and most of the time we have to read with consistency level ONE (because we have strict requirements on read performance). 
> 
> Thank you!
>