You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by onmstester onmstester <on...@zoho.com> on 2018/06/10 05:28:56 UTC
data consistency without using nodetool repair
I'm using RF=2 (i know it should be at least 3 but i'm short of resources) and WCl=ONE and RCL=ONE in a cluster of 10 nodes in a insert-only scenario.
The problem: i dont want to use nodetool repair because it would put hige load on my cluster for a long time, but also i need data consistency
and fault tolerance in a way that:
if one of my nodes fails:
1. there would be no single record data loss
2. write/read of data would be continued with no problem
I know that current config won't satisfy No.1, so changed the Write Consistensy Level to ALL and to satisfy No.2, i'm catching exceptions of
"1 replicas needed but 2 required", write those records again with WCL = ONE and put them somewhere for rewrite later with WCL=2.
Is there anything wrong with this workaround? any better solution (Strong requirements: I can't change the RF, The system should tolerate 1 node failure with no data loss and no read/write failure) ?
Sent using Zoho Mail
Re: data consistency without using nodetool repair
Posted by Jeff Jirsa <jj...@gmail.com>.
If you have two replicas A and B, and you write at ONE
A acknowledged the write, B is running but drops write, the write succeeds
A fails 30 seconds later. The data is lost because it had no chance to be repaired to the other host via hints / repair / readrepair
10 days is a big window - repair is a smaller problem than RF=2 and writing at ONE
--
Jeff Jirsa
> On Jun 9, 2018, at 10:52 PM, onmstester onmstester <on...@zoho.com> wrote:
>
> Thanks Jeff,
> If i run repair every 10 days, then, would there be a chance of losing data by losing one node (data inserted exactly after last repair) ?
>
> Sent using Zoho Mail
>
>
>
> ---- On Sun, 10 Jun 2018 10:14:46 +0430 Jeff Jirsa <jj...@gmail.com> wrote ----
>
>
>
>
> On Jun 9, 2018, at 10:28 PM, onmstester onmstester <on...@zoho.com> wrote:
>
> I'm using RF=2 (i know it should be at least 3 but i'm short of resources) and WCl=ONE and RCL=ONE in a cluster of 10 nodes in a insert-only scenario.
> The problem: i dont want to use nodetool repair because it would put hige load on my cluster for a long time, but also i need data consistency
> and fault tolerance in a way that:
> if one of my nodes fails:
> 1. there would be no single record data loss
>
> This requires write > 1
>
> 2. write/read of data would be continued with no problem
>
>
> This requires more replicas in the ring than the number of replicas requires for reads and writes
>
>
> I know that current config won't satisfy No.1, so changed the Write Consistensy Level to ALL and to satisfy No.2, i'm catching exceptions of
> "1 replicas needed but 2 required", write those records again with WCL = ONE and put them somewhere for rewrite later with WCL=2.
> Is there anything wrong with this workaround? any better solution (Strong requirements: I can't change the RF, The system should tolerate 1 node failure with no data loss and no read/write failure)
>
>
> Sorta works, but it’s dirty and a lot of edge cases. Depending on the budget and value of data, maybe it’s ok, but I wouldn’t trust it to be good enough for critical use cases
>
> The transient replication work that Ariel is doing (https://issues.apache.org/jira/browse/CASSANDRA-14404) will likely benefit you here in 4.0 (but it will require you to run repair).
>
>
Re: data consistency without using nodetool repair
Posted by onmstester onmstester <on...@zoho.com>.
Thanks Jeff,
If i run repair every 10 days, then, would there be a chance of losing data by losing one node (data inserted exactly after last repair) ?
Sent using Zoho Mail
---- On Sun, 10 Jun 2018 10:14:46 +0430 Jeff Jirsa <jjirsa@gmail.com> wrote ----
On Jun 9, 2018, at 10:28 PM, onmstester onmstester <onmstester@zoho.com> wrote:
I'm using RF=2 (i know it should be at least 3 but i'm short of resources) and WCl=ONE and RCL=ONE in a cluster of 10 nodes in a insert-only scenario.
The problem: i dont want to use nodetool repair because it would put hige load on my cluster for a long time, but also i need data consistency
and fault tolerance in a way that:
if one of my nodes fails:
1. there would be no single record data loss
This requires write > 1
2. write/read of data would be continued with no problem
This requires more replicas in the ring than the number of replicas requires for reads and writes
I know that current config won't satisfy No.1, so changed the Write Consistensy Level to ALL and to satisfy No.2, i'm catching exceptions of
"1 replicas needed but 2 required", write those records again with WCL = ONE and put them somewhere for rewrite later with WCL=2.
Is there anything wrong with this workaround? any better solution (Strong requirements: I can't change the RF, The system should tolerate 1 node failure with no data loss and no read/write failure)
Sorta works, but it’s dirty and a lot of edge cases. Depending on the budget and value of data, maybe it’s ok, but I wouldn’t trust it to be good enough for critical use cases
The transient replication work that Ariel is doing (https://issues.apache.org/jira/browse/CASSANDRA-14404) will likely benefit you here in 4.0 (but it will require you to run repair).
Re: data consistency without using nodetool repair
Posted by Jeff Jirsa <jj...@gmail.com>.
> On Jun 9, 2018, at 10:28 PM, onmstester onmstester <on...@zoho.com> wrote:
>
>
> I'm using RF=2 (i know it should be at least 3 but i'm short of resources) and WCl=ONE and RCL=ONE in a cluster of 10 nodes in a insert-only scenario.
> The problem: i dont want to use nodetool repair because it would put hige load on my cluster for a long time, but also i need data consistency
> and fault tolerance in a way that:
> if one of my nodes fails:
> 1. there would be no single record data loss
This requires write > 1
> 2. write/read of data would be continued with no problem
>
This requires more replicas in the ring than the number of replicas requires for reads and writes
> I know that current config won't satisfy No.1, so changed the Write Consistensy Level to ALL and to satisfy No.2, i'm catching exceptions of
> "1 replicas needed but 2 required", write those records again with WCL = ONE and put them somewhere for rewrite later with WCL=2.
> Is there anything wrong with this workaround? any better solution (Strong requirements: I can't change the RF, The system should tolerate 1 node failure with no data loss and no read/write failure)
Sorta works, but it’s dirty and a lot of edge cases. Depending on the budget and value of data, maybe it’s ok, but I wouldn’t trust it to be good enough for critical use cases
The transient replication work that Ariel is doing (https://issues.apache.org/jira/browse/CASSANDRA-14404) will likely benefit you here in 4.0 (but it will require you to run repair).