You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Laxmikant Upadhyay <la...@gmail.com> on 2020/01/16 08:34:44 UTC

How to assure data consistency in switch over to standby dc

We have 2 dc in active/standby model. At any given point if we want to
switch to standby dc, how will we make sure that data is consistent with
active site? Note that repair runs at its scheduled time.


I am thinking of below approaches :

1.Before switching run the repair (although it assure consistency mostly
but repair itself may take long time to complete)

2. Monitor the dropped message bean : If no message dropped since last
successful repair then it is good to switch without running repair.

3. Monitor the hints backlog (files in hint directory), if no backlog then
it is good to  switch without running repair.



I am interested to know how other people are solving this issue and make
fast switch-over assuring consistency.

-- 

regards,
Laxmikant Upadhyay

Re: How to assure data consistency in switch over to standby dc

Posted by Jean Carlo <je...@gmail.com>.

Hello Laxmiant,

your application has to deal with eventually consistency if you are using
cassandra. Ensure to have

R + W > RF

And have the repairs runing periodically. This is the best way to be the
most cosistent and coherent

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay


On Thu, Jan 16, 2020 at 3:18 PM Laxmikant Upadhyay <la...@gmail.com>
wrote:

> Hi Alex,
>
> You are right, that will solve the problem. but unfortunately i won't be
> able to meet my sla with write each quorum . I am using local quorum for
> both read and write.
> Any other way ?
>
>
> On Thu, Jan 16, 2020, 5:45 PM Oleksandr Shulgin <
> oleksandr.shulgin@zalando.de> wrote:
>
>> On Thu, Jan 16, 2020 at 1:04 PM Laxmikant Upadhyay <
>> laxmikant.hcl@gmail.com> wrote:
>>
>>> Hi,
>>> What I meant fromActive/standby model is that even though data is being
>>> replicated (asynchronously) to standby DC ,  client will only access the
>>> data from active DC (let's say using local_quorum).
>>>
>>> you have "to switch" your clients without any issues since your writes
>>> are replicated on all DC.
>>> --> that is not true because there is a chance of mutation drop. (Hints,
>>> read repair may help to some extent but data consistency is not guaranteed
>>> unless you run anti- entropy repair )
>>>
>>
>> What are the consistency levels used by your application(s)?
>>
>> E.g. for strong consistency across multiple DCs you could use EACH_QUORUM
>> for the write requests and LOCAL_QUORUM for reads, with a replication
>> factor >= 3 per DC.
>>
>> --
>> Alex
>>
>>

Re: How to assure data consistency in switch over to standby dc

Posted by Oleksandr Shulgin <ol...@zalando.de>.

On Thu, Jan 16, 2020 at 3:18 PM Laxmikant Upadhyay <la...@gmail.com>
wrote:

>
> You are right, that will solve the problem. but unfortunately i won't be
> able to meet my sla with write each quorum . I am using local quorum for
> both read and write.
> Any other way ?
>

Is you read SLO more sensitive than write SLO?  Maybe you can spend a bit
more time on read path by using non-local read CL, while keeping the write
CL local?

--
Alex

Re: How to assure data consistency in switch over to standby dc

Posted by Laxmikant Upadhyay <la...@gmail.com>.

Hi Alex,

You are right, that will solve the problem. but unfortunately i won't be
able to meet my sla with write each quorum . I am using local quorum for
both read and write.
Any other way ?


On Thu, Jan 16, 2020, 5:45 PM Oleksandr Shulgin <
oleksandr.shulgin@zalando.de> wrote:

> On Thu, Jan 16, 2020 at 1:04 PM Laxmikant Upadhyay <
> laxmikant.hcl@gmail.com> wrote:
>
>> Hi,
>> What I meant fromActive/standby model is that even though data is being
>> replicated (asynchronously) to standby DC ,  client will only access the
>> data from active DC (let's say using local_quorum).
>>
>> you have "to switch" your clients without any issues since your writes
>> are replicated on all DC.
>> --> that is not true because there is a chance of mutation drop. (Hints,
>> read repair may help to some extent but data consistency is not guaranteed
>> unless you run anti- entropy repair )
>>
>
> What are the consistency levels used by your application(s)?
>
> E.g. for strong consistency across multiple DCs you could use EACH_QUORUM
> for the write requests and LOCAL_QUORUM for reads, with a replication
> factor >= 3 per DC.
>
> --
> Alex
>
>

Re: How to assure data consistency in switch over to standby dc

Posted by Oleksandr Shulgin <ol...@zalando.de>.

On Thu, Jan 16, 2020 at 1:04 PM Laxmikant Upadhyay <la...@gmail.com>
wrote:

> Hi,
> What I meant fromActive/standby model is that even though data is being
> replicated (asynchronously) to standby DC ,  client will only access the
> data from active DC (let's say using local_quorum).
>
> you have "to switch" your clients without any issues since your writes are
> replicated on all DC.
> --> that is not true because there is a chance of mutation drop. (Hints,
> read repair may help to some extent but data consistency is not guaranteed
> unless you run anti- entropy repair )
>

What are the consistency levels used by your application(s)?

E.g. for strong consistency across multiple DCs you could use EACH_QUORUM
for the write requests and LOCAL_QUORUM for reads, with a replication
factor >= 3 per DC.

--
Alex

Re: How to assure data consistency in switch over to standby dc

Posted by Laxmikant Upadhyay <la...@gmail.com>.

Hi,
What I meant fromActive/standby model is that even though data is being
replicated (asynchronously) to standby DC ,  client will only access the
data from active DC (let's say using local_quorum).

you have "to switch" your clients without any issues since your writes are
replicated on all DC.
--> that is not true because there is a chance of mutation drop. (Hints,
read repair may help to some extent but data consistency is not guaranteed
unless you run anti- entropy repair )



On Thu, Jan 16, 2020, 3:45 PM Ahmed Eljami <ah...@gmail.com> wrote:

> Hello,
>
> What do you mean by active/standby model ? Cassandra is designed to be
> active/active inter DC.
> So you have "to switch" your clients without any issues since your writes
> are replicated on all DC.
>
> Unless you would mean by active/standby that the keyspace is not
> replicated on the second DC ?
>
> Le jeu. 16 janv. 2020 à 09:35, Laxmikant Upadhyay <la...@gmail.com>
> a écrit :
>
>> We have 2 dc in active/standby model. At any given point if we want to
>> switch to standby dc, how will we make sure that data is consistent with
>> active site? Note that repair runs at its scheduled time.
>>
>>
>> I am thinking of below approaches :
>>
>> 1.Before switching run the repair (although it assure consistency mostly
>> but repair itself may take long time to complete)
>>
>> 2. Monitor the dropped message bean : If no message dropped since last
>> successful repair then it is good to switch without running repair.
>>
>> 3. Monitor the hints backlog (files in hint directory), if no backlog
>> then it is good to  switch without running repair.
>>
>>
>>
>> I am interested to know how other people are solving this issue and make
>> fast switch-over assuring consistency.
>>
>> --
>>
>> regards,
>> Laxmikant Upadhyay
>>
>>
>
> --
> Cordialement;
>
> Ahmed ELJAMI
>

Re: How to assure data consistency in switch over to standby dc

Posted by Ahmed Eljami <ah...@gmail.com>.

Hello,

What do you mean by active/standby model ? Cassandra is designed to be
active/active inter DC.
So you have "to switch" your clients without any issues since your writes
are replicated on all DC.

Unless you would mean by active/standby that the keyspace is not replicated
on the second DC ?

Le jeu. 16 janv. 2020 à 09:35, Laxmikant Upadhyay <la...@gmail.com>
a écrit :

> We have 2 dc in active/standby model. At any given point if we want to
> switch to standby dc, how will we make sure that data is consistent with
> active site? Note that repair runs at its scheduled time.
>
>
> I am thinking of below approaches :
>
> 1.Before switching run the repair (although it assure consistency mostly
> but repair itself may take long time to complete)
>
> 2. Monitor the dropped message bean : If no message dropped since last
> successful repair then it is good to switch without running repair.
>
> 3. Monitor the hints backlog (files in hint directory), if no backlog then
> it is good to  switch without running repair.
>
>
>
> I am interested to know how other people are solving this issue and make
> fast switch-over assuring consistency.
>
> --
>
> regards,
> Laxmikant Upadhyay
>
>

-- 
Cordialement;

Ahmed ELJAMI