You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Anuj Wadehra <an...@yahoo.co.in> on 2016/04/26 04:26:10 UTC

Inconsistent Reads after Restoring Snapshot

Hi,
We have 2.0.14. We use RF=3 and read/write at Quorum. Moreover, we dont use incremental backups. As per the documentation at https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html , if i need to restore a Snapshot on SINGLE node in a cluster, I would run repair at the end. But while the repair is going on, reads may get inconsistent.

Consider following scenario:10 AM Daily Snapshot taken of node A and moved to backup location11 AM A record is inserted such that node A and B insert the record but there is a mutation drop on node C.1 PM Node A crashes and data is restored from latest 10 AM snapshot. Now, only Node B has the record.
Now, my question is:
Till the repair is completed on node A,a read at Quorum may return inconsistent result based on the nodes from which data is read.If data is read from node A and node C, nothing is returned and if data is read from node A and node B, record is returned. This is a vital point which is not highlighted anywhere.

Please confirm my understanding.If my understanding is right, how to make sure that my reads are not inconsistent while a node is being repair after restoring a snapshot.
I think, autobootstrapping the node without joining the ring till the repair is completed, is an alternative option. But snapshots save lot of streaming as compared to bootstrap.
Will incremental backups guarantee that 
ThanksAnuj

Sent from Yahoo Mail on Android

Re: Inconsistent Reads after Restoring Snapshot

Posted by Romain Hardouin <ro...@yahoo.fr>.
You can make a restore on the new node A (don't forget to set the token(s) in cassandra.yaml), start the node with -Dcassandra.join_ring=false and then run a repair on it. Have a look at https://issues.apache.org/jira/browse/CASSANDRA-6961
Best,
Romain 

    Le Mardi 26 avril 2016 4h26, Anuj Wadehra <an...@yahoo.co.in> a écrit :
 

 Hi,
We have 2.0.14. We use RF=3 and read/write at Quorum. Moreover, we dont use incremental backups. As per the documentation at https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html , if i need to restore a Snapshot on SINGLE node in a cluster, I would run repair at the end. But while the repair is going on, reads may get inconsistent.

Consider following scenario:10 AM Daily Snapshot taken of node A and moved to backup location11 AM A record is inserted such that node A and B insert the record but there is a mutation drop on node C.1 PM Node A crashes and data is restored from latest 10 AM snapshot. Now, only Node B has the record.
Now, my question is:
Till the repair is completed on node A,a read at Quorum may return inconsistent result based on the nodes from which data is read.If data is read from node A and node C, nothing is returned and if data is read from node A and node B, record is returned. This is a vital point which is not highlighted anywhere.

Please confirm my understanding.If my understanding is right, how to make sure that my reads are not inconsistent while a node is being repair after restoring a snapshot.
I think, autobootstrapping the node without joining the ring till the repair is completed, is an alternative option. But snapshots save lot of streaming as compared to bootstrap.
Will incremental backups guarantee that 
ThanksAnuj

Sent from Yahoo Mail on Android

  

Re: Inconsistent Reads after Restoring Snapshot

Posted by Carlos Alonso <in...@mrcalonso.com>.
If I understand correctly, your reads will always return the value from B
as that cell will have the highest timestamp.

Consistent reads (CL >= QUORUM) will request data from all replica nodes
and compare the results in the coordinator, returning the value if they all
agree or returning the one with highest timestamp (last write wins) if they
don't agree, and running a read repair (repair that particular record) if
the nodes didn't agreed.

Hope this helps.

Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>

On 29 April 2016 at 04:33, Anuj Wadehra <an...@yahoo.co.in> wrote:

> Sean,
>
> I meant commit log archival was never part of "restoring snapshot"
> DataStax documentation. How commitlog archival is related to my concern?
> Please elaborate.
>
> Thanks
> Anuj
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
> On Thu, 28 Apr, 2016 at 9:24 PM, SEAN_R_DURITY@homedepot.com
> <SE...@homedepot.com> wrote:
>
>
> https://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configLogArchive_t.html
>
>
>
> Sean Durity
>
>
>
> *From:* Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
> *Sent:* Wednesday, April 27, 2016 10:44 PM
> *To:* user@cassandra.apache.org
> *Subject:* RE: Inconsistent Reads after Restoring Snapshot
>
>
>
> No.We are not saving them.I have never read that in DataStax documentation.
>
>
>
> Thanks
>
> Anuj
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
>
>
> On Thu, 28 Apr, 2016 at 12:45 AM, SEAN_R_DURITY@homedepot.com
>
> <SE...@homedepot.com> wrote:
>
> What about the commitlogs? Are you saving those off anywhere in between
> the snapshot and the crash?
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
> *Sent:* Monday, April 25, 2016 10:26 PM
> *To:* User
> *Subject:* Inconsistent Reads after Restoring Snapshot
>
>
>
> Hi,
>
>
>
> We have 2.0.14. We use RF=3 and read/write at Quorum. Moreover, we dont
> use incremental backups. As per the documentation at
> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html ,
> if i need to restore a Snapshot on SINGLE node in a cluster, I would run
> repair at the end. But while the repair is going on, reads may get
> inconsistent.
>
>
>
>
>
> Consider following scenario:
>
> 10 AM Daily Snapshot taken of node A and moved to backup location
>
> 11 AM A record is inserted such that node A and B insert the record but
> there is a mutation drop on node C.
>
> 1 PM Node A crashes and data is restored from latest 10 AM snapshot. Now,
> only Node B has the record.
>
>
>
> Now, my question is:
>
>
>
> Till the repair is completed on node A,a read at Quorum may return
> inconsistent result based on the nodes from which data is read.If data is
> read from node A and node C, nothing is returned and if data is read from
> node A and node B, record is returned. This is a vital point which is not
> highlighted anywhere.
>
>
>
>
>
> Please confirm my understanding.If my understanding is right, how to make
> sure that my reads are not inconsistent while a node is being repair after
> restoring a snapshot.
>
>
>
> I think, autobootstrapping the node without joining the ring till the
> repair is completed, is an alternative option. But snapshots save lot of
> streaming as compared to bootstrap.
>
>
>
> Will incremental backups guarantee that
>
>
>
> Thanks
>
> Anuj
>
>
>
>
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
>
> ------------------------------
>
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>
>
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>

RE: Inconsistent Reads after Restoring Snapshot

Posted by Anuj Wadehra <an...@yahoo.co.in>.
Sean,
I meant commit log archival was never part of "restoring snapshot" DataStax documentation. How commitlog archival is related to my concern? Please elaborate.
ThanksAnuj

Sent from Yahoo Mail on Android 
 
  On Thu, 28 Apr, 2016 at 9:24 PM, SEAN_R_DURITY@homedepot.com<SE...@homedepot.com> wrote:   
https://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configLogArchive_t.html
 
  
 
Sean Durity
 
  
 
From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
Sent: Wednesday, April 27, 2016 10:44 PM
To: user@cassandra.apache.org
Subject: RE: Inconsistent Reads after Restoring Snapshot
 
  
 
No.We are not saving them.I have never read that in DataStax documentation.
 
  
 
Thanks
 
Anuj
 
Sent from Yahoo Mail on Android
 
  
 

On Thu, 28 Apr, 2016 at 12:45 AM, SEAN_R_DURITY@homedepot.com
 
<SE...@homedepot.com> wrote:
 
What about the commitlogs? Are you saving those off anywhere in between the snapshot and the crash?
 
 
 
 
 
Sean Durity
 
 
 
From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
Sent: Monday, April 25, 2016 10:26 PM
To: User
Subject: Inconsistent Reads after Restoring Snapshot
 
 
 
Hi,
 
 
 
We have 2.0.14. We use RF=3 and read/write at Quorum. Moreover, we dont use incremental backups. As per the documentation at https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html , if i need to restore a Snapshot on SINGLE node in a cluster, I would run repair at the end. But while the repair is going on, reads may get inconsistent.
 
 
 
 
 
Consider following scenario:
 
10 AM Daily Snapshot taken of node A and moved to backup location
 
11 AM A record is inserted such that node A and B insert the record but there is a mutation drop on node C.
 
1 PM Node A crashes and data is restored from latest 10 AM snapshot. Now, only Node B has the record.
 
 
 
Now, my question is:
 
 
 
Till the repair is completed on node A,a read at Quorum may return inconsistent result based on the nodes from which data is read.If data is read from node A and node C, nothing is returned and if data is read from node A and node B, record is returned. This is a vital point which is not highlighted anywhere.
 
 
 
 
 
Please confirm my understanding.If my understanding is right, how to make sure that my reads are not inconsistent while a node is being repair after restoring a snapshot.
 
 
 
I think, autobootstrapping the node without joining the ring till the repair is completed, is an alternative option. But snapshots save lot of streaming as compared to bootstrap.
 
 
 
Will incremental backups guarantee that 
 
 
 
Thanks
 
Anuj
 
 
 
 
 
Sent from Yahoo Mail on Android
 
  
 

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
 
  
 


The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
  

RE: Inconsistent Reads after Restoring Snapshot

Posted by SE...@homedepot.com.
https://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configLogArchive_t.html

Sean Durity

From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
Sent: Wednesday, April 27, 2016 10:44 PM
To: user@cassandra.apache.org
Subject: RE: Inconsistent Reads after Restoring Snapshot

No.We are not saving them.I have never read that in DataStax documentation.

Thanks
Anuj
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Thu, 28 Apr, 2016 at 12:45 AM, SEAN_R_DURITY@homedepot.com<ma...@homedepot.com>
<SE...@homedepot.com>> wrote:
What about the commitlogs? Are you saving those off anywhere in between the snapshot and the crash?


Sean Durity

From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
Sent: Monday, April 25, 2016 10:26 PM
To: User
Subject: Inconsistent Reads after Restoring Snapshot

Hi,

We have 2.0.14. We use RF=3 and read/write at Quorum. Moreover, we dont use incremental backups. As per the documentation at https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html , if i need to restore a Snapshot on SINGLE node in a cluster, I would run repair at the end. But while the repair is going on, reads may get inconsistent.


Consider following scenario:
10 AM Daily Snapshot taken of node A and moved to backup location
11 AM A record is inserted such that node A and B insert the record but there is a mutation drop on node C.
1 PM Node A crashes and data is restored from latest 10 AM snapshot. Now, only Node B has the record.

Now, my question is:

Till the repair is completed on node A,a read at Quorum may return inconsistent result based on the nodes from which data is read.If data is read from node A and node C, nothing is returned and if data is read from node A and node B, record is returned. This is a vital point which is not highlighted anywhere.


Please confirm my understanding.If my understanding is right, how to make sure that my reads are not inconsistent while a node is being repair after restoring a snapshot.

I think, autobootstrapping the node without joining the ring till the repair is completed, is an alternative option. But snapshots save lot of streaming as compared to bootstrap.

Will incremental backups guarantee that

Thanks
Anuj


Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.


________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

RE: Inconsistent Reads after Restoring Snapshot

Posted by Anuj Wadehra <an...@yahoo.co.in>.
No.We are not saving them.I have never read that in DataStax documentation.
ThanksAnuj

Sent from Yahoo Mail on Android 
 
  On Thu, 28 Apr, 2016 at 12:45 AM, SEAN_R_DURITY@homedepot.com<SE...@homedepot.com> wrote:   
What about the commitlogs? Are you saving those off anywhere in between the snapshot and the crash?
 
  
 
  
 
Sean Durity
 
  
 
From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
Sent: Monday, April 25, 2016 10:26 PM
To: User
Subject: Inconsistent Reads after Restoring Snapshot
 
  
 
Hi,
 
  
 
We have 2.0.14. We use RF=3 and read/write at Quorum. Moreover, we dont use incremental backups. As per the documentation at https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html , if i need to restore a Snapshot on SINGLE node in a cluster, I would run repair at the end. But while the repair is going on, reads may get inconsistent.
 
  
 
  
 
Consider following scenario:
 
10 AM Daily Snapshot taken of node A and moved to backup location
 
11 AM A record is inserted such that node A and B insert the record but there is a mutation drop on node C.
 
1 PM Node A crashes and data is restored from latest 10 AM snapshot. Now, only Node B has the record.
 
  
 
Now, my question is:
 
  
 
Till the repair is completed on node A,a read at Quorum may return inconsistent result based on the nodes from which data is read.If data is read from node A and node C, nothing is returned and if data is read from node A and node B, record is returned. This is a vital point which is not highlighted anywhere.
 
  
 
  
 
Please confirm my understanding.If my understanding is right, how to make sure that my reads are not inconsistent while a node is being repair after restoring a snapshot.
 
  
 
I think, autobootstrapping the node without joining the ring till the repair is completed, is an alternative option. But snapshots save lot of streaming as compared to bootstrap.
 
  
 
Will incremental backups guarantee that 
 
  
 
Thanks
 
Anuj
 
  
 
  
 

Sent from Yahoo Mail on Android 

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
  

RE: Inconsistent Reads after Restoring Snapshot

Posted by SE...@homedepot.com.
What about the commitlogs? Are you saving those off anywhere in between the snapshot and the crash?


Sean Durity

From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
Sent: Monday, April 25, 2016 10:26 PM
To: User
Subject: Inconsistent Reads after Restoring Snapshot

Hi,

We have 2.0.14. We use RF=3 and read/write at Quorum. Moreover, we dont use incremental backups. As per the documentation at https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html , if i need to restore a Snapshot on SINGLE node in a cluster, I would run repair at the end. But while the repair is going on, reads may get inconsistent.


Consider following scenario:
10 AM Daily Snapshot taken of node A and moved to backup location
11 AM A record is inserted such that node A and B insert the record but there is a mutation drop on node C.
1 PM Node A crashes and data is restored from latest 10 AM snapshot. Now, only Node B has the record.

Now, my question is:

Till the repair is completed on node A,a read at Quorum may return inconsistent result based on the nodes from which data is read.If data is read from node A and node C, nothing is returned and if data is read from node A and node B, record is returned. This is a vital point which is not highlighted anywhere.


Please confirm my understanding.If my understanding is right, how to make sure that my reads are not inconsistent while a node is being repair after restoring a snapshot.

I think, autobootstrapping the node without joining the ring till the repair is completed, is an alternative option. But snapshots save lot of streaming as compared to bootstrap.

Will incremental backups guarantee that

Thanks
Anuj


Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.