You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jason Turner <Ja...@viewpoint.com> on 2015/10/12 17:52:26 UTC

Snapshots - Backup/Restore requirements

We have recently created production ready Cassandra clusters and we are currently trying to create and implement a robust Backup/Restore process.

We are hosting our Cassandra VMs in AWS and we understand the Snapshot process, but we are unclear on the best way to get Snapshots backed up off the server and if all nodes need to be Snapshotted, and have the snapshots and tokens backed up.

Can anyone share their DR setups and maybe an overview of how you would recover if you lost your entire cluster?

Thanks!
Jason Turner
HostedOps Engineer | Hosted Operations | 503.416.5080 (d) | 541.281.8084 (c)
Located in the Worldwide Headquarters | Portland, Oregon
Web<http://www.viewpointcs.com/> | Twitter<http://twitter.com/?lang=en&logged_out=1#!/viewpointcs> | Facebook<https://www.facebook.com/pages/Viewpoint-Construction-Software/136832286362300> | Linkedin<https://www.linkedin.com/pub/jason-turner/4b/879/469> | Pinterest<http://pinterest.com/viewpointcs/> | Google+<https://plus.google.com/u/0/b/101084529691967295556/101084529691967295556/posts>
_____________________________________________________________
[cid:image001.png@01D104CB.504ADA10]
This email and any attachments are confidential and may be privileged. If you are not the named recipient, or have otherwise received this communication in error, please delete it from your inbox, notify the sender immediately, and do not disclose its contents to any other person, use for any purpose, or store or copy them in any medium. Thank you for your cooperation.


Re: Snapshots - Backup/Restore requirements

Posted by Robert Coli <rc...@eventbrite.com>.
On Mon, Oct 12, 2015 at 3:28 PM, Jeff Ferland <jb...@tubularlabs.com> wrote:

> Yeah, that’s got plenty of promise looking at it the 2nd time around. What
> turned me off from using it was the combination of not having a single-shot
> backup mode and appearing to only run in inotify mode which would just
> chuck in every compaction.
>

Glad to hear of your contribution!

However, I believe the "single shot" backup mode you are referring to is
the associated tool "tablepunch" which I'm pretty sure has been contributed
back to upstream?

=Rob

Re: Snapshots - Backup/Restore requirements

Posted by Jeff Ferland <jb...@tubularlabs.com>.
Yeah, that’s got plenty of promise looking at it the 2nd time around. What turned me off from using it was the combination of not having a single-shot backup mode and appearing to only run in inotify mode which would just chuck in every compaction.

It looks like when reading it through its source code to come up with an actual answer to your question that I can put in a very small patch for making a single-shot pass of backups without the notify loop and then run it again in parallel as a separate task with an include regex of backups/(.*?!-tmp).

So, yes, I suggest using tablesnap and I’ll focus what time I would have spent enhancing my code to test it, put up a minor diff for a single-shot flag, and get some documentation / examples on snapshot and backup directories.

-Jeff 

> On Oct 12, 2015, at 2:30 PM, Robert Coli <rc...@eventbrite.com> wrote:
> 
> On Mon, Oct 12, 2015 at 9:41 AM, Jeff Ferland <jbf@tubularlabs.com <ma...@tubularlabs.com>> wrote:
> I have a semi-hacky Python script I’ve written up. It needs refining for public use, but I’ll put it in Github later today and send you a link as I work on it. It uses boto to do concurrent multi-part uploads to S3 with retry and resume recording function if it gets interrupted while uploading that super huge file.
> 
> Or you could use tablesnap which has this basic design and has existed and been maintained and extended with additional tools over the last few years?
> 
> https://github.com/JeremyGrosser/tablesnap <https://github.com/JeremyGrosser/tablesnap>
> 
> =Rob
> 


Re: Snapshots - Backup/Restore requirements

Posted by Robert Coli <rc...@eventbrite.com>.
On Mon, Oct 12, 2015 at 9:41 AM, Jeff Ferland <jb...@tubularlabs.com> wrote:

> I have a semi-hacky Python script I’ve written up. It needs refining for
> public use, but I’ll put it in Github later today and send you a link as I
> work on it. It uses boto to do concurrent multi-part uploads to S3 with
> retry and resume recording function if it gets interrupted while uploading
> that super huge file.
>

Or you could use tablesnap which has this basic design and has existed and
been maintained and extended with additional tools over the last few years?

https://github.com/JeremyGrosser/tablesnap

=Rob

Re: Snapshots - Backup/Restore requirements

Posted by Jeff Ferland <jb...@tubularlabs.com>.
I have a semi-hacky Python script I’ve written up. It needs refining for public use, but I’ll put it in Github later today and send you a link as I work on it. It uses boto to do concurrent multi-part uploads to S3 with retry and resume recording function if it gets interrupted while uploading that super huge file.

There are various other released open source tools. Netflix Priam only supports 2.0.x series.
Also, https://github.com/tbarbugli/cassandra_snapshotter <https://github.com/tbarbugli/cassandra_snapshotter> is similar in goal to what my scripting does.

https://github.com/JeremyGrosser/tablesnap <https://github.com/JeremyGrosser/tablesnap> will use inotify to watch your backup directories and do uploads as well.

Usually I deal with recovery by using -Dreplace_address to stream from the live copies. In case of a full disaster, I’d replace directly to disk from the last snapshot. In case of disaster with a cluster size change, I’d use sstableloader.

-Jeff

> On Oct 12, 2015, at 8:52 AM, Jason Turner <Ja...@viewpoint.com> wrote:
> 
> We have recently created production ready Cassandra clusters and we are currently trying to create and implement a robust Backup/Restore process.
> 
> We are hosting our Cassandra VMs in AWS and we understand the Snapshot process, but we are unclear on the best way to get Snapshots backed up off the server and if all nodes need to be Snapshotted, and have the snapshots and tokens backed up.
> 
> Can anyone share their DR setups and maybe an overview of how you would recover if you lost your entire cluster?
>  
> Thanks!
> Jason Turner
> HostedOps Engineer | Hosted Operations | 503.416.5080 (d) | 541.281.8084 (c) 
> Located in the Worldwide Headquarters | Portland, Oregon                    
> Web <http://www.viewpointcs.com/> | Twitter <http://twitter.com/?lang=en&logged_out=1#!/viewpointcs> | Facebook <https://www.facebook.com/pages/Viewpoint-Construction-Software/136832286362300> | Linkedin <https://www.linkedin.com/pub/jason-turner/4b/879/469> | Pinterest <http://pinterest.com/viewpointcs/> | Google+ <https://plus.google.com/u/0/b/101084529691967295556/101084529691967295556/posts>
> _____________________________________________________________
> 
> <image001.png>
> This email and any attachments are confidential and may be privileged. If you are not the named recipient, or have otherwise received this communication in error, please delete it from your inbox, notify the sender immediately, and do not disclose its contents to any other person, use for any purpose, or store or copy them in any medium. Thank you for your cooperation.