You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Akshit Jain <ak...@iiitd.ac.in> on 2017/11/22 13:25:36 UTC

Backup and Restore in cassandra

What is the correct process to backup and restore in cassandra?
Should we do backup node by node like first schema backup from all the
nodes then all other stuff?
In restore the schema should be restored on one node or all the nodes
again?It will give Already Exists Exception but still what's the correct
process which is followed in production?

Re: Backup and Restore in cassandra

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hi,

Backup / Restore can be done in distinct ways.

The process is about taking a snapshot of all the nodes, roughly at the
same time to backup, and set a new environment based on this for restore.

Usually a Cassandra backup is made through 'nodetool snapshot' on each
node, then move all the snapshots out of the node to a safe place.

Restoring is different if one node was lost or the entire cluster went
down.

*If one node went down*, I would probably not use the restore, but just
have another node replacing the failed node, this way no data gap, all the
data should make its way to the node through the streaming process
(standard replace node in Cassandra).

*If all the nodes data is wrong* due to a user action (typically a bad
delete, a big "PEBKAC" of some kind), then it's about stopping all the
nodes, cleaning the '/data/ks/table/*', putting latest correct snapshots
taken from this node in there and restarting all the nodes. This imply a
downtime and a data loss of course.

*Also if for some reason old nodes are not accessible *(a major hardware
outage), it is good to have the schema (from one node, all the nodes should
have the same schema) and snapshots from the old cluster out of the
cluster. If old nodes are using the same tokens than previous nodes,
distribute SSTable copying data from the old to the new node, picking the
right nodes depending on the token they use. If that's not straightforward
(using vnodes), then using a SSTable loader is possible, but probably
slower, inducing a bigger downtime.


Some tools and documentation are helping with this topic:

Datastax doc:
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupRestore.html
Tablesnap / Cassback are old tools that were meant to make backup / restore
more easy. I never used them, so I'll let you judge these tools.


Finally I had to work on backup / restore recently on AWS, using EBS.
The option we took is to snapshot the '/data' volume entirely regularly
using AWS snapshots. So we have a copy of each volume, following our backup
policy. Snapshot on AWS are incremental, which allow relatively cheap
frequent backups, even though AWS snapshots are probably a bit expensive
overall.
With this technique restore is really straightforward. We create a new node
(or use the existing one that is to be restored), also create a new EBS
volume from the snapshot taken from the old node, and attach this new
volume to a the newly created node. When the nodes starts, it got all the
data, *including the system keyspace*. So the node know about the token it
owns (even when using vnodes), about the schema, other nodes token ranges
etc. We tested this solution quite successfully with Cassandra 2.1, using
same DC / cluster name in the 2 clusters but using distinct seeds (actually
in our test, the 2 clusters were using 2 distinct networks at not able to
talk anyway).

Maybe can this latest solution be applied out of AWS, just snapshot the
whole '/data' folder, then put it back in the same node after wiping out
the previous content, or directly using another node, and then just
starting Cassandra.

I never read about the technique described above, so even though our tests
were successful I encourage you to act carefully if going this path. Maybe
should you consider performing tests as well copying the entire Cassandra
'/data' folder to a new cluster, configured like the first one, just using
distinct seeds and starting the nodes like this. If this works, it would
solve all the schema and token range ownership considerations as the token
informations for each node are shipped alongside the data, in
'/data/system/'.

Good luck with this topic,

C*heers
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


2017-11-22 13:25 GMT+00:00 Akshit Jain <ak...@iiitd.ac.in>:

> What is the correct process to backup and restore in cassandra?
> Should we do backup node by node like first schema backup from all the
> nodes then all other stuff?
> In restore the schema should be restored on one node or all the nodes
> again?It will give Already Exists Exception but still what's the correct
> process which is followed in production?
>