You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Simon Schwichtenberg <SS...@dspace.de> on 2021/05/07 09:27:15 UTC
Data backup in CouchDB cluster
Hi,
I wonder how you'd do backups of your data in a CouchDB cluster. The documentation does not mention backups of clusters explicitly (https://docs.couchdb.org/en/latest/maintenance/backups.html#database-backups).
When you have a cluster of three nodes and the nodes are set to n=3 and q=2 (see https://docs.couchdb.org/en/latest/cluster/sharding.html), I'd expect that every single node in the cluster has all the data and you can copy the .couch files from any of these three nodes. When you have 6 nodes with n=3 and q=2 this approach does not work anymore because every node has just a single shard. Please correct me if I am wrong.
What is best practice to backup a cluster?
This message is a follow-up from here: https://github.com/cloudant/couchbackup/issues/349
Thanks,
Simon
Re: Data backup in CouchDB cluster
Posted by Jan Lehnardt <ja...@apache.org>.
Hi Simon,
there are multiple aspects to backup, data safety and
time to recovery (TTR).
If your only goal is to have a separate copy of your
data, backing up only one node of your database does
the trick.
However, if you want a short TTR, so your cluster is
complete again as soon as possible, you want a backup of
each of your nodes, so you can replay your backup at
any time. Each node stores the data slightly differently
on a physical level, which means restoring data from
node A’s backup to node B is not trivial. While logically,
this means you have three backups when you might only
want one.
If TTR is not a concern, you can just back up a single
node to make sure you can restore that if needed, but rely
on CouchDB internal backfill, if a node is replaced after a
failure without data being added from a backup. This will
take longer than restoring a node from its own backup.
Best
Jan
—
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/
24/7 Observation for your CouchDB Instances:
https://opservatory.app
> On 7. May 2021, at 11:27, Simon Schwichtenberg <SS...@dspace.de> wrote:
>
> Hi,
>
> I wonder how you'd do backups of your data in a CouchDB cluster. The documentation does not mention backups of clusters explicitly (https://docs.couchdb.org/en/latest/maintenance/backups.html#database-backups).
>
> When you have a cluster of three nodes and the nodes are set to n=3 and q=2 (see https://docs.couchdb.org/en/latest/cluster/sharding.html), I'd expect that every single node in the cluster has all the data and you can copy the .couch files from any of these three nodes. When you have 6 nodes with n=3 and q=2 this approach does not work anymore because every node has just a single shard. Please correct me if I am wrong.
>
> What is best practice to backup a cluster?
>
> This message is a follow-up from here: https://github.com/cloudant/couchbackup/issues/349
>
> Thanks,
> Simon
Re: Data backup in CouchDB cluster
Posted by Andrea Brancatelli <ab...@schema31.it.INVALID>.
I don't think so, but never tried it.
---
Andrea Brancatelli
On 2021-05-07 12:38, Willem van der Westhuizen wrote:
> This looks very interesting, is it possible to do incremental backups with this approach, from a given seqence number in the db only?
>
> Willem
>
> On 2021/05/07 11:52, Andrea Brancatelli wrote: We're using this:
>
> https://github.com/danielebailo/couchdb-dump
>
> since a few years.
>
> It almost always worked flawlessly. It's fast, and, to me, it's better
> than backing up the .couch files for various reasons:
>
> * you can restore datas on a cluster with a different N/Q layout
> * you can restore datas on a different machine with a different
> cluster name / different IP / different whatever ... .couch files
> include references to vm.args parameters.
> * when you restore the DB you get a clean db without the tombstones.
> * you can backup the db without having local access to the machine,
> passing trough the standard HTTP port.
>
> Hope it helps you.
>
> ---
>
> Andrea Brancatelli
>
> On 2021-05-07 11:27, Simon Schwichtenberg wrote:
>
> Hi,
>
> I wonder how you'd do backups of your data in a CouchDB cluster. The documentation does not mention backups of clusters explicitly (https://docs.couchdb.org/en/latest/maintenance/backups.html#database-backups).
>
> When you have a cluster of three nodes and the nodes are set to n=3 and q=2 (see https://docs.couchdb.org/en/latest/cluster/sharding.html), I'd expect that every single node in the cluster has all the data and you can copy the .couch files from any of these three nodes. When you have 6 nodes with n=3 and q=2 this approach does not work anymore because every node has just a single shard. Please correct me if I am wrong.
>
> What is best practice to backup a cluster?
>
> This message is a follow-up from here: https://github.com/cloudant/couchbackup/issues/349
>
> Thanks,
> Simon
Re: Data backup in CouchDB cluster
Posted by Willem van der Westhuizen <wi...@kwantu.net>.
This looks very interesting, is it possible to do incremental backups
with this approach, from a given seqence number in the db only?
Willem
On 2021/05/07 11:52, Andrea Brancatelli wrote:
> We're using this:
>
> https://github.com/danielebailo/couchdb-dump
>
> since a few years.
>
> It almost always worked flawlessly. It's fast, and, to me, it's better
> than backing up the .couch files for various reasons:
>
> * you can restore datas on a cluster with a different N/Q layout
> * you can restore datas on a different machine with a different
> cluster name / different IP / different whatever ... .couch files
> include references to vm.args parameters.
> * when you restore the DB you get a clean db without the tombstones.
> * you can backup the db without having local access to the machine,
> passing trough the standard HTTP port.
>
> Hope it helps you.
>
> ---
>
> Andrea Brancatelli
>
> On 2021-05-07 11:27, Simon Schwichtenberg wrote:
>
>> Hi,
>>
>> I wonder how you'd do backups of your data in a CouchDB cluster. The documentation does not mention backups of clusters explicitly (https://docs.couchdb.org/en/latest/maintenance/backups.html#database-backups).
>>
>> When you have a cluster of three nodes and the nodes are set to n=3 and q=2 (see https://docs.couchdb.org/en/latest/cluster/sharding.html), I'd expect that every single node in the cluster has all the data and you can copy the .couch files from any of these three nodes. When you have 6 nodes with n=3 and q=2 this approach does not work anymore because every node has just a single shard. Please correct me if I am wrong.
>>
>> What is best practice to backup a cluster?
>>
>> This message is a follow-up from here: https://github.com/cloudant/couchbackup/issues/349
>>
>> Thanks,
>> Simon
Re: Data backup in CouchDB cluster
Posted by Andrea Brancatelli <ab...@schema31.it.INVALID>.
We're using this:
https://github.com/danielebailo/couchdb-dump
since a few years.
It almost always worked flawlessly. It's fast, and, to me, it's better
than backing up the .couch files for various reasons:
* you can restore datas on a cluster with a different N/Q layout
* you can restore datas on a different machine with a different
cluster name / different IP / different whatever ... .couch files
include references to vm.args parameters.
* when you restore the DB you get a clean db without the tombstones.
* you can backup the db without having local access to the machine,
passing trough the standard HTTP port.
Hope it helps you.
---
Andrea Brancatelli
On 2021-05-07 11:27, Simon Schwichtenberg wrote:
> Hi,
>
> I wonder how you'd do backups of your data in a CouchDB cluster. The documentation does not mention backups of clusters explicitly (https://docs.couchdb.org/en/latest/maintenance/backups.html#database-backups).
>
> When you have a cluster of three nodes and the nodes are set to n=3 and q=2 (see https://docs.couchdb.org/en/latest/cluster/sharding.html), I'd expect that every single node in the cluster has all the data and you can copy the .couch files from any of these three nodes. When you have 6 nodes with n=3 and q=2 this approach does not work anymore because every node has just a single shard. Please correct me if I am wrong.
>
> What is best practice to backup a cluster?
>
> This message is a follow-up from here: https://github.com/cloudant/couchbackup/issues/349
>
> Thanks,
> Simon