You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by max <ma...@gmail.com> on 2017/10/23 19:55:01 UTC

CouchDB 2.1 backup and restore

Hi,

I'm upgrading from CouchDB 1.6 and wondering how to backup my databases. In
1.6 I was just saving each *.couch file and when something was wrong about
a specific database I just had to rollback to an older version. I was
looking at my new 2.1 CouchDB set up as a single node and then realized
each database was split in shard in /shards. OK then I though about saving
each piece of each database from these several shards folder but realized
the _dbs.couch was probably needed.

Thus my question is simple, how to backup a specific database at T time to
be able to rollback ONLY this database at T + N (without modifying others
databases).

Some tools seems to do the job such as
https://developer.ibm.com/clouddataservices/2016/03/22/simple-couchdb-and-cloudant-backup/

Last question : In simple words what is the CouchDB 2.x cluster? It does
not seem to allow fail-over neither load-balancing does it? What is the
purpose to split a single database into 3 servers whereas the client part
will only speak with one server through 1 IP? Maybe CPU saving and
scalability?

Thanks for any hints :)

Max

Re: CouchDB 2.1 backup and restore

Posted by Joan Touzet <wo...@apache.org>.

In a 3-node cluster with only 1 node operational, CouchDB can't guarantee 
that at least 2 copies of each document are written or read for each operation, 
which is the guaranty that we make on typical operations. This will result in 
e.g. 202 status code responses instead of 201 when PUTting a document. 


You can read more about this in our documentation here: 


http://docs.couchdb.org/en/2.1.0/cluster/theory.html 


If you're interested in the development process for CouchDB, you can sign up 
for our Development mailing list, or browse activity on it here: 


https://lists.apache.org/list.html?dev@couchdb.apache.org 


-Joan 

----- Original Message -----

From: "max" <ma...@gmail.com> 
To: user@couchdb.apache.org, "Joan Touzet" <wo...@apache.org> 
Sent: Friday, 3 November, 2017 2:31:13 PM 
Subject: Re: CouchDB 2.1 backup and restore 



Thank you for your great answer. 
When you say ''degraded'' when only one node is available, what kind of degrade are you talking about? You just said each node has its own copy of data then this node should be able to do the job at 100% except in high loads, isn't it? 


About backup & restore do you know where I can find documentation? 

Last question, what are the 2 releases you're talking about? :D 



Your explanation was really clear, thank you ! 


Max. 

Le 3 nov. 2017 6:49 PM, "Joan Touzet" < wohali@apache.org > a écrit : 


Hi Max, 

With 2 releases being worked right now, the team is kind of busy, but I'll 
answer one of the questions you asked: 


> Last question : In simple words what is the CouchDB 2.x cluster? It does 
> not seem to allow fail-over neither load-balancing does it? What is the 
> purpose to split a single database into 3 servers whereas the client part 
> will only speak with one server through 1 IP? Maybe CPU saving and 
> scalability? 

Distributed systems are less of a fail-over approach like you might be 
used to in classic "High Availability" design, and more of a 
redundant/replica approach. Increasingly, these designs are replacing 
the old cold/warm standby designs. 

With the default settings, those 3 servers each hold a copy of your data. 
If any node fails, the cluster will still be able to provide its core 
guaranty (r=w=2). With two nodes failed, all operations will be degraded, 
but once the failed nodes are replaced, the cluster will self-heal via 
so-called "internal replication" and everything will go back to normal. 

It also provides load balancing, as long as you place a load balancer 
in front of it. It is extremely common to have haproxy or nginx in front 
of CouchDB 2.x. If you are worried about a single point of failure, you 
can have multiple load balancers in front of the CouchDB cluster, and use 
DNS (or similar) to guarantee availability of the cluster. 

And yes, this architecture allows you to scale horizontally without 
taking the cluster down, which is a key feature. 

Perhaps it is time for the reference hardware architecture to make its 
way into the docs... ;) 

-Joan 



----- Original Message ----- 
From: "max" < maxima078@gmail.com > 
To: user@couchdb.apache.org 
Sent: Thursday, 2 November, 2017 11:57:35 AM 
Subject: Re: CouchDB 2.1 backup and restore 

Hi, 

Little up for my backup&restore question "how to backup a specific database 
at T time to be able to rollback ONLY this database at T + N (without 
modifying others databases)?" 

Thanks :) 

2017-10-23 21:55 GMT+02:00 max < maxima078@gmail.com >: 

> Hi, 
> 
> I'm upgrading from CouchDB 1.6 and wondering how to backup my databases. 
> In 1.6 I was just saving each *.couch file and when something was wrong 
> about a specific database I just had to rollback to an older version. I was 
> looking at my new 2.1 CouchDB set up as a single node and then realized 
> each database was split in shard in /shards. OK then I though about saving 
> each piece of each database from these several shards folder but realized 
> the _dbs.couch was probably needed. 
> 
> Thus my question is simple, how to backup a specific database at T time to 
> be able to rollback ONLY this database at T + N (without modifying others 
> databases). 
> 
> Some tools seems to do the job such as https://developer.ibm.com/ 
> clouddataservices/2016/03/22/simple-couchdb-and-cloudant-backup/ 
> 
> Last question : In simple words what is the CouchDB 2.x cluster? It does 
> not seem to allow fail-over neither load-balancing does it? What is the 
> purpose to split a single database into 3 servers whereas the client part 
> will only speak with one server through 1 IP? Maybe CPU saving and 
> scalability? 
> 
> Thanks for any hints :) 
> 
> Max 
>

Re: CouchDB 2.1 backup and restore

Posted by max <ma...@gmail.com>.

Thank you for your great answer.
When you say ''degraded'' when only one node is available, what kind of
degrade are you talking about? You just said each node has its own copy of
data then this node should be able to do the job at 100% except in high
loads, isn't it?

About backup & restore do you know where I can find documentation?

Last question, what are the 2 releases you're talking about? :D

Your explanation was really clear, thank you !

Max.
Le 3 nov. 2017 6:49 PM, "Joan Touzet" <wo...@apache.org> a écrit :

Hi Max,

With 2 releases being worked right now, the team is kind of busy, but I'll
answer one of the questions you asked:

> Last question : In simple words what is the CouchDB 2.x cluster? It does
> not seem to allow fail-over neither load-balancing does it? What is the
> purpose to split a single database into 3 servers whereas the client part
> will only speak with one server through 1 IP? Maybe CPU saving and
> scalability?

Distributed systems are less of a fail-over approach like you might be
used to in classic "High Availability" design, and more of a
redundant/replica approach. Increasingly, these designs are replacing
the old cold/warm standby designs.

With the default settings, those 3 servers each hold a copy of your data.
If any node fails, the cluster will still be able to provide its core
guaranty (r=w=2). With two nodes failed, all operations will be degraded,
but once the failed nodes are replaced, the cluster will self-heal via
so-called "internal replication" and everything will go back to normal.

It also provides load balancing, as long as you place a load balancer
in front of it. It is extremely common to have haproxy or nginx in front
of CouchDB 2.x. If you are worried about a single point of failure, you
can have multiple load balancers in front of the CouchDB cluster, and use
DNS (or similar) to guarantee availability of the cluster.

And yes, this architecture allows you to scale horizontally without
taking the cluster down, which is a key feature.

Perhaps it is time for the reference hardware architecture to make its
way into the docs... ;)

-Joan


----- Original Message -----
From: "max" <ma...@gmail.com>
To: user@couchdb.apache.org
Sent: Thursday, 2 November, 2017 11:57:35 AM
Subject: Re: CouchDB 2.1 backup and restore

Hi,

Little up for my backup&restore question "how to backup a specific database
at T time to be able to rollback ONLY this database at T + N (without
modifying others databases)?"

Thanks :)

2017-10-23 21:55 GMT+02:00 max <ma...@gmail.com>:

> Hi,
>
> I'm upgrading from CouchDB 1.6 and wondering how to backup my databases.
> In 1.6 I was just saving each *.couch file and when something was wrong
> about a specific database I just had to rollback to an older version. I
was
> looking at my new 2.1 CouchDB set up as a single node and then realized
> each database was split in shard in /shards. OK then I though about saving
> each piece of each database from these several shards folder but realized
> the _dbs.couch was probably needed.
>
> Thus my question is simple, how to backup a specific database at T time to
> be able to rollback ONLY this database at T + N (without modifying others
> databases).
>
> Some tools seems to do the job such as https://developer.ibm.com/
> clouddataservices/2016/03/22/simple-couchdb-and-cloudant-backup/
>
> Last question : In simple words what is the CouchDB 2.x cluster? It does
> not seem to allow fail-over neither load-balancing does it? What is the
> purpose to split a single database into 3 servers whereas the client part
> will only speak with one server through 1 IP? Maybe CPU saving and
> scalability?
>
> Thanks for any hints :)
>
> Max
>

Re: CouchDB 2.1 backup and restore

Posted by Joan Touzet <wo...@apache.org>.

Hi Max,

With 2 releases being worked right now, the team is kind of busy, but I'll
answer one of the questions you asked:

> Last question : In simple words what is the CouchDB 2.x cluster? It does
> not seem to allow fail-over neither load-balancing does it? What is the
> purpose to split a single database into 3 servers whereas the client part
> will only speak with one server through 1 IP? Maybe CPU saving and
> scalability?

Distributed systems are less of a fail-over approach like you might be
used to in classic "High Availability" design, and more of a 
redundant/replica approach. Increasingly, these designs are replacing
the old cold/warm standby designs.

With the default settings, those 3 servers each hold a copy of your data.
If any node fails, the cluster will still be able to provide its core
guaranty (r=w=2). With two nodes failed, all operations will be degraded,
but once the failed nodes are replaced, the cluster will self-heal via
so-called "internal replication" and everything will go back to normal.

It also provides load balancing, as long as you place a load balancer
in front of it. It is extremely common to have haproxy or nginx in front
of CouchDB 2.x. If you are worried about a single point of failure, you
can have multiple load balancers in front of the CouchDB cluster, and use
DNS (or similar) to guarantee availability of the cluster.

And yes, this architecture allows you to scale horizontally without
taking the cluster down, which is a key feature.

Perhaps it is time for the reference hardware architecture to make its
way into the docs... ;)

-Joan


----- Original Message -----
From: "max" <ma...@gmail.com>
To: user@couchdb.apache.org
Sent: Thursday, 2 November, 2017 11:57:35 AM
Subject: Re: CouchDB 2.1 backup and restore

Hi,

Little up for my backup&restore question "how to backup a specific database
at T time to be able to rollback ONLY this database at T + N (without
modifying others databases)?"

Thanks :)

2017-10-23 21:55 GMT+02:00 max <ma...@gmail.com>:

> Hi,
>
> I'm upgrading from CouchDB 1.6 and wondering how to backup my databases.
> In 1.6 I was just saving each *.couch file and when something was wrong
> about a specific database I just had to rollback to an older version. I was
> looking at my new 2.1 CouchDB set up as a single node and then realized
> each database was split in shard in /shards. OK then I though about saving
> each piece of each database from these several shards folder but realized
> the _dbs.couch was probably needed.
>
> Thus my question is simple, how to backup a specific database at T time to
> be able to rollback ONLY this database at T + N (without modifying others
> databases).
>
> Some tools seems to do the job such as https://developer.ibm.com/
> clouddataservices/2016/03/22/simple-couchdb-and-cloudant-backup/
>
> Last question : In simple words what is the CouchDB 2.x cluster? It does
> not seem to allow fail-over neither load-balancing does it? What is the
> purpose to split a single database into 3 servers whereas the client part
> will only speak with one server through 1 IP? Maybe CPU saving and
> scalability?
>
> Thanks for any hints :)
>
> Max
>

Re: CouchDB 2.1 backup and restore

Posted by max <ma...@gmail.com>.

Hi,

Little up for my backup&restore question "how to backup a specific database
at T time to be able to rollback ONLY this database at T + N (without
modifying others databases)?"

Thanks :)

2017-10-23 21:55 GMT+02:00 max <ma...@gmail.com>:

> Hi,
>
> I'm upgrading from CouchDB 1.6 and wondering how to backup my databases.
> In 1.6 I was just saving each *.couch file and when something was wrong
> about a specific database I just had to rollback to an older version. I was
> looking at my new 2.1 CouchDB set up as a single node and then realized
> each database was split in shard in /shards. OK then I though about saving
> each piece of each database from these several shards folder but realized
> the _dbs.couch was probably needed.
>
> Thus my question is simple, how to backup a specific database at T time to
> be able to rollback ONLY this database at T + N (without modifying others
> databases).
>
> Some tools seems to do the job such as https://developer.ibm.com/
> clouddataservices/2016/03/22/simple-couchdb-and-cloudant-backup/
>
> Last question : In simple words what is the CouchDB 2.x cluster? It does
> not seem to allow fail-over neither load-balancing does it? What is the
> purpose to split a single database into 3 servers whereas the client part
> will only speak with one server through 1 IP? Maybe CPU saving and
> scalability?
>
> Thanks for any hints :)
>
> Max
>