You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Ladislav Thon <la...@gmail.com> on 2012/06/27 18:38:23 UTC

Replicating with two CouchDBs that share the same URL

Hi,

we're using CouchDB (version 1.1.1 currently, but planning to upgrade to
1.2.0) because of its multi-master replication. The replication topology is
a simple star -- single central server and a number of clients that
replicate both from and to the central server. Writes are (almost) always
done on the clients.

Now for high availability, the central server isn't actually a single
machine, but two machines (and therefore two couches) whose IP addresses
are mapped to the same domain name (DNS round robin). These two couches
also replicate with each other. The clients don't know about this, they
always replicate from and to https://central.couch:6984/database.

This might not be the best architecture for HA and we would be able to
change it, but I'd still love to get an answer to this question: is CouchDB
able to cope with this? How does it know that it replicates with the same
couch it replicated with before (so that it only has to replay changes) and
how does it recognize that it replicates with a different couch than before
(and has to copy the whole database)?

I know that it was already proposed several times to add an UUID to CouchDB
server/database, which would solve this issue, and I also know that it's
very easy to end up with duplicates, which renders universallly
unique identifiers ... not so *unique* (i.e. useless).

---

Also, I have a question about replication monitoring. Are there some best
practices for monitoring whether the replication is working? I can of
course read the corresponding document in the _replicator database and look
at the _replication_state field, but this will only tell me that the
replication is *running* -- and I want to know that it's actually
*working*. For
now, we are using a pretty naive approach: 1. Every 10 minutes, write a
document with current date and time to the central couch. 2. Periodically
check on all clients (we have them under control) that the document isn't
too old. Is there a better approach?

Thanks for your opinions!

LT

Re: Replicating with two CouchDBs that share the same URL

Posted by Robert Newson <rn...@apache.org>.

The replicator searches for a checkpoint document on source and target when it starts. This document identifies the update_seq that the previous replication had reached. The checkpoint document's id is derived from the source hostname:port and target hostname:port (plus some other properties).

B.


On 11 Aug 2012, at 14:38, Ladislav Thon wrote:

> Friendly ping? :-)
> 
> LT
> 
> 2012/6/27 Ladislav Thon <la...@gmail.com>
> 
>> Hi,
>> 
>> we're using CouchDB (version 1.1.1 currently, but planning to upgrade to
>> 1.2.0) because of its multi-master replication. The replication topology is
>> a simple star -- single central server and a number of clients that
>> replicate both from and to the central server. Writes are (almost) always
>> done on the clients.
>> 
>> Now for high availability, the central server isn't actually a single
>> machine, but two machines (and therefore two couches) whose IP addresses
>> are mapped to the same domain name (DNS round robin). These two couches
>> also replicate with each other. The clients don't know about this, they
>> always replicate from and to https://central.couch:6984/database.
>> 
>> This might not be the best architecture for HA and we would be able to
>> change it, but I'd still love to get an answer to this question: is CouchDB
>> able to cope with this? How does it know that it replicates with the same
>> couch it replicated with before (so that it only has to replay changes) and
>> how does it recognize that it replicates with a different couch than before
>> (and has to copy the whole database)?
>> 
>> I know that it was already proposed several times to add an UUID to
>> CouchDB server/database, which would solve this issue, and I also know that
>> it's very easy to end up with duplicates, which renders universallly
>> unique identifiers ... not so *unique* (i.e. useless).
>> 
>> ---
>> 
>> Also, I have a question about replication monitoring. Are there some best
>> practices for monitoring whether the replication is working? I can of
>> course read the corresponding document in the _replicator database and look
>> at the _replication_state field, but this will only tell me that the
>> replication is *running* -- and I want to know that it's actually *working
>> *. For now, we are using a pretty naive approach: 1. Every 10 minutes,
>> write a document with current date and time to the central couch. 2.
>> Periodically check on all clients (we have them under control) that the
>> document isn't too old. Is there a better approach?
>> 
>> Thanks for your opinions!
>> 
>> LT
>>

Re: Replicating with two CouchDBs that share the same URL

Posted by Jim Klo <ji...@sri.com>.

See inline:

Sent from my iPad

On Aug 11, 2012, at 6:38 AM, "Ladislav Thon" <la...@gmail.com> wrote:

> Friendly ping? :-)
> 
> LT
> 
> 2012/6/27 Ladislav Thon <la...@gmail.com>
> 
>> Hi,
>> 
>> we're using CouchDB (version 1.1.1 currently, but planning to upgrade to
>> 1.2.0) because of its multi-master replication. The replication topology is
>> a simple star -- single central server and a number of clients that
>> replicate both from and to the central server. Writes are (almost) always
>> done on the clients.
>> 
>> Now for high availability, the central server isn't actually a single
>> machine, but two machines (and therefore two couches) whose IP addresses
>> are mapped to the same domain name (DNS round robin). These two couches
>> also replicate with each other. The clients don't know about this, they
>> always replicate from and to https://central.couch:6984/database.
>> 

So your edges are essentially masters and your central servers are really just slaves in a manner of speaking? It seems to me using RRDNS for your central server would be potentially bad, given that replication uses the changes feed and last local sequence (which may not be in the same doc_id order across servers) in a local 'watermark' doc that whose doc_id is a computed hash of the replication doc. RRDNS doesn't guarantee you're always talking to the same server so your 'clients' are most likely missing docs. 

Server 1: seq:doc_id
1:id1 - 2:id2 - 3:id3 - 4:id4 - 5:id5 - 6:id6

Server 2: seq:doc_id
1:id5 - 2:id6 - 3:id4 - 4:id2 - 5:id1 - 6:id3

Consider the above possible changes feeds, which assumes all docs are in both, but in different order, if first replication hits server 1 and gets only seq 1 - 3, it will then get docids 1-3 too. But due to RRDNS the next replication is with server 2, it will start at seq 4 as your DNS is the same, which would then try replicating the same docs again, possibly with conflicts, but would mask docids 4 - 6 from being replicated!

>> This might not be the best architecture for HA and we would be able to
>> change it, but I'd still love to get an answer to this question: is CouchDB
>> able to cope with this?

It can't because of RRDNS. 

>> How does it know that it replicates with the same
>> couch it replicated with before (so that it only has to replay changes) and
>> how does it recognize that it replicates with a different couch than before
>> (and has to copy the whole database)?
>> 

It doesn't know it's replicating with a different DB AFAIK.

>> I know that it was already proposed several times to add an UUID to
>> CouchDB server/database, which would solve this issue, and I also know that
>> it's very easy to end up with duplicates, which renders universallly
>> unique identifiers ... not so *unique* (i.e. useless).
>> 

I don't know the status of this, but I've not seen replication between multiple servers of the domain name work right. 

I don't know how many servers you have total, but just have every server replicate to others in the cluster. I don't know how well this scales ultimately, but we've not needed RRDNS to make it work.  

Also I'm assuming any app you use is fixated on a specific server or is using session affinity, otherwise your app will have inconsistent behavior too. 

>> ---
>> 
>> Also, I have a question about replication monitoring. Are there some best
>> practices for monitoring whether the replication is working? I can of
>> course read the corresponding document in the _replicator database and look
>> at the _replication_state field, but this will only tell me that the
>> replication is *running* -- and I want to know that it's actually *working
>> *. For now, we are using a pretty naive approach: 1. Every 10 minutes,
>> write a document with current date and time to the central couch. 2.
>> Periodically check on all clients (we have them under control) that the
>> document isn't too old. Is there a better approach?
>> 

Your approach to check consistency is probably about the simplest one could do. I do think your RRDS setup could cause rare to infrequent issues. I'd suggest just have each client (edge) replicate with all central servers using unique server names and have the central servers replicate with each other (however that shouldn't be necessary, if the clients are replicating to both). 

>> Thanks for your opinions!
>> 
>> LT
>>

Re: Replicating with two CouchDBs that share the same URL

Posted by Ladislav Thon <la...@gmail.com>.

Friendly ping? :-)

LT

2012/6/27 Ladislav Thon <la...@gmail.com>

> Hi,
>
> we're using CouchDB (version 1.1.1 currently, but planning to upgrade to
> 1.2.0) because of its multi-master replication. The replication topology is
> a simple star -- single central server and a number of clients that
> replicate both from and to the central server. Writes are (almost) always
> done on the clients.
>
> Now for high availability, the central server isn't actually a single
> machine, but two machines (and therefore two couches) whose IP addresses
> are mapped to the same domain name (DNS round robin). These two couches
> also replicate with each other. The clients don't know about this, they
> always replicate from and to https://central.couch:6984/database.
>
> This might not be the best architecture for HA and we would be able to
> change it, but I'd still love to get an answer to this question: is CouchDB
> able to cope with this? How does it know that it replicates with the same
> couch it replicated with before (so that it only has to replay changes) and
> how does it recognize that it replicates with a different couch than before
> (and has to copy the whole database)?
>
> I know that it was already proposed several times to add an UUID to
> CouchDB server/database, which would solve this issue, and I also know that
> it's very easy to end up with duplicates, which renders universallly
> unique identifiers ... not so *unique* (i.e. useless).
>
> ---
>
> Also, I have a question about replication monitoring. Are there some best
> practices for monitoring whether the replication is working? I can of
> course read the corresponding document in the _replicator database and look
> at the _replication_state field, but this will only tell me that the
> replication is *running* -- and I want to know that it's actually *working
> *. For now, we are using a pretty naive approach: 1. Every 10 minutes,
> write a document with current date and time to the central couch. 2.
> Periodically check on all clients (we have them under control) that the
> document isn't too old. Is there a better approach?
>
> Thanks for your opinions!
>
> LT
>