You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Miles Fidelman <mf...@meetinghouse.net> on 2010/04/21 04:29:27 UTC

documentation of replication protocol?

Hi Folks,

I've been looking, but can't seem to find any good documentation of the 
inter-node protocol used for replication.

I've been thinking of playing with a multi-cast alternative to the 
current pair-wise replication model - but, of course, that's hard to do 
without visibility into the format of the messages exchanged during 
replication.

Miles Fidelman

-- 
In theory, there is no difference between theory and practice.
In<fnord>  practice, there is.   .... Yogi Berra



Re: documentation of replication protocol?

Posted by Adam Kocoloski <ko...@apache.org>.
On Apr 21, 2010, at 8:33 AM, Miles Fidelman wrote:

> J Chris Anderson wrote:
>> On Apr 20, 2010, at 7:29 PM, Miles Fidelman wrote:
>>   
>>> I've been looking, but can't seem to find any good documentation of the inter-node protocol used for replication.
>>>     
>> As far as I know, the best source for documentation is the code, right now.
>>   
> <snip>
>> This is the hard coding (in Ruby) I had to add, to used the CouchDB replicator to pull from the Booth server:
>> 
>> http://github.com/jchris/booth/commit/2deff74e03838a6e7ef95b725c4342a08239a2b8#commitcomment-68685
>> 
>>   
> aaarrrgggh......
> 
> I don't suppose anyone out there has scribbled down anything resembling a sequence diagram or flowchart or list of bullet points or something that summarizes the steps that happen, and the code that gets run when POST /_replicate is invoked, or an ASN.1-like summary of the messages that get exchanged between two couch instances during replication

Miles, at a high level it looks like this:

http://dl.dropbox.com/u/237885/nyc-nosql-replicator.png

There are some added details regarding efficient replication of document attachments, and of course that slide doesn't describe the request and response formats for those 4 resources.  But that's the sequence of steps that the replicator loops through.

Adam

> 
> right now, replication reminds me of the old Sidney Harris Cartoon, "then a miracle occurs" (http://www.sciencecartoonsplus.com/pages/gallery.php)
> 
> -- 
> In theory, there is no difference between theory and practice.
> In<fnord>  practice, there is.   .... Yogi Berra
> 
> 


Re: documentation

Posted by Miles Fidelman <mf...@meetinghouse.net>.
Adam Kocoloski wrote:
> Simon Metson reminded me that I wrote down something like this for him a few months back.  Here it is.  It describes the replication workflow using inline document attachments, rather than the more efficient multipart requests which are supported in 0.11.  Hope it helps.  Regards,
>
>    
A good starting point.  Thanks!  Probably worth putting somewhere on the 
wiki for future reference.

And...

> It terms of broader architectural overviews, you may find Ricky Ho's set of articles useful:
>
> http://horicky.blogspot.com/2008/10/couchdb-implementation.html
>
>    
Exactly what I was looking for.  Thanks again!  (And now that I know 
what to look for, I found the link to it on the couch wiki).

Miles



>    
>> So, the sequence of calls depends on whether you're pulling updates from this remote server or pushing updates to it.  Let's consider the two cases separately:
>>
>> ## Pull Replication (remote source, local target)
>>
>> ### HEAD /db
>> Respond with a 200 status code and you're good.
>>
>> ### GET /db/_local/<rep id>
>> The replicator checkpoints its progress in these _local documents.  You can respond with a 404 if you like, otherwise the response should be JSON that looks very much like a replication response, e.g. the one described here:
>>
>> http://books.couchdb.org/relax/reference/replication#Replication%20in%20Detail
>>
>> Basically, if the _local doc exists and both the source and target DBs, and the documents agree on the value of "source_last_seq", the replicator will start from the update sequence on the source.
>>
>> ### GET /db/_changes?style=all_docs&heartbeat=10000&since=N[&feed=continuous]
>>
>> This is the hard part.  The replicator makes this request on a separate connection to your server, asking for a list of changes since N (the source_last_seq from the previous step).  If the replication is meant to be permanent, the feed=continuous parameter will be supplied.  The best reference for the response format is definitely the O'Reilly book:
>>
>> http://books.couchdb.org/relax/reference/change-notifications
>>
>> ### GET /db/docid?revs=true&latest=true&open_revs["1-23420432",...]
>>
>> You'll see one of these for each updated document if the update does not already exist on the target. I believe the response is a JSON Array
>>
>> [{"ok":{"_id":"docid","_rev":"1-23420432", ..rest of doc}, {"missing":"some-bad-rev"}]
>>
>> The "missing" case is very rare and is usually the result of somebody racing the replicator.
>>
>> ### GET /db/docid/attachment?rev=1-234923042
>>
>> Attachments are downloaded separately during pull replication.  The correct response is the binary data.
>>
>> ### PUT /db/_local/<rep id>
>>
>> Periodically the replicator will try to save an updated _local doc with the new replication history. The response is {"ok":true, "rev":NewRevId}
>>
>> That's it for pull replication.
>>
>> ## Push replication (local source, remote target)
>>
>> The _local doc calls are still there, but now we have two new POSTs:
>>
>> POST /db/_missing_revs -d '{"docid1":["1-24323423"], "docid2":"["2-23434534"]}
>>
>> This is the replicator asking the target if these document revisions are already saved there.  The response is a list of the ones that are missing:
>>
>> {"missing_revs":{"docid2":["2-23434534"]}}
>>
>> POST /db/_bulk_docs -d '{"new_edits":false, "docs":[... array of documents ...]}
>>
>> This one is exactly like the regular _bulk_docs call.  The new_edits:false parameter tells the target not to throw conflict, but instead save all these updates, as conflict revisions if necessary. Currently attachments are inlined, although in 0.11 we'll be doing special multipart PUTs for documents with attachments instead of using _bulk_docs (so we don't need to Base64 encode them). Best,
>>
>> Adam
>>      


-- 
In theory, there is no difference between theory and practice.
In<fnord>  practice, there is.   .... Yogi Berra



Re: documentation of replication protocol?

Posted by Adam Kocoloski <ko...@apache.org>.
On Apr 21, 2010, at 8:33 AM, Miles Fidelman wrote:

> J Chris Anderson wrote:
>> On Apr 20, 2010, at 7:29 PM, Miles Fidelman wrote:
>>   
>>> I've been looking, but can't seem to find any good documentation of the inter-node protocol used for replication.
>>>     
>> As far as I know, the best source for documentation is the code, right now.
>>   
> <snip>
>> This is the hard coding (in Ruby) I had to add, to used the CouchDB replicator to pull from the Booth server:
>> 
>> http://github.com/jchris/booth/commit/2deff74e03838a6e7ef95b725c4342a08239a2b8#commitcomment-68685
>> 
>>   
> aaarrrgggh......
> 
> I don't suppose anyone out there has scribbled down anything resembling a sequence diagram or flowchart or list of bullet points or something that summarizes the steps that happen, and the code that gets run when POST /_replicate is invoked, or an ASN.1-like summary of the messages that get exchanged between two couch instances during replication
> 
> right now, replication reminds me of the old Sidney Harris Cartoon, "then a miracle occurs" (http://www.sciencecartoonsplus.com/pages/gallery.php)
> 
> -- 
> In theory, there is no difference between theory and practice.
> In<fnord>  practice, there is.   .... Yogi Berra

Hi Miles,

Simon Metson reminded me that I wrote down something like this for him a few months back.  Here it is.  It describes the replication workflow using inline document attachments, rather than the more efficient multipart requests which are supported in 0.11.  Hope it helps.  Regards,

Adam

On 8 Dec 2009, at 01:42, Adam Kocoloski wrote:

> So, the sequence of calls depends on whether you're pulling updates from this remote server or pushing updates to it.  Let's consider the two cases separately:
> 
> ## Pull Replication (remote source, local target)
> 
> ### HEAD /db
> Respond with a 200 status code and you're good.
> 
> ### GET /db/_local/<rep id>
> The replicator checkpoints its progress in these _local documents.  You can respond with a 404 if you like, otherwise the response should be JSON that looks very much like a replication response, e.g. the one described here:
> 
> http://books.couchdb.org/relax/reference/replication#Replication%20in%20Detail
> 
> Basically, if the _local doc exists and both the source and target DBs, and the documents agree on the value of "source_last_seq", the replicator will start from the update sequence on the source.
> 
> ### GET /db/_changes?style=all_docs&heartbeat=10000&since=N[&feed=continuous]
> 
> This is the hard part.  The replicator makes this request on a separate connection to your server, asking for a list of changes since N (the source_last_seq from the previous step).  If the replication is meant to be permanent, the feed=continuous parameter will be supplied.  The best reference for the response format is definitely the O'Reilly book:
> 
> http://books.couchdb.org/relax/reference/change-notifications
> 
> ### GET /db/docid?revs=true&latest=true&open_revs["1-23420432",...]
> 
> You'll see one of these for each updated document if the update does not already exist on the target. I believe the response is a JSON Array
> 
> [{"ok":{"_id":"docid","_rev":"1-23420432", ..rest of doc}, {"missing":"some-bad-rev"}]
> 
> The "missing" case is very rare and is usually the result of somebody racing the replicator.
> 
> ### GET /db/docid/attachment?rev=1-234923042
> 
> Attachments are downloaded separately during pull replication.  The correct response is the binary data.
> 
> ### PUT /db/_local/<rep id>
> 
> Periodically the replicator will try to save an updated _local doc with the new replication history. The response is {"ok":true, "rev":NewRevId}
> 
> That's it for pull replication.
> 
> ## Push replication (local source, remote target)
> 
> The _local doc calls are still there, but now we have two new POSTs:
> 
> POST /db/_missing_revs -d '{"docid1":["1-24323423"], "docid2":"["2-23434534"]}
> 
> This is the replicator asking the target if these document revisions are already saved there.  The response is a list of the ones that are missing:
> 
> {"missing_revs":{"docid2":["2-23434534"]}}
> 
> POST /db/_bulk_docs -d '{"new_edits":false, "docs":[... array of documents ...]}
> 
> This one is exactly like the regular _bulk_docs call.  The new_edits:false parameter tells the target not to throw conflict, but instead save all these updates, as conflict revisions if necessary. Currently attachments are inlined, although in 0.11 we'll be doing special multipart PUTs for documents with attachments instead of using _bulk_docs (so we don't need to Base64 encode them). Best,
> 
> Adam


Re: documentation of replication protocol?

Posted by Miles Fidelman <mf...@meetinghouse.net>.
J Chris Anderson wrote:
> On Apr 20, 2010, at 7:29 PM, Miles Fidelman wrote:
>    
>> I've been looking, but can't seem to find any good documentation of the inter-node protocol used for replication.
>>      
> As far as I know, the best source for documentation is the code, right now.
>    
<snip>
> This is the hard coding (in Ruby) I had to add, to used the CouchDB replicator to pull from the Booth server:
>
> http://github.com/jchris/booth/commit/2deff74e03838a6e7ef95b725c4342a08239a2b8#commitcomment-68685
>
>    
aaarrrgggh......

I don't suppose anyone out there has scribbled down anything resembling 
a sequence diagram or flowchart or list of bullet points or something 
that summarizes the steps that happen, and the code that gets run when 
POST /_replicate is invoked, or an ASN.1-like summary of the messages 
that get exchanged between two couch instances during replication

right now, replication reminds me of the old Sidney Harris Cartoon, 
"then a miracle occurs" 
(http://www.sciencecartoonsplus.com/pages/gallery.php)

-- 
In theory, there is no difference between theory and practice.
In<fnord>  practice, there is.   .... Yogi Berra



Re: documentation of replication protocol?

Posted by Adam Kocoloski <ko...@apache.org>.
On Apr 21, 2010, at 12:05 AM, J Chris Anderson wrote:

> 
> On Apr 20, 2010, at 7:29 PM, Miles Fidelman wrote:
> 
>> Hi Folks,
>> 
>> I've been looking, but can't seem to find any good documentation of the inter-node protocol used for replication.
>> 
>> I've been thinking of playing with a multi-cast alternative to the current pair-wise replication model - but, of course, that's hard to do without visibility into the format of the messages exchanged during replication.
>> 
>> Miles Fidelman
> 
> As far as I know, the best source for documentation is the code, right now.
> 
> My reservation about the replication protocol is that it is more brittle than JSON (it requires some exact string matches in the source). With an event-based JSON parser, we could accept any valid JSON instead of hard coding the output of replicators.
> 
> One thing that strikes me is that if we had a browser-based test for the replicator protocol, we could clean this up substantially. This test suite would be a great contribution from anyone out there wanting to learn the replicator really well, but you might need to collaborate with someone to help get the tests to pass, in places.
> 
> This is the hard coding (in Ruby) I had to add, to used the CouchDB replicator to pull from the Booth server:
> 
> http://github.com/jchris/booth/commit/2deff74e03838a6e7ef95b725c4342a08239a2b8#commitcomment-68685
> 
> This is fine if we're just trying to replicate between CouchDB instances, but a challenge for people building interoperable replicators. 
> 
> Chris

Hi Chris, I need a little clarification here.  Was the hack on line 57 the specific placement of newlines, the ordering of fields in the JSON Object, or something else?

The CouchDB replicator does use a regular expression to split the _changes feed into individual events.  If you're talking about the need for newlines in between events, yes, that was a silly oversight on our part, and a simple bugfix.

The requirement for "last_seq" to appear after "results" in the object is also a simple thing to fix.  There's no good reason for the replication protocol to be more brittle than JSON.

Adam

Re: documentation of replication protocol?

Posted by J Chris Anderson <jc...@gmail.com>.
On Apr 20, 2010, at 7:29 PM, Miles Fidelman wrote:

> Hi Folks,
> 
> I've been looking, but can't seem to find any good documentation of the inter-node protocol used for replication.
> 
> I've been thinking of playing with a multi-cast alternative to the current pair-wise replication model - but, of course, that's hard to do without visibility into the format of the messages exchanged during replication.
> 
> Miles Fidelman

As far as I know, the best source for documentation is the code, right now.

My reservation about the replication protocol is that it is more brittle than JSON (it requires some exact string matches in the source). With an event-based JSON parser, we could accept any valid JSON instead of hard coding the output of replicators.

One thing that strikes me is that if we had a browser-based test for the replicator protocol, we could clean this up substantially. This test suite would be a great contribution from anyone out there wanting to learn the replicator really well, but you might need to collaborate with someone to help get the tests to pass, in places.

This is the hard coding (in Ruby) I had to add, to used the CouchDB replicator to pull from the Booth server:

http://github.com/jchris/booth/commit/2deff74e03838a6e7ef95b725c4342a08239a2b8#commitcomment-68685

This is fine if we're just trying to replicate between CouchDB instances, but a challenge for people building interoperable replicators. 

Chris




> 
> -- 
> In theory, there is no difference between theory and practice.
> In<fnord>  practice, there is.   .... Yogi Berra
> 
>