You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by kx...@apache.org on 2013/07/24 14:24:46 UTC
[09/50] [abbrv] git commit: updated
refs/heads/1781-reorganize-and-improve-docs to fa11c25
Add replication protocol definition.
COUCHDB-1824
Project: http://git-wip-us.apache.org/repos/asf/couchdb/repo
Commit: http://git-wip-us.apache.org/repos/asf/couchdb/commit/ae9ceacc
Tree: http://git-wip-us.apache.org/repos/asf/couchdb/tree/ae9ceacc
Diff: http://git-wip-us.apache.org/repos/asf/couchdb/diff/ae9ceacc
Branch: refs/heads/1781-reorganize-and-improve-docs
Commit: ae9ceacc8fe3aa0fdd9b6ff2a201e4cd76487953
Parents: 4ecafeb
Author: Alexander Shorin <kx...@apache.org>
Authored: Tue Jul 23 22:41:28 2013 +0400
Committer: Alexander Shorin <kx...@apache.org>
Committed: Wed Jul 24 10:48:37 2013 +0400
----------------------------------------------------------------------
share/doc/build/Makefile.am | 3 +
share/doc/src/replications/index.rst | 1 +
share/doc/src/replications/protocol.rst | 201 +++++++++++++++++++++++++++
3 files changed, 205 insertions(+)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/couchdb/blob/ae9ceacc/share/doc/build/Makefile.am
----------------------------------------------------------------------
diff --git a/share/doc/build/Makefile.am b/share/doc/build/Makefile.am
index 1580f85..cb60fba 100644
--- a/share/doc/build/Makefile.am
+++ b/share/doc/build/Makefile.am
@@ -70,6 +70,7 @@ html_files = \
html/_sources/config/proxying.txt \
html/_sources/replications/index.txt \
html/_sources/replications/intro.txt \
+ html/_sources/replications/protocol.txt \
html/_sources/replications/replicator.txt \
html/_sources/changelog.txt \
html/_sources/changes.txt \
@@ -127,6 +128,7 @@ html_files = \
html/config/proxying.html \
html/replications/index.html \
html/replications/intro.html \
+ html/replications/protocol.html \
html/replications/replicator.html \
html/changelog.html \
html/changes.html \
@@ -182,6 +184,7 @@ src_files = \
../src/config/proxying.rst \
../src/replications/index.rst \
../src/replications/intro.rst \
+ ../src/replications/protocol.rst \
../src/replications/replicator.rst \
../src/changelog.rst \
../src/changes.rst \
http://git-wip-us.apache.org/repos/asf/couchdb/blob/ae9ceacc/share/doc/src/replications/index.rst
----------------------------------------------------------------------
diff --git a/share/doc/src/replications/index.rst b/share/doc/src/replications/index.rst
index b626bc6..940a29c 100644
--- a/share/doc/src/replications/index.rst
+++ b/share/doc/src/replications/index.rst
@@ -33,3 +33,4 @@ destination database.
intro
replicator
+ protocol
http://git-wip-us.apache.org/repos/asf/couchdb/blob/ae9ceacc/share/doc/src/replications/protocol.rst
----------------------------------------------------------------------
diff --git a/share/doc/src/replications/protocol.rst b/share/doc/src/replications/protocol.rst
new file mode 100644
index 0000000..bf478f7
--- /dev/null
+++ b/share/doc/src/replications/protocol.rst
@@ -0,0 +1,201 @@
+.. Licensed under the Apache License, Version 2.0 (the "License"); you may not
+.. use this file except in compliance with the License. You may obtain a copy of
+.. the License at
+..
+.. http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+.. WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+.. License for the specific language governing permissions and limitations under
+.. the License.
+
+.. _replication/protocol:
+
+============================
+CouchDB Replication Protocol
+============================
+
+The **CouchDB Replication protocol** is a protocol for synchronizing
+documents between 2 peers over HTTP/1.1.
+
+Language
+--------
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+document are to be interpreted as described in :rfc:`2119`.
+
+
+Goals
+-----
+
+The CouchDB Replication protocol is a synchronization protocol for
+synchronizing documents between 2 peers over HTTP/1.1.
+
+In theory the CouchDB protocol can be used between products that
+implement it. However the reference implementation, written in Erlang_, is
+provided by the couch_replicator_ module available in Apache CouchDB.
+
+
+The CouchDB_ replication protocol is using the `CouchDB REST API
+<http://wiki.apache.org/couchdb/Reference>`_ and so is based on HTTP and
+the Apache CouchDB MVC Data model. The primary goal of this
+specification is to describe the CouchDB replication algorithm.
+
+
+Definitions
+-----------
+
+ID:
+ An identifier (could be an UUID) as described in :rfc:`4122`
+
+Sequence:
+ An ID provided by the changes feed. It can be numeric but not
+ necessarily.
+
+Revision:
+ (to define)
+
+Document
+ A document is JSON entity with a unique ID and revision.
+
+Database
+ A collection of documents with a unique URI
+
+URI
+ An uri is defined by the :rfc:`2396` . It can be an URL as defined
+ in :rfc:`1738`.
+
+Source
+ Database from where the Documents are replicated
+
+Target
+ Database where the Document are replicated
+
+Checkpoint
+ Last source sequence ID
+
+
+Algorithm
+---------
+
+1. Get unique identifiers for the Source and Target based on their URI if
+ replication task ID is not available.
+
+2. Save this identifier in a special Document named `_local/<uniqueid>`
+ on the Target database. This document isn't replicated. It will
+ collect the last Source sequence ID, the Checkpoint, from the
+ previous replication process.
+
+3. Get the Source changes feed by passing it the Checkpoint using the
+ `since` parameter by calling the `/<source>/_changes` URL. The
+ changes feed only return a list of current revisions.
+
+
+.. note::
+
+ This step can be done continuously using the `feed=longpoll` or
+ `feed=continuous` parameters. Then the feed will continuously get
+ the changes.
+
+
+4. Collect a group of Document/Revisions ID pairs from the **changes
+ feed** and send them to the target databases on the
+ `/<target>/_revs_diffs` URL. The result will contain the list of
+ revisions **NOT** in the Target.
+
+5. GET each revisions from the source Database by calling the URL
+ `/<source>/<docid>?revs=true&open_revs`=<revision>` . This
+ will get the document with teh parent revisions. Also don't forget to
+ get attachments that aren't already stored at the target. As an
+ optimisation you can use the HTTP multipart api to get all.
+
+6. Collect a group of revisions fetched at previous step and store them
+ on the target database using the `Bulk Docs
+ <http://wiki.apache.org/couchdb/HTTP_Document_API#Bulk_Docs>`_ API
+ with the `new_edit: false` JSON property to preserve their revisions
+ ID.
+
+7. After the group of revision is stored on the Target, save
+ the new Checkpoint on the Source.
+
+
+.. note::
+
+ - Even if some revisions have been ignored the sequence should be
+ take in consideration for the Checkpoint.
+
+ - To compare non numeric sequence ordering, you will have to keep an
+ ordered list of the sequences IDS as they appear in the _changes
+ feed and compare their indices.
+
+Filter replication
+------------------
+
+The replication can be filtered by passing the `filter` parameter to the
+changes feeds with a function name. This will call a function on each
+changes. If this function return True, the document will be added to the
+feed.
+
+
+Optimisations
+-------------
+
+- The system should run each steps in parallel to reduce the latency.
+
+- The number of revisions passed to the step 3 and 6 should be large
+ enough to reduce the bandwidth and make sure to reduce the latency.
+
+
+API Reference
+-------------
+
+- :ref:`api/db.head` -- Check Database existence
+- :ref:`api/db/ensure_full_commit` -- Ensure that all changes are stored on disk
+- :ref:`api/local/doc.get` -- Read the last Checkpoint
+- :ref:`api/local/doc.put` -- Save a new Checkpoint
+
+Push Only
+~~~~~~~~~
+
+- :ref:`api/db.put` -- Create Target if it not exists and option was provided
+- :ref:`api/db/revs_diff.post` -- Locate Revisions that are not known to the
+ Target
+- :ref:`api/db/bulk_docs.post` -- Upload Revisions to the Target
+- :ref:`api/doc.put`?new_edits=false -- Upload a single Document with
+ attachments to the Target
+
+Pull Only
+~~~~~~~~~
+
+- :ref:`api/db/changes.get` -- Locate changes since on Source the last pull.
+ The request uses next query parameters:
+
+ - ``style=all_docs``
+ - ``feed=feed`` , where feed is :ref:`normal <changes/normal>` or
+ :ref:`longpoll <changes/longpoll>`
+ - ``limit=limit``
+ - ``heartbeat=heartbeat``
+
+- :ref:`api/doc.get` -- Retrieve a single Document from Source with attachments.
+ The request uses next query parameters:
+
+ - ``open_revs=revid`` - where ``revid`` is the actual Document Revision at the
+ moment of the pull request
+ - ``revs=true``
+ - ``atts_since=lastrev``
+
+Reference
+---------
+
+* `TouchDB Ios wiki <https://github.com/couchbaselabs/TouchDB-iOS/wiki/Replication-Algorithm>`_
+* `CouchDB documentation
+ <http://wiki.apache.org/couchdb/Replication>`_
+* CouchDB `change notifications`_
+
+.. _CouchDB: http://couchdb.apache.org
+.. _Erlang: http://erlang.org
+.. _couch_replicator: https://github.com/apache/couchdb/tree/master/src/couch_replicator
+.. _change notifications: http://guide.couchdb.org/draft/notifications.html
+