You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by Apache Wiki <wi...@apache.org> on 2011/11/27 18:54:29 UTC

[Couchdb Wiki] Update of "Replication" by FilipeManana

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "Replication" page has been changed by FilipeManana:
http://wiki.apache.org/couchdb/Replication?action=diff&rev1=36&rev2=37

Comment:
Updated to mention features coming the in the 1.2.0 release.

  
  http://docs.couchbase.org/couchdb-release-1.1/index.html#couchb-release-1.1-replicatordb  (verifed by the author) 
  
+ 
+ == New features introduced in CouchDB 1.2.0 ==
+ 
+ The 1.2.0 release (not yet released at the moment of this writing, November 27 2011) ships with a new replicator implementation. Besides offering performance improvements, more resilience and better logging/reporting, it offers new configuration parameters. These parameters have, in the ''default.ini'' configuration file, some comments describing them and are the following:
+ 
+ * '''worker_processes''' - The number of process the replicator uses (per replication) to transfer documents from the source to the target database. Higher values can imply better throughput (due to more parallelism of network and disk IO) at the expense of more memory and eventually CPU. Default value is 4.
+ 
+ * '''worker_batch_size''' - Workers process batches with the size defined by this parameter (the size corresponds to number of ''_changes'' feed rows). Larger batch sizes can offer better performance, while lower values imply that checkpointing is done more frequently. Default value is 500.
+ 
+ * '''http_connections''' - The maximum number of HTTP connections per replication. For push replications, the effective number of HTTP connections used is min(worker_processes + 1, http_connections). For pull replications, the effective number of connections used corresponds to this parameter's value. Default value is 20.
+ 
+ * '''connection_timeout''' - The maximum period of inactivity for a connection in milliseconds. If a connection is idle for this period of time, its current request will be retried. Default value is 30000 milliseconds (30 seconds).
+ 
+ * '''retries_per_request''' - The maximum number of retries per request. Before a retry, the replicator will wait for a short period of time before repeating the request. This period of time doubles between each consecutive retry attempt. This period of time never goes beyond 5 minutes and its minimum value (before the first retry is attempted) is 0.25 seconds. The default value of this parameter is 10 attempts.
+ 
+ * '''socket_options''' - A list of options to pass to the connection sockets. The available options can be found in the [[http://www.erlang.org/doc/man/inet.html#setopts-2|documentation for the Erlang function setopts/2 of the inet module]]. Default value is ''[{keepalive, true}, {nodelay, false}]''.
+ 
+ * '''verify_ssl_certificates''' - Whether the replicator should validate or not peer SSL certificates. Default value is ''false''.
+ 
+ * '''ssl_certificate_max_depth''' - The maximum allowed depth for peer SSL certificates. This option only has effect if the option ''''verify_ssl_certificates''' is enabled. Default value is 3.
+ 
+ * '''cert_file''', '''key_file''', '''password''' - These options allow the replicator to authenticate to the other peer with an SSL certificate. The first one is a path to a certificate in the PEM format, the second is a path to a file containg the PEM encoded private key, and the third is a password needed to access the key file if this file is password protected. By default these options are disabled.
+ 
+ 
+ All these options, except for the ones related to peer authentication with SSL certificates, can also be set per replication by simply including them in the replication object/document. Example:
+ 
+ {{{
+ POST /_replicate HTTP/1.1
+ 
+ {
+     "source": "example-database",
+     "target": "http://example.org/example-database",
+     "connection_timeout": 60000,
+     "retries_per_request": 20,
+     "http_connections": 30
+ }
+ }}}
+ 
+ When a replication is started, CouchDB will log the value its parameter. Example:
+ 
+ {{{
+ [info] [<0.152.0>] Replication `"1447443f5d0837538c771c3af68518eb+create_target"` is using:
+ 	4 worker processes
+ 	a worker batch size of 500
+ 	30 HTTP connections
+ 	a connection timeout of 60000 milliseconds
+ 	20 retries per request
+ 	socket options are: [{keepalive,true},{nodelay,false}]
+ 	source start sequence 9243679
+ [info] [<0.128.0>] starting new replication `1447443f5d0837538c771c3af68518eb+create_target` at <0.152.0> (`my_database` -> `http://www.server.com:5984/my_database_copy/`)
+ }}}
+ 
+ As for monitoring progress, the active tasks API was enhanced to report additional information for replication tasks. Example:
+ 
+ {{{
+ $ curl http://localhost:5984/_active_tasks 
+ [ 
+     { 
+         "pid": "<0.1303.0>", 
+         "replication_id": "e42a443f5d08375c8c7a1c3af60518fb+create_target",
+         "checkpointed_source_seq": 17333, 
+         "continuous": false, 
+         "doc_write_failures": 0, 
+         "docs_read": 17833, 
+         "docs_written": 17833, 
+         "missing_revisions_found": 17833, 
+         "progress": 3, 
+         "revisions_checked": 17833, 
+         "source": "http://fdmanana.iriscouch.com/test_db/", 
+         "source_seq": 551202, 
+         "started_on": 1316229471, 
+         "target": "test_db", 
+         "type": "replication", 
+         "updated_on": 1316230082 
+     } 
+ ]
+ }}}
+