You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2020/08/17 22:10:29 UTC

[GitHub] [couchdb-documentation] rnewson commented on a change in pull request #581: [RFC] Replicator Implementation for CouchDB 4.x

rnewson commented on a change in pull request #581:
URL: https://github.com/apache/couchdb-documentation/pull/581#discussion_r471793702



##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or

Review comment:
       `created from` to `defined in`?

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.

Review comment:
       does this sentence add anything?

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.
+
+`job_id` : A replication job ID derived from the database and document IDs for
+persistent replications, and from source, target endpoint, user name and some
+options for transient replications. Computing a `job_id`, unlike a
+`replication_id`, doesn't require making any network requests. A filtered
+replication with a given `job_id` during its lifetime may change its
+`replication_id` multiple times when filter contents changes on the source.
+
+`max_jobs` : Configuration parameter which specifies up to how many replication
+jobs to run on each `replication` node.
+
+`max_churn` : Configuration parameter which specifies a limit of how many new
+jobs to spawn during each rescheduling interval.
+
+`min_backoff_penalty` : Configuration parameter specifying the minimum (the
+base) penalty applied to jobs which crash repeatedly.
+
+`max_backoff_penalty` : Configuration parameter specifying the maximum penalty
+applied to jobs which crash repeatedly.
+
+---
+
+# Detailed Description
+
+Replication job creation and scheduling works roughly as follows:
+
+ 1) `Persistent` and `transient` jobs both start by creating or updating a
+ `couch_jobs` record in a separate replication key-space on `api_frontend`
+ nodes. Persistent jobs are driven by an EPI callback mechanism which notifies

Review comment:
       meaning couch_epi? I'm not sure why that would be involved in anything that doesn't require pluggability.

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.
+
+`job_id` : A replication job ID derived from the database and document IDs for
+persistent replications, and from source, target endpoint, user name and some
+options for transient replications. Computing a `job_id`, unlike a
+`replication_id`, doesn't require making any network requests. A filtered
+replication with a given `job_id` during its lifetime may change its
+`replication_id` multiple times when filter contents changes on the source.
+
+`max_jobs` : Configuration parameter which specifies up to how many replication
+jobs to run on each `replication` node.
+
+`max_churn` : Configuration parameter which specifies a limit of how many new
+jobs to spawn during each rescheduling interval.
+
+`min_backoff_penalty` : Configuration parameter specifying the minimum (the
+base) penalty applied to jobs which crash repeatedly.
+
+`max_backoff_penalty` : Configuration parameter specifying the maximum penalty
+applied to jobs which crash repeatedly.
+
+---
+
+# Detailed Description
+
+Replication job creation and scheduling works roughly as follows:
+
+ 1) `Persistent` and `transient` jobs both start by creating or updating a
+ `couch_jobs` record in a separate replication key-space on `api_frontend`
+ nodes. Persistent jobs are driven by an EPI callback mechanism which notifies
+ `couch_replicator` application when documents in `_replicator` DBs are
+ updated, or when `_replicator` DBs are created and deleted. Transient jobs are
+ created from the `_replicate` HTTP handler directly. Newly created jobs are in
+ a `pending` state.
+
+ 2) Each `replication` node spawns some acceptor processes which wait in
+ `couch_jobs:accept/2` call for jobs. It will accept only jobs which are
+ scheduled to run at a time less or equal to the current time.

Review comment:
       this is the first mention of wallclock time being associated with the persisted state of a replication job. Where was it defined? And what's it for?

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also

Review comment:
       `would` to `will`

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.

Review comment:
       the current replicator suspends and resumes jobs as it sees fit, we should not contradict that here. the continuous flag declares the users _intention_ for the replication to happen continuously, and the replicator scheduler takes it from there.

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a

Review comment:
       missing comma after `3.x`.

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.

Review comment:
       specify that the latter is to fetch the filter function?

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the

Review comment:
       this seems a new definition to me and `normal` is a very subjective term. I've seen the term `one-shot replication` used for this case and suggest it is a better fit with an established history.

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.

Review comment:
       `on` to `through`? the jobs aren't created on those nodes at all, the api_frontend endpoint is simply processing the http requests. In the case of the _replicate endpoint, that handler is presumably directing a replication node to start the replication. In the case of the _replicator endpoint, the handler is just writing a document, which the replication scheduler happens to react to after the fact.

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.
+
+`job_id` : A replication job ID derived from the database and document IDs for
+persistent replications, and from source, target endpoint, user name and some
+options for transient replications. Computing a `job_id`, unlike a
+`replication_id`, doesn't require making any network requests. A filtered
+replication with a given `job_id` during its lifetime may change its
+`replication_id` multiple times when filter contents changes on the source.
+
+`max_jobs` : Configuration parameter which specifies up to how many replication
+jobs to run on each `replication` node.
+
+`max_churn` : Configuration parameter which specifies a limit of how many new
+jobs to spawn during each rescheduling interval.
+
+`min_backoff_penalty` : Configuration parameter specifying the minimum (the
+base) penalty applied to jobs which crash repeatedly.
+
+`max_backoff_penalty` : Configuration parameter specifying the maximum penalty
+applied to jobs which crash repeatedly.
+
+---
+
+# Detailed Description
+
+Replication job creation and scheduling works roughly as follows:
+
+ 1) `Persistent` and `transient` jobs both start by creating or updating a
+ `couch_jobs` record in a separate replication key-space on `api_frontend`
+ nodes. Persistent jobs are driven by an EPI callback mechanism which notifies
+ `couch_replicator` application when documents in `_replicator` DBs are
+ updated, or when `_replicator` DBs are created and deleted. Transient jobs are
+ created from the `_replicate` HTTP handler directly. Newly created jobs are in
+ a `pending` state.
+
+ 2) Each `replication` node spawns some acceptor processes which wait in
+ `couch_jobs:accept/2` call for jobs. It will accept only jobs which are
+ scheduled to run at a time less or equal to the current time.
+
+ 3) After a job is accepted, its state is updated as `running`, and then, a
+ gen_server process monitoring these replication jobs will spawn another
+ acceptor. That happens until the `max_jobs` limit is reached.
+
+ 4) The same monitoring gen_server will periodically check if there are any
+ pending jobs in the queue, and if there are, spawn up to some `max_churn`
+ number of new acceptors. These acceptors may start new jobs, and if they do,

Review comment:
       `, and if they do,` to `and, if they do,`

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.
+
+`job_id` : A replication job ID derived from the database and document IDs for
+persistent replications, and from source, target endpoint, user name and some
+options for transient replications. Computing a `job_id`, unlike a
+`replication_id`, doesn't require making any network requests. A filtered
+replication with a given `job_id` during its lifetime may change its
+`replication_id` multiple times when filter contents changes on the source.
+
+`max_jobs` : Configuration parameter which specifies up to how many replication
+jobs to run on each `replication` node.
+
+`max_churn` : Configuration parameter which specifies a limit of how many new
+jobs to spawn during each rescheduling interval.
+
+`min_backoff_penalty` : Configuration parameter specifying the minimum (the
+base) penalty applied to jobs which crash repeatedly.
+
+`max_backoff_penalty` : Configuration parameter specifying the maximum penalty
+applied to jobs which crash repeatedly.
+
+---
+
+# Detailed Description
+
+Replication job creation and scheduling works roughly as follows:
+
+ 1) `Persistent` and `transient` jobs both start by creating or updating a
+ `couch_jobs` record in a separate replication key-space on `api_frontend`
+ nodes. Persistent jobs are driven by an EPI callback mechanism which notifies
+ `couch_replicator` application when documents in `_replicator` DBs are
+ updated, or when `_replicator` DBs are created and deleted. Transient jobs are
+ created from the `_replicate` HTTP handler directly. Newly created jobs are in
+ a `pending` state.
+
+ 2) Each `replication` node spawns some acceptor processes which wait in
+ `couch_jobs:accept/2` call for jobs. It will accept only jobs which are
+ scheduled to run at a time less or equal to the current time.
+
+ 3) After a job is accepted, its state is updated as `running`, and then, a
+ gen_server process monitoring these replication jobs will spawn another
+ acceptor. That happens until the `max_jobs` limit is reached.
+
+ 4) The same monitoring gen_server will periodically check if there are any
+ pending jobs in the queue, and if there are, spawn up to some `max_churn`

Review comment:
       `, and if there are,` to `and, if there are,`

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.
+
+`job_id` : A replication job ID derived from the database and document IDs for
+persistent replications, and from source, target endpoint, user name and some
+options for transient replications. Computing a `job_id`, unlike a
+`replication_id`, doesn't require making any network requests. A filtered
+replication with a given `job_id` during its lifetime may change its
+`replication_id` multiple times when filter contents changes on the source.
+
+`max_jobs` : Configuration parameter which specifies up to how many replication
+jobs to run on each `replication` node.
+
+`max_churn` : Configuration parameter which specifies a limit of how many new
+jobs to spawn during each rescheduling interval.
+
+`min_backoff_penalty` : Configuration parameter specifying the minimum (the
+base) penalty applied to jobs which crash repeatedly.
+
+`max_backoff_penalty` : Configuration parameter specifying the maximum penalty
+applied to jobs which crash repeatedly.
+
+---
+
+# Detailed Description
+
+Replication job creation and scheduling works roughly as follows:
+
+ 1) `Persistent` and `transient` jobs both start by creating or updating a
+ `couch_jobs` record in a separate replication key-space on `api_frontend`
+ nodes. Persistent jobs are driven by an EPI callback mechanism which notifies
+ `couch_replicator` application when documents in `_replicator` DBs are
+ updated, or when `_replicator` DBs are created and deleted. Transient jobs are
+ created from the `_replicate` HTTP handler directly. Newly created jobs are in
+ a `pending` state.
+
+ 2) Each `replication` node spawns some acceptor processes which wait in
+ `couch_jobs:accept/2` call for jobs. It will accept only jobs which are
+ scheduled to run at a time less or equal to the current time.
+
+ 3) After a job is accepted, its state is updated as `running`, and then, a

Review comment:
       `as` to `to`.

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.
+
+`job_id` : A replication job ID derived from the database and document IDs for
+persistent replications, and from source, target endpoint, user name and some
+options for transient replications. Computing a `job_id`, unlike a
+`replication_id`, doesn't require making any network requests. A filtered
+replication with a given `job_id` during its lifetime may change its
+`replication_id` multiple times when filter contents changes on the source.
+
+`max_jobs` : Configuration parameter which specifies up to how many replication
+jobs to run on each `replication` node.
+
+`max_churn` : Configuration parameter which specifies a limit of how many new
+jobs to spawn during each rescheduling interval.
+
+`min_backoff_penalty` : Configuration parameter specifying the minimum (the
+base) penalty applied to jobs which crash repeatedly.
+
+`max_backoff_penalty` : Configuration parameter specifying the maximum penalty
+applied to jobs which crash repeatedly.
+
+---
+
+# Detailed Description
+
+Replication job creation and scheduling works roughly as follows:
+
+ 1) `Persistent` and `transient` jobs both start by creating or updating a
+ `couch_jobs` record in a separate replication key-space on `api_frontend`
+ nodes. Persistent jobs are driven by an EPI callback mechanism which notifies
+ `couch_replicator` application when documents in `_replicator` DBs are
+ updated, or when `_replicator` DBs are created and deleted. Transient jobs are
+ created from the `_replicate` HTTP handler directly. Newly created jobs are in
+ a `pending` state.
+
+ 2) Each `replication` node spawns some acceptor processes which wait in
+ `couch_jobs:accept/2` call for jobs. It will accept only jobs which are
+ scheduled to run at a time less or equal to the current time.
+
+ 3) After a job is accepted, its state is updated as `running`, and then, a
+ gen_server process monitoring these replication jobs will spawn another
+ acceptor. That happens until the `max_jobs` limit is reached.
+
+ 4) The same monitoring gen_server will periodically check if there are any
+ pending jobs in the queue, and if there are, spawn up to some `max_churn`
+ number of new acceptors. These acceptors may start new jobs, and if they do,
+ for each one of them, the oldest running job will be stopped and re-enqueued
+ as `pending`. This in large follows the logic from the replication scheduler
+ in CouchDB <= 3.x except that is uses `couch_jobs` as the central queuing and
+ scheduling mechanism.
+
+ 5) After the job is marked as `running`, it computes its `replication_id`,
+ initializes an internal replication state record from job's data object, and
+ starts replicating. Underneath this level the logic is identical to what's
+ already happening in CouchDB <= 3.x and so it is not described further in this
+ document.
+
+ 6) As jobs run, they periodically checkpoint, and when they do that, they also
+ recompute their `replication_id`. In the case of filtered replications the
+ `replication_id` may change, and if so, that job is stopped and re-enqueued as
+ `pending`. Also, during checkpointing the job's data value is updated with
+ stats such that the job stays active and doesn't get re-enqueued by the
+ `couch_jobs` activity monitor.
+
+ 7) If the job crashes, it will reschedule itself in `gen_server:terminate/2`
+ via `couch_jobs:resubmit/3` call to run again at some future time, defined
+ roughly as `now + max(min_backoff_penalty * 2^consecutive_errors,
+ max_backoff_penalty)`. If a job starts and successfully runs for some
+ predefined period of time without crashing, it is considered to be `"healed"`
+ and its `consecutive_errors` count is reset to 0.
+
+ 8) If the node where replication job runs crashes, or the job is manually
+ killed via `exit(Pid, kill)`, `couch_jobs` activity monitor will automatically
+ re-enqueue the job as `pending`.
+
+## Replicator Job States
+
+### Description
+
+The set of replication job states is defined as:
+
+ * `pending` : A job is marked as `pending` in these cases:
+    - As soon as a job is created from an `api_frontend` node
+    - When it stopped to let other replication jobs run
+    - When a filtered replication's `replication_id` changes
+
+ * `running` : Set when a job is accepted by the `couch_jobs:accept/2`
+   call. This generally means the job is actually running on a node,
+   however, in cases when a node crashes, the job may show as
+   `running` on that node until `couch_jobs` activity monitor
+   re-enqueues the job, and it starts running on another node.
+
+ * `crashing` : The job was running, but then crashed with an intermittent
+   error. Job's data has an error count which is incremented, and then a
+   backoff penalty is computed and the job is rescheduled to try again at some
+   point in the future.
+
+ * `completed` : Normal replications which have completed
+
+ * `failed` : This can happen when:
+    - A replication job could not be parsed from a replication document. For
+      example, if the user has not specified a `"source"` field.
+    - A transient replication job crashes. Transient jobs don't get rescheduled
+      to run again after they crash.
+    - There already is another persistent replication job running or pending
+      with the same `replication_id`.
+
+### State Differences From CouchDB <= 3.x
+
+The set of states is slightly different than the ones from before. There are
+now fewer states as some of them have been combined together:
+
+ * `initializing` was combined with `pending`
+
+ * `error` was combined with `crashing`
+
+### Mapping Between couch_jobs States and Replication States
+
+`couch_jobs` application has its own set of state definitions and they map to
+replicator states like so:
+
+ | Replicator States| `couch_jobs` States
+ | ---              | :--
+ | pending          | pending
+ | running          | running
+ | crashing         | pending
+ | completed        | finished
+ | failed           | finished
+
+### State Transition Diagram
+
+Jobs start in the `pending` state, after either a `_replicator` db doc
+update, or a POST to the `/_replicate` endpoint. Continuous jobs, will
+normally toggle between `pending` and `running` states. Normal jobs
+may toggle between `pending` and running a few times and then end up
+in `completed`.
+
+```
+_replicator doc       +-------+
+POST /_replicate ---->+pending|
+                      +-------+
+                          ^
+                          |
+                          |
+                          v
+                      +---+---+      +--------+
+            +---------+running+<---->|crashing|
+            |         +---+---+      +--------+
+            |             ^
+            |             |
+            v             v
+        +------+     +---------+
+        |failed|     |completed|
+        +------+     +---------+
+```
+
+
+## Replication ID Collisions
+
+Multiple replication jobs may specify replications which map to the same
+`replication_id`. To handle these collisions there is an FDB subspace `(...,
+LayerPrefix, ?REPLICATION_IDS, replication_id) -> job_id` to keep track of
+them. After the `replication_id` is computed, each replication job checks if
+there is already another job pending or running with the same `replication_id`.
+If the other job is transient, then the current job will reschedule itself as
+`crashing`. If the other job is persistent, the current job will fail
+permanently as `failed`.
+
+## Replication Parameter Validation
+
+`_replicator` documents in CouchDB <= 3.x were parsed and validated in a
+two-step process:
+
+  1) In a validate-doc-update (VDU) javascript function from a programmatically
+  inserted _design document. This validation happened when the document was
+  updated, and performed some rough checks on field names and value types. If
+  this validation failed, the document update operation was rejected.
+
+  2) Inside replicator's Erlang code when it was translated to an internal
+ record used by the replication application. This validation was more thorough
+ but didn't have very friendly error messages. If validation failed here, the
+ job would be marked as `failed`.
+
+For CouchDB 4.x the proposal is to use only the Erlang parser. It would be
+called from the `before_doc_update` callback. This is a callback which runs
+before every document update. If validation fails there it would reject the
+document update operation. This should reduce code duplication and also provide
+better feedback to the users directly when they update the `_replicator`
+documents.
+
+## Transient Job Behavior
+
+In CouchDB <= 3.x transient replication jobs ran in memory on a particular node
+in the cluster. If the node where the replication job ran crashes, the job
+would simply disappear without a trace. It was up to the user to periodically
+monitor the job status and re-create the job. In the current design,
+`transient` jobs are persisted to FDB as `couch_jobs` records, and so would
+survive dbcore node restarts. Also after transient jobs complete or failed,

Review comment:
       what's a "dbcore node"?

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.

Review comment:
       the definition of what's included in the hash is "anything that _could_ affect the result of the replication" btw.

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.
+
+`job_id` : A replication job ID derived from the database and document IDs for
+persistent replications, and from source, target endpoint, user name and some
+options for transient replications. Computing a `job_id`, unlike a
+`replication_id`, doesn't require making any network requests. A filtered
+replication with a given `job_id` during its lifetime may change its
+`replication_id` multiple times when filter contents changes on the source.
+
+`max_jobs` : Configuration parameter which specifies up to how many replication
+jobs to run on each `replication` node.
+
+`max_churn` : Configuration parameter which specifies a limit of how many new
+jobs to spawn during each rescheduling interval.
+
+`min_backoff_penalty` : Configuration parameter specifying the minimum (the
+base) penalty applied to jobs which crash repeatedly.
+
+`max_backoff_penalty` : Configuration parameter specifying the maximum penalty
+applied to jobs which crash repeatedly.
+
+---
+
+# Detailed Description
+
+Replication job creation and scheduling works roughly as follows:
+
+ 1) `Persistent` and `transient` jobs both start by creating or updating a
+ `couch_jobs` record in a separate replication key-space on `api_frontend`
+ nodes. Persistent jobs are driven by an EPI callback mechanism which notifies
+ `couch_replicator` application when documents in `_replicator` DBs are
+ updated, or when `_replicator` DBs are created and deleted. Transient jobs are
+ created from the `_replicate` HTTP handler directly. Newly created jobs are in
+ a `pending` state.
+
+ 2) Each `replication` node spawns some acceptor processes which wait in
+ `couch_jobs:accept/2` call for jobs. It will accept only jobs which are
+ scheduled to run at a time less or equal to the current time.
+
+ 3) After a job is accepted, its state is updated as `running`, and then, a
+ gen_server process monitoring these replication jobs will spawn another
+ acceptor. That happens until the `max_jobs` limit is reached.
+
+ 4) The same monitoring gen_server will periodically check if there are any
+ pending jobs in the queue, and if there are, spawn up to some `max_churn`
+ number of new acceptors. These acceptors may start new jobs, and if they do,
+ for each one of them, the oldest running job will be stopped and re-enqueued
+ as `pending`. This in large follows the logic from the replication scheduler
+ in CouchDB <= 3.x except that is uses `couch_jobs` as the central queuing and
+ scheduling mechanism.
+
+ 5) After the job is marked as `running`, it computes its `replication_id`,
+ initializes an internal replication state record from job's data object, and
+ starts replicating. Underneath this level the logic is identical to what's
+ already happening in CouchDB <= 3.x and so it is not described further in this
+ document.
+
+ 6) As jobs run, they periodically checkpoint, and when they do that, they also
+ recompute their `replication_id`. In the case of filtered replications the
+ `replication_id` may change, and if so, that job is stopped and re-enqueued as
+ `pending`. Also, during checkpointing the job's data value is updated with
+ stats such that the job stays active and doesn't get re-enqueued by the
+ `couch_jobs` activity monitor.
+
+ 7) If the job crashes, it will reschedule itself in `gen_server:terminate/2`

Review comment:
       gen_server:terminate is only for cleanup and only called in rare circumstances, this doesn't seem a good way to manage rescheduling.

##########
File path: rfcs/016-fdb-replicator.md
##########
@@ -0,0 +1,386 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Replicator Implementation On FDB'
+labels: rfc, discussion
+assignees: 'vatamane@apache.org'
+
+---
+
+# Introduction
+
+This document describes the design of the replicator application for CouchDB
+4.x. The replicator will rely on `couch_jobs` for centralized scheduling and
+monitoring of replication jobs.
+
+## Abstract
+
+CouchDB replicator is the CouchDB application which runs replication jobs.
+Replication jobs can be created from documents in `_replicator` databases, or
+by `POST`-ing requests to the HTTP `/_replicate` endpoint. Previously, in
+CouchDB <= 3.x replication jobs were mapped to individual cluster nodes and a
+scheduler component would run up to `max_jobs` number of jobs at a time on each
+node. The new design proposes using `couch_jobs`, as described in the
+[Background Jobs
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/007-background-jobs.md),
+to have a central, FDB-based queue of replication jobs. `couch_jobs`
+application would manage job scheduling and coordination. The new design also
+proposes using heterogeneous node types as defined in the [Node Types
+RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md)
+such that replication jobs will be created only on `api_frontend` nodes and run
+only on `replication` nodes.
+
+## Requirements Language
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC
+2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+`_replicator` databases : A database that is either named `_replicator` or ends
+with the `/_replicator` suffix.
+
+`transient` replications : Replication jobs created by `POST`-ing to the
+`/_replicate` endpoint.
+
+`persistent` replications : Replication jobs created from a document in a
+`_replicator` database.
+
+`continuous` replications : Jobs created with the `"continuous": true`
+parameter. When this job reaches the end of the changes feed it will continue
+waiting for new changes in a loop until the user removes the job.
+
+`normal` replications : Replication jobs which are not `continuous`. If the
+`"continuous":true` parameter is not specified, by default, replication jobs
+will be `normal`.
+
+`api_frontend node` : Database node which has the `api_frontend` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can be only be created on these nodes.
+
+`replication node` : Database node which has the `replication` type set to
+`true` as described in
+[RFC](https://github.com/apache/couchdb-documentation/blob/master/rfcs/013-node-types.md).
+Replication jobs can only be run on these nodes.
+
+`filtered` replications: Replications with a user-defined filter on the source
+endpoint to filter its changes feed.
+
+`replication_id` : An ID defined by replication jobs which is a hash of the
+ source and target endpoint URLs, some of the options, and for filtered
+ replications, the contents of the filter from the source endpoint. Replication
+ IDs will change, for example, if the filter contents on the source endpoint
+ changes. Computing this value may require a network round-trip to the source
+ endpoint.
+
+`job_id` : A replication job ID derived from the database and document IDs for
+persistent replications, and from source, target endpoint, user name and some
+options for transient replications. Computing a `job_id`, unlike a
+`replication_id`, doesn't require making any network requests. A filtered
+replication with a given `job_id` during its lifetime may change its
+`replication_id` multiple times when filter contents changes on the source.
+
+`max_jobs` : Configuration parameter which specifies up to how many replication
+jobs to run on each `replication` node.
+
+`max_churn` : Configuration parameter which specifies a limit of how many new
+jobs to spawn during each rescheduling interval.
+
+`min_backoff_penalty` : Configuration parameter specifying the minimum (the
+base) penalty applied to jobs which crash repeatedly.
+
+`max_backoff_penalty` : Configuration parameter specifying the maximum penalty
+applied to jobs which crash repeatedly.
+
+---
+
+# Detailed Description
+
+Replication job creation and scheduling works roughly as follows:
+
+ 1) `Persistent` and `transient` jobs both start by creating or updating a
+ `couch_jobs` record in a separate replication key-space on `api_frontend`
+ nodes. Persistent jobs are driven by an EPI callback mechanism which notifies
+ `couch_replicator` application when documents in `_replicator` DBs are
+ updated, or when `_replicator` DBs are created and deleted. Transient jobs are
+ created from the `_replicate` HTTP handler directly. Newly created jobs are in
+ a `pending` state.
+
+ 2) Each `replication` node spawns some acceptor processes which wait in
+ `couch_jobs:accept/2` call for jobs. It will accept only jobs which are
+ scheduled to run at a time less or equal to the current time.
+
+ 3) After a job is accepted, its state is updated as `running`, and then, a
+ gen_server process monitoring these replication jobs will spawn another
+ acceptor. That happens until the `max_jobs` limit is reached.
+
+ 4) The same monitoring gen_server will periodically check if there are any
+ pending jobs in the queue, and if there are, spawn up to some `max_churn`
+ number of new acceptors. These acceptors may start new jobs, and if they do,
+ for each one of them, the oldest running job will be stopped and re-enqueued
+ as `pending`. This in large follows the logic from the replication scheduler
+ in CouchDB <= 3.x except that is uses `couch_jobs` as the central queuing and
+ scheduling mechanism.
+
+ 5) After the job is marked as `running`, it computes its `replication_id`,
+ initializes an internal replication state record from job's data object, and
+ starts replicating. Underneath this level the logic is identical to what's
+ already happening in CouchDB <= 3.x and so it is not described further in this
+ document.
+
+ 6) As jobs run, they periodically checkpoint, and when they do that, they also
+ recompute their `replication_id`. In the case of filtered replications the
+ `replication_id` may change, and if so, that job is stopped and re-enqueued as
+ `pending`. Also, during checkpointing the job's data value is updated with
+ stats such that the job stays active and doesn't get re-enqueued by the
+ `couch_jobs` activity monitor.
+
+ 7) If the job crashes, it will reschedule itself in `gen_server:terminate/2`
+ via `couch_jobs:resubmit/3` call to run again at some future time, defined
+ roughly as `now + max(min_backoff_penalty * 2^consecutive_errors,
+ max_backoff_penalty)`. If a job starts and successfully runs for some
+ predefined period of time without crashing, it is considered to be `"healed"`
+ and its `consecutive_errors` count is reset to 0.
+
+ 8) If the node where replication job runs crashes, or the job is manually
+ killed via `exit(Pid, kill)`, `couch_jobs` activity monitor will automatically
+ re-enqueue the job as `pending`.
+
+## Replicator Job States
+
+### Description
+
+The set of replication job states is defined as:
+
+ * `pending` : A job is marked as `pending` in these cases:
+    - As soon as a job is created from an `api_frontend` node
+    - When it stopped to let other replication jobs run
+    - When a filtered replication's `replication_id` changes
+
+ * `running` : Set when a job is accepted by the `couch_jobs:accept/2`
+   call. This generally means the job is actually running on a node,
+   however, in cases when a node crashes, the job may show as
+   `running` on that node until `couch_jobs` activity monitor
+   re-enqueues the job, and it starts running on another node.
+
+ * `crashing` : The job was running, but then crashed with an intermittent
+   error. Job's data has an error count which is incremented, and then a
+   backoff penalty is computed and the job is rescheduled to try again at some
+   point in the future.
+
+ * `completed` : Normal replications which have completed
+
+ * `failed` : This can happen when:
+    - A replication job could not be parsed from a replication document. For
+      example, if the user has not specified a `"source"` field.
+    - A transient replication job crashes. Transient jobs don't get rescheduled
+      to run again after they crash.
+    - There already is another persistent replication job running or pending
+      with the same `replication_id`.
+
+### State Differences From CouchDB <= 3.x
+
+The set of states is slightly different than the ones from before. There are
+now fewer states as some of them have been combined together:
+
+ * `initializing` was combined with `pending`
+
+ * `error` was combined with `crashing`
+
+### Mapping Between couch_jobs States and Replication States
+
+`couch_jobs` application has its own set of state definitions and they map to
+replicator states like so:
+
+ | Replicator States| `couch_jobs` States
+ | ---              | :--
+ | pending          | pending
+ | running          | running
+ | crashing         | pending
+ | completed        | finished
+ | failed           | finished
+
+### State Transition Diagram
+
+Jobs start in the `pending` state, after either a `_replicator` db doc
+update, or a POST to the `/_replicate` endpoint. Continuous jobs, will
+normally toggle between `pending` and `running` states. Normal jobs
+may toggle between `pending` and running a few times and then end up
+in `completed`.
+
+```
+_replicator doc       +-------+
+POST /_replicate ---->+pending|
+                      +-------+
+                          ^
+                          |
+                          |
+                          v
+                      +---+---+      +--------+
+            +---------+running+<---->|crashing|
+            |         +---+---+      +--------+
+            |             ^
+            |             |
+            v             v
+        +------+     +---------+
+        |failed|     |completed|
+        +------+     +---------+
+```
+
+
+## Replication ID Collisions
+
+Multiple replication jobs may specify replications which map to the same
+`replication_id`. To handle these collisions there is an FDB subspace `(...,
+LayerPrefix, ?REPLICATION_IDS, replication_id) -> job_id` to keep track of
+them. After the `replication_id` is computed, each replication job checks if
+there is already another job pending or running with the same `replication_id`.
+If the other job is transient, then the current job will reschedule itself as
+`crashing`. If the other job is persistent, the current job will fail
+permanently as `failed`.
+
+## Replication Parameter Validation
+
+`_replicator` documents in CouchDB <= 3.x were parsed and validated in a
+two-step process:
+
+  1) In a validate-doc-update (VDU) javascript function from a programmatically
+  inserted _design document. This validation happened when the document was
+  updated, and performed some rough checks on field names and value types. If
+  this validation failed, the document update operation was rejected.
+
+  2) Inside replicator's Erlang code when it was translated to an internal
+ record used by the replication application. This validation was more thorough
+ but didn't have very friendly error messages. If validation failed here, the
+ job would be marked as `failed`.
+
+For CouchDB 4.x the proposal is to use only the Erlang parser. It would be
+called from the `before_doc_update` callback. This is a callback which runs
+before every document update. If validation fails there it would reject the
+document update operation. This should reduce code duplication and also provide
+better feedback to the users directly when they update the `_replicator`
+documents.

Review comment:
       I agree on this. the auto injection of vdu's was cute but never really solid.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org