You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@couchdb.apache.org by Joan Touzet <wo...@apache.org> on 2017/07/16 21:14:09 UTC

[PROPOSAL] Include replication scheduler in 2.1

Hi all,

*Per the CouchDB bylaws, this is a concrete proposal that will default
to lazy consensus in 72 hours (2017.07.19 ~21:00).*

As we approach release candidates for v2.1 (see next email), we have one
major decision left to make: whether or not to include the scheduling
replicator in 2.1.

Arguments for inclusion:

* New feature allowing CouchDB to manage more replication jobs at
  the same time by switching between them / starting / stopping.
  From the documentation:
  * Handles failing jobs more gracefully (exponnential backoff).
  * Includes a new pair of API endpoints: _scheduler/jobs and
    _scheduler/docs with enhanced information and an updated state
    machine for replication jobs.
  * Shared connection pool improves network resource usage and
    performance, especially with large numbers of connections to
    the same source/target.
  * Improved request rate limit handling.
  * Improved recovery from long but temporary network failures.
  * Better handling of filtered replications.
* Feature includes its own tests, which all pass.
* Feature is fully documented.
* Cherry-picking out the scheduling replicator commits from the
  ~190 commits since then (all bugfixes and minor improvements) is
  labour intensive for the release team, and possibly error prone.

Arguments against inclusion:

* It has been ~9 months since the 2.0 release. Many bugs have been
  found and fixed.
* A new release without the scheduling replicator would provide
  risk mitigation for users who need those bug fixes but are risk-
  averse to new features.
* Scheduling replicator has not seen much real-world testing. Bugs
  may surface in a 2.1 release that could destabilise existing
  installs being upgraded.

I've thought a lot about this issue, and would like to propose that we
release 2.1 *with* the scheduling replicator included. My reasoning is
that the benefits outweigh the potential downsides. If necessary, we
can release a 2.1.1 in the following weeks with urgent bug fixes to
the scheduling replicator if necessary.

Another alternative would be to ~simultaneously release a 2.1 from just
before the scheduling replicator landed (~190 commits ago), then a 2.2
from the HEAD of the master branch with all the subsequent fixes. 2.1
would be missing these more recent fixes, but it would again avoid the
massive cherry-picking operation necessary to port all of them to the
2.1 branch without the scheduling replicator. I'm -0 on this because
of the confusion it might create with release announcements, but
wouldn't block if that was the desired path forward.

Developers, please make your voices heard! :)

-Joan

Re: [PROPOSAL] Include replication scheduler in 2.1

Posted by Robert Samuel Newson <rn...@apache.org>.

+1.

> On 16 Jul 2017, at 22:14, Joan Touzet <wo...@apache.org> wrote:
> 
> Hi all,
> 
> *Per the CouchDB bylaws, this is a concrete proposal that will default
> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
> 
> As we approach release candidates for v2.1 (see next email), we have one
> major decision left to make: whether or not to include the scheduling
> replicator in 2.1.
> 
> Arguments for inclusion:
> 
> * New feature allowing CouchDB to manage more replication jobs at
>  the same time by switching between them / starting / stopping.
>  From the documentation:
>  * Handles failing jobs more gracefully (exponnential backoff).
>  * Includes a new pair of API endpoints: _scheduler/jobs and
>    _scheduler/docs with enhanced information and an updated state
>    machine for replication jobs.
>  * Shared connection pool improves network resource usage and
>    performance, especially with large numbers of connections to
>    the same source/target.
>  * Improved request rate limit handling.
>  * Improved recovery from long but temporary network failures.
>  * Better handling of filtered replications.
> * Feature includes its own tests, which all pass.
> * Feature is fully documented.
> * Cherry-picking out the scheduling replicator commits from the
>  ~190 commits since then (all bugfixes and minor improvements) is
>  labour intensive for the release team, and possibly error prone.
> 
> Arguments against inclusion:
> 
> * It has been ~9 months since the 2.0 release. Many bugs have been
>  found and fixed.
> * A new release without the scheduling replicator would provide
>  risk mitigation for users who need those bug fixes but are risk-
>  averse to new features.
> * Scheduling replicator has not seen much real-world testing. Bugs
>  may surface in a 2.1 release that could destabilise existing
>  installs being upgraded.
> 
> I've thought a lot about this issue, and would like to propose that we
> release 2.1 *with* the scheduling replicator included. My reasoning is
> that the benefits outweigh the potential downsides. If necessary, we
> can release a 2.1.1 in the following weeks with urgent bug fixes to
> the scheduling replicator if necessary.
> 
> Another alternative would be to ~simultaneously release a 2.1 from just
> before the scheduling replicator landed (~190 commits ago), then a 2.2
> from the HEAD of the master branch with all the subsequent fixes. 2.1
> would be missing these more recent fixes, but it would again avoid the
> massive cherry-picking operation necessary to port all of them to the
> 2.1 branch without the scheduling replicator. I'm -0 on this because
> of the confusion it might create with release announcements, but
> wouldn't block if that was the desired path forward.
> 
> Developers, please make your voices heard! :)
> 
> -Joan

Re: [PROPOSAL] Include replication scheduler in 2.1

Posted by Benjamin Bastian <bb...@apache.org>.

+1

On Wed, Jul 19, 2017 at 11:04 AM, Jan Lehnardt <ma...@jan.io> wrote:

> +1
>
> Cheers
> Jan
> --
>
> > On 17. Jul 2017, at 20:35, Paul Davis <pa...@gmail.com>
> wrote:
> >
> > +1 (assuming that's +1 in favor of releasing with scheduling replicator)
> >
> >> On Sun, Jul 16, 2017 at 7:54 PM, Nick Vatamaniuc <va...@gmail.com>
> wrote:
> >> +1
> >>
> >>> On Jul 16, 2017 17:14, "Joan Touzet" <wo...@apache.org> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> *Per the CouchDB bylaws, this is a concrete proposal that will default
> >>> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
> >>>
> >>> As we approach release candidates for v2.1 (see next email), we have
> one
> >>> major decision left to make: whether or not to include the scheduling
> >>> replicator in 2.1.
> >>>
> >>> Arguments for inclusion:
> >>>
> >>> * New feature allowing CouchDB to manage more replication jobs at
> >>>  the same time by switching between them / starting / stopping.
> >>>  From the documentation:
> >>>  * Handles failing jobs more gracefully (exponnential backoff).
> >>>  * Includes a new pair of API endpoints: _scheduler/jobs and
> >>>    _scheduler/docs with enhanced information and an updated state
> >>>    machine for replication jobs.
> >>>  * Shared connection pool improves network resource usage and
> >>>    performance, especially with large numbers of connections to
> >>>    the same source/target.
> >>>  * Improved request rate limit handling.
> >>>  * Improved recovery from long but temporary network failures.
> >>>  * Better handling of filtered replications.
> >>> * Feature includes its own tests, which all pass.
> >>> * Feature is fully documented.
> >>> * Cherry-picking out the scheduling replicator commits from the
> >>>  ~190 commits since then (all bugfixes and minor improvements) is
> >>>  labour intensive for the release team, and possibly error prone.
> >>>
> >>> Arguments against inclusion:
> >>>
> >>> * It has been ~9 months since the 2.0 release. Many bugs have been
> >>>  found and fixed.
> >>> * A new release without the scheduling replicator would provide
> >>>  risk mitigation for users who need those bug fixes but are risk-
> >>>  averse to new features.
> >>> * Scheduling replicator has not seen much real-world testing. Bugs
> >>>  may surface in a 2.1 release that could destabilise existing
> >>>  installs being upgraded.
> >>>
> >>> I've thought a lot about this issue, and would like to propose that we
> >>> release 2.1 *with* the scheduling replicator included. My reasoning is
> >>> that the benefits outweigh the potential downsides. If necessary, we
> >>> can release a 2.1.1 in the following weeks with urgent bug fixes to
> >>> the scheduling replicator if necessary.
> >>>
> >>> Another alternative would be to ~simultaneously release a 2.1 from just
> >>> before the scheduling replicator landed (~190 commits ago), then a 2.2
> >>> from the HEAD of the master branch with all the subsequent fixes. 2.1
> >>> would be missing these more recent fixes, but it would again avoid the
> >>> massive cherry-picking operation necessary to port all of them to the
> >>> 2.1 branch without the scheduling replicator. I'm -0 on this because
> >>> of the confusion it might create with release announcements, but
> >>> wouldn't block if that was the desired path forward.
> >>>
> >>> Developers, please make your voices heard! :)
> >>>
> >>> -Joan
> >>>
>
>

Re: [PROPOSAL] Include replication scheduler in 2.1

Posted by Jan Lehnardt <ma...@jan.io>.

+1

Cheers
Jan
--

> On 17. Jul 2017, at 20:35, Paul Davis <pa...@gmail.com> wrote:
> 
> +1 (assuming that's +1 in favor of releasing with scheduling replicator)
> 
>> On Sun, Jul 16, 2017 at 7:54 PM, Nick Vatamaniuc <va...@gmail.com> wrote:
>> +1
>> 
>>> On Jul 16, 2017 17:14, "Joan Touzet" <wo...@apache.org> wrote:
>>> 
>>> Hi all,
>>> 
>>> *Per the CouchDB bylaws, this is a concrete proposal that will default
>>> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
>>> 
>>> As we approach release candidates for v2.1 (see next email), we have one
>>> major decision left to make: whether or not to include the scheduling
>>> replicator in 2.1.
>>> 
>>> Arguments for inclusion:
>>> 
>>> * New feature allowing CouchDB to manage more replication jobs at
>>>  the same time by switching between them / starting / stopping.
>>>  From the documentation:
>>>  * Handles failing jobs more gracefully (exponnential backoff).
>>>  * Includes a new pair of API endpoints: _scheduler/jobs and
>>>    _scheduler/docs with enhanced information and an updated state
>>>    machine for replication jobs.
>>>  * Shared connection pool improves network resource usage and
>>>    performance, especially with large numbers of connections to
>>>    the same source/target.
>>>  * Improved request rate limit handling.
>>>  * Improved recovery from long but temporary network failures.
>>>  * Better handling of filtered replications.
>>> * Feature includes its own tests, which all pass.
>>> * Feature is fully documented.
>>> * Cherry-picking out the scheduling replicator commits from the
>>>  ~190 commits since then (all bugfixes and minor improvements) is
>>>  labour intensive for the release team, and possibly error prone.
>>> 
>>> Arguments against inclusion:
>>> 
>>> * It has been ~9 months since the 2.0 release. Many bugs have been
>>>  found and fixed.
>>> * A new release without the scheduling replicator would provide
>>>  risk mitigation for users who need those bug fixes but are risk-
>>>  averse to new features.
>>> * Scheduling replicator has not seen much real-world testing. Bugs
>>>  may surface in a 2.1 release that could destabilise existing
>>>  installs being upgraded.
>>> 
>>> I've thought a lot about this issue, and would like to propose that we
>>> release 2.1 *with* the scheduling replicator included. My reasoning is
>>> that the benefits outweigh the potential downsides. If necessary, we
>>> can release a 2.1.1 in the following weeks with urgent bug fixes to
>>> the scheduling replicator if necessary.
>>> 
>>> Another alternative would be to ~simultaneously release a 2.1 from just
>>> before the scheduling replicator landed (~190 commits ago), then a 2.2
>>> from the HEAD of the master branch with all the subsequent fixes. 2.1
>>> would be missing these more recent fixes, but it would again avoid the
>>> massive cherry-picking operation necessary to port all of them to the
>>> 2.1 branch without the scheduling replicator. I'm -0 on this because
>>> of the confusion it might create with release announcements, but
>>> wouldn't block if that was the desired path forward.
>>> 
>>> Developers, please make your voices heard! :)
>>> 
>>> -Joan
>>>

Re: [PROPOSAL] Include replication scheduler in 2.1

Posted by Paul Davis <pa...@gmail.com>.

+1 (assuming that's +1 in favor of releasing with scheduling replicator)

On Sun, Jul 16, 2017 at 7:54 PM, Nick Vatamaniuc <va...@gmail.com> wrote:
> +1
>
> On Jul 16, 2017 17:14, "Joan Touzet" <wo...@apache.org> wrote:
>
>> Hi all,
>>
>> *Per the CouchDB bylaws, this is a concrete proposal that will default
>> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
>>
>> As we approach release candidates for v2.1 (see next email), we have one
>> major decision left to make: whether or not to include the scheduling
>> replicator in 2.1.
>>
>> Arguments for inclusion:
>>
>> * New feature allowing CouchDB to manage more replication jobs at
>>   the same time by switching between them / starting / stopping.
>>   From the documentation:
>>   * Handles failing jobs more gracefully (exponnential backoff).
>>   * Includes a new pair of API endpoints: _scheduler/jobs and
>>     _scheduler/docs with enhanced information and an updated state
>>     machine for replication jobs.
>>   * Shared connection pool improves network resource usage and
>>     performance, especially with large numbers of connections to
>>     the same source/target.
>>   * Improved request rate limit handling.
>>   * Improved recovery from long but temporary network failures.
>>   * Better handling of filtered replications.
>> * Feature includes its own tests, which all pass.
>> * Feature is fully documented.
>> * Cherry-picking out the scheduling replicator commits from the
>>   ~190 commits since then (all bugfixes and minor improvements) is
>>   labour intensive for the release team, and possibly error prone.
>>
>> Arguments against inclusion:
>>
>> * It has been ~9 months since the 2.0 release. Many bugs have been
>>   found and fixed.
>> * A new release without the scheduling replicator would provide
>>   risk mitigation for users who need those bug fixes but are risk-
>>   averse to new features.
>> * Scheduling replicator has not seen much real-world testing. Bugs
>>   may surface in a 2.1 release that could destabilise existing
>>   installs being upgraded.
>>
>> I've thought a lot about this issue, and would like to propose that we
>> release 2.1 *with* the scheduling replicator included. My reasoning is
>> that the benefits outweigh the potential downsides. If necessary, we
>> can release a 2.1.1 in the following weeks with urgent bug fixes to
>> the scheduling replicator if necessary.
>>
>> Another alternative would be to ~simultaneously release a 2.1 from just
>> before the scheduling replicator landed (~190 commits ago), then a 2.2
>> from the HEAD of the master branch with all the subsequent fixes. 2.1
>> would be missing these more recent fixes, but it would again avoid the
>> massive cherry-picking operation necessary to port all of them to the
>> 2.1 branch without the scheduling replicator. I'm -0 on this because
>> of the confusion it might create with release announcements, but
>> wouldn't block if that was the desired path forward.
>>
>> Developers, please make your voices heard! :)
>>
>> -Joan
>>

Re: [PROPOSAL] Include replication scheduler in 2.1

Posted by Nick Vatamaniuc <va...@gmail.com>.

+1

On Jul 16, 2017 17:14, "Joan Touzet" <wo...@apache.org> wrote:

> Hi all,
>
> *Per the CouchDB bylaws, this is a concrete proposal that will default
> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
>
> As we approach release candidates for v2.1 (see next email), we have one
> major decision left to make: whether or not to include the scheduling
> replicator in 2.1.
>
> Arguments for inclusion:
>
> * New feature allowing CouchDB to manage more replication jobs at
>   the same time by switching between them / starting / stopping.
>   From the documentation:
>   * Handles failing jobs more gracefully (exponnential backoff).
>   * Includes a new pair of API endpoints: _scheduler/jobs and
>     _scheduler/docs with enhanced information and an updated state
>     machine for replication jobs.
>   * Shared connection pool improves network resource usage and
>     performance, especially with large numbers of connections to
>     the same source/target.
>   * Improved request rate limit handling.
>   * Improved recovery from long but temporary network failures.
>   * Better handling of filtered replications.
> * Feature includes its own tests, which all pass.
> * Feature is fully documented.
> * Cherry-picking out the scheduling replicator commits from the
>   ~190 commits since then (all bugfixes and minor improvements) is
>   labour intensive for the release team, and possibly error prone.
>
> Arguments against inclusion:
>
> * It has been ~9 months since the 2.0 release. Many bugs have been
>   found and fixed.
> * A new release without the scheduling replicator would provide
>   risk mitigation for users who need those bug fixes but are risk-
>   averse to new features.
> * Scheduling replicator has not seen much real-world testing. Bugs
>   may surface in a 2.1 release that could destabilise existing
>   installs being upgraded.
>
> I've thought a lot about this issue, and would like to propose that we
> release 2.1 *with* the scheduling replicator included. My reasoning is
> that the benefits outweigh the potential downsides. If necessary, we
> can release a 2.1.1 in the following weeks with urgent bug fixes to
> the scheduling replicator if necessary.
>
> Another alternative would be to ~simultaneously release a 2.1 from just
> before the scheduling replicator landed (~190 commits ago), then a 2.2
> from the HEAD of the master branch with all the subsequent fixes. 2.1
> would be missing these more recent fixes, but it would again avoid the
> massive cherry-picking operation necessary to port all of them to the
> 2.1 branch without the scheduling replicator. I'm -0 on this because
> of the confusion it might create with release announcements, but
> wouldn't block if that was the desired path forward.
>
> Developers, please make your voices heard! :)
>
> -Joan
>