You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Joan Touzet <wo...@apache.org> on 2017/07/16 21:14:09 UTC
[PROPOSAL] Include replication scheduler in 2.1
Hi all,
*Per the CouchDB bylaws, this is a concrete proposal that will default
to lazy consensus in 72 hours (2017.07.19 ~21:00).*
As we approach release candidates for v2.1 (see next email), we have one
major decision left to make: whether or not to include the scheduling
replicator in 2.1.
Arguments for inclusion:
* New feature allowing CouchDB to manage more replication jobs at
the same time by switching between them / starting / stopping.
From the documentation:
* Handles failing jobs more gracefully (exponnential backoff).
* Includes a new pair of API endpoints: _scheduler/jobs and
_scheduler/docs with enhanced information and an updated state
machine for replication jobs.
* Shared connection pool improves network resource usage and
performance, especially with large numbers of connections to
the same source/target.
* Improved request rate limit handling.
* Improved recovery from long but temporary network failures.
* Better handling of filtered replications.
* Feature includes its own tests, which all pass.
* Feature is fully documented.
* Cherry-picking out the scheduling replicator commits from the
~190 commits since then (all bugfixes and minor improvements) is
labour intensive for the release team, and possibly error prone.
Arguments against inclusion:
* It has been ~9 months since the 2.0 release. Many bugs have been
found and fixed.
* A new release without the scheduling replicator would provide
risk mitigation for users who need those bug fixes but are risk-
averse to new features.
* Scheduling replicator has not seen much real-world testing. Bugs
may surface in a 2.1 release that could destabilise existing
installs being upgraded.
I've thought a lot about this issue, and would like to propose that we
release 2.1 *with* the scheduling replicator included. My reasoning is
that the benefits outweigh the potential downsides. If necessary, we
can release a 2.1.1 in the following weeks with urgent bug fixes to
the scheduling replicator if necessary.
Another alternative would be to ~simultaneously release a 2.1 from just
before the scheduling replicator landed (~190 commits ago), then a 2.2
from the HEAD of the master branch with all the subsequent fixes. 2.1
would be missing these more recent fixes, but it would again avoid the
massive cherry-picking operation necessary to port all of them to the
2.1 branch without the scheduling replicator. I'm -0 on this because
of the confusion it might create with release announcements, but
wouldn't block if that was the desired path forward.
Developers, please make your voices heard! :)
-Joan
Re: [PROPOSAL] Include replication scheduler in 2.1
Posted by Robert Samuel Newson <rn...@apache.org>.
+1.
> On 16 Jul 2017, at 22:14, Joan Touzet <wo...@apache.org> wrote:
>
> Hi all,
>
> *Per the CouchDB bylaws, this is a concrete proposal that will default
> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
>
> As we approach release candidates for v2.1 (see next email), we have one
> major decision left to make: whether or not to include the scheduling
> replicator in 2.1.
>
> Arguments for inclusion:
>
> * New feature allowing CouchDB to manage more replication jobs at
> the same time by switching between them / starting / stopping.
> From the documentation:
> * Handles failing jobs more gracefully (exponnential backoff).
> * Includes a new pair of API endpoints: _scheduler/jobs and
> _scheduler/docs with enhanced information and an updated state
> machine for replication jobs.
> * Shared connection pool improves network resource usage and
> performance, especially with large numbers of connections to
> the same source/target.
> * Improved request rate limit handling.
> * Improved recovery from long but temporary network failures.
> * Better handling of filtered replications.
> * Feature includes its own tests, which all pass.
> * Feature is fully documented.
> * Cherry-picking out the scheduling replicator commits from the
> ~190 commits since then (all bugfixes and minor improvements) is
> labour intensive for the release team, and possibly error prone.
>
> Arguments against inclusion:
>
> * It has been ~9 months since the 2.0 release. Many bugs have been
> found and fixed.
> * A new release without the scheduling replicator would provide
> risk mitigation for users who need those bug fixes but are risk-
> averse to new features.
> * Scheduling replicator has not seen much real-world testing. Bugs
> may surface in a 2.1 release that could destabilise existing
> installs being upgraded.
>
> I've thought a lot about this issue, and would like to propose that we
> release 2.1 *with* the scheduling replicator included. My reasoning is
> that the benefits outweigh the potential downsides. If necessary, we
> can release a 2.1.1 in the following weeks with urgent bug fixes to
> the scheduling replicator if necessary.
>
> Another alternative would be to ~simultaneously release a 2.1 from just
> before the scheduling replicator landed (~190 commits ago), then a 2.2
> from the HEAD of the master branch with all the subsequent fixes. 2.1
> would be missing these more recent fixes, but it would again avoid the
> massive cherry-picking operation necessary to port all of them to the
> 2.1 branch without the scheduling replicator. I'm -0 on this because
> of the confusion it might create with release announcements, but
> wouldn't block if that was the desired path forward.
>
> Developers, please make your voices heard! :)
>
> -Joan
Re: [PROPOSAL] Include replication scheduler in 2.1
Posted by Benjamin Bastian <bb...@apache.org>.
+1
On Wed, Jul 19, 2017 at 11:04 AM, Jan Lehnardt <ma...@jan.io> wrote:
> +1
>
> Cheers
> Jan
> --
>
> > On 17. Jul 2017, at 20:35, Paul Davis <pa...@gmail.com>
> wrote:
> >
> > +1 (assuming that's +1 in favor of releasing with scheduling replicator)
> >
> >> On Sun, Jul 16, 2017 at 7:54 PM, Nick Vatamaniuc <va...@gmail.com>
> wrote:
> >> +1
> >>
> >>> On Jul 16, 2017 17:14, "Joan Touzet" <wo...@apache.org> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> *Per the CouchDB bylaws, this is a concrete proposal that will default
> >>> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
> >>>
> >>> As we approach release candidates for v2.1 (see next email), we have
> one
> >>> major decision left to make: whether or not to include the scheduling
> >>> replicator in 2.1.
> >>>
> >>> Arguments for inclusion:
> >>>
> >>> * New feature allowing CouchDB to manage more replication jobs at
> >>> the same time by switching between them / starting / stopping.
> >>> From the documentation:
> >>> * Handles failing jobs more gracefully (exponnential backoff).
> >>> * Includes a new pair of API endpoints: _scheduler/jobs and
> >>> _scheduler/docs with enhanced information and an updated state
> >>> machine for replication jobs.
> >>> * Shared connection pool improves network resource usage and
> >>> performance, especially with large numbers of connections to
> >>> the same source/target.
> >>> * Improved request rate limit handling.
> >>> * Improved recovery from long but temporary network failures.
> >>> * Better handling of filtered replications.
> >>> * Feature includes its own tests, which all pass.
> >>> * Feature is fully documented.
> >>> * Cherry-picking out the scheduling replicator commits from the
> >>> ~190 commits since then (all bugfixes and minor improvements) is
> >>> labour intensive for the release team, and possibly error prone.
> >>>
> >>> Arguments against inclusion:
> >>>
> >>> * It has been ~9 months since the 2.0 release. Many bugs have been
> >>> found and fixed.
> >>> * A new release without the scheduling replicator would provide
> >>> risk mitigation for users who need those bug fixes but are risk-
> >>> averse to new features.
> >>> * Scheduling replicator has not seen much real-world testing. Bugs
> >>> may surface in a 2.1 release that could destabilise existing
> >>> installs being upgraded.
> >>>
> >>> I've thought a lot about this issue, and would like to propose that we
> >>> release 2.1 *with* the scheduling replicator included. My reasoning is
> >>> that the benefits outweigh the potential downsides. If necessary, we
> >>> can release a 2.1.1 in the following weeks with urgent bug fixes to
> >>> the scheduling replicator if necessary.
> >>>
> >>> Another alternative would be to ~simultaneously release a 2.1 from just
> >>> before the scheduling replicator landed (~190 commits ago), then a 2.2
> >>> from the HEAD of the master branch with all the subsequent fixes. 2.1
> >>> would be missing these more recent fixes, but it would again avoid the
> >>> massive cherry-picking operation necessary to port all of them to the
> >>> 2.1 branch without the scheduling replicator. I'm -0 on this because
> >>> of the confusion it might create with release announcements, but
> >>> wouldn't block if that was the desired path forward.
> >>>
> >>> Developers, please make your voices heard! :)
> >>>
> >>> -Joan
> >>>
>
>
Re: [PROPOSAL] Include replication scheduler in 2.1
Posted by Jan Lehnardt <ma...@jan.io>.
+1
Cheers
Jan
--
> On 17. Jul 2017, at 20:35, Paul Davis <pa...@gmail.com> wrote:
>
> +1 (assuming that's +1 in favor of releasing with scheduling replicator)
>
>> On Sun, Jul 16, 2017 at 7:54 PM, Nick Vatamaniuc <va...@gmail.com> wrote:
>> +1
>>
>>> On Jul 16, 2017 17:14, "Joan Touzet" <wo...@apache.org> wrote:
>>>
>>> Hi all,
>>>
>>> *Per the CouchDB bylaws, this is a concrete proposal that will default
>>> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
>>>
>>> As we approach release candidates for v2.1 (see next email), we have one
>>> major decision left to make: whether or not to include the scheduling
>>> replicator in 2.1.
>>>
>>> Arguments for inclusion:
>>>
>>> * New feature allowing CouchDB to manage more replication jobs at
>>> the same time by switching between them / starting / stopping.
>>> From the documentation:
>>> * Handles failing jobs more gracefully (exponnential backoff).
>>> * Includes a new pair of API endpoints: _scheduler/jobs and
>>> _scheduler/docs with enhanced information and an updated state
>>> machine for replication jobs.
>>> * Shared connection pool improves network resource usage and
>>> performance, especially with large numbers of connections to
>>> the same source/target.
>>> * Improved request rate limit handling.
>>> * Improved recovery from long but temporary network failures.
>>> * Better handling of filtered replications.
>>> * Feature includes its own tests, which all pass.
>>> * Feature is fully documented.
>>> * Cherry-picking out the scheduling replicator commits from the
>>> ~190 commits since then (all bugfixes and minor improvements) is
>>> labour intensive for the release team, and possibly error prone.
>>>
>>> Arguments against inclusion:
>>>
>>> * It has been ~9 months since the 2.0 release. Many bugs have been
>>> found and fixed.
>>> * A new release without the scheduling replicator would provide
>>> risk mitigation for users who need those bug fixes but are risk-
>>> averse to new features.
>>> * Scheduling replicator has not seen much real-world testing. Bugs
>>> may surface in a 2.1 release that could destabilise existing
>>> installs being upgraded.
>>>
>>> I've thought a lot about this issue, and would like to propose that we
>>> release 2.1 *with* the scheduling replicator included. My reasoning is
>>> that the benefits outweigh the potential downsides. If necessary, we
>>> can release a 2.1.1 in the following weeks with urgent bug fixes to
>>> the scheduling replicator if necessary.
>>>
>>> Another alternative would be to ~simultaneously release a 2.1 from just
>>> before the scheduling replicator landed (~190 commits ago), then a 2.2
>>> from the HEAD of the master branch with all the subsequent fixes. 2.1
>>> would be missing these more recent fixes, but it would again avoid the
>>> massive cherry-picking operation necessary to port all of them to the
>>> 2.1 branch without the scheduling replicator. I'm -0 on this because
>>> of the confusion it might create with release announcements, but
>>> wouldn't block if that was the desired path forward.
>>>
>>> Developers, please make your voices heard! :)
>>>
>>> -Joan
>>>
Re: [PROPOSAL] Include replication scheduler in 2.1
Posted by Paul Davis <pa...@gmail.com>.
+1 (assuming that's +1 in favor of releasing with scheduling replicator)
On Sun, Jul 16, 2017 at 7:54 PM, Nick Vatamaniuc <va...@gmail.com> wrote:
> +1
>
> On Jul 16, 2017 17:14, "Joan Touzet" <wo...@apache.org> wrote:
>
>> Hi all,
>>
>> *Per the CouchDB bylaws, this is a concrete proposal that will default
>> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
>>
>> As we approach release candidates for v2.1 (see next email), we have one
>> major decision left to make: whether or not to include the scheduling
>> replicator in 2.1.
>>
>> Arguments for inclusion:
>>
>> * New feature allowing CouchDB to manage more replication jobs at
>> the same time by switching between them / starting / stopping.
>> From the documentation:
>> * Handles failing jobs more gracefully (exponnential backoff).
>> * Includes a new pair of API endpoints: _scheduler/jobs and
>> _scheduler/docs with enhanced information and an updated state
>> machine for replication jobs.
>> * Shared connection pool improves network resource usage and
>> performance, especially with large numbers of connections to
>> the same source/target.
>> * Improved request rate limit handling.
>> * Improved recovery from long but temporary network failures.
>> * Better handling of filtered replications.
>> * Feature includes its own tests, which all pass.
>> * Feature is fully documented.
>> * Cherry-picking out the scheduling replicator commits from the
>> ~190 commits since then (all bugfixes and minor improvements) is
>> labour intensive for the release team, and possibly error prone.
>>
>> Arguments against inclusion:
>>
>> * It has been ~9 months since the 2.0 release. Many bugs have been
>> found and fixed.
>> * A new release without the scheduling replicator would provide
>> risk mitigation for users who need those bug fixes but are risk-
>> averse to new features.
>> * Scheduling replicator has not seen much real-world testing. Bugs
>> may surface in a 2.1 release that could destabilise existing
>> installs being upgraded.
>>
>> I've thought a lot about this issue, and would like to propose that we
>> release 2.1 *with* the scheduling replicator included. My reasoning is
>> that the benefits outweigh the potential downsides. If necessary, we
>> can release a 2.1.1 in the following weeks with urgent bug fixes to
>> the scheduling replicator if necessary.
>>
>> Another alternative would be to ~simultaneously release a 2.1 from just
>> before the scheduling replicator landed (~190 commits ago), then a 2.2
>> from the HEAD of the master branch with all the subsequent fixes. 2.1
>> would be missing these more recent fixes, but it would again avoid the
>> massive cherry-picking operation necessary to port all of them to the
>> 2.1 branch without the scheduling replicator. I'm -0 on this because
>> of the confusion it might create with release announcements, but
>> wouldn't block if that was the desired path forward.
>>
>> Developers, please make your voices heard! :)
>>
>> -Joan
>>
Re: [PROPOSAL] Include replication scheduler in 2.1
Posted by Nick Vatamaniuc <va...@gmail.com>.
+1
On Jul 16, 2017 17:14, "Joan Touzet" <wo...@apache.org> wrote:
> Hi all,
>
> *Per the CouchDB bylaws, this is a concrete proposal that will default
> to lazy consensus in 72 hours (2017.07.19 ~21:00).*
>
> As we approach release candidates for v2.1 (see next email), we have one
> major decision left to make: whether or not to include the scheduling
> replicator in 2.1.
>
> Arguments for inclusion:
>
> * New feature allowing CouchDB to manage more replication jobs at
> the same time by switching between them / starting / stopping.
> From the documentation:
> * Handles failing jobs more gracefully (exponnential backoff).
> * Includes a new pair of API endpoints: _scheduler/jobs and
> _scheduler/docs with enhanced information and an updated state
> machine for replication jobs.
> * Shared connection pool improves network resource usage and
> performance, especially with large numbers of connections to
> the same source/target.
> * Improved request rate limit handling.
> * Improved recovery from long but temporary network failures.
> * Better handling of filtered replications.
> * Feature includes its own tests, which all pass.
> * Feature is fully documented.
> * Cherry-picking out the scheduling replicator commits from the
> ~190 commits since then (all bugfixes and minor improvements) is
> labour intensive for the release team, and possibly error prone.
>
> Arguments against inclusion:
>
> * It has been ~9 months since the 2.0 release. Many bugs have been
> found and fixed.
> * A new release without the scheduling replicator would provide
> risk mitigation for users who need those bug fixes but are risk-
> averse to new features.
> * Scheduling replicator has not seen much real-world testing. Bugs
> may surface in a 2.1 release that could destabilise existing
> installs being upgraded.
>
> I've thought a lot about this issue, and would like to propose that we
> release 2.1 *with* the scheduling replicator included. My reasoning is
> that the benefits outweigh the potential downsides. If necessary, we
> can release a 2.1.1 in the following weeks with urgent bug fixes to
> the scheduling replicator if necessary.
>
> Another alternative would be to ~simultaneously release a 2.1 from just
> before the scheduling replicator landed (~190 commits ago), then a 2.2
> from the HEAD of the master branch with all the subsequent fixes. 2.1
> would be missing these more recent fixes, but it would again avoid the
> massive cherry-picking operation necessary to port all of them to the
> 2.1 branch without the scheduling replicator. I'm -0 on this because
> of the confusion it might create with release announcements, but
> wouldn't block if that was the desired path forward.
>
> Developers, please make your voices heard! :)
>
> -Joan
>