You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by ermouth <er...@gmail.com> on 2019/09/06 22:25:44 UTC

CouchDB data processing endpoints

I’d like to raise QS functions deprecation question again, in a somehow
different aspect.

Statistically, ‘couchapp’ term has unfortunate history: many tried what
they thought were couchapps, most expected too much – and were bitterly
disappointed. I have better experience with couchapps and derivatives, but
technologies and tooling used are very far from mainstream concepts.

Anyway, we played a lot with QS, and discovered that _list, _update and,
later, js _rewrite have another point of application, much more significant
than rendering HTML in a cumbersome manner.

QS functions, being distributed with main data flow, allow comprehensive
in-place data pre- and post-processing, very unique feature across DB
landscape.

So, probably it would be better not to deprecate/remove those endpoints in
Couch 3, but just ditch the ‘couchapp’ term in favor of something like
‘data processing extensions’.

Seems, QS functions in current state require very minor maintenance and
have nearly zero bugs.  So why spend time amputating working parts,
shouldn’t it be better to transform them into clearly presented advantage?

I mean changes in docs, removing ‘couchapps’ in favor of ‘processing
extensions’, maybe introducing less expensive requestObj, etc. Clarifying
language:"erlang" usage might also be valuable. Anyway, creative work, not
destructive.

ermouth

Re: CouchDB data processing endpoints

Posted by "Johs. E" <jo...@b2w.com>.
Hi Robert,

As long as design documents have meaningful processing endpoints, they will remain the extremely useful babel fish of CouchDB.

Does your "for some time to come” mean that you support a LTS for 3.0?
(ref Jan 19 Aug 2019)

Johs


> On 8 Sep 2019, at 16:24, ermouth <er...@gmail.com> wrote:
> 
> Hi.
> 
>> the majority view seems to be that these can all be done better externally
> 
> As Johs mentioned, it looks like couchapp community here is a sort of
> minority. I’m sorry to say, but telling minorities what is, well, the right
> way to use endpoints, is an attitude widely considered at least outdated.
> Even if the majority think they know how to do better, even if some of them
> experienced nausea one or two times long ago.
> 
> Minorities better be embraced, not repelled or expelled.
> 
>> there doesn't even need to be a performance impact
> 
> First, this is not always true, and I even know in which particular cases –
> because I’ve tested. But did you?
> 
> Secondly, as I pointed out several times, slight performance differences
> are not always important, for large meshes deployment is much more vital
> thing. As from deployment pov there is no reasonably lightweight
> substitute.
> 
> Best regards,
> ermouth
> 
> 
> вс, 8 сент. 2019 г. в 11:33, Robert Samuel Newson <rn...@apache.org>:
> 
>> Hi,
>> 
>> My rule of thumb here is whether any particular 'data processing endpoint'
>> can be done better (or at all) within the database server than otherwise,
>> as opposed to the original CouchDB position of adding such things to the
>> database to enable application hosting (of a limited form, as we've all
>> noted ad nauseam). For _show, _list, _rewrite and _update, the majority
>> view seems to be that these can all be done better externally. With Joan's
>> note on co-locating couchdb, node.js and a proxy like nginx or haproxy,
>> there doesn't even need to be a performance impact.
>> 
>> I think Joan might have been referring to "validate_doc_update" not
>> "_update" with the "going nowhere" comment, as I think we all agree that
>> validate_doc_update is an important part of the core database (though it
>> might be enhanced to not require a javascript evaluation in most
>> circumstances) or at least something like it that allows users to enforce
>> arbitrary constraints on the form of any given update.
>> 
>> In summary, I still think we're deprecating several of the 'data
>> processing endpoints' in 3.0 and removing them (or not re-implementing them
>> as the case may be) in 4.0. CouchDB 3.0 will be around for some time to
>> come.
>> 
>> B.
>> 
>>> On 7 Sep 2019, at 04:13, Johs Ensby <jo...@b2w.com> wrote:
>>> 
>>> Hi Joan,
>>> 
>>>> On 7 Sep 2019, at 00:59, Joan Touzet <wo...@apache.org> wrote:
>>>> More accurately, the current plan is they won't be re-implemented for
>> 4.0, since the existing implementations won't work in 4.0 against
>> FoundationDB.
>>> 
>>> About the discussions on dropping the functions that make design
>> documents so useful to many of us:
>>> Thanks again for clarifying.
>>> 
>>> This provides predictability for a group of users that might otherwise
>> feel like a week minory.
>>> Together with Jan's LTS commitment in the August report below, this
>> predictability is highly appreciated.
>>> 
>>>> On 19 Aug 2019, at 11:51, Jan Lehnardt <ja...@apache.org> wrote:
>>>> 
>>>> 3.0
>>>> will include the best version of the current, mostly Erlang-based
>> project,
>>>> with many new features contributed by various project partners (but
>> notably
>>>> IBM). This will be the LTS version for people who won’t be able to
>> migrate to
>>>> the newer technology foundation. There are a number of technical
>> limitations
>>>> that we are happy to adopt as a project going forward, but that might
>> be deal-
>>>> breakers for some users. As such, we’ll serve those users best with an
>> excellent
>>>> edition of the original technology stack. LTS-timelines are TBD.
>>>> 
>>> 
>>> 
>>> Johs
>>> 
>>> PS
>>>> On 7 Sep 2019, at 00:59, Joan Touzet <wo...@apache.org> wrote:
>>>> We've already dropped the 'couchapp' term in the documentation, over a
>> year ago. There is a single reference to them in the docs that states these
>> functions should not be used for new designs:
>>> As for the discussion about CouchDB as a catch-all platform (node.js,
>> haproxy, and nginx etc) – it is lost and buried.
>>> Thanks to @ermouth for narrowing down this to "prossesing endpoints" and
>> "data pre/post prossessing", useful terms in redirecting the discussion
>> towards the usefulness of design documents that sync can with the data.
>> 
>> 


Re: CouchDB data processing endpoints

Posted by ermouth <er...@gmail.com>.
Hi.

> the majority view seems to be that these can all be done better externally

As Johs mentioned, it looks like couchapp community here is a sort of
minority. I’m sorry to say, but telling minorities what is, well, the right
way to use endpoints, is an attitude widely considered at least outdated.
Even if the majority think they know how to do better, even if some of them
experienced nausea one or two times long ago.

Minorities better be embraced, not repelled or expelled.

> there doesn't even need to be a performance impact

First, this is not always true, and I even know in which particular cases –
because I’ve tested. But did you?

Secondly, as I pointed out several times, slight performance differences
are not always important, for large meshes deployment is much more vital
thing. As from deployment pov there is no reasonably lightweight
substitute.

Best regards,
ermouth


вс, 8 сент. 2019 г. в 11:33, Robert Samuel Newson <rn...@apache.org>:

> Hi,
>
> My rule of thumb here is whether any particular 'data processing endpoint'
> can be done better (or at all) within the database server than otherwise,
> as opposed to the original CouchDB position of adding such things to the
> database to enable application hosting (of a limited form, as we've all
> noted ad nauseam). For _show, _list, _rewrite and _update, the majority
> view seems to be that these can all be done better externally. With Joan's
> note on co-locating couchdb, node.js and a proxy like nginx or haproxy,
> there doesn't even need to be a performance impact.
>
> I think Joan might have been referring to "validate_doc_update" not
> "_update" with the "going nowhere" comment, as I think we all agree that
> validate_doc_update is an important part of the core database (though it
> might be enhanced to not require a javascript evaluation in most
> circumstances) or at least something like it that allows users to enforce
> arbitrary constraints on the form of any given update.
>
> In summary, I still think we're deprecating several of the 'data
> processing endpoints' in 3.0 and removing them (or not re-implementing them
> as the case may be) in 4.0. CouchDB 3.0 will be around for some time to
> come.
>
> B.
>
> > On 7 Sep 2019, at 04:13, Johs Ensby <jo...@b2w.com> wrote:
> >
> > Hi Joan,
> >
> >> On 7 Sep 2019, at 00:59, Joan Touzet <wo...@apache.org> wrote:
> >> More accurately, the current plan is they won't be re-implemented for
> 4.0, since the existing implementations won't work in 4.0 against
> FoundationDB.
> >
> > About the discussions on dropping the functions that make design
> documents so useful to many of us:
> > Thanks again for clarifying.
> >
> > This provides predictability for a group of users that might otherwise
> feel like a week minory.
> > Together with Jan's LTS commitment in the August report below, this
> predictability is highly appreciated.
> >
> >> On 19 Aug 2019, at 11:51, Jan Lehnardt <ja...@apache.org> wrote:
> >>
> >> 3.0
> >> will include the best version of the current, mostly Erlang-based
> project,
> >> with many new features contributed by various project partners (but
> notably
> >> IBM). This will be the LTS version for people who won’t be able to
> migrate to
> >> the newer technology foundation. There are a number of technical
> limitations
> >> that we are happy to adopt as a project going forward, but that might
> be deal-
> >> breakers for some users. As such, we’ll serve those users best with an
> excellent
> >> edition of the original technology stack. LTS-timelines are TBD.
> >>
> >
> >
> > Johs
> >
> > PS
> >> On 7 Sep 2019, at 00:59, Joan Touzet <wo...@apache.org> wrote:
> >> We've already dropped the 'couchapp' term in the documentation, over a
> year ago. There is a single reference to them in the docs that states these
> functions should not be used for new designs:
> > As for the discussion about CouchDB as a catch-all platform (node.js,
> haproxy, and nginx etc) – it is lost and buried.
> > Thanks to @ermouth for narrowing down this to "prossesing endpoints" and
> "data pre/post prossessing", useful terms in redirecting the discussion
> towards the usefulness of design documents that sync can with the data.
>
>

Re: CouchDB data processing endpoints

Posted by Robert Samuel Newson <rn...@apache.org>.
Hi,

My rule of thumb here is whether any particular 'data processing endpoint' can be done better (or at all) within the database server than otherwise, as opposed to the original CouchDB position of adding such things to the database to enable application hosting (of a limited form, as we've all noted ad nauseam). For _show, _list, _rewrite and _update, the majority view seems to be that these can all be done better externally. With Joan's note on co-locating couchdb, node.js and a proxy like nginx or haproxy, there doesn't even need to be a performance impact.

I think Joan might have been referring to "validate_doc_update" not "_update" with the "going nowhere" comment, as I think we all agree that validate_doc_update is an important part of the core database (though it might be enhanced to not require a javascript evaluation in most circumstances) or at least something like it that allows users to enforce arbitrary constraints on the form of any given update.

In summary, I still think we're deprecating several of the 'data processing endpoints' in 3.0 and removing them (or not re-implementing them as the case may be) in 4.0. CouchDB 3.0 will be around for some time to come.

B.

> On 7 Sep 2019, at 04:13, Johs Ensby <jo...@b2w.com> wrote:
> 
> Hi Joan,
> 
>> On 7 Sep 2019, at 00:59, Joan Touzet <wo...@apache.org> wrote:
>> More accurately, the current plan is they won't be re-implemented for 4.0, since the existing implementations won't work in 4.0 against FoundationDB.
> 
> About the discussions on dropping the functions that make design documents so useful to many of us:
> Thanks again for clarifying.
> 
> This provides predictability for a group of users that might otherwise feel like a week minory.
> Together with Jan's LTS commitment in the August report below, this predictability is highly appreciated. 
> 
>> On 19 Aug 2019, at 11:51, Jan Lehnardt <ja...@apache.org> wrote:
>> 
>> 3.0
>> will include the best version of the current, mostly Erlang-based project,
>> with many new features contributed by various project partners (but notably
>> IBM). This will be the LTS version for people who won’t be able to migrate to
>> the newer technology foundation. There are a number of technical limitations
>> that we are happy to adopt as a project going forward, but that might be deal-
>> breakers for some users. As such, we’ll serve those users best with an excellent
>> edition of the original technology stack. LTS-timelines are TBD.
>> 
> 
> 
> Johs
> 
> PS
>> On 7 Sep 2019, at 00:59, Joan Touzet <wo...@apache.org> wrote:
>> We've already dropped the 'couchapp' term in the documentation, over a year ago. There is a single reference to them in the docs that states these functions should not be used for new designs:
> As for the discussion about CouchDB as a catch-all platform (node.js, haproxy, and nginx etc) – it is lost and buried.
> Thanks to @ermouth for narrowing down this to "prossesing endpoints" and "data pre/post prossessing", useful terms in redirecting the discussion towards the usefulness of design documents that sync can with the data.


Re: CouchDB data processing endpoints

Posted by Johs Ensby <jo...@b2w.com>.
Hi Joan,

> On 7 Sep 2019, at 00:59, Joan Touzet <wo...@apache.org> wrote:
> More accurately, the current plan is they won't be re-implemented for 4.0, since the existing implementations won't work in 4.0 against FoundationDB.

About the discussions on dropping the functions that make design documents so useful to many of us:
Thanks again for clarifying.

This provides predictability for a group of users that might otherwise feel like a week minory.
Together with Jan's LTS commitment in the August report below, this predictability is highly appreciated. 

> On 19 Aug 2019, at 11:51, Jan Lehnardt <ja...@apache.org> wrote:
> 
>  3.0
> will include the best version of the current, mostly Erlang-based project,
> with many new features contributed by various project partners (but notably
> IBM). This will be the LTS version for people who won’t be able to migrate to
> the newer technology foundation. There are a number of technical limitations
> that we are happy to adopt as a project going forward, but that might be deal-
> breakers for some users. As such, we’ll serve those users best with an excellent
> edition of the original technology stack. LTS-timelines are TBD.
> 


Johs

PS
> On 7 Sep 2019, at 00:59, Joan Touzet <wo...@apache.org> wrote:
> We've already dropped the 'couchapp' term in the documentation, over a year ago. There is a single reference to them in the docs that states these functions should not be used for new designs:
As for the discussion about CouchDB as a catch-all platform (node.js, haproxy, and nginx etc) – it is lost and buried.
Thanks to @ermouth for narrowing down this to "prossesing endpoints" and "data pre/post prossessing", useful terms in redirecting the discussion towards the usefulness of design documents that sync can with the data.

Re: CouchDB data processing endpoints

Posted by Joan Touzet <wo...@apache.org>.
On 2019-09-06 6:25 p.m., ermouth wrote:
> I’d like to raise QS functions deprecation question again, in a somehow
> different aspect.

What is "QS functions"? Which specific deprecations in 3.0 are you 
asking about? Is it just _list, _update and _rewrites? Of these, _update 
is going nowhere, and _list and _rewrites don't go away until 4.0.

More accurately, the current plan is they won't be re-implemented for 
4.0, since the existing implementations won't work in 4.0 against 
FoundationDB.

> Statistically, ‘couchapp’ term has unfortunate history: many tried what
> they thought were couchapps, most expected too much – and were bitterly
> disappointed. I have better experience with couchapps and derivatives, but
> technologies and tooling used are very far from mainstream concepts.
> 
> Anyway, we played a lot with QS, and discovered that _list, _update and,
> later, js _rewrite have another point of application, much more significant
> than rendering HTML in a cumbersome manner.
> 
> QS functions, being distributed with main data flow, allow comprehensive
> in-place data pre- and post-processing, very unique feature across DB
> landscape.

We've had this discussion before. I don't see you saying anything new, 
and I'm not going to repeat myself, ether.

> So, probably it would be better not to deprecate/remove those endpoints in
> Couch 3, but just ditch the ‘couchapp’ term in favor of something like
> ‘data processing extensions’.

We've already dropped the 'couchapp' term in the documentation, over a 
year ago. There is a single reference to them in the docs that states 
these functions should not be used for new designs:

http://docs.couchdb.org/en/stable/ddocs/index.html?highlight=couchapp

The deprecation table of the 4 endpoints proposed for deprecation in 4.0 
includes the suggested replacements for these functions in 4.0. Yes, 
this isn't all done 100% in CouchDB, but even a RaspberryPi can run an 
haproxy, an nginx or a small Node.JS server alongside CouchDB with 
almost no CPU impact - in fact, less CPU and RAM impact than using 
_list, _show or _rewrite in CouchDB today.

That's not to say that this stuff - server-side mass updates of 
documents, to take one often requested example - might not end up being 
implemented *better* than these functions, with a brand new API, thanks 
to what FoundationDB provides. (But that functionality may not land in 
time for 4.0, either.)

> Seems, QS functions in current state require very minor maintenance and
> have nearly zero bugs.  So why spend time amputating working parts,
> shouldn’t it be better to transform them into clearly presented advantage?

See above - the current implementations won't work in 4.0 and would need 
a complete rewrite. This isn't a "keep the code around" situation, it's 
a full re-engineering effort. Our individual experiences have lead us to 
different places on this functionality, we've heard from everyone active 
in this community at least *twice* about their viewpoints.

Speaking as fairly as possible, I don't practically see this 
disagreement being resolved in an easy fashion at this point.

-Joan

> I mean changes in docs, removing ‘couchapps’ in favor of ‘processing
> extensions’, maybe introducing less expensive requestObj, etc. Clarifying
> language:"erlang" usage might also be valuable. Anyway, creative work, not
> destructive.
> 
> ermouth