You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Robert Newson <rn...@apache.org> on 2016/03/20 17:19:08 UTC

Candidates for built-in filter functions?

Hi,

As part of a new effort to improve replicator performance I'm planning to add new built-in filter functions. These run in the Erlang vm; saving the couchjs round trip. 

The first candidate is one that skips deleted documents as it's quite common to replicate with such a filter to remove deleted tombstones. 

This thread is for gathering more suggestions, so please help me out here. I'd like to reach the level we have for reduce functions which cover a good deal of the useful / functional cases. 

One filter I'm considering would allow filtering by the value of a named attribute. Something like "include this doc if doc.type equals 'purchase order'". Both the name and required value would be query parameters. 

B. 

Re: Candidates for built-in filter functions?

Posted by Nolan Lawson <no...@nolanlawson.com>.
I second the comment about Mango. However, if that's not feasible, then I'd
say:

- deleted docs
- docs with _conflicts
- design/non-design docs
- docs with/without attachments

Just some ideas! Although really it would be nice if it were
parameterizable (then we could do types, keywords, etc.), but at that point
we might as well use Mango.

- Nolan

On Tue, Mar 22, 2016 at 8:28 AM, Garren Smith <ga...@apache.org> wrote:

> I can't comment from an performance perspective. But from an api
> perspective, having the mango selectors for the filters would be really
> nice. I think it would be rich enough and quite elegant.
>
> On Sun, Mar 20, 2016 at 8:40 PM, Robert Samuel Newson <rn...@apache.org>
> wrote:
>
> > That idea came up explicitly this week and it has obvious merit. I don't
> > know enough about mango selectors to know if it's rich "enough" but it
> > would be simple to add and whatever it did cover would run much faster
> than
> > today's JS approach.
> >
> > > On 20 Mar 2016, at 17:03, Adam Kocoloski <ko...@apache.org> wrote:
> > >
> > > Hi Bob, instead of trying to anticipate all popular options what about
> > enabling Mango selectors as filters? I’d hope that over time the
> > performance of a selector is comparable to a builtin.
> > >
> > > Adam
> > >
> > >> On Mar 20, 2016, at 12:34 PM, Alexander Shorin <kx...@gmail.com>
> > wrote:
> > >>
> > >> On Sun, Mar 20, 2016 at 7:30 PM, Constantin Teodorescu
> > >> <br...@gmail.com> wrote:
> > >>> On Sun, Mar 20, 2016 at 6:19 PM, Robert Newson <rn...@apache.org>
> > wrote:
> > >>>
> > >>>> As part of a new effort to improve replicator performance I'm
> > planning to
> > >>>> add new built-in filter functions. These run in the Erlang vm;
> saving
> > the
> > >>>> couchjs round trip.
> > >>>> The first candidate is one that skips deleted documents as it's
> quite
> > >>>> common to replicate with such a filter to remove deleted tombstones.
> > >>>> This thread is for gathering more suggestions, so please help me out
> > here.
> > >>>> I'd like to reach the level we have for reduce functions which cover
> > a good
> > >>>> deal of the useful / functional cases.
> > >>>> One filter I'm considering would allow filtering by the value of a
> > named
> > >>>> attribute. Something like "include this doc if doc.type equals
> > 'purchase
> > >>>> order'". Both the name and required value would be query parameters.
> > >>>>
> > >>>
> > >>> It would be nice also:  _design/* or even something like  _id match
> > regexp
> > >>> ...
> > >>> And the same for doc.type match regexp
> > >>
> > >> That's the way to have a security issue by giving arbitrary user to
> > >> run any regexp on server side. For instance:
> > >> http://www.regular-expressions.info/catastrophic.html
> > >>
> > >> --
> > >> ,,,^..^,,,
> > >
> >
> >
>



-- 
Nolan Lawson
nolanlawson.com
github.com/nolanlawson

Re: Candidates for built-in filter functions?

Posted by Garren Smith <ga...@apache.org>.
I can't comment from an performance perspective. But from an api
perspective, having the mango selectors for the filters would be really
nice. I think it would be rich enough and quite elegant.

On Sun, Mar 20, 2016 at 8:40 PM, Robert Samuel Newson <rn...@apache.org>
wrote:

> That idea came up explicitly this week and it has obvious merit. I don't
> know enough about mango selectors to know if it's rich "enough" but it
> would be simple to add and whatever it did cover would run much faster than
> today's JS approach.
>
> > On 20 Mar 2016, at 17:03, Adam Kocoloski <ko...@apache.org> wrote:
> >
> > Hi Bob, instead of trying to anticipate all popular options what about
> enabling Mango selectors as filters? I’d hope that over time the
> performance of a selector is comparable to a builtin.
> >
> > Adam
> >
> >> On Mar 20, 2016, at 12:34 PM, Alexander Shorin <kx...@gmail.com>
> wrote:
> >>
> >> On Sun, Mar 20, 2016 at 7:30 PM, Constantin Teodorescu
> >> <br...@gmail.com> wrote:
> >>> On Sun, Mar 20, 2016 at 6:19 PM, Robert Newson <rn...@apache.org>
> wrote:
> >>>
> >>>> As part of a new effort to improve replicator performance I'm
> planning to
> >>>> add new built-in filter functions. These run in the Erlang vm; saving
> the
> >>>> couchjs round trip.
> >>>> The first candidate is one that skips deleted documents as it's quite
> >>>> common to replicate with such a filter to remove deleted tombstones.
> >>>> This thread is for gathering more suggestions, so please help me out
> here.
> >>>> I'd like to reach the level we have for reduce functions which cover
> a good
> >>>> deal of the useful / functional cases.
> >>>> One filter I'm considering would allow filtering by the value of a
> named
> >>>> attribute. Something like "include this doc if doc.type equals
> 'purchase
> >>>> order'". Both the name and required value would be query parameters.
> >>>>
> >>>
> >>> It would be nice also:  _design/* or even something like  _id match
> regexp
> >>> ...
> >>> And the same for doc.type match regexp
> >>
> >> That's the way to have a security issue by giving arbitrary user to
> >> run any regexp on server side. For instance:
> >> http://www.regular-expressions.info/catastrophic.html
> >>
> >> --
> >> ,,,^..^,,,
> >
>
>

Re: Candidates for built-in filter functions?

Posted by Robert Samuel Newson <rn...@apache.org>.
That idea came up explicitly this week and it has obvious merit. I don't know enough about mango selectors to know if it's rich "enough" but it would be simple to add and whatever it did cover would run much faster than today's JS approach.

> On 20 Mar 2016, at 17:03, Adam Kocoloski <ko...@apache.org> wrote:
> 
> Hi Bob, instead of trying to anticipate all popular options what about enabling Mango selectors as filters? I’d hope that over time the performance of a selector is comparable to a builtin.
> 
> Adam
> 
>> On Mar 20, 2016, at 12:34 PM, Alexander Shorin <kx...@gmail.com> wrote:
>> 
>> On Sun, Mar 20, 2016 at 7:30 PM, Constantin Teodorescu
>> <br...@gmail.com> wrote:
>>> On Sun, Mar 20, 2016 at 6:19 PM, Robert Newson <rn...@apache.org> wrote:
>>> 
>>>> As part of a new effort to improve replicator performance I'm planning to
>>>> add new built-in filter functions. These run in the Erlang vm; saving the
>>>> couchjs round trip.
>>>> The first candidate is one that skips deleted documents as it's quite
>>>> common to replicate with such a filter to remove deleted tombstones.
>>>> This thread is for gathering more suggestions, so please help me out here.
>>>> I'd like to reach the level we have for reduce functions which cover a good
>>>> deal of the useful / functional cases.
>>>> One filter I'm considering would allow filtering by the value of a named
>>>> attribute. Something like "include this doc if doc.type equals 'purchase
>>>> order'". Both the name and required value would be query parameters.
>>>> 
>>> 
>>> It would be nice also:  _design/* or even something like  _id match regexp
>>> ...
>>> And the same for doc.type match regexp
>> 
>> That's the way to have a security issue by giving arbitrary user to
>> run any regexp on server side. For instance:
>> http://www.regular-expressions.info/catastrophic.html
>> 
>> --
>> ,,,^..^,,,
> 


Re: Candidates for built-in filter functions?

Posted by Adam Kocoloski <ko...@apache.org>.
Hi Bob, instead of trying to anticipate all popular options what about enabling Mango selectors as filters? I’d hope that over time the performance of a selector is comparable to a builtin.

Adam

> On Mar 20, 2016, at 12:34 PM, Alexander Shorin <kx...@gmail.com> wrote:
> 
> On Sun, Mar 20, 2016 at 7:30 PM, Constantin Teodorescu
> <br...@gmail.com> wrote:
>> On Sun, Mar 20, 2016 at 6:19 PM, Robert Newson <rn...@apache.org> wrote:
>> 
>>> As part of a new effort to improve replicator performance I'm planning to
>>> add new built-in filter functions. These run in the Erlang vm; saving the
>>> couchjs round trip.
>>> The first candidate is one that skips deleted documents as it's quite
>>> common to replicate with such a filter to remove deleted tombstones.
>>> This thread is for gathering more suggestions, so please help me out here.
>>> I'd like to reach the level we have for reduce functions which cover a good
>>> deal of the useful / functional cases.
>>> One filter I'm considering would allow filtering by the value of a named
>>> attribute. Something like "include this doc if doc.type equals 'purchase
>>> order'". Both the name and required value would be query parameters.
>>> 
>> 
>> It would be nice also:  _design/* or even something like  _id match regexp
>> ...
>> And the same for doc.type match regexp
> 
> That's the way to have a security issue by giving arbitrary user to
> run any regexp on server side. For instance:
> http://www.regular-expressions.info/catastrophic.html
> 
> --
> ,,,^..^,,,


Re: Candidates for built-in filter functions?

Posted by Alexander Shorin <kx...@gmail.com>.
On Sun, Mar 20, 2016 at 7:30 PM, Constantin Teodorescu
<br...@gmail.com> wrote:
> On Sun, Mar 20, 2016 at 6:19 PM, Robert Newson <rn...@apache.org> wrote:
>
>> As part of a new effort to improve replicator performance I'm planning to
>> add new built-in filter functions. These run in the Erlang vm; saving the
>> couchjs round trip.
>> The first candidate is one that skips deleted documents as it's quite
>> common to replicate with such a filter to remove deleted tombstones.
>> This thread is for gathering more suggestions, so please help me out here.
>> I'd like to reach the level we have for reduce functions which cover a good
>> deal of the useful / functional cases.
>> One filter I'm considering would allow filtering by the value of a named
>> attribute. Something like "include this doc if doc.type equals 'purchase
>> order'". Both the name and required value would be query parameters.
>>
>
> It would be nice also:  _design/* or even something like  _id match regexp
> ...
> And the same for doc.type match regexp

That's the way to have a security issue by giving arbitrary user to
run any regexp on server side. For instance:
http://www.regular-expressions.info/catastrophic.html

--
,,,^..^,,,

Re: Candidates for built-in filter functions?

Posted by Constantin Teodorescu <br...@gmail.com>.
On Sun, Mar 20, 2016 at 6:19 PM, Robert Newson <rn...@apache.org> wrote:

> As part of a new effort to improve replicator performance I'm planning to
> add new built-in filter functions. These run in the Erlang vm; saving the
> couchjs round trip.
> The first candidate is one that skips deleted documents as it's quite
> common to replicate with such a filter to remove deleted tombstones.
> This thread is for gathering more suggestions, so please help me out here.
> I'd like to reach the level we have for reduce functions which cover a good
> deal of the useful / functional cases.
> One filter I'm considering would allow filtering by the value of a named
> attribute. Something like "include this doc if doc.type equals 'purchase
> order'". Both the name and required value would be query parameters.
>

It would be nice also:  _design/* or even something like  _id match regexp
...
And the same for doc.type match regexp

Teo