You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Adam Wolff <aw...@gmail.com> on 2009/03/10 01:31:29 UTC

helper functions in reduce

Hi again,So we've got a nifty little reduce function that makes fairly heavy
use of helper functions. My view looks kinda look this:

by_id : {
    map: function(doc) {
        emit(doc._id, {doc.type: doc});
    },
    reduce : function(keys, vals, rereduce){

        function helper1(args){...};
        function helper2(args){...};
        function helper3(args){...};

        var result = [];
        for each val in vals
            if ( helper1(vals) ) result = helper2(val, result);
            else result = helper3(val, results);

        return result;
    }


but it bothers me a little that those three functions are defined every time
that reduce is run. I haven't measured the overhead, but my general
experience with javascript is that function creation has measurable
overhead. could couchdb make the view object itself available when reduce is
run, so I can stick other functions on its slots? like:

    helper1 : function(args){...},
    helper2 : function(args){...},
    helper3 : function(args){...},

    reduce : function(keys, vals, rereduce){


        var result = [];
        for each val in vals
            if ( view.helper1(vals) ) result = view.helper2(val, result);
            else result = view.helper3(val, results);

        return result;

Am I the only person who's worried about this?

Thanks,
A

Re: helper functions in reduce

Posted by Chris Anderson <jc...@apache.org>.
On Wed, Mar 11, 2009 at 6:59 AM, Zachary Zolton
<za...@gmail.com> wrote:
> I was a bit mislead by the title; this sounds like caching the
> parsed/ready-to-go JavaScript view functions.
>
> However, I am interested in hearing how we can introduce helper code,
> and reusable libraries, into our JavaScript views.
>

Check out the CouchApp script, which has macros for including helper
code automatically in view functions when saving the design documents.
This is really the best way to do it, as the functions as stored in
the design doc can not depend on definitions elsewhere in the code. If
there are a set of helpers you'd like to see as sort of a standard
library for MapReduce, they'd make great candidates for inclusion in
the CouchApp vendor package.

Scan the README for mention of the !code macro:
http://github.com/jchris/couchapp/blob/master/README.md

-- 
Chris Anderson
http://jchris.mfdz.com

Re: helper functions in reduce

Posted by Zachary Zolton <za...@gmail.com>.
I was a bit mislead by the title; this sounds like caching the
parsed/ready-to-go JavaScript view functions.

However, I am interested in hearing how we can introduce helper code,
and reusable libraries, into our JavaScript views.

Perhaps one could take care of both in one go? If this doesn't get
implemented soon, I'll take a shot at it. Dang work responsibilities
getting in the way...! ;^)

—Zach

On Wed, Mar 11, 2009 at 2:29 AM, Jan Lehnardt <ja...@apache.org> wrote:
>
> On 11 Mar 2009, at 02:29, Chris Anderson wrote:
>
>> On Tue, Mar 10, 2009 at 6:18 PM, kowsik <ko...@gmail.com> wrote:
>>>
>>> And the source code as defined in the view is parsed and executed
>>> __each__ time reduce is called on the changed docs. I think the intent
>>> was that the view server has no clue when _design views actually
>>> change. So couchdb passes in the actual string corresponding to the
>>> view function each time it wants couchjs to reduce a bunch of
>>> key/values. I suppose you could precompile these functions in the view
>>> server and have it return "handles" to couchdb that can be used at a
>>> later stage.
>>
>> We've discussed a little on IRC, how to optimize this. The plan so
>> far, is to maintain in CouchDB's ets tables of couchjs processes, a
>> list of functions known by each process (where functions are named by
>> their md5 hash). When a function is passed to a query server, it'll be
>> passed along with its name, the first time. If a server already knows
>> about the function, Couch will just pass the name in subsequent
>> requests. If they somehow get out of sync CouchDB can just kill that
>> server and start fresh.
>>
>> This also means that if you have lots of dbs with the same views in
>> them, you get nice reuse of couchjs processes.
>>
>> Anyone feel like implementing? ;)
>
> *cough* …maybe dev@.
>
> Cheers
> Jan
> --
>
>
>
>
>

Re: helper functions in reduce

Posted by Jan Lehnardt <ja...@apache.org>.
On 11 Mar 2009, at 02:29, Chris Anderson wrote:

> On Tue, Mar 10, 2009 at 6:18 PM, kowsik <ko...@gmail.com> wrote:
>> And the source code as defined in the view is parsed and executed
>> __each__ time reduce is called on the changed docs. I think the  
>> intent
>> was that the view server has no clue when _design views actually
>> change. So couchdb passes in the actual string corresponding to the
>> view function each time it wants couchjs to reduce a bunch of
>> key/values. I suppose you could precompile these functions in the  
>> view
>> server and have it return "handles" to couchdb that can be used at a
>> later stage.
>
> We've discussed a little on IRC, how to optimize this. The plan so
> far, is to maintain in CouchDB's ets tables of couchjs processes, a
> list of functions known by each process (where functions are named by
> their md5 hash). When a function is passed to a query server, it'll be
> passed along with its name, the first time. If a server already knows
> about the function, Couch will just pass the name in subsequent
> requests. If they somehow get out of sync CouchDB can just kill that
> server and start fresh.
>
> This also means that if you have lots of dbs with the same views in
> them, you get nice reuse of couchjs processes.
>
> Anyone feel like implementing? ;)

*cough* …maybe dev@.

Cheers
Jan
--





Re: helper functions in reduce

Posted by Jan Lehnardt <ja...@apache.org>.
On 11 Mar 2009, at 02:29, Chris Anderson wrote:

> On Tue, Mar 10, 2009 at 6:18 PM, kowsik <ko...@gmail.com> wrote:
>> And the source code as defined in the view is parsed and executed
>> __each__ time reduce is called on the changed docs. I think the  
>> intent
>> was that the view server has no clue when _design views actually
>> change. So couchdb passes in the actual string corresponding to the
>> view function each time it wants couchjs to reduce a bunch of
>> key/values. I suppose you could precompile these functions in the  
>> view
>> server and have it return "handles" to couchdb that can be used at a
>> later stage.
>
> We've discussed a little on IRC, how to optimize this. The plan so
> far, is to maintain in CouchDB's ets tables of couchjs processes, a
> list of functions known by each process (where functions are named by
> their md5 hash). When a function is passed to a query server, it'll be
> passed along with its name, the first time. If a server already knows
> about the function, Couch will just pass the name in subsequent
> requests. If they somehow get out of sync CouchDB can just kill that
> server and start fresh.
>
> This also means that if you have lots of dbs with the same views in
> them, you get nice reuse of couchjs processes.
>
> Anyone feel like implementing? ;)

*cough* …maybe dev@.

Cheers
Jan
--





Re: helper functions in reduce

Posted by Chris Anderson <jc...@apache.org>.
On Tue, Mar 10, 2009 at 6:18 PM, kowsik <ko...@gmail.com> wrote:
> And the source code as defined in the view is parsed and executed
> __each__ time reduce is called on the changed docs. I think the intent
> was that the view server has no clue when _design views actually
> change. So couchdb passes in the actual string corresponding to the
> view function each time it wants couchjs to reduce a bunch of
> key/values. I suppose you could precompile these functions in the view
> server and have it return "handles" to couchdb that can be used at a
> later stage.

We've discussed a little on IRC, how to optimize this. The plan so
far, is to maintain in CouchDB's ets tables of couchjs processes, a
list of functions known by each process (where functions are named by
their md5 hash). When a function is passed to a query server, it'll be
passed along with its name, the first time. If a server already knows
about the function, Couch will just pass the name in subsequent
requests. If they somehow get out of sync CouchDB can just kill that
server and start fresh.

This also means that if you have lots of dbs with the same views in
them, you get nice reuse of couchjs processes.

Anyone feel like implementing? ;)

Chris


-- 
Chris Anderson
http://jchris.mfdz.com

Re: helper functions in reduce

Posted by Paul Davis <pa...@gmail.com>.
On Tue, Mar 10, 2009 at 9:18 PM, kowsik <ko...@gmail.com> wrote:
> And the source code as defined in the view is parsed and executed
> __each__ time reduce is called on the changed docs. I think the intent
> was that the view server has no clue when _design views actually
> change. So couchdb passes in the actual string corresponding to the
> view function each time it wants couchjs to reduce a bunch of
> key/values. I suppose you could precompile these functions in the view
> server and have it return "handles" to couchdb that can be used at a
> later stage.
>

For future reference, map functions are compiled once per view update.
Reduce functions are compiled multiple times.

As jchris says, there's definitely room for improvement.

> But, I don't think anyone has benchmarked the view server to see where
> it's spending the most amount of time to implement these
> optimizations.
>
> K.
>
> On Tue, Mar 10, 2009 at 5:10 PM, Jens Alfke <je...@mooseyard.com> wrote:
>>
>> On Mar 9, 2009, at 5:31 PM, Adam Wolff wrote:
>>
>>> but it bothers me a little that those three functions are defined every
>>> time
>>> that reduce is run.
>>
>> They're not. They're defined when the source code is parsed, not when it's
>> run. The only difference it makes to have them inside the reduce function is
>> that they can access local variables of that function. There shouldn't be
>> any difference in speed.
>>
>> —Jens
>

Re: helper functions in reduce

Posted by kowsik <ko...@gmail.com>.
And the source code as defined in the view is parsed and executed
__each__ time reduce is called on the changed docs. I think the intent
was that the view server has no clue when _design views actually
change. So couchdb passes in the actual string corresponding to the
view function each time it wants couchjs to reduce a bunch of
key/values. I suppose you could precompile these functions in the view
server and have it return "handles" to couchdb that can be used at a
later stage.

But, I don't think anyone has benchmarked the view server to see where
it's spending the most amount of time to implement these
optimizations.

K.

On Tue, Mar 10, 2009 at 5:10 PM, Jens Alfke <je...@mooseyard.com> wrote:
>
> On Mar 9, 2009, at 5:31 PM, Adam Wolff wrote:
>
>> but it bothers me a little that those three functions are defined every
>> time
>> that reduce is run.
>
> They're not. They're defined when the source code is parsed, not when it's
> run. The only difference it makes to have them inside the reduce function is
> that they can access local variables of that function. There shouldn't be
> any difference in speed.
>
> —Jens

Re: helper functions in reduce

Posted by Jens Alfke <je...@mooseyard.com>.
On Mar 9, 2009, at 5:31 PM, Adam Wolff wrote:

> but it bothers me a little that those three functions are defined  
> every time
> that reduce is run.

They're not. They're defined when the source code is parsed, not when  
it's run. The only difference it makes to have them inside the reduce  
function is that they can access local variables of that function.  
There shouldn't be any difference in speed.

—Jens

Re: helper functions in reduce

Posted by kowsik <ko...@gmail.com>.
Some other thing to consider. If you look at main.js there's a lot of
toJSON in there, which for a lot of documents takes up a lot of time
(I'm mostly extrapolating on this by looking at the code and the
overall behavior). This might be an ideal candidate to move into
couchjs.c, especially with the native JSON support that FF3 now has.
>From what I've read, eval is pretty fast (and is used for each line
read from couchdb to the view server).

https://bugzilla.mozilla.org/show_bug.cgi?id=387522

K.

On Mon, Mar 9, 2009 at 7:02 PM, Chris Anderson <jc...@apache.org> wrote:
> On Mon, Mar 9, 2009 at 5:31 PM, Adam Wolff <aw...@gmail.com> wrote:
>> Am I the only person who's worried about this?
>>
>
> I've definitely wondered about it. I'd be curious to see some measurements.
>
> This seems like the sort of thing the JS optimizer could clean up
> really easily. So far I haven't worried to much about view function
> computation time, as it seems to be roughly matched with index
> maintenance time. That is, I've never felt that JS view function
> execution was the bottleneck.
>
> The first thing I'd optimize in your example function is all the extra
> calls to helper1() but I'm guessing it's just an example.
>
> Chris
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>

Re: helper functions in reduce

Posted by Adam Wolff <aw...@gmail.com>.
thanks for the reply chris. that was just a typo with "helper1" should've
been:        for each val in vals
            if ( view.helper1(val) ) result = view.helper2(val, result);

I'll do some measurements when we get closer to production and report back
to the list.

A

On Mon, Mar 9, 2009 at 7:02 PM, Chris Anderson <jc...@apache.org> wrote:

> On Mon, Mar 9, 2009 at 5:31 PM, Adam Wolff <aw...@gmail.com> wrote:
> > Am I the only person who's worried about this?
> >
>
> I've definitely wondered about it. I'd be curious to see some measurements.
>
> This seems like the sort of thing the JS optimizer could clean up
> really easily. So far I haven't worried to much about view function
> computation time, as it seems to be roughly matched with index
> maintenance time. That is, I've never felt that JS view function
> execution was the bottleneck.
>
> The first thing I'd optimize in your example function is all the extra
> calls to helper1() but I'm guessing it's just an example.
>
> Chris
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>

Re: helper functions in reduce

Posted by Chris Anderson <jc...@apache.org>.
On Mon, Mar 9, 2009 at 5:31 PM, Adam Wolff <aw...@gmail.com> wrote:
> Am I the only person who's worried about this?
>

I've definitely wondered about it. I'd be curious to see some measurements.

This seems like the sort of thing the JS optimizer could clean up
really easily. So far I haven't worried to much about view function
computation time, as it seems to be roughly matched with index
maintenance time. That is, I've never felt that JS view function
execution was the bottleneck.

The first thing I'd optimize in your example function is all the extra
calls to helper1() but I'm guessing it's just an example.

Chris

-- 
Chris Anderson
http://jchris.mfdz.com