You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Manokaran K <ma...@gmail.com> on 2010/05/11 10:04:48 UTC

view generation speed

Hi,

Am testing an app where there are approx 400k docs (exam results). There are
only two views in the design_doc:

map:
function(doc) {
      if(doc.dtype === 'result'){
            var subject_total, scr, scores = doc.scores;
            var total = 0, len = scores.length; //scores.length is 6 to 12
            for(var i=0; i<len; i++){
                  scr = scores[i];
                  subject_total = scr.th + (scr.pr ? scr.pr : 0);
                  total += subject_total;
                  emit([scr.code, subject_total], 1);
            }
            emit(['to', total], 1);
      }
}

reduce:
function(keys, values, rereduce){
      return sum(values);
}

I use the above map-reduce to calculate the frequencies for each subject
score as well as the aggregate score.

My problem is that to it takes a looong time to generate the view. After two
hours (on a dual core 2.2GHz with 3GB RAM) only 300K docs have been
indexed!! Is that on par? Or is something wrong?

Sometime back JChris had mentioned using the erlang 'sum' call to speed up
things in response to another mail but I cannot locate that  now. Can
someone point me to it?

I have one other view in the design_doc:

map:
function(doc) {
      if(doc.dtype === 'subject_def'){   //only about 20-25 docs of this
type
            val = {};
            for(var prop in doc){
                  if(prop !== '_id' && prop !== '_rev' && prop !== 'dtype'){
                        val[prop] = doc[prop];
                  }
            }
            emit(doc._id, val);
      }
}

No reduce for this.


Thanks,
mano



-- 
Lord, give us the wisdom to utter words that are gentle and tender, for
tomorrow we may have to eat them.
   -Sen. Morris Udall

Re: view generation speed

Posted by Sebastian Cohnen <se...@googlemail.com>.
On 11.05.2010, at 11:37, Ronen Narkis wrote:

> Why aren't the faster function are always used? is there a reason to prefer
> the sum over _sum?

The only reason I can think of is, that you want more control over what to sum up...

Re: view generation speed

Posted by Ronen Narkis <na...@gmail.com>.
Why aren't the faster function are always used? is there a reason to prefer
the sum over _sum?

Ronen

On Tue, May 11, 2010 at 12:16 PM, Manokaran K <ma...@gmail.com> wrote:

> On Tue, May 11, 2010 at 1:40 PM, Sebastian Cohnen <
> sebastiancohnen@googlemail.com> wrote:
>
> > _sum
> >
> > instead of
> >
> > > function(keys, values, rereduce){
> > >      return sum(values);
> > > }
> >
> > I think the wiki is missing some information on build-in reduce
> functions.
> > I'll add this today.
> >
> >
> Thanks a ton :-)
>
> The improvement was dramatic. What took close to 3 hours took just 20 mins
> using _sum!!
>
> Waiting to see what other tricks are  available.
>
> regds
> mano
>

Re: view generation speed

Posted by Sebastian Cohnen <se...@googlemail.com>.
Great!

Currently there are only three built-in funs (which I documented here: http://wiki.apache.org/couchdb/Built-In_Reduce_Functions ).

If raw performance is important for you, you could also consider using native erlang views (instead of using javascript).


On 11.05.2010, at 11:16, Manokaran K wrote:

> On Tue, May 11, 2010 at 1:40 PM, Sebastian Cohnen <
> sebastiancohnen@googlemail.com> wrote:
> 
>> _sum
>> 
>> instead of
>> 
>>> function(keys, values, rereduce){
>>>     return sum(values);
>>> }
>> 
>> I think the wiki is missing some information on build-in reduce functions.
>> I'll add this today.
>> 
>> 
> Thanks a ton :-)
> 
> The improvement was dramatic. What took close to 3 hours took just 20 mins
> using _sum!!
> 
> Waiting to see what other tricks are  available.
> 
> regds
> mano


Re: view generation speed

Posted by Manokaran K <ma...@gmail.com>.
On Tue, May 11, 2010 at 1:40 PM, Sebastian Cohnen <
sebastiancohnen@googlemail.com> wrote:

> _sum
>
> instead of
>
> > function(keys, values, rereduce){
> >      return sum(values);
> > }
>
> I think the wiki is missing some information on build-in reduce functions.
> I'll add this today.
>
>
Thanks a ton :-)

The improvement was dramatic. What took close to 3 hours took just 20 mins
using _sum!!

Waiting to see what other tricks are  available.

regds
mano

Re: view generation speed

Posted by Sebastian Cohnen <se...@googlemail.com>.
_sum

instead of

> function(keys, values, rereduce){
>      return sum(values);
> }

I think the wiki is missing some information on build-in reduce functions. I'll add this today.



On 11.05.2010, at 10:04, Manokaran K wrote:

> Hi,
> 
> Am testing an app where there are approx 400k docs (exam results). There are
> only two views in the design_doc:
> 
> map:
> function(doc) {
>      if(doc.dtype === 'result'){
>            var subject_total, scr, scores = doc.scores;
>            var total = 0, len = scores.length; //scores.length is 6 to 12
>            for(var i=0; i<len; i++){
>                  scr = scores[i];
>                  subject_total = scr.th + (scr.pr ? scr.pr : 0);
>                  total += subject_total;
>                  emit([scr.code, subject_total], 1);
>            }
>            emit(['to', total], 1);
>      }
> }
> 
> reduce:
> function(keys, values, rereduce){
>      return sum(values);
> }
> 
> I use the above map-reduce to calculate the frequencies for each subject
> score as well as the aggregate score.
> 
> My problem is that to it takes a looong time to generate the view. After two
> hours (on a dual core 2.2GHz with 3GB RAM) only 300K docs have been
> indexed!! Is that on par? Or is something wrong?
> 
> Sometime back JChris had mentioned using the erlang 'sum' call to speed up
> things in response to another mail but I cannot locate that  now. Can
> someone point me to it?
> 
> I have one other view in the design_doc:
> 
> map:
> function(doc) {
>      if(doc.dtype === 'subject_def'){   //only about 20-25 docs of this
> type
>            val = {};
>            for(var prop in doc){
>                  if(prop !== '_id' && prop !== '_rev' && prop !== 'dtype'){
>                        val[prop] = doc[prop];
>                  }
>            }
>            emit(doc._id, val);
>      }
> }
> 
> No reduce for this.
> 
> 
> Thanks,
> mano
> 
> 
> 
> -- 
> Lord, give us the wisdom to utter words that are gentle and tender, for
> tomorrow we may have to eat them.
>   -Sen. Morris Udall