You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Adam Wolff <aw...@gmail.com> on 2009/03/26 23:36:10 UTC

couchdb view performance

Hi everyone,
We are running an alpha version of our software against a couchdb instance
with a handful of documents, and we're seeing response times from our views
of ~500ms. This is measured both within our application, and hitting the
view directly using firebug+firefox.
The view I'm talking about matches about 5 documents and returns about 9K of
data. I'm running:
Apache CouchDB 0.8.1-incubating (LogLevel=info)
Erlang (BEAM) emulator version 5.6.5 [source] [async-threads:0] [hipe]
[kernel-poll:false]

This is all running on my MacBook Pro 2.33GHz Core 2 Duo with 3GB of RAM.

By logging, I can see that my reduce function is running every time I access
the view. The response time is about the same whether I've committed a new
version of one of the documents in the view or not. This surprised me, since
I thought that view results were cached. I've also tried logging the amount
of time actually spent *in* my reduce function, but that appears to be
negligible.

I am seeing some very fast responses from couchdb, for straight resource
access -- on order 10ms. But all of my views are relatively slow -- even
ones that don't have a reduce step.

So, I'm wondering if I have a bad version, or bad config, or if this is
expected performance. I'm sure things are running faster in trunk, but I
want to get a feel for what kind of performance I can expect from a view
with a fairly complicated reduce step.

Thanks in advance for any advice,
A

Re: couchdb view performance

Posted by Paul Davis <pa...@gmail.com>.
On Fri, Mar 27, 2009 at 2:50 AM, Adam Wolff <aw...@gmail.com> wrote:
> Hi paul,
> My system is being uncooporative - now I'm measuring 100-200ms response
> times, but I'm
> still pretty sure that couch is lagging.
> Anyway, I thought that maybe adding group=true when I queried the view with
> a key made a difference,
> but I wasn't sure. Should that make a difference? What does it mean to query
> with a key and without
> group=true?

You can only query a reduce by key if you have group=true, other wise
the default behavior is to produce a single row output. Using
group=true does make a difference as it needs to make a pass through
the JS VM per key (IIRC). This is what I mentioned about adding
optimizations later.

> I've attached my map/reduce. Thanks so much for all the help.

It looks like you're running up against the too much data problem in
that the value's you're returning from your reduce are growing too
fast, but its hard to tell without seeing the data.

> Adam
> On Thu, Mar 26, 2009 at 4:01 PM, Paul Davis <pa...@gmail.com>
> wrote:
>>
>> On Thu, Mar 26, 2009 at 6:36 PM, Adam Wolff <aw...@gmail.com> wrote:
>> > Hi everyone,
>> > We are running an alpha version of our software against a couchdb
>> > instance
>> > with a handful of documents, and we're seeing response times from our
>> > views
>> > of ~500ms. This is measured both within our application, and hitting the
>> > view directly using firebug+firefox.
>> > The view I'm talking about matches about 5 documents and returns about
>> > 9K of
>> > data. I'm running:
>> > Apache CouchDB 0.8.1-incubating (LogLevel=info)
>> > Erlang (BEAM) emulator version 5.6.5 [source] [async-threads:0] [hipe]
>> > [kernel-poll:false]
>> >
>> > This is all running on my MacBook Pro 2.33GHz Core 2 Duo with 3GB of
>> > RAM.
>> >
>>
>> You'll definitely want to upgrade to trunk, or 0.9 which is just now
>> out for testing pre-release at [1]. 500 ms is way way slow, trunk
>> should help, but there's probably something else going on as well.
>>
>> > By logging, I can see that my reduce function is running every time I
>> > access
>> > the view. The response time is about the same whether I've committed a
>> > new
>> > version of one of the documents in the view or not. This surprised me,
>> > since
>> > I thought that view results were cached. I've also tried logging the
>> > amount
>> > of time actually spent *in* my reduce function, but that appears to be
>> > negligible.
>> >
>>
>> The reduce function is generally run once per final reduce operation
>> currently. If I'm not mistaken, this means that you get it once per
>> key when group=true and just once when group=false
>>
>> > I am seeing some very fast responses from couchdb, for straight resource
>> > access -- on order 10ms. But all of my views are relatively slow -- even
>> > ones that don't have a reduce step.
>> >
>> > So, I'm wondering if I have a bad version, or bad config, or if this is
>> > expected performance. I'm sure things are running faster in trunk, but I
>> > want to get a feel for what kind of performance I can expect from a view
>> > with a fairly complicated reduce step.
>> >
>>
>> When you say fairly complicated, how do you mean? There is a size
>> output constraint for reductions. Ie, reduce functions should return
>> data that grows less than log(# keys reduced) because of data is
>> stored in the internal btree nodes.
>>
>> > Thanks in advance for any advice,
>> > A
>> >
>>
>> Also, the mechanics of reduce calculations have been on the back
>> burner for awhile in terms of keeping those partial reductions around.
>> I'm not 100% familiar with the entire code path, but I know that
>> there's definitely room for improvement but the speed optimizations
>> are being pushed back in favor of pulling in the big features.
>>
>> If nothing looks obvious, you can try pasting your M/R functions to
>> see if anyone spots something that looks slow.
>>
>> HTH,
>> Paul Davis
>
>

Re: couchdb view performance

Posted by Adam Wolff <aw...@gmail.com>.
Hi paul,My system is being uncooporative - now I'm measuring 100-200ms
response times, but I'm
still pretty sure that couch is lagging.

Anyway, I thought that maybe adding group=true when I queried the view with
a key made a difference,
but I wasn't sure. Should that make a difference? What does it mean to query
with a key and without
group=true?

I've attached my map/reduce. Thanks so much for all the help.
Adam

On Thu, Mar 26, 2009 at 4:01 PM, Paul Davis <pa...@gmail.com>wrote:

> On Thu, Mar 26, 2009 at 6:36 PM, Adam Wolff <aw...@gmail.com> wrote:
> > Hi everyone,
> > We are running an alpha version of our software against a couchdb
> instance
> > with a handful of documents, and we're seeing response times from our
> views
> > of ~500ms. This is measured both within our application, and hitting the
> > view directly using firebug+firefox.
> > The view I'm talking about matches about 5 documents and returns about 9K
> of
> > data. I'm running:
> > Apache CouchDB 0.8.1-incubating (LogLevel=info)
> > Erlang (BEAM) emulator version 5.6.5 [source] [async-threads:0] [hipe]
> > [kernel-poll:false]
> >
> > This is all running on my MacBook Pro 2.33GHz Core 2 Duo with 3GB of RAM.
> >
>
> You'll definitely want to upgrade to trunk, or 0.9 which is just now
> out for testing pre-release at [1]. 500 ms is way way slow, trunk
> should help, but there's probably something else going on as well.
>
> > By logging, I can see that my reduce function is running every time I
> access
> > the view. The response time is about the same whether I've committed a
> new
> > version of one of the documents in the view or not. This surprised me,
> since
> > I thought that view results were cached. I've also tried logging the
> amount
> > of time actually spent *in* my reduce function, but that appears to be
> > negligible.
> >
>
> The reduce function is generally run once per final reduce operation
> currently. If I'm not mistaken, this means that you get it once per
> key when group=true and just once when group=false
>
> > I am seeing some very fast responses from couchdb, for straight resource
> > access -- on order 10ms. But all of my views are relatively slow -- even
> > ones that don't have a reduce step.
> >
> > So, I'm wondering if I have a bad version, or bad config, or if this is
> > expected performance. I'm sure things are running faster in trunk, but I
> > want to get a feel for what kind of performance I can expect from a view
> > with a fairly complicated reduce step.
> >
>
> When you say fairly complicated, how do you mean? There is a size
> output constraint for reductions. Ie, reduce functions should return
> data that grows less than log(# keys reduced) because of data is
> stored in the internal btree nodes.
>
> > Thanks in advance for any advice,
> > A
> >
>
> Also, the mechanics of reduce calculations have been on the back
> burner for awhile in terms of keeping those partial reductions around.
> I'm not 100% familiar with the entire code path, but I know that
> there's definitely room for improvement but the speed optimizations
> are being pushed back in favor of pulling in the big features.
>
> If nothing looks obvious, you can try pasting your M/R functions to
> see if anyone spots something that looks slow.
>
> HTH,
> Paul Davis
>

Re: couchdb view performance

Posted by Paul Davis <pa...@gmail.com>.
On Thu, Mar 26, 2009 at 6:36 PM, Adam Wolff <aw...@gmail.com> wrote:
> Hi everyone,
> We are running an alpha version of our software against a couchdb instance
> with a handful of documents, and we're seeing response times from our views
> of ~500ms. This is measured both within our application, and hitting the
> view directly using firebug+firefox.
> The view I'm talking about matches about 5 documents and returns about 9K of
> data. I'm running:
> Apache CouchDB 0.8.1-incubating (LogLevel=info)
> Erlang (BEAM) emulator version 5.6.5 [source] [async-threads:0] [hipe]
> [kernel-poll:false]
>
> This is all running on my MacBook Pro 2.33GHz Core 2 Duo with 3GB of RAM.
>

You'll definitely want to upgrade to trunk, or 0.9 which is just now
out for testing pre-release at [1]. 500 ms is way way slow, trunk
should help, but there's probably something else going on as well.

> By logging, I can see that my reduce function is running every time I access
> the view. The response time is about the same whether I've committed a new
> version of one of the documents in the view or not. This surprised me, since
> I thought that view results were cached. I've also tried logging the amount
> of time actually spent *in* my reduce function, but that appears to be
> negligible.
>

The reduce function is generally run once per final reduce operation
currently. If I'm not mistaken, this means that you get it once per
key when group=true and just once when group=false

> I am seeing some very fast responses from couchdb, for straight resource
> access -- on order 10ms. But all of my views are relatively slow -- even
> ones that don't have a reduce step.
>
> So, I'm wondering if I have a bad version, or bad config, or if this is
> expected performance. I'm sure things are running faster in trunk, but I
> want to get a feel for what kind of performance I can expect from a view
> with a fairly complicated reduce step.
>

When you say fairly complicated, how do you mean? There is a size
output constraint for reductions. Ie, reduce functions should return
data that grows less than log(# keys reduced) because of data is
stored in the internal btree nodes.

> Thanks in advance for any advice,
> A
>

Also, the mechanics of reduce calculations have been on the back
burner for awhile in terms of keeping those partial reductions around.
I'm not 100% familiar with the entire code path, but I know that
there's definitely room for improvement but the speed optimizations
are being pushed back in favor of pulling in the big features.

If nothing looks obvious, you can try pasting your M/R functions to
see if anyone spots something that looks slow.

HTH,
Paul Davis