You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Daniel Truemper <tr...@googlemail.com> on 2010/03/02 18:45:40 UTC

Re: Hitting the reduce overflow boundary

Moving this to user@c.a.o?!

> I'm using this view at work:
> http://friendpaste.com/3CtGyg5IczkxNV3rWRdalB
> 
> Unfortunately, after adding a new document today, it stopped working
> due to the reduce overflow error checking. In this case, I think the
> document I added had 20 currencies to be inserted into the interest
> value, whereas the former largest doc had only 9.
> 
> For now, I've just disabled the reduce_limit, but it would be nice if
> the heuristic could be tuned (as the comment in default.ini says). If
> there's some better way of writing my view, that would be fine as
> well, of course.
I think you view could need some improvement but it is kind of hard to guess what you are trying to do. Maybe you could provide a little more details about it!?

Best
Daniel

Re: Hitting the reduce overflow boundary

Posted by J Chris Anderson <jc...@gmail.com>.
On Mar 5, 2010, at 8:23 AM, Dirkjan Ochtman wrote:

> On Fri, Mar 5, 2010 at 12:17, Dirkjan Ochtman <dj...@gmail.com> wrote:
>> I would really like to have someone from the dev team speak up on this
>> one, since I'd kind of like to re-enable the reduce_limit option, but
>> I don't think this view should be classified as overflowing.
> 
> I happily found Adam in IRC, who explained this to me:
> 
> 17:05 <+kocolosk> djc: so the current reduce_limit calculation is in main.js
> 17:06 <+kocolosk> the JSONified reduction needs to be less than 200 bytes, and
>                  it needs to be less than half of the size of the input map
>                  values
> 17:06 <+kocolosk> you could try tweaking those to see which condition you're
>                  failing
> 
> The way I see it, the way the reduce phase should work is that the
> result from an collection of documents should be smaller or not much
> larger than the largest single object in the input set. This way,
> you'll prevent the unbounded growth that you want to prevent. Such a
> rule should also work on slightly larger inputs, because that should
> just be a larger constant, not exponentional growth.
> 
> So I see two problems with the current rule:
> 
> - it has a fixed limit at 200b, which isn't very reasonable because a
> larger size doesn't mean there's unbounded growth going on
> - it assumes that all the values in the input map have about equal
> size, which isn't really a requirement
> 
> Am I crazy, or would a scheme like I proposed above be an improvement?

definitely. A patch to make the reduce_overflow_threshold configurable (with a default of 200 bytes) would be a major improvement and not hard to do.

Chris

> 
> Cheers,
> 
> Dirkjan


Re: Hitting the reduce overflow boundary

Posted by Dirkjan Ochtman <dj...@gmail.com>.
On Fri, Mar 5, 2010 at 12:17, Dirkjan Ochtman <dj...@gmail.com> wrote:
> I would really like to have someone from the dev team speak up on this
> one, since I'd kind of like to re-enable the reduce_limit option, but
> I don't think this view should be classified as overflowing.

I happily found Adam in IRC, who explained this to me:

17:05 <+kocolosk> djc: so the current reduce_limit calculation is in main.js
17:06 <+kocolosk> the JSONified reduction needs to be less than 200 bytes, and
                  it needs to be less than half of the size of the input map
                  values
17:06 <+kocolosk> you could try tweaking those to see which condition you're
                  failing

The way I see it, the way the reduce phase should work is that the
result from an collection of documents should be smaller or not much
larger than the largest single object in the input set. This way,
you'll prevent the unbounded growth that you want to prevent. Such a
rule should also work on slightly larger inputs, because that should
just be a larger constant, not exponentional growth.

So I see two problems with the current rule:

- it has a fixed limit at 200b, which isn't very reasonable because a
larger size doesn't mean there's unbounded growth going on
- it assumes that all the values in the input map have about equal
size, which isn't really a requirement

Am I crazy, or would a scheme like I proposed above be an improvement?

Cheers,

Dirkjan

Re: Hitting the reduce overflow boundary

Posted by Dirkjan Ochtman <dj...@gmail.com>.
On Fri, Mar 5, 2010 at 12:17, Dirkjan Ochtman <dj...@gmail.com> wrote:
> I would really like to have someone from the dev team speak up on this
> one, since I'd kind of like to re-enable the reduce_limit option, but
> I don't think this view should be classified as overflowing.

I happily found Adam in IRC, who explained this to me:

17:05 <+kocolosk> djc: so the current reduce_limit calculation is in main.js
17:06 <+kocolosk> the JSONified reduction needs to be less than 200 bytes, and
                  it needs to be less than half of the size of the input map
                  values
17:06 <+kocolosk> you could try tweaking those to see which condition you're
                  failing

The way I see it, the way the reduce phase should work is that the
result from an collection of documents should be smaller or not much
larger than the largest single object in the input set. This way,
you'll prevent the unbounded growth that you want to prevent. Such a
rule should also work on slightly larger inputs, because that should
just be a larger constant, not exponentional growth.

So I see two problems with the current rule:

- it has a fixed limit at 200b, which isn't very reasonable because a
larger size doesn't mean there's unbounded growth going on
- it assumes that all the values in the input map have about equal
size, which isn't really a requirement

Am I crazy, or would a scheme like I proposed above be an improvement?

Cheers,

Dirkjan

Re: Hitting the reduce overflow boundary

Posted by Dirkjan Ochtman <dj...@gmail.com>.
On Tue, Mar 2, 2010 at 18:45, Daniel Truemper <tr...@googlemail.com> wrote:
> Moving this to user@c.a.o?!

Not sure where it makes most sense, since I consider this a bug that
merits some discussion.

> I think you view could need some improvement but it is kind of hard to guess what you are trying to do. Maybe you could provide a little more details about it!?

I have transfer documents that have an arbitrary amount of currencies
in which some amount got transferred. In my view, I'd like to classify
documents in one of two buckets and aggregate all transfers in that
buckets into a single transfer object that sums the transfer amount
per currency.

I would really like to have someone from the dev team speak up on this
one, since I'd kind of like to re-enable the reduce_limit option, but
I don't think this view should be classified as overflowing.

Cheers,

Dirkjan

Re: Hitting the reduce overflow boundary

Posted by Dirkjan Ochtman <dj...@gmail.com>.
On Tue, Mar 2, 2010 at 18:45, Daniel Truemper <tr...@googlemail.com> wrote:
> Moving this to user@c.a.o?!

Not sure where it makes most sense, since I consider this a bug that
merits some discussion.

> I think you view could need some improvement but it is kind of hard to guess what you are trying to do. Maybe you could provide a little more details about it!?

I have transfer documents that have an arbitrary amount of currencies
in which some amount got transferred. In my view, I'd like to classify
documents in one of two buckets and aggregate all transfers in that
buckets into a single transfer object that sums the transfer amount
per currency.

I would really like to have someone from the dev team speak up on this
one, since I'd kind of like to re-enable the reduce_limit option, but
I don't think this view should be classified as overflowing.

Cheers,

Dirkjan