You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Matteo Caprari <ma...@gmail.com> on 2009/12/09 18:55:47 UTC

Help on grouping data by date

Hello.

I'm learning couchdb and I'd appreciate if someone could have look and
comment on the code and description that follows.

Given a set of documents like {date:"2009-10-11", amount:10500}, I'm
using the view below
and group_level={1,2,3} to group by year, month or day and return the
average amount for each group.

I don't think it's possible to write a rereduce-friendly averaging
function; Instead, I return an object
that holds the total amount and  the number of rows for each group.The
client has to calculate the average on his own.

_view/by_date?group_level=1 returns one row for each year, with the
average amount for that year
_view/by_date?group_level=2 returns one row for each mont, with the
average amount for that month
_view/by_date?group_level=3 returns one row for each day

//map.js:
function(doc) {
	emit(doc.date.split('-'), doc.amount);
}

// reduce.js
function(keys, values, rereduce) {
	if (rereduce) {
		var result = {tot:0, count:0};	
		for (var idx in values) {
			result.tot += values[idx].tot;
			result.count += values[idx].count;
		}
		return result;
	}
	else {
		var result = {tot:sum(values), count:values.length};
		return result;
	}
}

Thanks :)

-- 
:Matteo Caprari
matteo.caprari@gmail.com

Re: Help on grouping data by date

Posted by Matteo Caprari <ma...@gmail.com>.
Ok.

Got it now, thanks :)

On Wed, Dec 9, 2009 at 8:02 PM, Paul Davis <pa...@gmail.com> wrote:
> On Wed, Dec 9, 2009 at 2:43 PM, Matteo Caprari <ma...@gmail.com> wrote:
>> Hi Paul.
>>
>> This is interesting. I assumed that the rereduce may be feeded with
>> output from the rereduce itself,
>> while you seem to imply that the output of the rereduce is always a
>> final value (ie is one row in the view output).
>>
>> Am I misreading your mail?
>>
>> Thanks for the pointers.
>>
>
> You misread a bit. A rereduce does take multiple values as inputs. The
> note about just recalculating the average was that you can do it
> server side to avoid it client side. Ie, exactly like you're doing,
> but add in the final calculation to avoid it client side. The
> reduce.js example has an explicit example of doing this with the
> standard deviation where it always calculates a 'partial' standard
> deviation.
>
> HTH,
> Paul Davis
>



-- 
:Matteo Caprari
matteo.caprari@gmail.com

Re: Help on grouping data by date

Posted by Paul Davis <pa...@gmail.com>.
On Wed, Dec 9, 2009 at 2:43 PM, Matteo Caprari <ma...@gmail.com> wrote:
> Hi Paul.
>
> This is interesting. I assumed that the rereduce may be feeded with
> output from the rereduce itself,
> while you seem to imply that the output of the rereduce is always a
> final value (ie is one row in the view output).
>
> Am I misreading your mail?
>
> Thanks for the pointers.
>

You misread a bit. A rereduce does take multiple values as inputs. The
note about just recalculating the average was that you can do it
server side to avoid it client side. Ie, exactly like you're doing,
but add in the final calculation to avoid it client side. The
reduce.js example has an explicit example of doing this with the
standard deviation where it always calculates a 'partial' standard
deviation.

HTH,
Paul Davis

Re: Help on grouping data by date

Posted by Matteo Caprari <ma...@gmail.com>.
Hi Paul.

This is interesting. I assumed that the rereduce may be feeded with
output from the rereduce itself,
while you seem to imply that the output of the rereduce is always a
final value (ie is one row in the view output).

Am I misreading your mail?

Thanks for the pointers.

On Wed, Dec 9, 2009 at 6:53 PM, Paul Davis <pa...@gmail.com> wrote:
> On Wed, Dec 9, 2009 at 12:55 PM, Matteo Caprari
> <ma...@gmail.com> wrote:
>> Hello.
>>
>> I'm learning couchdb and I'd appreciate if someone could have look and
>> comment on the code and description that follows.
>>
>> Given a set of documents like {date:"2009-10-11", amount:10500}, I'm
>> using the view below
>> and group_level={1,2,3} to group by year, month or day and return the
>> average amount for each group.
>>
>> I don't think it's possible to write a rereduce-friendly averaging
>> function; Instead, I return an object
>> that holds the total amount and  the number of rows for each group.The
>> client has to calculate the average on his own.
>>
>> _view/by_date?group_level=1 returns one row for each year, with the
>> average amount for that year
>> _view/by_date?group_level=2 returns one row for each mont, with the
>> average amount for that month
>> _view/by_date?group_level=3 returns one row for each day
>>
>> //map.js:
>> function(doc) {
>>        emit(doc.date.split('-'), doc.amount);
>> }
>>
>> // reduce.js
>> function(keys, values, rereduce) {
>>        if (rereduce) {
>>                var result = {tot:0, count:0};
>>                for (var idx in values) {
>>                        result.tot += values[idx].tot;
>>                        result.count += values[idx].count;
>>                }
>>                return result;
>>        }
>>        else {
>>                var result = {tot:sum(values), count:values.length};
>>                return result;
>>        }
>> }
>>
>> Thanks :)
>>
>> --
>> :Matteo Caprari
>> matteo.caprari@gmail.com
>>
>
> Matteo,
>
> That all looks pretty spot on. Though there's nothing keeping you from
> adding a line in your rereduce code to just do result.average =
> result.tot / result.count before returning. You can see a similar
> example in the reduce.js Futon tests of a similar calculation for
> standard deviation.
>
> There's also the view snippets page on the wiki [1].
>
> Paul Davis
>
> [1] http://wiki.apache.org/couchdb/View_Snippets
>



-- 
:Matteo Caprari
matteo.caprari@gmail.com

Re: Help on grouping data by date

Posted by Paul Davis <pa...@gmail.com>.
On Wed, Dec 9, 2009 at 12:55 PM, Matteo Caprari
<ma...@gmail.com> wrote:
> Hello.
>
> I'm learning couchdb and I'd appreciate if someone could have look and
> comment on the code and description that follows.
>
> Given a set of documents like {date:"2009-10-11", amount:10500}, I'm
> using the view below
> and group_level={1,2,3} to group by year, month or day and return the
> average amount for each group.
>
> I don't think it's possible to write a rereduce-friendly averaging
> function; Instead, I return an object
> that holds the total amount and  the number of rows for each group.The
> client has to calculate the average on his own.
>
> _view/by_date?group_level=1 returns one row for each year, with the
> average amount for that year
> _view/by_date?group_level=2 returns one row for each mont, with the
> average amount for that month
> _view/by_date?group_level=3 returns one row for each day
>
> //map.js:
> function(doc) {
>        emit(doc.date.split('-'), doc.amount);
> }
>
> // reduce.js
> function(keys, values, rereduce) {
>        if (rereduce) {
>                var result = {tot:0, count:0};
>                for (var idx in values) {
>                        result.tot += values[idx].tot;
>                        result.count += values[idx].count;
>                }
>                return result;
>        }
>        else {
>                var result = {tot:sum(values), count:values.length};
>                return result;
>        }
> }
>
> Thanks :)
>
> --
> :Matteo Caprari
> matteo.caprari@gmail.com
>

Matteo,

That all looks pretty spot on. Though there's nothing keeping you from
adding a line in your rereduce code to just do result.average =
result.tot / result.count before returning. You can see a similar
example in the reduce.js Futon tests of a similar calculation for
standard deviation.

There's also the view snippets page on the wiki [1].

Paul Davis

[1] http://wiki.apache.org/couchdb/View_Snippets