You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Mathias Leppich <ml...@muhqu.de> on 2011/02/02 08:47:18 UTC
Re: Reporting aggregated data using reduce function
Maybe related: a typical reduce function I use to sum on objects:
function (keys, values, rereduce) {
var sums = {};
for (var i in values) {
for (var k in values[i]) {
sums[k] = (sums[k]||0)+values[i][k];
}
}
return sums;
}
which sums emited keys as follows:
emit("somekey",{"A":1});
emit("somekey",{"B":2});
emit("somekey",{"A":1,"C":1});
emit("somekey",{"A":1,"B":2});
reduced output:
"somekey": {"A":3,"B":4,"C":1}
but, I guess the array approach is more efficient, as it uses less space by using indexes instead of keys...
- Mathias
On 31.01.2011, at 12:07, John wrote:
> That's perfect, thanks Robert. Funny how something so simple can be so confusing if the concept is new to you.......
>
> For anyone who searches for how to do a reduce over an array in the future, here's the code:
>
> function(keys, values, rereduce){
>
> var total = [0,0];
> values.forEach(function(value){
> total[0] += value[0];
> total[1] += value[1];
> }
>
> )
> return total;
> }
>
> Looks like I might get to add reporting to my successful use cases for Couchdb!
>
> John
>
>
> On 31 Jan 2011, at 09:01, Robert Newson wrote:
>
>> in 1.1, _sum will work for arrays of numbers too (rather than
>> concatenating them as above). for now, just loop over the array of
>> arrays and do the sum yourself.
>>
>> On Mon, Jan 31, 2011 at 1:02 AM, Keith Gable <zi...@gmail.com> wrote:
>>> It sounds like you need a new view for each piece of data.
>>>
>>> by_answered, by_busy, by_time_to_answer, etc.
>>>
>>> Then you'd query each view to get the reduction, and the reduce would be as
>>> simple as _sum.
>>>
>>>
>>>
>>> On Jan 30, 2011, at 5:55 PM, John <jo...@netdev.co.uk> wrote:
>>>
>>>> Hi
>>>>
>>>> I'm looking to extend our usage of couchdb by replacing our mysql
>>>> reporting db.
>>>> Whilst using couchdb successfully for a number of varied use cases I've
>>>> never had to do much with reduce so I'm unsure on how to use it to reduce an
>>>> array of values.
>>>>
>>>> Basically I want to be able to search a database using a composite key and
>>>> retrieving some aggregated information about number of calls, call status,
>>>> avg time to answer and avg duration
>>>>
>>>>
>>>> The following view shows how I'd like it to work:
>>>>
>>>> Key = <Application, Account, Subscription>
>>>> Value = <1, answered, busy, noreply, time to answer, duration>
>>>>
>>>> e.g.
>>>>
>>>> ["NTS", "NetDev", "MySub1"], [1,1,0,0,100,200]
>>>> ["NTS", "NetDev", "MySub1"], [1,1,0,0,150,400]
>>>> ["NTS", "NetDev", "MySub1"], [1,1,0,0,170,500]
>>>> ["NTS", "NetDev", "MySub1"], [1,0,1,0,0,0]
>>>> ["NTS", "NetDev", "MySub1"], [1,0,1,0,0,0]
>>>> ["NTS", "NetDev", "MySub1"], [1,0,0,2,0,0]
>>>> ["NTS", "NetDev", "MySub1"], [1,0,0,2,0,0]
>>>>
>>>> My Reduced output should look like this:
>>>>
>>>> [7,3,2,2,420,1100]
>>>> i.e. 7 calls in total, 3 answered, 2 busy, 2 no reply, the total time for
>>>> time to answer is 420 and the total time for call duration is 1100.
>>>>
>>>> I can then compute the two averages after getting the data back from couch
>>>> i.e. 420/no. of answered calls(3) and 1100/no. of answered calls(3)
>>>>
>>>> I thought that sum(values) would do this for me but it just upsets couch:
>>>>
>>>> Reduce output must shrink more rapidly: Current output:
>>>> '["001,11,11,11,11,11,11,11,11,11,11,101,11,11,11,11,11,11,11,11,11,11,11,101,11,11,11,11,11,11,11,11'...
>>>> (first 100 of 277 bytes)
>>>>
>>>> What should my reduce function look like?
>>>>
>>>> Thanks
>>>>
>>>> John
>>>
>
Re: Reporting aggregated data using reduce function
Posted by John <jo...@netdev.co.uk>.
That's definitely more friendly on the eye and probably less brittle than my Array example, old habits die hard and I'm a telecoms guy who cant get used to all this extra memory.......
In any case both are useful examples of doing something a bit more complex than the usual examples I've seen for Reduce.
I've had some cracking support from this list and developed some really useful queries which again I don't see in the Wiki or Book. Turning the answers in this mailing list into a knowledge base would be an invaluable aid for people looking at the technology for the first time. I certainly don't mind returning the effort I've received from others here and contributing to that but where should such examples go, in a section on the Wiki?
Making them easy to search for in google, showing a common problem/pattern which is a bit more than trivial and a real world example would benefit all and take some of the strain off this list.
Just a thought but please do reply, anyone, if you have ideas on this or think another approach is better.
John
On 2 Feb 2011, at 07:47, Mathias Leppich wrote:
> Maybe related: a typical reduce function I use to sum on objects:
>
> function (keys, values, rereduce) {
> var sums = {};
> for (var i in values) {
> for (var k in values[i]) {
> sums[k] = (sums[k]||0)+values[i][k];
> }
> }
> return sums;
> }
>
> which sums emited keys as follows:
> emit("somekey",{"A":1});
> emit("somekey",{"B":2});
> emit("somekey",{"A":1,"C":1});
> emit("somekey",{"A":1,"B":2});
>
> reduced output:
> "somekey": {"A":3,"B":4,"C":1}
>
> but, I guess the array approach is more efficient, as it uses less space by using indexes instead of keys...
>
> - Mathias
>
> On 31.01.2011, at 12:07, John wrote:
>
>> That's perfect, thanks Robert. Funny how something so simple can be so confusing if the concept is new to you.......
>>
>> For anyone who searches for how to do a reduce over an array in the future, here's the code:
>>
>> function(keys, values, rereduce){
>>
>> var total = [0,0];
>> values.forEach(function(value){
>> total[0] += value[0];
>> total[1] += value[1];
>> }
>>
>> )
>> return total;
>> }
>>
>> Looks like I might get to add reporting to my successful use cases for Couchdb!
>>
>> John
>>
>>
>> On 31 Jan 2011, at 09:01, Robert Newson wrote:
>>
>>> in 1.1, _sum will work for arrays of numbers too (rather than
>>> concatenating them as above). for now, just loop over the array of
>>> arrays and do the sum yourself.
>>>
>>> On Mon, Jan 31, 2011 at 1:02 AM, Keith Gable <zi...@gmail.com> wrote:
>>>> It sounds like you need a new view for each piece of data.
>>>>
>>>> by_answered, by_busy, by_time_to_answer, etc.
>>>>
>>>> Then you'd query each view to get the reduction, and the reduce would be as
>>>> simple as _sum.
>>>>
>>>>
>>>>
>>>> On Jan 30, 2011, at 5:55 PM, John <jo...@netdev.co.uk> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I'm looking to extend our usage of couchdb by replacing our mysql
>>>>> reporting db.
>>>>> Whilst using couchdb successfully for a number of varied use cases I've
>>>>> never had to do much with reduce so I'm unsure on how to use it to reduce an
>>>>> array of values.
>>>>>
>>>>> Basically I want to be able to search a database using a composite key and
>>>>> retrieving some aggregated information about number of calls, call status,
>>>>> avg time to answer and avg duration
>>>>>
>>>>>
>>>>> The following view shows how I'd like it to work:
>>>>>
>>>>> Key = <Application, Account, Subscription>
>>>>> Value = <1, answered, busy, noreply, time to answer, duration>
>>>>>
>>>>> e.g.
>>>>>
>>>>> ["NTS", "NetDev", "MySub1"], [1,1,0,0,100,200]
>>>>> ["NTS", "NetDev", "MySub1"], [1,1,0,0,150,400]
>>>>> ["NTS", "NetDev", "MySub1"], [1,1,0,0,170,500]
>>>>> ["NTS", "NetDev", "MySub1"], [1,0,1,0,0,0]
>>>>> ["NTS", "NetDev", "MySub1"], [1,0,1,0,0,0]
>>>>> ["NTS", "NetDev", "MySub1"], [1,0,0,2,0,0]
>>>>> ["NTS", "NetDev", "MySub1"], [1,0,0,2,0,0]
>>>>>
>>>>> My Reduced output should look like this:
>>>>>
>>>>> [7,3,2,2,420,1100]
>>>>> i.e. 7 calls in total, 3 answered, 2 busy, 2 no reply, the total time for
>>>>> time to answer is 420 and the total time for call duration is 1100.
>>>>>
>>>>> I can then compute the two averages after getting the data back from couch
>>>>> i.e. 420/no. of answered calls(3) and 1100/no. of answered calls(3)
>>>>>
>>>>> I thought that sum(values) would do this for me but it just upsets couch:
>>>>>
>>>>> Reduce output must shrink more rapidly: Current output:
>>>>> '["001,11,11,11,11,11,11,11,11,11,11,101,11,11,11,11,11,11,11,11,11,11,11,101,11,11,11,11,11,11,11,11'...
>>>>> (first 100 of 277 bytes)
>>>>>
>>>>> What should my reduce function look like?
>>>>>
>>>>> Thanks
>>>>>
>>>>> John
>>>>
>>
>