You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Gregory Tappero <co...@gmail.com> on 2010/03/06 16:26:39 UTC

Map reduce and weird output question

Hello everyone,

I have the following EdoPing 's type of documents

{
   "_id": "22add509c1e7bc286832edc5bfe99ce5",
   "_rev": "1-49663ab8778f445e481143120d0d7086",
   "doc_type": "EdoPing",
   "em_uname": "student1",
   "em_gid": 1,
   "created_at": "2010-03-03T14:18:19Z",
   "em_ip": "92.154.70.148",
   "em_type": 0,
   "room_url": "z2fudcvcrfa3reaydatre",
   "room_users": [
       "tutorsbox"
   ]
}

i would like to count all unique em_uname of em_type 0 on a given day date.

For now i used this map/reduce http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe

Date.prototype.setRFC3339 = function(dString){
    var regexp =
/(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z|([+-])(\d\d)(:)?(\d\d))/;

    if (dString.toString().match(new RegExp(regexp))) {
        var d = dString.match(new RegExp(regexp));
        var offset = 0;

        this.setUTCDate(1);
        this.setUTCFullYear(parseInt(d[1],10));
        this.setUTCMonth(parseInt(d[3],10) - 1);
        this.setUTCDate(parseInt(d[5],10));
        this.setUTCHours(parseInt(d[7],10));
        this.setUTCMinutes(parseInt(d[9],10));
        this.setUTCSeconds(parseInt(d[11],10));
        if (d[12])
            this.setUTCMilliseconds(parseFloat(d[12]) * 1000);
        else
            this.setUTCMilliseconds(0);
        if (d[13] != 'Z') {
            offset = (d[15] * 60) + parseInt(d[17],10);
            offset *= ((d[14] == '-') ? -1 : 1);
            this.setTime(this.getTime() - offset * 60 * 1000);
        }
    } else {
        this.setTime(Date.parse(dString));
    }
    return this;
};

var seenKeys = new Array();

function(doc) {


    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
        date = new Date().setRFC3339(doc.created_at);
        var key = doc.em_uname + String(doc.created_at).substring(0,10);
        if (seenKeys[key] ==  undefined  ) {
            seenKeys[key] = 1;
            emit([date.getFullYear(), parseInt(date.getMonth())+1,
date.getDate() ] , 1);
         }
    }
}


It works when saved for this first time but as soon as new EdoPings
get added it starts emitting rows it has already seen ! (same key)
creating faulty count results.

Is it ok to have seenKeys outside of the doc function() ?
What other way could i use to get the same results ?

Thanks,

Greg

Re: Map reduce and weird output question

Posted by J Chris Anderson <jc...@couch.io>.
On Mar 8, 2010, at 12:28 AM, Gregory Tappero wrote:

> Thanks,
> 
> I got the wanted result with
> http://friendpaste.com/6sYxT4cNJ9IjpWiW9qgCut
> 
> benoitc came to my rescue.
> 

The will be a problem with large databases. When the # of unique users is large, the group=false query would return a very large object with all the users names in it. Except it won't because it will raise a reduce_overflow_error.

Your problem is interesting. You might learn from reading this paper:

http://labs.google.com/papers/sawzall.html

It gives a survey of the available algorithms which can work in constant space even over large databases.

Chris

> Greg
> 
> 
> 
> 
> On Mon, Mar 8, 2010 at 9:07 AM, Paweł Stawicki <pa...@gmail.com> wrote:
>> Hmm... I'm just thinking now, don't know if it works, but maybe try
>> something like this:
>> If you can get number of documents per day per username, first try to make
>> this number always one if keys is [date, username]:
>> Reduce:
>> if (keys.length == 2) {
>>  return 1;
>> } else if (keys.length == 1) { //date only, return number of usernames
>>  return values.length();
>> }
>> 
>> The risk is that some usernames will count twice, but maybe try it.
>> 
>> Best regards
>> --
>> Paweł Stawicki
>> http://pawelstawicki.blogspot.com
>> http://szczecin.jug.pl
>> 
>> 
>> 
>> On Mon, Mar 8, 2010 at 08:03, Gregory Tappero <co...@gmail.com> wrote:
>> 
>>> My number of keys is 4, year month day userame so returning the bbr of
>>> keys in reduce does not seem to give me the output i am looking for.
>>> Unless i misunderstood something.
>>> 
>>> Thank you for helping,
>>> 
>>> Greg
>>> 
>>> On Mon, Mar 8, 2010 at 12:28 AM, Randall Leeds <ra...@gmail.com>
>>> wrote:
>>>> I'm not an expert on this, but I think you need to create your own
>>>> reduce function and output the number of keys rather than the sum of
>>>> the values.
>>>> 
>>>> On Sun, Mar 7, 2010 at 15:15, Gregory Tappero <co...@gmail.com> wrote:
>>>>> Thank you Pawel,
>>>>> 
>>>>> If i try to follow your way it gives me the count of docs in a given
>>>>> day for each username, what i would like is the count of unique
>>>>> usernames for a given day.
>>>>> 
>>>>> function(doc) {
>>>>> 
>>>>>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>>>>>        date = new Date().setRFC3339(doc.created_at);
>>>>>        emit([date.getFullYear(), parseInt(date.getMonth())+1,
>>>>> date.getDate(), doc.em_uname] , 1);
>>>>> 
>>>>>    }
>>>>> }
>>>>> 
>>>>> Reduce:
>>>>>  _count
>>>>> 
>>>>> =================
>>>>> I get:
>>>>> 
>>>>> [2010, 3, 3, "student1"]         5
>>>>> [2010, 3, 4, "student1"]         18
>>>>> [2010, 3, 5, "eong"]             77
>>>>> [2010, 3, 6, "bkante"]           71
>>>>> [2010, 3, 6, "jfrancillette"]    72
>>>>> [2010, 3, 6, "mlouviers"]        12
>>>>> [2010, 3, 7, "student1"]         4
>>>>> 
>>>>> I would like to extract the following
>>>>> 
>>>>> [2010, 3, 3]       1
>>>>> [2010, 3, 4]       1
>>>>> [2010, 3, 5]    1
>>>>> [2010, 3, 6]       3
>>>>> [2010, 3, 7]       1
>>>>> 
>>>>> 
>>>>> if i do a group_level=3 it sum the values.
>>>>> 
>>>>> {"key":[2010,3,3],"value":5},
>>>>> {"key":[2010,3,4],"value":18},
>>>>> {"key":[2010,3,5],"value":77},
>>>>> {"key":[2010,3,6],"value":155},
>>>>> {"key":[2010,3,7],"value":4}
>>>>> 
>>>>> How can i count the unique username emitter per day ?
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Sun, Mar 7, 2010 at 10:02 PM, Paweł Stawicki <
>>> pawelstawicki@gmail.com> wrote:
>>>>>> Just emit all documents with em_type = 0 in map function, with [date,
>>>>>> em_uname] as key. Then count in reduce.
>>>>>> 
>>>>>> Map:
>>>>>> function(doc) {
>>>>>>  if (doc.em_type = 0) {
>>>>>>    //If you only want to count, you can emit anything (e.g. 1) instead
>>> of
>>>>>> doc here.
>>>>>>    emit([date, em_uname], doc);
>>>>>>  }
>>>>>> }
>>>>>> 
>>>>>> Reduce:
>>>>>> function(keys, values, rereduce) {
>>>>>>  if (!rereduce) {
>>>>>>    return count_of_values;
>>>>>>  } else {
>>>>>>    return sum_of_values;
>>>>>>  }
>>>>>> 
>>>>>>  //If you return 1 from emit instead of doc, then count_of_values ==
>>>>>> sum_of_values
>>>>>> }
>>>>>> 
>>>>>> Then you can handle everything by grouping:
>>>>>> http://yourserver:5984/yourdb/_view/yourview?group_level=2
>>>>>> or group=true
>>>>>> 
>>>>>> Regards
>>>>>> --
>>>>>> Paweł Stawicki
>>>>>> http://pawelstawicki.blogspot.com
>>>>>> http://szczecin.jug.pl
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Sat, Mar 6, 2010 at 16:26, Gregory Tappero <co...@gmail.com>
>>> wrote:
>>>>>> 
>>>>>>> Hello everyone,
>>>>>>> 
>>>>>>> I have the following EdoPing 's type of documents
>>>>>>> 
>>>>>>> {
>>>>>>>   "_id": "22add509c1e7bc286832edc5bfe99ce5",
>>>>>>>   "_rev": "1-49663ab8778f445e481143120d0d7086",
>>>>>>>   "doc_type": "EdoPing",
>>>>>>>   "em_uname": "student1",
>>>>>>>   "em_gid": 1,
>>>>>>>   "created_at": "2010-03-03T14:18:19Z",
>>>>>>>   "em_ip": "92.154.70.148",
>>>>>>>   "em_type": 0,
>>>>>>>   "room_url": "z2fudcvcrfa3reaydatre",
>>>>>>>   "room_users": [
>>>>>>>       "tutorsbox"
>>>>>>>   ]
>>>>>>> }
>>>>>>> 
>>>>>>> i would like to count all unique em_uname of em_type 0 on a given day
>>> date.
>>>>>>> 
>>>>>>> For now i used this map/reduce
>>>>>>> http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe
>>>>>>> 
>>>>>>> Date.prototype.setRFC3339 = function(dString){
>>>>>>>    var regexp =
>>>>>>> 
>>>>>>> 
>>> /(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z|([+-])(\d\d)(:)?(\d\d))/;
>>>>>>> 
>>>>>>>    if (dString.toString().match(new RegExp(regexp))) {
>>>>>>>        var d = dString.match(new RegExp(regexp));
>>>>>>>        var offset = 0;
>>>>>>> 
>>>>>>>        this.setUTCDate(1);
>>>>>>>        this.setUTCFullYear(parseInt(d[1],10));
>>>>>>>        this.setUTCMonth(parseInt(d[3],10) - 1);
>>>>>>>        this.setUTCDate(parseInt(d[5],10));
>>>>>>>        this.setUTCHours(parseInt(d[7],10));
>>>>>>>        this.setUTCMinutes(parseInt(d[9],10));
>>>>>>>        this.setUTCSeconds(parseInt(d[11],10));
>>>>>>>        if (d[12])
>>>>>>>            this.setUTCMilliseconds(parseFloat(d[12]) * 1000);
>>>>>>>        else
>>>>>>>            this.setUTCMilliseconds(0);
>>>>>>>        if (d[13] != 'Z') {
>>>>>>>            offset = (d[15] * 60) + parseInt(d[17],10);
>>>>>>>            offset *= ((d[14] == '-') ? -1 : 1);
>>>>>>>            this.setTime(this.getTime() - offset * 60 * 1000);
>>>>>>>        }
>>>>>>>    } else {
>>>>>>>        this.setTime(Date.parse(dString));
>>>>>>>    }
>>>>>>>    return this;
>>>>>>> };
>>>>>>> 
>>>>>>> var seenKeys = new Array();
>>>>>>> 
>>>>>>> function(doc) {
>>>>>>> 
>>>>>>> 
>>>>>>>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>>>>>>>        date = new Date().setRFC3339(doc.created_at);
>>>>>>>        var key = doc.em_uname +
>>> String(doc.created_at).substring(0,10);
>>>>>>>        if (seenKeys[key] ==  undefined  ) {
>>>>>>>            seenKeys[key] = 1;
>>>>>>>            emit([date.getFullYear(), parseInt(date.getMonth())+1,
>>>>>>> date.getDate() ] , 1);
>>>>>>>         }
>>>>>>>    }
>>>>>>> }
>>>>>>> 
>>>>>>> 
>>>>>>> It works when saved for this first time but as soon as new EdoPings
>>>>>>> get added it starts emitting rows it has already seen ! (same key)
>>>>>>> creating faulty count results.
>>>>>>> 
>>>>>>> Is it ok to have seenKeys outside of the doc function() ?
>>>>>>> What other way could i use to get the same results ?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Greg
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Greg Tappero
>>>>> CTO co founder Edoboard
>>>>> http://www.edoboard.com
>>>>> +33 0645764425
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Greg Tappero
>>> CTO co founder Edoboard
>>> http://www.edoboard.com
>>> +33 0645764425
>>> 
>> 
> 
> 
> 
> -- 
> Greg Tappero
> CTO co founder Edoboard
> http://www.edoboard.com
> +33 0645764425


Re: Map reduce and weird output question

Posted by Gregory Tappero <co...@gmail.com>.
Thanks,

I got the wanted result with
http://friendpaste.com/6sYxT4cNJ9IjpWiW9qgCut

benoitc came to my rescue.

Greg




On Mon, Mar 8, 2010 at 9:07 AM, Paweł Stawicki <pa...@gmail.com> wrote:
> Hmm... I'm just thinking now, don't know if it works, but maybe try
> something like this:
> If you can get number of documents per day per username, first try to make
> this number always one if keys is [date, username]:
> Reduce:
> if (keys.length == 2) {
>  return 1;
> } else if (keys.length == 1) { //date only, return number of usernames
>  return values.length();
> }
>
> The risk is that some usernames will count twice, but maybe try it.
>
> Best regards
> --
> Paweł Stawicki
> http://pawelstawicki.blogspot.com
> http://szczecin.jug.pl
>
>
>
> On Mon, Mar 8, 2010 at 08:03, Gregory Tappero <co...@gmail.com> wrote:
>
>> My number of keys is 4, year month day userame so returning the bbr of
>> keys in reduce does not seem to give me the output i am looking for.
>> Unless i misunderstood something.
>>
>> Thank you for helping,
>>
>> Greg
>>
>> On Mon, Mar 8, 2010 at 12:28 AM, Randall Leeds <ra...@gmail.com>
>> wrote:
>> > I'm not an expert on this, but I think you need to create your own
>> > reduce function and output the number of keys rather than the sum of
>> > the values.
>> >
>> > On Sun, Mar 7, 2010 at 15:15, Gregory Tappero <co...@gmail.com> wrote:
>> >> Thank you Pawel,
>> >>
>> >> If i try to follow your way it gives me the count of docs in a given
>> >> day for each username, what i would like is the count of unique
>> >> usernames for a given day.
>> >>
>> >> function(doc) {
>> >>
>> >>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>> >>        date = new Date().setRFC3339(doc.created_at);
>> >>        emit([date.getFullYear(), parseInt(date.getMonth())+1,
>> >> date.getDate(), doc.em_uname] , 1);
>> >>
>> >>    }
>> >> }
>> >>
>> >> Reduce:
>> >>  _count
>> >>
>> >> =================
>> >> I get:
>> >>
>> >> [2010, 3, 3, "student1"]         5
>> >> [2010, 3, 4, "student1"]         18
>> >> [2010, 3, 5, "eong"]             77
>> >> [2010, 3, 6, "bkante"]           71
>> >> [2010, 3, 6, "jfrancillette"]    72
>> >> [2010, 3, 6, "mlouviers"]        12
>> >> [2010, 3, 7, "student1"]         4
>> >>
>> >> I would like to extract the following
>> >>
>> >> [2010, 3, 3]       1
>> >> [2010, 3, 4]       1
>> >> [2010, 3, 5]    1
>> >> [2010, 3, 6]       3
>> >> [2010, 3, 7]       1
>> >>
>> >>
>> >> if i do a group_level=3 it sum the values.
>> >>
>> >> {"key":[2010,3,3],"value":5},
>> >> {"key":[2010,3,4],"value":18},
>> >> {"key":[2010,3,5],"value":77},
>> >> {"key":[2010,3,6],"value":155},
>> >> {"key":[2010,3,7],"value":4}
>> >>
>> >> How can i count the unique username emitter per day ?
>> >>
>> >>
>> >>
>> >>
>> >> On Sun, Mar 7, 2010 at 10:02 PM, Paweł Stawicki <
>> pawelstawicki@gmail.com> wrote:
>> >>> Just emit all documents with em_type = 0 in map function, with [date,
>> >>> em_uname] as key. Then count in reduce.
>> >>>
>> >>> Map:
>> >>> function(doc) {
>> >>>  if (doc.em_type = 0) {
>> >>>    //If you only want to count, you can emit anything (e.g. 1) instead
>> of
>> >>> doc here.
>> >>>    emit([date, em_uname], doc);
>> >>>  }
>> >>> }
>> >>>
>> >>> Reduce:
>> >>> function(keys, values, rereduce) {
>> >>>  if (!rereduce) {
>> >>>    return count_of_values;
>> >>>  } else {
>> >>>    return sum_of_values;
>> >>>  }
>> >>>
>> >>>  //If you return 1 from emit instead of doc, then count_of_values ==
>> >>> sum_of_values
>> >>> }
>> >>>
>> >>> Then you can handle everything by grouping:
>> >>> http://yourserver:5984/yourdb/_view/yourview?group_level=2
>> >>> or group=true
>> >>>
>> >>> Regards
>> >>> --
>> >>> Paweł Stawicki
>> >>> http://pawelstawicki.blogspot.com
>> >>> http://szczecin.jug.pl
>> >>>
>> >>>
>> >>>
>> >>> On Sat, Mar 6, 2010 at 16:26, Gregory Tappero <co...@gmail.com>
>> wrote:
>> >>>
>> >>>> Hello everyone,
>> >>>>
>> >>>> I have the following EdoPing 's type of documents
>> >>>>
>> >>>> {
>> >>>>   "_id": "22add509c1e7bc286832edc5bfe99ce5",
>> >>>>   "_rev": "1-49663ab8778f445e481143120d0d7086",
>> >>>>   "doc_type": "EdoPing",
>> >>>>   "em_uname": "student1",
>> >>>>   "em_gid": 1,
>> >>>>   "created_at": "2010-03-03T14:18:19Z",
>> >>>>   "em_ip": "92.154.70.148",
>> >>>>   "em_type": 0,
>> >>>>   "room_url": "z2fudcvcrfa3reaydatre",
>> >>>>   "room_users": [
>> >>>>       "tutorsbox"
>> >>>>   ]
>> >>>> }
>> >>>>
>> >>>> i would like to count all unique em_uname of em_type 0 on a given day
>> date.
>> >>>>
>> >>>> For now i used this map/reduce
>> >>>> http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe
>> >>>>
>> >>>> Date.prototype.setRFC3339 = function(dString){
>> >>>>    var regexp =
>> >>>>
>> >>>>
>> /(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z|([+-])(\d\d)(:)?(\d\d))/;
>> >>>>
>> >>>>    if (dString.toString().match(new RegExp(regexp))) {
>> >>>>        var d = dString.match(new RegExp(regexp));
>> >>>>        var offset = 0;
>> >>>>
>> >>>>        this.setUTCDate(1);
>> >>>>        this.setUTCFullYear(parseInt(d[1],10));
>> >>>>        this.setUTCMonth(parseInt(d[3],10) - 1);
>> >>>>        this.setUTCDate(parseInt(d[5],10));
>> >>>>        this.setUTCHours(parseInt(d[7],10));
>> >>>>        this.setUTCMinutes(parseInt(d[9],10));
>> >>>>        this.setUTCSeconds(parseInt(d[11],10));
>> >>>>        if (d[12])
>> >>>>            this.setUTCMilliseconds(parseFloat(d[12]) * 1000);
>> >>>>        else
>> >>>>            this.setUTCMilliseconds(0);
>> >>>>        if (d[13] != 'Z') {
>> >>>>            offset = (d[15] * 60) + parseInt(d[17],10);
>> >>>>            offset *= ((d[14] == '-') ? -1 : 1);
>> >>>>            this.setTime(this.getTime() - offset * 60 * 1000);
>> >>>>        }
>> >>>>    } else {
>> >>>>        this.setTime(Date.parse(dString));
>> >>>>    }
>> >>>>    return this;
>> >>>> };
>> >>>>
>> >>>> var seenKeys = new Array();
>> >>>>
>> >>>> function(doc) {
>> >>>>
>> >>>>
>> >>>>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>> >>>>        date = new Date().setRFC3339(doc.created_at);
>> >>>>        var key = doc.em_uname +
>> String(doc.created_at).substring(0,10);
>> >>>>        if (seenKeys[key] ==  undefined  ) {
>> >>>>            seenKeys[key] = 1;
>> >>>>            emit([date.getFullYear(), parseInt(date.getMonth())+1,
>> >>>> date.getDate() ] , 1);
>> >>>>         }
>> >>>>    }
>> >>>> }
>> >>>>
>> >>>>
>> >>>> It works when saved for this first time but as soon as new EdoPings
>> >>>> get added it starts emitting rows it has already seen ! (same key)
>> >>>> creating faulty count results.
>> >>>>
>> >>>> Is it ok to have seenKeys outside of the doc function() ?
>> >>>> What other way could i use to get the same results ?
>> >>>>
>> >>>> Thanks,
>> >>>>
>> >>>> Greg
>> >>>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Greg Tappero
>> >> CTO co founder Edoboard
>> >> http://www.edoboard.com
>> >> +33 0645764425
>> >>
>> >
>>
>>
>>
>> --
>> Greg Tappero
>> CTO co founder Edoboard
>> http://www.edoboard.com
>> +33 0645764425
>>
>



-- 
Greg Tappero
CTO co founder Edoboard
http://www.edoboard.com
+33 0645764425

Re: Map reduce and weird output question

Posted by Paweł Stawicki <pa...@gmail.com>.
Hmm... I'm just thinking now, don't know if it works, but maybe try
something like this:
If you can get number of documents per day per username, first try to make
this number always one if keys is [date, username]:
Reduce:
if (keys.length == 2) {
  return 1;
} else if (keys.length == 1) { //date only, return number of usernames
  return values.length();
}

The risk is that some usernames will count twice, but maybe try it.

Best regards
--
Paweł Stawicki
http://pawelstawicki.blogspot.com
http://szczecin.jug.pl



On Mon, Mar 8, 2010 at 08:03, Gregory Tappero <co...@gmail.com> wrote:

> My number of keys is 4, year month day userame so returning the bbr of
> keys in reduce does not seem to give me the output i am looking for.
> Unless i misunderstood something.
>
> Thank you for helping,
>
> Greg
>
> On Mon, Mar 8, 2010 at 12:28 AM, Randall Leeds <ra...@gmail.com>
> wrote:
> > I'm not an expert on this, but I think you need to create your own
> > reduce function and output the number of keys rather than the sum of
> > the values.
> >
> > On Sun, Mar 7, 2010 at 15:15, Gregory Tappero <co...@gmail.com> wrote:
> >> Thank you Pawel,
> >>
> >> If i try to follow your way it gives me the count of docs in a given
> >> day for each username, what i would like is the count of unique
> >> usernames for a given day.
> >>
> >> function(doc) {
> >>
> >>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
> >>        date = new Date().setRFC3339(doc.created_at);
> >>        emit([date.getFullYear(), parseInt(date.getMonth())+1,
> >> date.getDate(), doc.em_uname] , 1);
> >>
> >>    }
> >> }
> >>
> >> Reduce:
> >>  _count
> >>
> >> =================
> >> I get:
> >>
> >> [2010, 3, 3, "student1"]         5
> >> [2010, 3, 4, "student1"]         18
> >> [2010, 3, 5, "eong"]             77
> >> [2010, 3, 6, "bkante"]           71
> >> [2010, 3, 6, "jfrancillette"]    72
> >> [2010, 3, 6, "mlouviers"]        12
> >> [2010, 3, 7, "student1"]         4
> >>
> >> I would like to extract the following
> >>
> >> [2010, 3, 3]       1
> >> [2010, 3, 4]       1
> >> [2010, 3, 5]    1
> >> [2010, 3, 6]       3
> >> [2010, 3, 7]       1
> >>
> >>
> >> if i do a group_level=3 it sum the values.
> >>
> >> {"key":[2010,3,3],"value":5},
> >> {"key":[2010,3,4],"value":18},
> >> {"key":[2010,3,5],"value":77},
> >> {"key":[2010,3,6],"value":155},
> >> {"key":[2010,3,7],"value":4}
> >>
> >> How can i count the unique username emitter per day ?
> >>
> >>
> >>
> >>
> >> On Sun, Mar 7, 2010 at 10:02 PM, Paweł Stawicki <
> pawelstawicki@gmail.com> wrote:
> >>> Just emit all documents with em_type = 0 in map function, with [date,
> >>> em_uname] as key. Then count in reduce.
> >>>
> >>> Map:
> >>> function(doc) {
> >>>  if (doc.em_type = 0) {
> >>>    //If you only want to count, you can emit anything (e.g. 1) instead
> of
> >>> doc here.
> >>>    emit([date, em_uname], doc);
> >>>  }
> >>> }
> >>>
> >>> Reduce:
> >>> function(keys, values, rereduce) {
> >>>  if (!rereduce) {
> >>>    return count_of_values;
> >>>  } else {
> >>>    return sum_of_values;
> >>>  }
> >>>
> >>>  //If you return 1 from emit instead of doc, then count_of_values ==
> >>> sum_of_values
> >>> }
> >>>
> >>> Then you can handle everything by grouping:
> >>> http://yourserver:5984/yourdb/_view/yourview?group_level=2
> >>> or group=true
> >>>
> >>> Regards
> >>> --
> >>> Paweł Stawicki
> >>> http://pawelstawicki.blogspot.com
> >>> http://szczecin.jug.pl
> >>>
> >>>
> >>>
> >>> On Sat, Mar 6, 2010 at 16:26, Gregory Tappero <co...@gmail.com>
> wrote:
> >>>
> >>>> Hello everyone,
> >>>>
> >>>> I have the following EdoPing 's type of documents
> >>>>
> >>>> {
> >>>>   "_id": "22add509c1e7bc286832edc5bfe99ce5",
> >>>>   "_rev": "1-49663ab8778f445e481143120d0d7086",
> >>>>   "doc_type": "EdoPing",
> >>>>   "em_uname": "student1",
> >>>>   "em_gid": 1,
> >>>>   "created_at": "2010-03-03T14:18:19Z",
> >>>>   "em_ip": "92.154.70.148",
> >>>>   "em_type": 0,
> >>>>   "room_url": "z2fudcvcrfa3reaydatre",
> >>>>   "room_users": [
> >>>>       "tutorsbox"
> >>>>   ]
> >>>> }
> >>>>
> >>>> i would like to count all unique em_uname of em_type 0 on a given day
> date.
> >>>>
> >>>> For now i used this map/reduce
> >>>> http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe
> >>>>
> >>>> Date.prototype.setRFC3339 = function(dString){
> >>>>    var regexp =
> >>>>
> >>>>
> /(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z|([+-])(\d\d)(:)?(\d\d))/;
> >>>>
> >>>>    if (dString.toString().match(new RegExp(regexp))) {
> >>>>        var d = dString.match(new RegExp(regexp));
> >>>>        var offset = 0;
> >>>>
> >>>>        this.setUTCDate(1);
> >>>>        this.setUTCFullYear(parseInt(d[1],10));
> >>>>        this.setUTCMonth(parseInt(d[3],10) - 1);
> >>>>        this.setUTCDate(parseInt(d[5],10));
> >>>>        this.setUTCHours(parseInt(d[7],10));
> >>>>        this.setUTCMinutes(parseInt(d[9],10));
> >>>>        this.setUTCSeconds(parseInt(d[11],10));
> >>>>        if (d[12])
> >>>>            this.setUTCMilliseconds(parseFloat(d[12]) * 1000);
> >>>>        else
> >>>>            this.setUTCMilliseconds(0);
> >>>>        if (d[13] != 'Z') {
> >>>>            offset = (d[15] * 60) + parseInt(d[17],10);
> >>>>            offset *= ((d[14] == '-') ? -1 : 1);
> >>>>            this.setTime(this.getTime() - offset * 60 * 1000);
> >>>>        }
> >>>>    } else {
> >>>>        this.setTime(Date.parse(dString));
> >>>>    }
> >>>>    return this;
> >>>> };
> >>>>
> >>>> var seenKeys = new Array();
> >>>>
> >>>> function(doc) {
> >>>>
> >>>>
> >>>>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
> >>>>        date = new Date().setRFC3339(doc.created_at);
> >>>>        var key = doc.em_uname +
> String(doc.created_at).substring(0,10);
> >>>>        if (seenKeys[key] ==  undefined  ) {
> >>>>            seenKeys[key] = 1;
> >>>>            emit([date.getFullYear(), parseInt(date.getMonth())+1,
> >>>> date.getDate() ] , 1);
> >>>>         }
> >>>>    }
> >>>> }
> >>>>
> >>>>
> >>>> It works when saved for this first time but as soon as new EdoPings
> >>>> get added it starts emitting rows it has already seen ! (same key)
> >>>> creating faulty count results.
> >>>>
> >>>> Is it ok to have seenKeys outside of the doc function() ?
> >>>> What other way could i use to get the same results ?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Greg
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Greg Tappero
> >> CTO co founder Edoboard
> >> http://www.edoboard.com
> >> +33 0645764425
> >>
> >
>
>
>
> --
> Greg Tappero
> CTO co founder Edoboard
> http://www.edoboard.com
> +33 0645764425
>

Re: Map reduce and weird output question

Posted by Gregory Tappero <co...@gmail.com>.
My number of keys is 4, year month day userame so returning the bbr of
keys in reduce does not seem to give me the output i am looking for.
Unless i misunderstood something.

Thank you for helping,

Greg

On Mon, Mar 8, 2010 at 12:28 AM, Randall Leeds <ra...@gmail.com> wrote:
> I'm not an expert on this, but I think you need to create your own
> reduce function and output the number of keys rather than the sum of
> the values.
>
> On Sun, Mar 7, 2010 at 15:15, Gregory Tappero <co...@gmail.com> wrote:
>> Thank you Pawel,
>>
>> If i try to follow your way it gives me the count of docs in a given
>> day for each username, what i would like is the count of unique
>> usernames for a given day.
>>
>> function(doc) {
>>
>>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>>        date = new Date().setRFC3339(doc.created_at);
>>        emit([date.getFullYear(), parseInt(date.getMonth())+1,
>> date.getDate(), doc.em_uname] , 1);
>>
>>    }
>> }
>>
>> Reduce:
>>  _count
>>
>> =================
>> I get:
>>
>> [2010, 3, 3, "student1"]         5
>> [2010, 3, 4, "student1"]         18
>> [2010, 3, 5, "eong"]             77
>> [2010, 3, 6, "bkante"]           71
>> [2010, 3, 6, "jfrancillette"]    72
>> [2010, 3, 6, "mlouviers"]        12
>> [2010, 3, 7, "student1"]         4
>>
>> I would like to extract the following
>>
>> [2010, 3, 3]       1
>> [2010, 3, 4]       1
>> [2010, 3, 5]    1
>> [2010, 3, 6]       3
>> [2010, 3, 7]       1
>>
>>
>> if i do a group_level=3 it sum the values.
>>
>> {"key":[2010,3,3],"value":5},
>> {"key":[2010,3,4],"value":18},
>> {"key":[2010,3,5],"value":77},
>> {"key":[2010,3,6],"value":155},
>> {"key":[2010,3,7],"value":4}
>>
>> How can i count the unique username emitter per day ?
>>
>>
>>
>>
>> On Sun, Mar 7, 2010 at 10:02 PM, Paweł Stawicki <pa...@gmail.com> wrote:
>>> Just emit all documents with em_type = 0 in map function, with [date,
>>> em_uname] as key. Then count in reduce.
>>>
>>> Map:
>>> function(doc) {
>>>  if (doc.em_type = 0) {
>>>    //If you only want to count, you can emit anything (e.g. 1) instead of
>>> doc here.
>>>    emit([date, em_uname], doc);
>>>  }
>>> }
>>>
>>> Reduce:
>>> function(keys, values, rereduce) {
>>>  if (!rereduce) {
>>>    return count_of_values;
>>>  } else {
>>>    return sum_of_values;
>>>  }
>>>
>>>  //If you return 1 from emit instead of doc, then count_of_values ==
>>> sum_of_values
>>> }
>>>
>>> Then you can handle everything by grouping:
>>> http://yourserver:5984/yourdb/_view/yourview?group_level=2
>>> or group=true
>>>
>>> Regards
>>> --
>>> Paweł Stawicki
>>> http://pawelstawicki.blogspot.com
>>> http://szczecin.jug.pl
>>>
>>>
>>>
>>> On Sat, Mar 6, 2010 at 16:26, Gregory Tappero <co...@gmail.com> wrote:
>>>
>>>> Hello everyone,
>>>>
>>>> I have the following EdoPing 's type of documents
>>>>
>>>> {
>>>>   "_id": "22add509c1e7bc286832edc5bfe99ce5",
>>>>   "_rev": "1-49663ab8778f445e481143120d0d7086",
>>>>   "doc_type": "EdoPing",
>>>>   "em_uname": "student1",
>>>>   "em_gid": 1,
>>>>   "created_at": "2010-03-03T14:18:19Z",
>>>>   "em_ip": "92.154.70.148",
>>>>   "em_type": 0,
>>>>   "room_url": "z2fudcvcrfa3reaydatre",
>>>>   "room_users": [
>>>>       "tutorsbox"
>>>>   ]
>>>> }
>>>>
>>>> i would like to count all unique em_uname of em_type 0 on a given day date.
>>>>
>>>> For now i used this map/reduce
>>>> http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe
>>>>
>>>> Date.prototype.setRFC3339 = function(dString){
>>>>    var regexp =
>>>>
>>>> /(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z|([+-])(\d\d)(:)?(\d\d))/;
>>>>
>>>>    if (dString.toString().match(new RegExp(regexp))) {
>>>>        var d = dString.match(new RegExp(regexp));
>>>>        var offset = 0;
>>>>
>>>>        this.setUTCDate(1);
>>>>        this.setUTCFullYear(parseInt(d[1],10));
>>>>        this.setUTCMonth(parseInt(d[3],10) - 1);
>>>>        this.setUTCDate(parseInt(d[5],10));
>>>>        this.setUTCHours(parseInt(d[7],10));
>>>>        this.setUTCMinutes(parseInt(d[9],10));
>>>>        this.setUTCSeconds(parseInt(d[11],10));
>>>>        if (d[12])
>>>>            this.setUTCMilliseconds(parseFloat(d[12]) * 1000);
>>>>        else
>>>>            this.setUTCMilliseconds(0);
>>>>        if (d[13] != 'Z') {
>>>>            offset = (d[15] * 60) + parseInt(d[17],10);
>>>>            offset *= ((d[14] == '-') ? -1 : 1);
>>>>            this.setTime(this.getTime() - offset * 60 * 1000);
>>>>        }
>>>>    } else {
>>>>        this.setTime(Date.parse(dString));
>>>>    }
>>>>    return this;
>>>> };
>>>>
>>>> var seenKeys = new Array();
>>>>
>>>> function(doc) {
>>>>
>>>>
>>>>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>>>>        date = new Date().setRFC3339(doc.created_at);
>>>>        var key = doc.em_uname + String(doc.created_at).substring(0,10);
>>>>        if (seenKeys[key] ==  undefined  ) {
>>>>            seenKeys[key] = 1;
>>>>            emit([date.getFullYear(), parseInt(date.getMonth())+1,
>>>> date.getDate() ] , 1);
>>>>         }
>>>>    }
>>>> }
>>>>
>>>>
>>>> It works when saved for this first time but as soon as new EdoPings
>>>> get added it starts emitting rows it has already seen ! (same key)
>>>> creating faulty count results.
>>>>
>>>> Is it ok to have seenKeys outside of the doc function() ?
>>>> What other way could i use to get the same results ?
>>>>
>>>> Thanks,
>>>>
>>>> Greg
>>>>
>>>
>>
>>
>>
>> --
>> Greg Tappero
>> CTO co founder Edoboard
>> http://www.edoboard.com
>> +33 0645764425
>>
>



-- 
Greg Tappero
CTO co founder Edoboard
http://www.edoboard.com
+33 0645764425

Re: Map reduce and weird output question

Posted by Randall Leeds <ra...@gmail.com>.
I'm not an expert on this, but I think you need to create your own
reduce function and output the number of keys rather than the sum of
the values.

On Sun, Mar 7, 2010 at 15:15, Gregory Tappero <co...@gmail.com> wrote:
> Thank you Pawel,
>
> If i try to follow your way it gives me the count of docs in a given
> day for each username, what i would like is the count of unique
> usernames for a given day.
>
> function(doc) {
>
>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>        date = new Date().setRFC3339(doc.created_at);
>        emit([date.getFullYear(), parseInt(date.getMonth())+1,
> date.getDate(), doc.em_uname] , 1);
>
>    }
> }
>
> Reduce:
>  _count
>
> =================
> I get:
>
> [2010, 3, 3, "student1"]         5
> [2010, 3, 4, "student1"]         18
> [2010, 3, 5, "eong"]             77
> [2010, 3, 6, "bkante"]           71
> [2010, 3, 6, "jfrancillette"]    72
> [2010, 3, 6, "mlouviers"]        12
> [2010, 3, 7, "student1"]         4
>
> I would like to extract the following
>
> [2010, 3, 3]       1
> [2010, 3, 4]       1
> [2010, 3, 5]    1
> [2010, 3, 6]       3
> [2010, 3, 7]       1
>
>
> if i do a group_level=3 it sum the values.
>
> {"key":[2010,3,3],"value":5},
> {"key":[2010,3,4],"value":18},
> {"key":[2010,3,5],"value":77},
> {"key":[2010,3,6],"value":155},
> {"key":[2010,3,7],"value":4}
>
> How can i count the unique username emitter per day ?
>
>
>
>
> On Sun, Mar 7, 2010 at 10:02 PM, Paweł Stawicki <pa...@gmail.com> wrote:
>> Just emit all documents with em_type = 0 in map function, with [date,
>> em_uname] as key. Then count in reduce.
>>
>> Map:
>> function(doc) {
>>  if (doc.em_type = 0) {
>>    //If you only want to count, you can emit anything (e.g. 1) instead of
>> doc here.
>>    emit([date, em_uname], doc);
>>  }
>> }
>>
>> Reduce:
>> function(keys, values, rereduce) {
>>  if (!rereduce) {
>>    return count_of_values;
>>  } else {
>>    return sum_of_values;
>>  }
>>
>>  //If you return 1 from emit instead of doc, then count_of_values ==
>> sum_of_values
>> }
>>
>> Then you can handle everything by grouping:
>> http://yourserver:5984/yourdb/_view/yourview?group_level=2
>> or group=true
>>
>> Regards
>> --
>> Paweł Stawicki
>> http://pawelstawicki.blogspot.com
>> http://szczecin.jug.pl
>>
>>
>>
>> On Sat, Mar 6, 2010 at 16:26, Gregory Tappero <co...@gmail.com> wrote:
>>
>>> Hello everyone,
>>>
>>> I have the following EdoPing 's type of documents
>>>
>>> {
>>>   "_id": "22add509c1e7bc286832edc5bfe99ce5",
>>>   "_rev": "1-49663ab8778f445e481143120d0d7086",
>>>   "doc_type": "EdoPing",
>>>   "em_uname": "student1",
>>>   "em_gid": 1,
>>>   "created_at": "2010-03-03T14:18:19Z",
>>>   "em_ip": "92.154.70.148",
>>>   "em_type": 0,
>>>   "room_url": "z2fudcvcrfa3reaydatre",
>>>   "room_users": [
>>>       "tutorsbox"
>>>   ]
>>> }
>>>
>>> i would like to count all unique em_uname of em_type 0 on a given day date.
>>>
>>> For now i used this map/reduce
>>> http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe
>>>
>>> Date.prototype.setRFC3339 = function(dString){
>>>    var regexp =
>>>
>>> /(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z|([+-])(\d\d)(:)?(\d\d))/;
>>>
>>>    if (dString.toString().match(new RegExp(regexp))) {
>>>        var d = dString.match(new RegExp(regexp));
>>>        var offset = 0;
>>>
>>>        this.setUTCDate(1);
>>>        this.setUTCFullYear(parseInt(d[1],10));
>>>        this.setUTCMonth(parseInt(d[3],10) - 1);
>>>        this.setUTCDate(parseInt(d[5],10));
>>>        this.setUTCHours(parseInt(d[7],10));
>>>        this.setUTCMinutes(parseInt(d[9],10));
>>>        this.setUTCSeconds(parseInt(d[11],10));
>>>        if (d[12])
>>>            this.setUTCMilliseconds(parseFloat(d[12]) * 1000);
>>>        else
>>>            this.setUTCMilliseconds(0);
>>>        if (d[13] != 'Z') {
>>>            offset = (d[15] * 60) + parseInt(d[17],10);
>>>            offset *= ((d[14] == '-') ? -1 : 1);
>>>            this.setTime(this.getTime() - offset * 60 * 1000);
>>>        }
>>>    } else {
>>>        this.setTime(Date.parse(dString));
>>>    }
>>>    return this;
>>> };
>>>
>>> var seenKeys = new Array();
>>>
>>> function(doc) {
>>>
>>>
>>>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>>>        date = new Date().setRFC3339(doc.created_at);
>>>        var key = doc.em_uname + String(doc.created_at).substring(0,10);
>>>        if (seenKeys[key] ==  undefined  ) {
>>>            seenKeys[key] = 1;
>>>            emit([date.getFullYear(), parseInt(date.getMonth())+1,
>>> date.getDate() ] , 1);
>>>         }
>>>    }
>>> }
>>>
>>>
>>> It works when saved for this first time but as soon as new EdoPings
>>> get added it starts emitting rows it has already seen ! (same key)
>>> creating faulty count results.
>>>
>>> Is it ok to have seenKeys outside of the doc function() ?
>>> What other way could i use to get the same results ?
>>>
>>> Thanks,
>>>
>>> Greg
>>>
>>
>
>
>
> --
> Greg Tappero
> CTO co founder Edoboard
> http://www.edoboard.com
> +33 0645764425
>

Re: Map reduce and weird output question

Posted by Gregory Tappero <co...@gmail.com>.
Thank you Pawel,

If i try to follow your way it gives me the count of docs in a given
day for each username, what i would like is the count of unique
usernames for a given day.

function(doc) {

    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
        date = new Date().setRFC3339(doc.created_at);
        emit([date.getFullYear(), parseInt(date.getMonth())+1,
date.getDate(), doc.em_uname] , 1);

    }
}

Reduce:
 _count

=================
I get:

[2010, 3, 3, "student1"]         5
[2010, 3, 4, "student1"]         18
[2010, 3, 5, "eong"]	         77
[2010, 3, 6, "bkante"]	         71
[2010, 3, 6, "jfrancillette"]	 72
[2010, 3, 6, "mlouviers"]	 12
[2010, 3, 7, "student1"]	 4

I would like to extract the following

[2010, 3, 3]       1	
[2010, 3, 4]       1	
[2010, 3, 5]	1
[2010, 3, 6]       3	
[2010, 3, 7]       1


if i do a group_level=3 it sum the values.

{"key":[2010,3,3],"value":5},
{"key":[2010,3,4],"value":18},
{"key":[2010,3,5],"value":77},
{"key":[2010,3,6],"value":155},
{"key":[2010,3,7],"value":4}

How can i count the unique username emitter per day ?




On Sun, Mar 7, 2010 at 10:02 PM, Paweł Stawicki <pa...@gmail.com> wrote:
> Just emit all documents with em_type = 0 in map function, with [date,
> em_uname] as key. Then count in reduce.
>
> Map:
> function(doc) {
>  if (doc.em_type = 0) {
>    //If you only want to count, you can emit anything (e.g. 1) instead of
> doc here.
>    emit([date, em_uname], doc);
>  }
> }
>
> Reduce:
> function(keys, values, rereduce) {
>  if (!rereduce) {
>    return count_of_values;
>  } else {
>    return sum_of_values;
>  }
>
>  //If you return 1 from emit instead of doc, then count_of_values ==
> sum_of_values
> }
>
> Then you can handle everything by grouping:
> http://yourserver:5984/yourdb/_view/yourview?group_level=2
> or group=true
>
> Regards
> --
> Paweł Stawicki
> http://pawelstawicki.blogspot.com
> http://szczecin.jug.pl
>
>
>
> On Sat, Mar 6, 2010 at 16:26, Gregory Tappero <co...@gmail.com> wrote:
>
>> Hello everyone,
>>
>> I have the following EdoPing 's type of documents
>>
>> {
>>   "_id": "22add509c1e7bc286832edc5bfe99ce5",
>>   "_rev": "1-49663ab8778f445e481143120d0d7086",
>>   "doc_type": "EdoPing",
>>   "em_uname": "student1",
>>   "em_gid": 1,
>>   "created_at": "2010-03-03T14:18:19Z",
>>   "em_ip": "92.154.70.148",
>>   "em_type": 0,
>>   "room_url": "z2fudcvcrfa3reaydatre",
>>   "room_users": [
>>       "tutorsbox"
>>   ]
>> }
>>
>> i would like to count all unique em_uname of em_type 0 on a given day date.
>>
>> For now i used this map/reduce
>> http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe
>>
>> Date.prototype.setRFC3339 = function(dString){
>>    var regexp =
>>
>> /(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z|([+-])(\d\d)(:)?(\d\d))/;
>>
>>    if (dString.toString().match(new RegExp(regexp))) {
>>        var d = dString.match(new RegExp(regexp));
>>        var offset = 0;
>>
>>        this.setUTCDate(1);
>>        this.setUTCFullYear(parseInt(d[1],10));
>>        this.setUTCMonth(parseInt(d[3],10) - 1);
>>        this.setUTCDate(parseInt(d[5],10));
>>        this.setUTCHours(parseInt(d[7],10));
>>        this.setUTCMinutes(parseInt(d[9],10));
>>        this.setUTCSeconds(parseInt(d[11],10));
>>        if (d[12])
>>            this.setUTCMilliseconds(parseFloat(d[12]) * 1000);
>>        else
>>            this.setUTCMilliseconds(0);
>>        if (d[13] != 'Z') {
>>            offset = (d[15] * 60) + parseInt(d[17],10);
>>            offset *= ((d[14] == '-') ? -1 : 1);
>>            this.setTime(this.getTime() - offset * 60 * 1000);
>>        }
>>    } else {
>>        this.setTime(Date.parse(dString));
>>    }
>>    return this;
>> };
>>
>> var seenKeys = new Array();
>>
>> function(doc) {
>>
>>
>>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>>        date = new Date().setRFC3339(doc.created_at);
>>        var key = doc.em_uname + String(doc.created_at).substring(0,10);
>>        if (seenKeys[key] ==  undefined  ) {
>>            seenKeys[key] = 1;
>>            emit([date.getFullYear(), parseInt(date.getMonth())+1,
>> date.getDate() ] , 1);
>>         }
>>    }
>> }
>>
>>
>> It works when saved for this first time but as soon as new EdoPings
>> get added it starts emitting rows it has already seen ! (same key)
>> creating faulty count results.
>>
>> Is it ok to have seenKeys outside of the doc function() ?
>> What other way could i use to get the same results ?
>>
>> Thanks,
>>
>> Greg
>>
>



-- 
Greg Tappero
CTO co founder Edoboard
http://www.edoboard.com
+33 0645764425

Re: Map reduce and weird output question

Posted by Paweł Stawicki <pa...@gmail.com>.
Just emit all documents with em_type = 0 in map function, with [date,
em_uname] as key. Then count in reduce.

Map:
function(doc) {
  if (doc.em_type = 0) {
    //If you only want to count, you can emit anything (e.g. 1) instead of
doc here.
    emit([date, em_uname], doc);
  }
}

Reduce:
function(keys, values, rereduce) {
  if (!rereduce) {
    return count_of_values;
  } else {
    return sum_of_values;
  }

  //If you return 1 from emit instead of doc, then count_of_values ==
sum_of_values
}

Then you can handle everything by grouping:
http://yourserver:5984/yourdb/_view/yourview?group_level=2
or group=true

Regards
--
Paweł Stawicki
http://pawelstawicki.blogspot.com
http://szczecin.jug.pl



On Sat, Mar 6, 2010 at 16:26, Gregory Tappero <co...@gmail.com> wrote:

> Hello everyone,
>
> I have the following EdoPing 's type of documents
>
> {
>   "_id": "22add509c1e7bc286832edc5bfe99ce5",
>   "_rev": "1-49663ab8778f445e481143120d0d7086",
>   "doc_type": "EdoPing",
>   "em_uname": "student1",
>   "em_gid": 1,
>   "created_at": "2010-03-03T14:18:19Z",
>   "em_ip": "92.154.70.148",
>   "em_type": 0,
>   "room_url": "z2fudcvcrfa3reaydatre",
>   "room_users": [
>       "tutorsbox"
>   ]
> }
>
> i would like to count all unique em_uname of em_type 0 on a given day date.
>
> For now i used this map/reduce
> http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe
>
> Date.prototype.setRFC3339 = function(dString){
>    var regexp =
>
> /(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z|([+-])(\d\d)(:)?(\d\d))/;
>
>    if (dString.toString().match(new RegExp(regexp))) {
>        var d = dString.match(new RegExp(regexp));
>        var offset = 0;
>
>        this.setUTCDate(1);
>        this.setUTCFullYear(parseInt(d[1],10));
>        this.setUTCMonth(parseInt(d[3],10) - 1);
>        this.setUTCDate(parseInt(d[5],10));
>        this.setUTCHours(parseInt(d[7],10));
>        this.setUTCMinutes(parseInt(d[9],10));
>        this.setUTCSeconds(parseInt(d[11],10));
>        if (d[12])
>            this.setUTCMilliseconds(parseFloat(d[12]) * 1000);
>        else
>            this.setUTCMilliseconds(0);
>        if (d[13] != 'Z') {
>            offset = (d[15] * 60) + parseInt(d[17],10);
>            offset *= ((d[14] == '-') ? -1 : 1);
>            this.setTime(this.getTime() - offset * 60 * 1000);
>        }
>    } else {
>        this.setTime(Date.parse(dString));
>    }
>    return this;
> };
>
> var seenKeys = new Array();
>
> function(doc) {
>
>
>    if (doc.doc_type=="EdoPing" && doc.em_type==0) {
>        date = new Date().setRFC3339(doc.created_at);
>        var key = doc.em_uname + String(doc.created_at).substring(0,10);
>        if (seenKeys[key] ==  undefined  ) {
>            seenKeys[key] = 1;
>            emit([date.getFullYear(), parseInt(date.getMonth())+1,
> date.getDate() ] , 1);
>         }
>    }
> }
>
>
> It works when saved for this first time but as soon as new EdoPings
> get added it starts emitting rows it has already seen ! (same key)
> creating faulty count results.
>
> Is it ok to have seenKeys outside of the doc function() ?
> What other way could i use to get the same results ?
>
> Thanks,
>
> Greg
>