You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Michael Fellinger <m....@gmail.com> on 2008/12/26 06:39:15 UTC

Calculating most popular tags

I have a lot of documents that look like:

{"type": "Jot", "author": "manveru", "tags": ["foo","bar","baz"],
"text": "foobar"}

Now, what I'd like to do is finding out which tags are most used by a
quick query, but it seems that the only place this could ever happen
is in the reduce function... Given the previous discussion of how to
understand reduce I'd like to throw my problem out there as well.
I have following map/reduce right now:

http://p.ramaze.net/16546

And it behaves a lot faster (from ~1.5s to 0.1s) than what I used to have:

http://p.ramaze.net/16547
queried by ?group=true

Yet their results are almost identical, except for the fact that the
prior will return only one result row which contains a list of all
tags with their frequencies, the other one gives a typical group
result, one row per tag, key being the tag and value the frequency.

My question now is, how would you do this different?

I have a couple of other queries that I just can't seem to boil down
to something fast, but I'll get to that another time.

^ manveru