You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Brian Candler <B....@pobox.com> on 2009/06/02 11:10:13 UTC
Re: find all unique field names
On Thu, May 28, 2009 at 04:20:06PM -0500, Douglas Fils wrote:
> It's not too hard to generate a map function that emits an array of the
> field names in a particular record....
> (please note this is about as much JS as I have ever written) :)
> function(doc) {
> var i = 0;
> var keyNames = new Array();
> for (var key in doc) {
> keyNames[i] = key
> i++;
> }
> emit(null,keyNames);
> }
>
> However, once I pass that over to the reduce (assuming this is even the
> way to do it) I don't see an easy way to get the unique intersection of
> the various field names.
Try just emiting the field names like this
function(doc) {
for (var key in doc) {
emit(key,null);
}
}
Then the following reduce function will build a map of {fieldname: count}
function(ks, vs, co) {
if (co) {
var result = vs.shift();
for (var i in vs) {
for (var j in vs[i]) {
result[j] = (result[j] || 0) + vs[i][j];
}
}
return result;
} else {
var result = {};
for (var i in ks) {
var key = ks[i];
result[key[0]] = (result[key[0]] || 0) + 1;
}
return result;
}
}
Then the client just asks for the reduce value, and looks at the distinct
keys.
Alternatively, you can use a simple counter reduce function and a group=true
query.
The former approach more efficient if the number of distinct values is
relatively small, since a single disk access will get all the keys. The
latter approach involves walking the btree index, but avoids the problems
with building a large reduce object if the number of distinct values is
large.
HTH,
Brian.