You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Devon Weller <dw...@devonweller.com> on 2009/11/05 14:46:10 UTC

Sorting items by number of votes

Hi.  I'm new to CouchDB (and this list) and I am looking for some  
help.  I am trying to do something very similar to http://wiki.apache.org/couchdb/View_Snippets#aggregate_sum 
.



I have a database with 2 types of documents, resources and votes.  The  
documents in the database look something like this:

resource 1 {_id: 1, type:"resource", name:"resource 1", ...}
resource 2 {_id: 2, type:"resource", name:"resource 2", ...}
resource 3 {_id: 3, type:"resource", name:"resource 3", ...}

vote for resource 3 {type:"vote", resource_id: 3, user_id: foo}
vote for resource 2 {type:"vote", resource_id: 2, user_id: foo}
vote for resource 3 {type:"vote", resource_id: 3, user_id: bar}


I want to create a view that will give me the resources in order of  
the most votes along with the number of votes.  In other words, I want  
to write a view that gives me something like the following results:

{id:3, key:3, value:{doc:(document for resource 3), score: 2}},
{id:2, key:2, value:{doc:(document for resource 2), score: 1}},
{id:1, key:1, value:{doc:(document for resource 1), score: 0}},




Here is my map function:


function(doc) {
	if (doc.type == 'vote' && doc.resource_id) {
		emit(doc.resource_id, doc);
	} else if (doc.type == 'resource' && doc.name) {
		emit(doc._id, doc);
	}
}


Here is my reduce function:



function(keys, values, rereduce) {
	var score = 0;
	var output_doc = {};

	for (var i=0; i < values.length; i++) {
		if (values[i].type == 'vote') {
			++score;
		} else if (values[i].type == 'resource') {
			output_doc = values[i];
		}
	}

	return {doc:output_doc, score:score};
}



When I query this view in Futon I get the following error:

"Reduce output must shrink more rapidly"


Am I trying to do something that can't be done with CouchDB?  Or am I  
just missing something?

Thanks so much for any help.

- Devon

Re: Sorting items by number of votes

Posted by Chris Anderson <jc...@apache.org>.

On Thu, Nov 5, 2009 at 6:36 AM, Devon Weller <dw...@devonweller.com> wrote:
>
> Thanks Nathan.
>
> Using a list is an interesting idea.  Although, I suspect that method would
> become inefficient for things like "give me the 10 resources with the most
> votes" when there are 10,000 resources in the database.
>
> I think my solution will be to create a map reduce which just counts the
> votes by resource_id.  And then use that information to do a bulk request
> for the top 10 documents by ID.
>
>

Yes. In order to do this you'll need to use a group reduce, to see,
for each resource in the db, the # of votes it has. Sorting by the #
of votes will have to come in the client.

If you have too many unique resources to sort in the client, you can
copy the view to a secondary db, (by looping over the rows and saving
docs, from your application code) and sort by value there.


>
>
> Regarding the example in the wiki:
>
> As a new user, I just followed an example from the wiki:
> http://wiki.apache.org/couchdb/View_Snippets#aggregate_sum
>
> Is that example an incorrect way of using a CouchDB view?  Should it be
> removed?
>
>
> - Devon
>
>
>
>
>
>
> On Nov 5, 2009, at 8:04 AM, Nathan Stott wrote:
>
>> Reduces are not for joins.  A lot of people try that when they first start
>> using couch.  I tried it.
>>
>> This is more appropriate for a list.  If you feed your view into a list,
>> then you can have the list do the processing that you want and create the
>> 'joined' objects and emit them as JSON (or however else you like)
>>
>>
>> On Thu, Nov 5, 2009 at 7:46 AM, Devon Weller
>> <dw...@devonweller.com>wrote:
>>
>>> Hi.  I'm new to CouchDB (and this list) and I am looking for some help.
>>>  I
>>> am trying to do something very similar to
>>> http://wiki.apache.org/couchdb/View_Snippets#aggregate_sum.
>>>
>>>
>>>
>>> I have a database with 2 types of documents, resources and votes.  The
>>> documents in the database look something like this:
>>>
>>> resource 1 {_id: 1, type:"resource", name:"resource 1", ...}
>>> resource 2 {_id: 2, type:"resource", name:"resource 2", ...}
>>> resource 3 {_id: 3, type:"resource", name:"resource 3", ...}
>>>
>>> vote for resource 3 {type:"vote", resource_id: 3, user_id: foo}
>>> vote for resource 2 {type:"vote", resource_id: 2, user_id: foo}
>>> vote for resource 3 {type:"vote", resource_id: 3, user_id: bar}
>>>
>>>
>>> I want to create a view that will give me the resources in order of the
>>> most votes along with the number of votes.  In other words, I want to
>>> write
>>> a view that gives me something like the following results:
>>>
>>> {id:3, key:3, value:{doc:(document for resource 3), score: 2}},
>>> {id:2, key:2, value:{doc:(document for resource 2), score: 1}},
>>> {id:1, key:1, value:{doc:(document for resource 1), score: 0}},
>>>
>>>
>>>
>>>
>>> Here is my map function:
>>>
>>>
>>> function(doc) {
>>>      if (doc.type == 'vote' && doc.resource_id) {
>>>              emit(doc.resource_id, doc);
>>>      } else if (doc.type == 'resource' && doc.name) {
>>>              emit(doc._id, doc);
>>>      }
>>> }
>>>
>>>
>>> Here is my reduce function:
>>>
>>>
>>>
>>> function(keys, values, rereduce) {
>>>      var score = 0;
>>>      var output_doc = {};
>>>
>>>      for (var i=0; i < values.length; i++) {
>>>              if (values[i].type == 'vote') {
>>>                      ++score;
>>>              } else if (values[i].type == 'resource') {
>>>                      output_doc = values[i];
>>>              }
>>>      }
>>>
>>>      return {doc:output_doc, score:score};
>>> }
>>>
>>>
>>>
>>> When I query this view in Futon I get the following error:
>>>
>>> "Reduce output must shrink more rapidly"
>>>
>>>
>>> Am I trying to do something that can't be done with CouchDB?  Or am I
>>> just
>>> missing something?
>>>
>>> Thanks so much for any help.
>>>
>>> - Devon
>>>
>>>
>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: Sorting items by number of votes

Posted by Devon Weller <dw...@devonweller.com>.

That is true.  My app would need to detect the conflict, reload the  
document, add the vote to the array again and re-try the update.  I  
don't imagine enough collisions will occur in my app that this will be  
a problem.  And I believe it will make my views a lot easier to deal  
with.

- Devon

On Nov 5, 2009, at 1:57 PM, Nathan Stott wrote:

> If you put your votes in array belonging to the resource document,  
> you will
> run into problems with conflicts if voting occurs near  
> simultaneously on the
> same resource.
>
> On Thu, Nov 5, 2009 at 1:51 PM, Devon Weller  
> <dw...@devonweller.com>wrote:
>
>>
>> Thanks Nathan, Daniel and others.  Those are some good suggestions  
>> and
>> workarounds.
>>
>> I think I may need to rethink my design and just put the votes in  
>> an array
>> belonging to the resource document.
>>
>>
>> As for the aggregate sum example in the wiki, I still think it  
>> might be
>> broken.  I'm starting another thread with details about that in a  
>> separate
>> email.
>>
>> - Devon
>>

Re: Sorting items by number of votes

Posted by Nathan Stott <nr...@gmail.com>.

If you put your votes in array belonging to the resource document, you will
run into problems with conflicts if voting occurs near simultaneously on the
same resource.

On Thu, Nov 5, 2009 at 1:51 PM, Devon Weller <dw...@devonweller.com>wrote:

>
> Thanks Nathan, Daniel and others.  Those are some good suggestions and
> workarounds.
>
> I think I may need to rethink my design and just put the votes in an array
> belonging to the resource document.
>
>
> As for the aggregate sum example in the wiki, I still think it might be
> broken.  I'm starting another thread with details about that in a separate
> email.
>
> - Devon
>
>
>
> On Nov 5, 2009, at 12:53 PM, Nathan Stott wrote:
>
>  Right now I'm using lists to do 'joins' of my multi-doc-type views.
>>
>
> On Nov 5, 2009, at 11:43 AM, Daniel Truemper wrote:
>
>  You could however write another type of document (VoteCount) into your
>> database containing the resource and the number of votes. Then emitting as
>> key something like [ #votes, resource ] will give you an ordered view based
>> on the number of votes. You could trigger the view update from the client
>> each time a vote is made (i.e. add a vote document, call the view, update
>> the VoteCount document and call the new view to get the ordered votes). You
>> could also do this automatically on the CouchDB using update notifiers and
>> simple Bash/Python/Perl/whatever scripts...
>>
>

Re: Sorting items by number of votes

Posted by Devon Weller <dw...@devonweller.com>.

Thanks Nathan, Daniel and others.  Those are some good suggestions and  
workarounds.

I think I may need to rethink my design and just put the votes in an  
array belonging to the resource document.

As for the aggregate sum example in the wiki, I still think it might  
be broken.  I'm starting another thread with details about that in a  
separate email.

- Devon

On Nov 5, 2009, at 12:53 PM, Nathan Stott wrote:

> Right now I'm using lists to do 'joins' of my multi-doc-type views.

On Nov 5, 2009, at 11:43 AM, Daniel Truemper wrote:

> You could however write another type of document (VoteCount) into  
> your database containing the resource and the number of votes. Then  
> emitting as key something like [ #votes, resource ] will give you an  
> ordered view based on the number of votes. You could trigger the  
> view update from the client each time a vote is made (i.e. add a  
> vote document, call the view, update the VoteCount document and call  
> the new view to get the ordered votes). You could also do this  
> automatically on the CouchDB using update notifiers and simple Bash/ 
> Python/Perl/whatever scripts...

Re: Sorting items by number of votes

Posted by Nathan Stott <nr...@gmail.com>.

Lists are capable of building up arbitrary data structures much like what a
client would have to do to sort it.  I don't know if there is some limit on
the memory usage that a list can have.  If there is I have not heard of it.
 Right now I'm using lists to do 'joins' of my multi-doc-type views. I
haven't thrown the sorting into the mix, but I've thought about giving it a
shot to see how it performs.

On Thu, Nov 5, 2009 at 12:47 PM, Daniel Truemper <
daniel.truemper@googlemail.com> wrote:

>
>  Why would a list be any less efficient than sorting on the client side?
>>  You
>> could even use the view API to constrain the results sent to the list
>> where
>> appropriate.
>>
> I did not want to say that a list is less efficient than sorting on the
> client side. But (again AFAIK) the list function can format the view's
> output in the order the view returns the values. So in Devon's case where he
> wants to sort the view by value instead of key, a list will not make any
> difference with respect to the sorting...
>
> Maybe I am wrong and the list function can sort by value...???
>
> Daniel
>
>

Re: Sorting items by number of votes

Posted by Daniel Truemper <da...@googlemail.com>.

> Why would a list be any less efficient than sorting on the client  
> side?  You
> could even use the view API to constrain the results sent to the  
> list where
> appropriate.
I did not want to say that a list is less efficient than sorting on  
the client side. But (again AFAIK) the list function can format the  
view's output in the order the view returns the values. So in Devon's  
case where he wants to sort the view by value instead of key, a list  
will not make any difference with respect to the sorting...

Maybe I am wrong and the list function can sort by value...???

Daniel

Re: Sorting items by number of votes

Posted by Nathan Stott <nr...@gmail.com>.

Why would a list be any less efficient than sorting on the client side?  You
could even use the view API to constrain the results sent to the list where
appropriate.

On Thu, Nov 5, 2009 at 11:43 AM, Daniel Truemper <tr...@googlemail.com>wrote:

> Hi,
>
>
>  Using a list is an interesting idea.  Although, I suspect that method
>> would become inefficient for things like "give me the 10 resources with the
>> most votes" when there are 10,000 resources in the database.
>>
> hm, I don't think that a list is the appropriate way to go here since a
> list is based on a view again (AFAIK).
>
>
>  I think my solution will be to create a map reduce which just counts the
>> votes by resource_id.  And then use that information to do a bulk request
>> for the top 10 documents by ID.
>>
>
>  Regarding the example in the wiki:
>>
>> As a new user, I just followed an example from the wiki:
>> http://wiki.apache.org/couchdb/View_Snippets#aggregate_sum
>>
>> Is that example an incorrect way of using a CouchDB view?  Should it be
>> removed?
>>
> No, it is not. If you step through it you will notice what happens. Here is
> the code again:
>
> // Map function
> function(doc) {
>  if (doc.Type == "customer") {
>    emit([doc._id, 0], doc);
>  } else if (doc.Type == "order") {
>    emit([doc.customer_id, 1], doc);
>  }
> }
>
> // Reduce function
> // Only produces meaningful output.customer_details if group_level >= 1
> function(keys, values, rereduce) {
>  var output = {};
>  if (rereduce) {
>    for (idx in values) {
>      if (values[idx].total !== undefined) {
>        output.total += values[idx].total;
>      } else if (values[idx].customer_details !== undefined) {
>        output.customer_details = values[idx].customer_details;
>      }
>    }
>  } else {
>    for (idx in values) {
>      if (values[idx].Type == "customer") output.customer_details = doc;
>      else if (values[idx].Type == "order") output.total += 1;
>    }
>  }
>  return output;
> }
>
>
>
> 1. The map function emits a compound key that is [doc._id, 0] for customers
> and [doc.customer_id, 1] for orders. So in the case of 2 customers with 3
> orders each the resulting tree looks like the following:
>
> key, value
>
> [ 1, 0 ], customer1
> [ 1, 1 ], customer1order
> [ 1, 1 ], customer1order
> [ 1, 1 ], customer1order
> [ 2, 0 ], customer2
> [ 2, 1 ], customer2order
> [ 2, 1 ], customer2order
> [ 2, 1 ], customer2order
>
> with customer1 having id 1 and customer2 having id 2.
>
>
>
>
>
> 2. In the reduce function are 2 different kinds of reduction. A small note:
> you should call the view with group_level=1 otherwise you won't get
> aggregated results!
>
> The basic operation of the reduce function is the second part:
>
>    for (idx in values) {
>      if (values[idx].Type == "customer") output.customer_details = doc;
>      else if (values[idx].Type == "order") output.total += 1;
>    }
>
> What happens is that it iterates through all key/value pairs where the key
> begins with [1]. So:
>
> [ 1, 0 ], customer1
> [ 1, 1 ], customer1order
> [ 1, 1 ], customer1order
> [ 1, 1 ], customer1order
>
> So for the first entry (a customer) the output object is filled with the
> customer doc. For all orders a counter inside the output object is
> increased. So in the end the following would be returned:
>
> {
>  customer_details = customer1,
>  total = 3
> }
>
> So: input is a list of 4 values, the output only contains 1.
>
>
>
>
>
> 3. The second case of the reduce function deals with the phase where there
> are so many orders that CouchDB internally stores mid-values of not all
> orders with customers in the same tree bucket.
>
> Example:
>
>           top
>        /         \
>      / \          / \
>     /   \        /   \
>  a     b     c     d
>
> a = [ 1, 0 ], customer1
> b = [ 1, 1 ], customer1order
> c = [ 1, 1 ], customer1order
> d = [ 1, 1 ], customer1order
>
> So if CouchDB internally stores mid-values of a+b and c+d you will have the
> two output objects:
>
> {
>  customer_details = customer1,
>  total = 1
> }
>
> {
>  total = 2
> }
>
> These two values are now used to rereduce:
>
>    for (idx in values) {
>      if (values[idx].total !== undefined) {
>        output.total += values[idx].total;
>      } else if (values[idx].customer_details !== undefined) {
>        output.customer_details = values[idx].customer_details;
>      }
>    }
>
> So in the end you again have the above value:
>
> {
>  customer_details = customer1,
>  total = 3
>
> }
>
>
>
>  Here is my reduce function:
>>>>
>>>> function(keys, values, rereduce) {
>>>>     var score = 0;
>>>>     var output_doc = {};
>>>>
>>>>     for (var i=0; i < values.length; i++) {
>>>>             if (values[i].type == 'vote') {
>>>>                     ++score;
>>>>             } else if (values[i].type == 'resource') {
>>>>                     output_doc = values[i];
>>>>             }
>>>>     }
>>>>
>>>>     return {doc:output_doc, score:score};
>>>> }
>>>>
>>>
> First I think you need to implement the rereduce phase, otherwise you will
> get wrong numbers with large amounts of data.
>
> From looking at your reduce function I seem to remember that the error
> message is based on some byte length difference between the incoming and
> outgoing value of the reduce function. So if the incoming values only
> contain one very large resource document and several smaller votes, the fact
> that you are returning the resource document might get in your way here. So
> I think it would be better if you would only store the document id and get
> that document in a separate call to the db.
>
>
>
> And a little side note: at the moment you cannot order the view based on
> the value! Ordering is only done by keys!
>
> You could however write another type of document (VoteCount) into your
> database containing the resource and the number of votes. Then emitting as
> key something like [ #votes, resource ] will give you an ordered view based
> on the number of votes. You could trigger the view update from the client
> each time a vote is made (i.e. add a vote document, call the view, update
> the VoteCount document and call the new view to get the ordered votes). You
> could also do this automatically on the CouchDB using update notifiers and
> simple Bash/Python/Perl/whatever scripts...
>
> HTH
> Daniel
>

Re: Sorting items by number of votes

Posted by Daniel Truemper <tr...@googlemail.com>.

Hi,

> Using a list is an interesting idea.  Although, I suspect that  
> method would become inefficient for things like "give me the 10  
> resources with the most votes" when there are 10,000 resources in  
> the database.
hm, I don't think that a list is the appropriate way to go here since  
a list is based on a view again (AFAIK).

> I think my solution will be to create a map reduce which just counts  
> the votes by resource_id.  And then use that information to do a  
> bulk request for the top 10 documents by ID.

> Regarding the example in the wiki:
>
> As a new user, I just followed an example from the wiki: http://wiki.apache.org/couchdb/View_Snippets#aggregate_sum
>
> Is that example an incorrect way of using a CouchDB view?  Should it  
> be removed?
No, it is not. If you step through it you will notice what happens.  
Here is the code again:

// Map function
function(doc) {
   if (doc.Type == "customer") {
     emit([doc._id, 0], doc);
   } else if (doc.Type == "order") {
     emit([doc.customer_id, 1], doc);
   }
}

// Reduce function
// Only produces meaningful output.customer_details if group_level >= 1
function(keys, values, rereduce) {
   var output = {};
   if (rereduce) {
     for (idx in values) {
       if (values[idx].total !== undefined) {
         output.total += values[idx].total;
       } else if (values[idx].customer_details !== undefined) {
         output.customer_details = values[idx].customer_details;
       }
     }
   } else {
     for (idx in values) {
       if (values[idx].Type == "customer") output.customer_details =  
doc;
       else if (values[idx].Type == "order") output.total += 1;
     }
   }
   return output;
}



1. The map function emits a compound key that is [doc._id, 0] for  
customers and [doc.customer_id, 1] for orders. So in the case of 2  
customers with 3 orders each the resulting tree looks like the  
following:

key, value

[ 1, 0 ], customer1
[ 1, 1 ], customer1order
[ 1, 1 ], customer1order
[ 1, 1 ], customer1order
[ 2, 0 ], customer2
[ 2, 1 ], customer2order
[ 2, 1 ], customer2order
[ 2, 1 ], customer2order

with customer1 having id 1 and customer2 having id 2.





2. In the reduce function are 2 different kinds of reduction. A small  
note: you should call the view with group_level=1 otherwise you won't  
get aggregated results!

The basic operation of the reduce function is the second part:

     for (idx in values) {
       if (values[idx].Type == "customer") output.customer_details =  
doc;
       else if (values[idx].Type == "order") output.total += 1;
     }

What happens is that it iterates through all key/value pairs where the  
key begins with [1]. So:

[ 1, 0 ], customer1
[ 1, 1 ], customer1order
[ 1, 1 ], customer1order
[ 1, 1 ], customer1order

So for the first entry (a customer) the output object is filled with  
the customer doc. For all orders a counter inside the output object is  
increased. So in the end the following would be returned:

{
   customer_details = customer1,
   total = 3
}

So: input is a list of 4 values, the output only contains 1.





3. The second case of the reduce function deals with the phase where  
there are so many orders that CouchDB internally stores mid-values of  
not all orders with customers in the same tree bucket.

Example:

            top
         /         \
       / \          / \
      /   \        /   \
   a     b     c     d

a = [ 1, 0 ], customer1
b = [ 1, 1 ], customer1order
c = [ 1, 1 ], customer1order
d = [ 1, 1 ], customer1order

So if CouchDB internally stores mid-values of a+b and c+d you will  
have the two output objects:

{
   customer_details = customer1,
   total = 1
}

{
   total = 2
}

These two values are now used to rereduce:

     for (idx in values) {
       if (values[idx].total !== undefined) {
         output.total += values[idx].total;
       } else if (values[idx].customer_details !== undefined) {
         output.customer_details = values[idx].customer_details;
       }
     }

So in the end you again have the above value:

{
   customer_details = customer1,
   total = 3
}



>>> Here is my reduce function:
>>>
>>> function(keys, values, rereduce) {
>>>      var score = 0;
>>>      var output_doc = {};
>>>
>>>      for (var i=0; i < values.length; i++) {
>>>              if (values[i].type == 'vote') {
>>>                      ++score;
>>>              } else if (values[i].type == 'resource') {
>>>                      output_doc = values[i];
>>>              }
>>>      }
>>>
>>>      return {doc:output_doc, score:score};
>>> }

First I think you need to implement the rereduce phase, otherwise you  
will get wrong numbers with large amounts of data.

 From looking at your reduce function I seem to remember that the  
error message is based on some byte length difference between the  
incoming and outgoing value of the reduce function. So if the incoming  
values only contain one very large resource document and several  
smaller votes, the fact that you are returning the resource document  
might get in your way here. So I think it would be better if you would  
only store the document id and get that document in a separate call to  
the db.



And a little side note: at the moment you cannot order the view based  
on the value! Ordering is only done by keys!

You could however write another type of document (VoteCount) into your  
database containing the resource and the number of votes. Then  
emitting as key something like [ #votes, resource ] will give you an  
ordered view based on the number of votes. You could trigger the view  
update from the client each time a vote is made (i.e. add a vote  
document, call the view, update the VoteCount document and call the  
new view to get the ordered votes). You could also do this  
automatically on the CouchDB using update notifiers and simple Bash/ 
Python/Perl/whatever scripts...

HTH
Daniel

Re: Sorting items by number of votes

Posted by Devon Weller <dw...@devonweller.com>.

Thanks Nathan.

Using a list is an interesting idea.  Although, I suspect that method  
would become inefficient for things like "give me the 10 resources  
with the most votes" when there are 10,000 resources in the database.

I think my solution will be to create a map reduce which just counts  
the votes by resource_id.  And then use that information to do a bulk  
request for the top 10 documents by ID.




Regarding the example in the wiki:

As a new user, I just followed an example from the wiki: http://wiki.apache.org/couchdb/View_Snippets#aggregate_sum

Is that example an incorrect way of using a CouchDB view?  Should it  
be removed?


- Devon






On Nov 5, 2009, at 8:04 AM, Nathan Stott wrote:

> Reduces are not for joins.  A lot of people try that when they first  
> start
> using couch.  I tried it.
>
> This is more appropriate for a list.  If you feed your view into a  
> list,
> then you can have the list do the processing that you want and  
> create the
> 'joined' objects and emit them as JSON (or however else you like)
>
>
> On Thu, Nov 5, 2009 at 7:46 AM, Devon Weller  
> <dw...@devonweller.com>wrote:
>
>> Hi.  I'm new to CouchDB (and this list) and I am looking for some  
>> help.  I
>> am trying to do something very similar to
>> http://wiki.apache.org/couchdb/View_Snippets#aggregate_sum.
>>
>>
>>
>> I have a database with 2 types of documents, resources and votes.   
>> The
>> documents in the database look something like this:
>>
>> resource 1 {_id: 1, type:"resource", name:"resource 1", ...}
>> resource 2 {_id: 2, type:"resource", name:"resource 2", ...}
>> resource 3 {_id: 3, type:"resource", name:"resource 3", ...}
>>
>> vote for resource 3 {type:"vote", resource_id: 3, user_id: foo}
>> vote for resource 2 {type:"vote", resource_id: 2, user_id: foo}
>> vote for resource 3 {type:"vote", resource_id: 3, user_id: bar}
>>
>>
>> I want to create a view that will give me the resources in order of  
>> the
>> most votes along with the number of votes.  In other words, I want  
>> to write
>> a view that gives me something like the following results:
>>
>> {id:3, key:3, value:{doc:(document for resource 3), score: 2}},
>> {id:2, key:2, value:{doc:(document for resource 2), score: 1}},
>> {id:1, key:1, value:{doc:(document for resource 1), score: 0}},
>>
>>
>>
>>
>> Here is my map function:
>>
>>
>> function(doc) {
>>       if (doc.type == 'vote' && doc.resource_id) {
>>               emit(doc.resource_id, doc);
>>       } else if (doc.type == 'resource' && doc.name) {
>>               emit(doc._id, doc);
>>       }
>> }
>>
>>
>> Here is my reduce function:
>>
>>
>>
>> function(keys, values, rereduce) {
>>       var score = 0;
>>       var output_doc = {};
>>
>>       for (var i=0; i < values.length; i++) {
>>               if (values[i].type == 'vote') {
>>                       ++score;
>>               } else if (values[i].type == 'resource') {
>>                       output_doc = values[i];
>>               }
>>       }
>>
>>       return {doc:output_doc, score:score};
>> }
>>
>>
>>
>> When I query this view in Futon I get the following error:
>>
>> "Reduce output must shrink more rapidly"
>>
>>
>> Am I trying to do something that can't be done with CouchDB?  Or am  
>> I just
>> missing something?
>>
>> Thanks so much for any help.
>>
>> - Devon
>>
>>

Re: Sorting items by number of votes

Posted by Nathan Stott <nr...@gmail.com>.

Reduces are not for joins.  A lot of people try that when they first start
using couch.  I tried it.

This is more appropriate for a list.  If you feed your view into a list,
then you can have the list do the processing that you want and create the
'joined' objects and emit them as JSON (or however else you like)


On Thu, Nov 5, 2009 at 7:46 AM, Devon Weller <dw...@devonweller.com>wrote:

> Hi.  I'm new to CouchDB (and this list) and I am looking for some help.  I
> am trying to do something very similar to
> http://wiki.apache.org/couchdb/View_Snippets#aggregate_sum.
>
>
>
> I have a database with 2 types of documents, resources and votes.  The
> documents in the database look something like this:
>
> resource 1 {_id: 1, type:"resource", name:"resource 1", ...}
> resource 2 {_id: 2, type:"resource", name:"resource 2", ...}
> resource 3 {_id: 3, type:"resource", name:"resource 3", ...}
>
> vote for resource 3 {type:"vote", resource_id: 3, user_id: foo}
> vote for resource 2 {type:"vote", resource_id: 2, user_id: foo}
> vote for resource 3 {type:"vote", resource_id: 3, user_id: bar}
>
>
> I want to create a view that will give me the resources in order of the
> most votes along with the number of votes.  In other words, I want to write
> a view that gives me something like the following results:
>
> {id:3, key:3, value:{doc:(document for resource 3), score: 2}},
> {id:2, key:2, value:{doc:(document for resource 2), score: 1}},
> {id:1, key:1, value:{doc:(document for resource 1), score: 0}},
>
>
>
>
> Here is my map function:
>
>
> function(doc) {
>        if (doc.type == 'vote' && doc.resource_id) {
>                emit(doc.resource_id, doc);
>        } else if (doc.type == 'resource' && doc.name) {
>                emit(doc._id, doc);
>        }
> }
>
>
> Here is my reduce function:
>
>
>
> function(keys, values, rereduce) {
>        var score = 0;
>        var output_doc = {};
>
>        for (var i=0; i < values.length; i++) {
>                if (values[i].type == 'vote') {
>                        ++score;
>                } else if (values[i].type == 'resource') {
>                        output_doc = values[i];
>                }
>        }
>
>        return {doc:output_doc, score:score};
> }
>
>
>
> When I query this view in Futon I get the following error:
>
> "Reduce output must shrink more rapidly"
>
>
> Am I trying to do something that can't be done with CouchDB?  Or am I just
> missing something?
>
> Thanks so much for any help.
>
> - Devon
>
>