You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Brian Candler <B....@pobox.com> on 2009/06/02 10:51:47 UTC

Re: Finding pairs of documents

On Thu, May 28, 2009 at 07:36:30PM +0200, laccolithgrunt@gmx.de wrote:
> I'm quite new to CouchDB and map/reduce and played a little around
> with it. Currently, I am trying to find pairs of two documents which
> fulfill a special condition. E.g., consider the following documents in
> the store (_id and _rev are left out): [{price: 1},{price: 5},{price:
> 9},{price: 1000}]
> 
> Now I would like to find all pair of documents, where (doc1.price + 4
> == doc2.price). How can I express this in couchdb? I expect the
> following result: [[{price: 1},{price: 5}], [{price: 5},{price:9}]]

Not easily. You really want view intersections, which couchdb doesn't yet
have, so you need to intersect in the client.

If you emit(doc.price,null) and emit(doc.price+4,null) then you'll get your
index ordered by price. This makes it quite efficient to do a merge on two
streams. Of course, you can just emit(doc.price,null) and do the +4
calculation in the client as well.

If the number of documents is so large that you can't read the entire index
into RAM on the client, then it's possible to do "streaming" on the view;
CouchDB inserts a newline after each document, so you can read the view one
"line" at a time. Have a look at CouchRest for an example of how this is
done.

A reduce function isn't the right way to go; it won't work well unless the
number of documents you have is tiny, but if that's true, you can just do it
in the client trivially anyway. Using a reduce function you'd have to
generate something like

   {price1=>[docid1, docid2, docid3],
    price2=>[docid4, docid5],
    price3=>[docid6, docid7]}

but unfortunately that would grow linearly with the number of docids, and
that will be extremely inefficient. And you'd still need to search it in the
client to look for pairs of (price, price+4) anyway.

HTH,

Brian.