You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Jeff Hinrichs - DM&T <je...@dundeemt.com> on 2009/01/11 20:11:19 UTC

view intersections?

I've been reading and googling trying to figure out the proper way to
do an intersection of views.

The database has documents with an attribute of tags (a list)
['copper','blue','hot','long','twisted']


If I wanted to find all documents that had the tags of 'copper' and
'blue' what is the preferred way?  I could index all the elements of
the tags list and then perform two requests key='copper' and
key='blue' and have the client do the intersection.  Is there a way to
have couchdb do the lifting on this one?

Along the same line, what of the union of tags. 'copper' or 'blue'

Re: view intersections?

Posted by Chris Anderson <jc...@gmail.com>.
On Sun, Jan 11, 2009 at 3:33 PM, Jeff Hinrichs - DM&T
<du...@gmail.com> wrote:
> On Sun, Jan 11, 2009 at 4:49 PM, Jeff Hinrichs - DM&T
> <du...@gmail.com> wrote:
>> On Sun, Jan 11, 2009 at 2:17 PM, Dean Landolt <de...@deanlandolt.com> wrote:
>>> On Sun, Jan 11, 2009 at 2:11 PM, Jeff Hinrichs - DM&T <je...@dundeemt.com>wrote:
>>>
>>>> I've been reading and googling trying to figure out the proper way to
>>>> do an intersection of views.
>>>>
>>>> The database has documents with an attribute of tags (a list)
>>>> ['copper','blue','hot','long','twisted']
>>>>
>>>>
>>>> If I wanted to find all documents that had the tags of 'copper' and
>>>> 'blue' what is the preferred way?  I could index all the elements of
>>>> the tags list and then perform two requests key='copper' and
>>>> key='blue' and have the client do the intersection.  Is there a way to
>>>> have couchdb do the lifting on this one?
>>>>
>>>> Along the same line, what of the union of tags. 'copper' or 'blue'
>>>>
>>>
>>> Two requests, then merge on the client. It's not really a pattern that fits
>>> the map/reduce paradigm well. I don't know the status of the fti
>>> integration, but once that goes down there should be more efficient ways to
>>> handle this.
>>>
>> True enough that it doesn't fit the map/reduce paradigm, but the
>> intersection would be performed post map/reduce. ?? Just like
>> key/startkey/endkey/limit are not part of the map/reduce picture, they
>> appear to be implemented separately to operate on the results of the
>> view created by map/reduce.   This would be an enhancement to querying
>> the view, not in generating the view itself.
>>
>> Feel free to slap me down, as I'm talking from someone who has not
>> looked at the source, is fairly new to couch and is talking from
>> limited experience(I've got thick skin and a desire to learn :).
>> However,  set intersections and to a lesser degree, joins, are a
>> common and useful idiom especially when working with sets.  And views
>> are sets, couchdb already supports limited set operations by giving
>> simple sub-select operations.  The reason, I'm guessing, is do to the
>> natural idiom and freeing the client from doing the work.
>> Intersections are a natural progression of view querying, in my
>> opinion.
>>
>> trying-to-get-someone-else-to-do-my-work'ly,
>> -Jeff
>>
>
> Ok, so now I read the documents and I am able to comprehend things I
> missed before.  couchdb 0.9 can do a union query on a view,  from,
> http://wiki.apache.org/couchdb/HTTP_view_API - Query Options
>
> "A JSON structure of {"keys": ["key1", "key2", ...]} can be posted to
> any user defined view or _all_docs to retrieve just the view rows
> matching that set of keys. Rows are returned in the order of the keys
> specified. Combining this feature with include_docs=true results in
> the so-called multi-document-fetch feature. "
>
> I've verified that
>  curl -X POST --data '{"keys":["short","red"]}'
> http://localhost:5984/delasco-tests/_view/tags/by_family_tagvalue
> returns a union of key="short" or key="red" over the view.  This is
> great because it means someone smart is already thinking about this.
> Now if this thread can just convince those people to take the next
> step and continue on and cover intersections for multiple keys on a
> view in the same spirit!
>
> This does bring a question to mind, why is this implemented as a POST
> and not a GET.  Wouldn't REST dictate that since it doesn't update the
> data a GET would be better?
>

It was implemented as a POST because of limits on URL length that make
GET not useful for many many keys.

I fully support any patch that adds GET support where ?keys=JSON_ARRAY
but we can't drop POST without limiting ourselves severely.



-- 
Chris Anderson
http://jchris.mfdz.com

Re: view intersections?

Posted by Jeff Hinrichs - DM&T <du...@gmail.com>.
On Sun, Jan 11, 2009 at 4:49 PM, Jeff Hinrichs - DM&T
<du...@gmail.com> wrote:
> On Sun, Jan 11, 2009 at 2:17 PM, Dean Landolt <de...@deanlandolt.com> wrote:
>> On Sun, Jan 11, 2009 at 2:11 PM, Jeff Hinrichs - DM&T <je...@dundeemt.com>wrote:
>>
>>> I've been reading and googling trying to figure out the proper way to
>>> do an intersection of views.
>>>
>>> The database has documents with an attribute of tags (a list)
>>> ['copper','blue','hot','long','twisted']
>>>
>>>
>>> If I wanted to find all documents that had the tags of 'copper' and
>>> 'blue' what is the preferred way?  I could index all the elements of
>>> the tags list and then perform two requests key='copper' and
>>> key='blue' and have the client do the intersection.  Is there a way to
>>> have couchdb do the lifting on this one?
>>>
>>> Along the same line, what of the union of tags. 'copper' or 'blue'
>>>
>>
>> Two requests, then merge on the client. It's not really a pattern that fits
>> the map/reduce paradigm well. I don't know the status of the fti
>> integration, but once that goes down there should be more efficient ways to
>> handle this.
>>
> True enough that it doesn't fit the map/reduce paradigm, but the
> intersection would be performed post map/reduce. ?? Just like
> key/startkey/endkey/limit are not part of the map/reduce picture, they
> appear to be implemented separately to operate on the results of the
> view created by map/reduce.   This would be an enhancement to querying
> the view, not in generating the view itself.
>
> Feel free to slap me down, as I'm talking from someone who has not
> looked at the source, is fairly new to couch and is talking from
> limited experience(I've got thick skin and a desire to learn :).
> However,  set intersections and to a lesser degree, joins, are a
> common and useful idiom especially when working with sets.  And views
> are sets, couchdb already supports limited set operations by giving
> simple sub-select operations.  The reason, I'm guessing, is do to the
> natural idiom and freeing the client from doing the work.
> Intersections are a natural progression of view querying, in my
> opinion.
>
> trying-to-get-someone-else-to-do-my-work'ly,
> -Jeff
>

Ok, so now I read the documents and I am able to comprehend things I
missed before.  couchdb 0.9 can do a union query on a view,  from,
http://wiki.apache.org/couchdb/HTTP_view_API - Query Options

"A JSON structure of {"keys": ["key1", "key2", ...]} can be posted to
any user defined view or _all_docs to retrieve just the view rows
matching that set of keys. Rows are returned in the order of the keys
specified. Combining this feature with include_docs=true results in
the so-called multi-document-fetch feature. "

I've verified that
 curl -X POST --data '{"keys":["short","red"]}'
http://localhost:5984/delasco-tests/_view/tags/by_family_tagvalue
returns a union of key="short" or key="red" over the view.  This is
great because it means someone smart is already thinking about this.
Now if this thread can just convince those people to take the next
step and continue on and cover intersections for multiple keys on a
view in the same spirit!

This does bring a question to mind, why is this implemented as a POST
and not a GET.  Wouldn't REST dictate that since it doesn't update the
data a GET would be better?

answering-some-of-my-own'ly,

Jeff

Re: view intersections?

Posted by Jeff Hinrichs - DM&T <du...@gmail.com>.
On Sun, Jan 11, 2009 at 2:17 PM, Dean Landolt <de...@deanlandolt.com> wrote:
> On Sun, Jan 11, 2009 at 2:11 PM, Jeff Hinrichs - DM&T <je...@dundeemt.com>wrote:
>
>> I've been reading and googling trying to figure out the proper way to
>> do an intersection of views.
>>
>> The database has documents with an attribute of tags (a list)
>> ['copper','blue','hot','long','twisted']
>>
>>
>> If I wanted to find all documents that had the tags of 'copper' and
>> 'blue' what is the preferred way?  I could index all the elements of
>> the tags list and then perform two requests key='copper' and
>> key='blue' and have the client do the intersection.  Is there a way to
>> have couchdb do the lifting on this one?
>>
>> Along the same line, what of the union of tags. 'copper' or 'blue'
>>
>
> Two requests, then merge on the client. It's not really a pattern that fits
> the map/reduce paradigm well. I don't know the status of the fti
> integration, but once that goes down there should be more efficient ways to
> handle this.
>
True enough that it doesn't fit the map/reduce paradigm, but the
intersection would be performed post map/reduce. ?? Just like
key/startkey/endkey/limit are not part of the map/reduce picture, they
appear to be implemented separately to operate on the results of the
view created by map/reduce.   This would be an enhancement to querying
the view, not in generating the view itself.

Feel free to slap me down, as I'm talking from someone who has not
looked at the source, is fairly new to couch and is talking from
limited experience(I've got thick skin and a desire to learn :).
However,  set intersections and to a lesser degree, joins, are a
common and useful idiom especially when working with sets.  And views
are sets, couchdb already supports limited set operations by giving
simple sub-select operations.  The reason, I'm guessing, is do to the
natural idiom and freeing the client from doing the work.
Intersections are a natural progression of view querying, in my
opinion.

trying-to-get-someone-else-to-do-my-work'ly,
-Jeff

Re: view intersections?

Posted by Dean Landolt <de...@deanlandolt.com>.
On Sun, Jan 11, 2009 at 2:11 PM, Jeff Hinrichs - DM&T <je...@dundeemt.com>wrote:

> I've been reading and googling trying to figure out the proper way to
> do an intersection of views.
>
> The database has documents with an attribute of tags (a list)
> ['copper','blue','hot','long','twisted']
>
>
> If I wanted to find all documents that had the tags of 'copper' and
> 'blue' what is the preferred way?  I could index all the elements of
> the tags list and then perform two requests key='copper' and
> key='blue' and have the client do the intersection.  Is there a way to
> have couchdb do the lifting on this one?
>
> Along the same line, what of the union of tags. 'copper' or 'blue'
>

Two requests, then merge on the client. It's not really a pattern that fits
the map/reduce paradigm well. I don't know the status of the fti
integration, but once that goes down there should be more efficient ways to
handle this.