You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Norman Barker <no...@gmail.com> on 2010/08/02 19:25:32 UTC

Re: querying multiple views

Multiview is available at

http://github.com/normanb/couchdb-multiview

I have moved this discussion to dev, any feedback is appreciated.

On Fri, Jul 23, 2010 at 10:44 AM, afshin afzali <a....@gmail.com> wrote:
> Hi Norman,
>
> As Chris said, it's a valuable patch.
>
> Cheers,
> -- afshin
>
> On Fri, Jul 23, 2010 at 3:57 AM, Norman Barker <no...@gmail.com>wrote:
>
>> ok, I have this working in practise, though there are still many
>> improvements that could be made. Sorry for the delay, 1.0
>> experimenting took some of my time last week (yay for 1.0!!).
>>
>> Since couchdb now has an eccn number and ssl components this will take
>> me a little longer to release this code, but it will go out for
>> comment shortly.
>>
>> I did sort the views from smallest to largest (by seeing the number of
>> hits in each view) and the I used the process ring example where you
>> send a message in a ring, each node in the ring represents a view and
>> if there is an intersection it sends a message to the next and if it
>> gets back to the start then it is sent as a chunk to the user. If the
>> match fails then the ring stops (actually it sends an empty message
>> back to the main node).
>>
>> Interesting problem, but I am sure this is just a start, handling
>> externals which return all their data in one blob (as they don't
>> stream) is also an issue.
>>
>> I will work on getting the code out.
>>
>> Norman
>>
>> On Sun, Jul 11, 2010 at 11:13 AM, afshin afzali <a....@gmail.com>
>> wrote:
>> > Thank You Very Much Guys.
>> >
>> > -- afshin
>> >
>> > On Sun, Jul 11, 2010 at 7:27 PM, J Chris Anderson <jc...@gmail.com>
>> wrote:
>> >
>> >>
>> >> On Jul 11, 2010, at 7:27 AM, Norman Barker wrote:
>> >>
>> >> > Afshin
>> >> >
>> >> > I have got the all clear from my work to release this as a patch, I
>> >> > expect to be putting something up on github by the end of the week
>> >> > (internal paperwork permitting). I am going to implement it as an
>> >> > external handler so it can be used and reviewed and from there it will
>> >> > be under a do what you want with it license so it can go into couchdb
>> >> > if accepted.
>> >> >
>> >>
>> >> If you're implementing in Erlang, you probably don't need to go full hog
>> as
>> >> an external.
>> >>
>> >> Because the configuration system is so modular, you are probably best
>> >> adding it as a new httpd_design_handler, or httpd_db_handler.
>> >>
>> >> It should be easy to create a new module and link it in via the
>> >> configuration file.
>> >>
>> >> I don't know if you plan to allow querying across databases (I'd suggest
>> >> restricting the queries to a single database if you want to stay within
>> >> CouchDB's security model, and have it more likely to be accepted as a
>> >> patch.)
>> >>
>> >> We should really move this discussion to dev@ -- a lot of the
>> developers
>> >> only give a cursory glance at the user list, so you will get more
>> valuable
>> >> feedback there.
>> >>
>> >> Thanks for taking the time to write and release the patch!
>> >>
>> >> Chris
>> >>
>> >> > Chris, thanks for the help with the reduce function and confirming the
>> >> concept!
>> >> >
>> >> > Norman
>> >> >
>> >> > On Sun, Jul 11, 2010 at 8:02 AM, afshin afzali <
>> a.afzali2003@gmail.com>
>> >> wrote:
>> >> >> Hi Norman, Chris
>> >> >>
>> >> >> I just wanted to say this is the same problem we are currently facing
>> >> >> with. We are implementing a Local Business Directory application on
>> >> >> couchdb. Our searches need to combine several keys together to find
>> >> >> right entries. To do something like that in server side, we had the
>> >> >> paging mechanism problem, so we have chosen that do Norman's
>> algorithm
>> >> >> in client side! I'll appreciate if there will be a successful
>> progress
>> >> >> in this issue.
>> >> >>
>> >> >> BEST,
>> >> >> -- afshin
>> >> >>
>> >> >> On 7/8/10, J Chris Anderson <jc...@apache.org> wrote:
>> >> >>>
>> >> >>> On Jul 8, 2010, at 10:43 AM, Norman Barker wrote:
>> >> >>>
>> >> >>>> Hi,
>> >> >>>>
>> >> >>>> I have been thinking about how to query multiple views at one time.
>> >> >>>>
>> >> >>>> I have an erlang handler in couchdb that takes a http post
>> containing
>> >> >>>> N view queries, each query contains a startkey and an endkey, I
>> then
>> >> >>>> open up each view in parallel (using pmap) and accumulate the doc
>> ids,
>> >> >>>> then I use the erlang sets module to get the unique values. All
>> good
>> >> >>>> and looks pretty (and works), though it doesn't scale since I am
>> >> >>>> holding all the results on the server (potential memory overload!)
>> >> >>>> whereas I would like to stream the results to the client one by
>> one.
>> >> >>>>
>> >> >>>> I am thinking of doing the following but have some questions;
>> >> >>>>
>> >> >>>> My first question is when I do
>> >> >>>>
>> >> >>>> couch_view:fold(View, FoldlFun, FoldAccInit,
>> >> >>>> couch_httpd_view:make_key_options(Args)),
>> >> >>>>
>> >> >>>> is there a way to call the _count reduce function in code to find
>> the
>> >> >>>> number of rows in the slice between startkey and endkey?
>> >> >>>>
>> >> >>>> If so, I would like to order all the views in the posted query
>> >> >>>> document by the result of _count from smallest to largest.
>> >> >>>>
>> >> >>>> I would then fold over the smallest result view and pull each
>> document
>> >> >>>> id (*) in turn.
>> >> >>>>
>> >> >>>> With each document id I would then call each of the other views in
>> >> >>>> turn with their startkey and endkey and in addition include
>> >> >>>> startkey_docid and endkey_docid with the docid in * above, again
>> >> >>>> calling _count I can test for inclusion. If the doc id is in all
>> views
>> >> >>>> then I will immediate stream this to the client.
>> >> >>>>
>> >> >>>> Am I doing something stupid, is this optimal?
>> >> >>>>
>> >> >>>
>> >> >>> It sounds like you are on the right track. this could be a very
>> >> valuable
>> >> >>> patch to CouchDB once you have it working.
>> >> >>>
>> >> >>>> Any help with the programmatic _count call would be great.
>> >> >>>>
>> >> >>>
>> >> >>> One hint: maybe the call to reduce_to_count will help.
>> >> >>>
>> >> >>> Here's an implementation of a reduce query in Erlang.
>> >> >>>
>> >> >>> http://github.com/jchris/hovercraft/blob/master/hovercraft.erl#L217
>> >> >>>
>> >> >>> Sorry I can't be more helpful. I've successfully bootstrapped this
>> >> stuff in
>> >> >>> my head before, but it always takes a couple of hours of turning my
>> >> brain
>> >> >>> into a step debugger.
>> >> >>>
>> >> >>> Good luck!
>> >> >>>
>> >> >>> Once you get deeper into the code you might have better luck getting
>> >> >>> responses on the dev@ list or maybe the #couchdb IRC channel on
>> >> freenode.
>> >> >>>
>> >> >>> Chris
>> >> >>>
>> >> >>>
>> >> >>>> thanks,
>> >> >>>>
>> >> >>>> Norman
>> >> >>>
>> >> >>>
>> >> >>
>> >>
>> >>
>> >
>>
>