You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by afters <af...@gmail.com> on 2011/05/22 20:12:50 UTC

DB that mirrors a view

Hi guys,

I'm giving a serious look into building a db that mirrors the contents of a
view (for running map on the results of map-reduce).

The hurdle that I'm facing is tracking changes in the view:
It's possible to listen to the db's changes feed and deduce which view-keys
should be updated for a particular seqnum, then query the view with these
keys. Unfortunately, the view would only return the latest results, that may
easily incorporate changes from later seqnums, thus making my mirroring db
inconsistent with the view.

Any thoughts? Has anyone else meddled with turning a view to a db?
It would be useful if I could somehow freeze the update of the view (I guess
I could do it by replicating its db only to a certain point, but that would
mean a whole other db)

Thanks,
  a.

Re: DB that mirrors a view

Posted by afters <af...@gmail.com>.
Come to think of it, maybe a simple way to achieve this is with an
intermediate db:
This db would contain just the calculated map values, and would have a view
that does the reduce. This would mimic the original map-reduce operation
with one key difference: I would control this db's update rate, and thus
could fetch a correct view for each change in the original db.


On 23 May 2011 10:23, afters <af...@gmail.com> wrote:

> It would save some effort, but probably wouldn't help in fetching the
> corresponding reduce values.
>
> On 23 May 2011 09:44, Benoit Chesneau <bc...@gmail.com> wrote:
>
>>
>> Hi,
>>
>> Indeed, in trunk and 1.1 you can pass a _view filter to change then a
>> view param view=dname/vname that allows you to get all changes using
>> the map function and then know if your view have been updated. Maybe
>> that can help.
>>
>> - benoît
>>
>
>

Re: DB that mirrors a view

Posted by afters <af...@gmail.com>.
It would save some effort, but probably wouldn't help in fetching the
corresponding reduce values.

On 23 May 2011 09:44, Benoit Chesneau <bc...@gmail.com> wrote:

>
> Hi,
>
> Indeed, in trunk and 1.1 you can pass a _view filter to change then a
> view param view=dname/vname that allows you to get all changes using
> the map function and then know if your view have been updated. Maybe
> that can help.
>
> - benoît
>

Re: DB that mirrors a view

Posted by Benoit Chesneau <bc...@gmail.com>.
On Mon, May 23, 2011 at 7:55 AM, afters <af...@gmail.com> wrote:
> Jason,
>
> Thanks for the detailed answer! I've actually written code along the same
> lines as yours (only with eval instead of 'new Function' - thanks for the
> tip).
>
> I'm actually interested in the reduce results, rather than the map results.
> It's still possible to deduce the affected reduce-keys for each change, just
> by using the map function: just clip the emitted map key according to the
> desired group_level. Alas, it's not possible to deduce the query results -
> as reduce depends also on previous changes.
>
> If only I could prevent the view from being updated for a little while, I
> could read the changes feed till I'm in line with the view's seqnum...
>
> Benoit, if you're reading this I'd love to hear more about your fork.
>
>  a.
>
> On 23 May 2011 04:25, Jason Smith <jh...@iriscouch.com> wrote:
>
>> I think Benoit has a fork of CouchDB to do this. Also I believe this
>> feature (a _changes filter that only shows docs that change a view) is
>> possibly coming in the future.
>>
>> As a side-note, view queries support an ?update_seq=true parameter,
>> which will tell you the seqnum of the database which that view
>> represents.
>>
>> Anyway, probably the best bang-for-buck with NodeJS is simply to
>> download the design document and run the map() functions yourself.
>> They are purely functional, and they have no (or perhaps well-known)
>> side-effects.
>>
>> The basic idea is:
>>
>> 1. Fetch the design document as JSON
>> 2. Extract the .views.whatever.map value as a string
>> 3. Build a wrapper map() function that has a custom emit()
>> 4. Run it for every _changes update
>>
>> Below is my basic idea. I think I've seen it elsewhere, perhaps in
>> Benoit's fork. But this is with Node. It could use some optimization
>> too (don't re-define map() every function call).
>>
>> var ddoc = get_ddoc();
>> var couch_map = ddoc.views.someview.map;
>>
>> var changed_keys = [];
>> function emit(key, val) {
>>  console.log("Emit! " + JSON.stringify(key));
>>  changed_keys.push(key);
>> }
>>
>> var map = new Function("doc, emit", "var couch_map = " + couch_map +
>> "; couch_map(doc););
>>
>> changes.forEach(function(change) {
>>  map(change.doc, emit);
>> })
>>
>> console.log("These keys were emitted: " + JSON.stringify(changed_keys));
>>
>> > Hi guys,
>> >
>> > I'm giving a serious look into building a db that mirrors the contents of
>> a
>> > view (for running map on the results of map-reduce).
>> >
>> > The hurdle that I'm facing is tracking changes in the view:
>> > It's possible to listen to the db's changes feed and deduce which
>> view-keys
>> > should be updated for a particular seqnum, then query the view with these
>> > keys. Unfortunately, the view would only return the latest results, that
>> may
>> > easily incorporate changes from later seqnums, thus making my mirroring
>> db
>> > inconsistent with the view.
>> >
>> > Any thoughts? Has anyone else meddled with turning a view to a db?
>> > It would be useful if I could somehow freeze the update of the view (I
>> guess
>> > I could do it by replicating its db only to a certain point, but that
>> would
>> > mean a whole other db)
>> >
>> > Thanks,
>> >  a.
>> >
>>
>>
>>
>> --
>> Iris Couch
>>
>
Hi,

Indeed, in trunk and 1.1 you can pass a _view filter to change then a
view param view=dname/vname that allows you to get all changes using
the map function and then know if your view have been updated. Maybe
that can help.

- benoît

Re: DB that mirrors a view

Posted by afters <af...@gmail.com>.
Jason,

Thanks for the detailed answer! I've actually written code along the same
lines as yours (only with eval instead of 'new Function' - thanks for the
tip).

I'm actually interested in the reduce results, rather than the map results.
It's still possible to deduce the affected reduce-keys for each change, just
by using the map function: just clip the emitted map key according to the
desired group_level. Alas, it's not possible to deduce the query results -
as reduce depends also on previous changes.

If only I could prevent the view from being updated for a little while, I
could read the changes feed till I'm in line with the view's seqnum...

Benoit, if you're reading this I'd love to hear more about your fork.

 a.

On 23 May 2011 04:25, Jason Smith <jh...@iriscouch.com> wrote:

> I think Benoit has a fork of CouchDB to do this. Also I believe this
> feature (a _changes filter that only shows docs that change a view) is
> possibly coming in the future.
>
> As a side-note, view queries support an ?update_seq=true parameter,
> which will tell you the seqnum of the database which that view
> represents.
>
> Anyway, probably the best bang-for-buck with NodeJS is simply to
> download the design document and run the map() functions yourself.
> They are purely functional, and they have no (or perhaps well-known)
> side-effects.
>
> The basic idea is:
>
> 1. Fetch the design document as JSON
> 2. Extract the .views.whatever.map value as a string
> 3. Build a wrapper map() function that has a custom emit()
> 4. Run it for every _changes update
>
> Below is my basic idea. I think I've seen it elsewhere, perhaps in
> Benoit's fork. But this is with Node. It could use some optimization
> too (don't re-define map() every function call).
>
> var ddoc = get_ddoc();
> var couch_map = ddoc.views.someview.map;
>
> var changed_keys = [];
> function emit(key, val) {
>  console.log("Emit! " + JSON.stringify(key));
>  changed_keys.push(key);
> }
>
> var map = new Function("doc, emit", "var couch_map = " + couch_map +
> "; couch_map(doc););
>
> changes.forEach(function(change) {
>  map(change.doc, emit);
> })
>
> console.log("These keys were emitted: " + JSON.stringify(changed_keys));
>
> > Hi guys,
> >
> > I'm giving a serious look into building a db that mirrors the contents of
> a
> > view (for running map on the results of map-reduce).
> >
> > The hurdle that I'm facing is tracking changes in the view:
> > It's possible to listen to the db's changes feed and deduce which
> view-keys
> > should be updated for a particular seqnum, then query the view with these
> > keys. Unfortunately, the view would only return the latest results, that
> may
> > easily incorporate changes from later seqnums, thus making my mirroring
> db
> > inconsistent with the view.
> >
> > Any thoughts? Has anyone else meddled with turning a view to a db?
> > It would be useful if I could somehow freeze the update of the view (I
> guess
> > I could do it by replicating its db only to a certain point, but that
> would
> > mean a whole other db)
> >
> > Thanks,
> >  a.
> >
>
>
>
> --
> Iris Couch
>

Re: DB that mirrors a view

Posted by Jason Smith <jh...@iriscouch.com>.
I think Benoit has a fork of CouchDB to do this. Also I believe this
feature (a _changes filter that only shows docs that change a view) is
possibly coming in the future.

As a side-note, view queries support an ?update_seq=true parameter,
which will tell you the seqnum of the database which that view
represents.

Anyway, probably the best bang-for-buck with NodeJS is simply to
download the design document and run the map() functions yourself.
They are purely functional, and they have no (or perhaps well-known)
side-effects.

The basic idea is:

1. Fetch the design document as JSON
2. Extract the .views.whatever.map value as a string
3. Build a wrapper map() function that has a custom emit()
4. Run it for every _changes update

Below is my basic idea. I think I've seen it elsewhere, perhaps in
Benoit's fork. But this is with Node. It could use some optimization
too (don't re-define map() every function call).

var ddoc = get_ddoc();
var couch_map = ddoc.views.someview.map;

var changed_keys = [];
function emit(key, val) {
  console.log("Emit! " + JSON.stringify(key));
  changed_keys.push(key);
}

var map = new Function("doc, emit", "var couch_map = " + couch_map +
"; couch_map(doc););

changes.forEach(function(change) {
  map(change.doc, emit);
})

console.log("These keys were emitted: " + JSON.stringify(changed_keys));

> Hi guys,
>
> I'm giving a serious look into building a db that mirrors the contents of a
> view (for running map on the results of map-reduce).
>
> The hurdle that I'm facing is tracking changes in the view:
> It's possible to listen to the db's changes feed and deduce which view-keys
> should be updated for a particular seqnum, then query the view with these
> keys. Unfortunately, the view would only return the latest results, that may
> easily incorporate changes from later seqnums, thus making my mirroring db
> inconsistent with the view.
>
> Any thoughts? Has anyone else meddled with turning a view to a db?
> It would be useful if I could somehow freeze the update of the view (I guess
> I could do it by replicating its db only to a certain point, but that would
> mean a whole other db)
>
> Thanks,
>  a.
>



-- 
Iris Couch