You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Garren Smith <ga...@apache.org> on 2019/05/28 09:26:43 UTC

[DISCUSS] Views and view signatures

Hi Everyone,

Traditionally every design doc has a view signature (a md5 checksum), which
is made up of the views defined in it along with the options and query
language.

As Paul mentions on the map index RFC [1], the reason that indexes were
grouped together was to save space on the id tree index and to guarantee
that specific indexes are always up to date at the same time. Currently it
is also more efficient to service multiple map/reduce functions at once
when sending docs to the javascript query engine.

With the move to FoundationDB, we don’t get the added space saving of
grouping view indexes together for the id tree index. And to guarantee that
all the views in a design doc are updated to the same point, we would have
to update multiple views in the transaction which will lead to larger
transactions and the possibility of exceeding the 10 MB transaction limit.

Because of that, I would like to separate each index in a design document
so that it has its own unique signature and that each index will be built
on its own. That way only a specific index’s key/values will be updated in
a transaction. It also would mean that when a design doc is updated with
new views or some of the views are changed, the unchanged views won’t need
to be rebuilt as their view signatures would remain the same.

We could look at adding a view group, which contains a collection of views
- most likely the views in one design document would be a view group. So
instead of updating one view in the background job queue, all the indexes
in a view group would be built by that job. But they would not be updated
in the same transaction, however the job would only be considered complete
once all the views in a view group were updated.

Where I’m not 100% if this would work is around the javascript query
engine.  This change would mean that only one function would be processed
at a time,  so it could be a lot more inefficient. Ideally it would be
great to update how the javscript query engine works so it can handle this
case better.

The usage of view signatures is pretty new to me, so I would appreciate
feedback from people that know more about the history of view signatures
and design docs, along with people that use views and rely on indexes in a
design doc all being updated at once.

Cheers
Garren

[1]
https://github.com/apache/couchdb-documentation/pull/410#pullrequestreview-236827606

Re: [DISCUSS] Views and view signatures

Posted by Adam Kocoloski <ko...@apache.org>.
View signatures and design documents have been around a *long* time, as has the deployment pattern around building the new version of the index in the background to minimize production downtime. I don’t think anything substantial has changed on this front in several years.

I think it’s likely that more sophisticated users specifically rely on the behavior that all of the views in a view group are updated together, so that stale=ok requests to different views in a group will all show the state of a database at the same point in time. But I’m also quite eager to hear feedback.

I get the desire to avoid hitting the 10MB limit and maximizing the ability to work with all existing design documents; on the other hand, I think we’d consider any view group emitting more than 10MB of index data *for a single document* to be quite a bad design pattern.

All in all - a tricky topic. Thanks for bringing it up.

Adam

> On May 28, 2019, at 5:26 AM, Garren Smith <ga...@apache.org> wrote:
> 
> Hi Everyone,
> 
> Traditionally every design doc has a view signature (a md5 checksum), which
> is made up of the views defined in it along with the options and query
> language.
> 
> As Paul mentions on the map index RFC [1], the reason that indexes were
> grouped together was to save space on the id tree index and to guarantee
> that specific indexes are always up to date at the same time. Currently it
> is also more efficient to service multiple map/reduce functions at once
> when sending docs to the javascript query engine.
> 
> With the move to FoundationDB, we don’t get the added space saving of
> grouping view indexes together for the id tree index. And to guarantee that
> all the views in a design doc are updated to the same point, we would have
> to update multiple views in the transaction which will lead to larger
> transactions and the possibility of exceeding the 10 MB transaction limit.
> 
> Because of that, I would like to separate each index in a design document
> so that it has its own unique signature and that each index will be built
> on its own. That way only a specific index’s key/values will be updated in
> a transaction. It also would mean that when a design doc is updated with
> new views or some of the views are changed, the unchanged views won’t need
> to be rebuilt as their view signatures would remain the same.
> 
> We could look at adding a view group, which contains a collection of views
> - most likely the views in one design document would be a view group. So
> instead of updating one view in the background job queue, all the indexes
> in a view group would be built by that job. But they would not be updated
> in the same transaction, however the job would only be considered complete
> once all the views in a view group were updated.
> 
> Where I’m not 100% if this would work is around the javascript query
> engine.  This change would mean that only one function would be processed
> at a time,  so it could be a lot more inefficient. Ideally it would be
> great to update how the javscript query engine works so it can handle this
> case better.
> 
> The usage of view signatures is pretty new to me, so I would appreciate
> feedback from people that know more about the history of view signatures
> and design docs, along with people that use views and rely on indexes in a
> design doc all being updated at once.
> 
> Cheers
> Garren
> 
> [1]
> https://github.com/apache/couchdb-documentation/pull/410#pullrequestreview-236827606