You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Patrick Barnes <mr...@gmail.com> on 2010/08/26 15:40:26 UTC

Seamless view rebuilding?

I have a database serving documents through a number of intermediary 
application servers, to the users' web browsers.

There are two mechanisms by which documents are modified;
a) Piecemeal updates as a result of user actions. (ie adding or updating 
a record)
b) Bulk updates, typically from import scripts, that might modify tens 
of thousands of documents at once.

The problem I'm having, is that when a set of bulk updates go through, 
it can take a long time to rebuild the view indexes. Meanwhile, several 
user's web requests will time out until rebuilding is complete.

Stale=ok is a simple solution to the bulk problem, but the application 
servers will also expect to be able to update documents, and retrieve 
the changes immediately after.

Is there a good way to avoid these large view update delays?

-Patrick Barnes

Re: Seamless view rebuilding?

Posted by Stephen Prater <st...@agrussell.com>.
I solved a somewhat similar problem (where I needed a new view index  
on every push, regardless of whether the view had actually changed) by  
adding a !stamp macro to couchapp which subs in a comment with the  
timestamp.

You could pull the new ddoc copy trick then - since it would  
effectively be a new view each time you did a couchapp push.

Of course, that rebuilds the ENTIRE view, which might not be okay.

On Aug 26, 2010, at 10:00 AM, J Chris Anderson wrote:

>
> On Aug 26, 2010, at 7:43 AM, Patrick Barnes wrote:
>
>> What about a staging db?
>>
>> If I had:
>> * Continuous replication from production->staging (for those  
>> piecemeal updates)
>> * Made major batch updates to the staging server
>> * Rebuilt the staging server views, then
>> * Replicated from staging->production
>>
>
> View files don't replicate, so this won't help for the batches.
>
> Best bet is probably to query the view periodically during the batch  
> import.
>
> Chris
>
>> Would the production db's views take as long to rebuild as the  
>> single-db model, or does it have some mechanism to optimise it?
>>
>> The couchdb server is also behind a proxy, so there might be some  
>> solution there?
>>
>> -PB
>>
>>
>> On 26/08/2010 11:48 PM, Adam Kocoloski wrote:
>>> But he doesn't have a new view, just a very large batch of updates  
>>> added to an existing view.
>>>
>>> On Aug 26, 2010, at 9:46 AM, Robert Newson wrote:
>>>
>>>> Create a new ddoc with your new view, query that view, waiting  
>>>> for it
>>>> to build, and then copy your new ddoc over your old one. View  
>>>> indexes
>>>> are named on disk after their digest specifically to allow this
>>>> offline building feature. :)
>>>>
>>>> B.
>>>>
>>>> On Thu, Aug 26, 2010 at 2:40 PM, Patrick  
>>>> Barnes<mr...@gmail.com>  wrote:
>>>>> I have a database serving documents through a number of  
>>>>> intermediary
>>>>> application servers, to the users' web browsers.
>>>>>
>>>>> There are two mechanisms by which documents are modified;
>>>>> a) Piecemeal updates as a result of user actions. (ie adding or  
>>>>> updating a
>>>>> record)
>>>>> b) Bulk updates, typically from import scripts, that might  
>>>>> modify tens of
>>>>> thousands of documents at once.
>>>>>
>>>>> The problem I'm having, is that when a set of bulk updates go  
>>>>> through, it
>>>>> can take a long time to rebuild the view indexes. Meanwhile,  
>>>>> several user's
>>>>> web requests will time out until rebuilding is complete.
>>>>>
>>>>> Stale=ok is a simple solution to the bulk problem, but the  
>>>>> application
>>>>> servers will also expect to be able to update documents, and  
>>>>> retrieve the
>>>>> changes immediately after.
>>>>>
>>>>> Is there a good way to avoid these large view update delays?
>>>>>
>>>>> -Patrick Barnes
>>>>>
>>>
>>>
>
>
>


Re: Seamless view rebuilding?

Posted by J Chris Anderson <jc...@apache.org>.
On Aug 26, 2010, at 7:43 AM, Patrick Barnes wrote:

> What about a staging db?
> 
> If I had:
> * Continuous replication from production->staging (for those piecemeal updates)
> * Made major batch updates to the staging server
> * Rebuilt the staging server views, then
> * Replicated from staging->production
> 

View files don't replicate, so this won't help for the batches.

Best bet is probably to query the view periodically during the batch import.

Chris

> Would the production db's views take as long to rebuild as the single-db model, or does it have some mechanism to optimise it?
> 
> The couchdb server is also behind a proxy, so there might be some solution there?
> 
> -PB
> 
> 
> On 26/08/2010 11:48 PM, Adam Kocoloski wrote:
>> But he doesn't have a new view, just a very large batch of updates added to an existing view.
>> 
>> On Aug 26, 2010, at 9:46 AM, Robert Newson wrote:
>> 
>>> Create a new ddoc with your new view, query that view, waiting for it
>>> to build, and then copy your new ddoc over your old one. View indexes
>>> are named on disk after their digest specifically to allow this
>>> offline building feature. :)
>>> 
>>> B.
>>> 
>>> On Thu, Aug 26, 2010 at 2:40 PM, Patrick Barnes<mr...@gmail.com>  wrote:
>>>> I have a database serving documents through a number of intermediary
>>>> application servers, to the users' web browsers.
>>>> 
>>>> There are two mechanisms by which documents are modified;
>>>> a) Piecemeal updates as a result of user actions. (ie adding or updating a
>>>> record)
>>>> b) Bulk updates, typically from import scripts, that might modify tens of
>>>> thousands of documents at once.
>>>> 
>>>> The problem I'm having, is that when a set of bulk updates go through, it
>>>> can take a long time to rebuild the view indexes. Meanwhile, several user's
>>>> web requests will time out until rebuilding is complete.
>>>> 
>>>> Stale=ok is a simple solution to the bulk problem, but the application
>>>> servers will also expect to be able to update documents, and retrieve the
>>>> changes immediately after.
>>>> 
>>>> Is there a good way to avoid these large view update delays?
>>>> 
>>>> -Patrick Barnes
>>>> 
>> 
>> 


Re: Seamless view rebuilding?

Posted by Patrick Barnes <mr...@gmail.com>.
What about a staging db?

If I had:
* Continuous replication from production->staging (for those piecemeal 
updates)
* Made major batch updates to the staging server
* Rebuilt the staging server views, then
* Replicated from staging->production

Would the production db's views take as long to rebuild as the single-db 
model, or does it have some mechanism to optimise it?

The couchdb server is also behind a proxy, so there might be some 
solution there?

-PB


On 26/08/2010 11:48 PM, Adam Kocoloski wrote:
> But he doesn't have a new view, just a very large batch of updates added to an existing view.
>
> On Aug 26, 2010, at 9:46 AM, Robert Newson wrote:
>
>> Create a new ddoc with your new view, query that view, waiting for it
>> to build, and then copy your new ddoc over your old one. View indexes
>> are named on disk after their digest specifically to allow this
>> offline building feature. :)
>>
>> B.
>>
>> On Thu, Aug 26, 2010 at 2:40 PM, Patrick Barnes<mr...@gmail.com>  wrote:
>>> I have a database serving documents through a number of intermediary
>>> application servers, to the users' web browsers.
>>>
>>> There are two mechanisms by which documents are modified;
>>> a) Piecemeal updates as a result of user actions. (ie adding or updating a
>>> record)
>>> b) Bulk updates, typically from import scripts, that might modify tens of
>>> thousands of documents at once.
>>>
>>> The problem I'm having, is that when a set of bulk updates go through, it
>>> can take a long time to rebuild the view indexes. Meanwhile, several user's
>>> web requests will time out until rebuilding is complete.
>>>
>>> Stale=ok is a simple solution to the bulk problem, but the application
>>> servers will also expect to be able to update documents, and retrieve the
>>> changes immediately after.
>>>
>>> Is there a good way to avoid these large view update delays?
>>>
>>> -Patrick Barnes
>>>
>
>

Re: Seamless view rebuilding?

Posted by Adam Kocoloski <ko...@apache.org>.
But he doesn't have a new view, just a very large batch of updates added to an existing view.

On Aug 26, 2010, at 9:46 AM, Robert Newson wrote:

> Create a new ddoc with your new view, query that view, waiting for it
> to build, and then copy your new ddoc over your old one. View indexes
> are named on disk after their digest specifically to allow this
> offline building feature. :)
> 
> B.
> 
> On Thu, Aug 26, 2010 at 2:40 PM, Patrick Barnes <mr...@gmail.com> wrote:
>> I have a database serving documents through a number of intermediary
>> application servers, to the users' web browsers.
>> 
>> There are two mechanisms by which documents are modified;
>> a) Piecemeal updates as a result of user actions. (ie adding or updating a
>> record)
>> b) Bulk updates, typically from import scripts, that might modify tens of
>> thousands of documents at once.
>> 
>> The problem I'm having, is that when a set of bulk updates go through, it
>> can take a long time to rebuild the view indexes. Meanwhile, several user's
>> web requests will time out until rebuilding is complete.
>> 
>> Stale=ok is a simple solution to the bulk problem, but the application
>> servers will also expect to be able to update documents, and retrieve the
>> changes immediately after.
>> 
>> Is there a good way to avoid these large view update delays?
>> 
>> -Patrick Barnes
>> 


Re: Seamless view rebuilding?

Posted by Robert Newson <ro...@gmail.com>.
Create a new ddoc with your new view, query that view, waiting for it
to build, and then copy your new ddoc over your old one. View indexes
are named on disk after their digest specifically to allow this
offline building feature. :)

B.

On Thu, Aug 26, 2010 at 2:40 PM, Patrick Barnes <mr...@gmail.com> wrote:
> I have a database serving documents through a number of intermediary
> application servers, to the users' web browsers.
>
> There are two mechanisms by which documents are modified;
> a) Piecemeal updates as a result of user actions. (ie adding or updating a
> record)
> b) Bulk updates, typically from import scripts, that might modify tens of
> thousands of documents at once.
>
> The problem I'm having, is that when a set of bulk updates go through, it
> can take a long time to rebuild the view indexes. Meanwhile, several user's
> web requests will time out until rebuilding is complete.
>
> Stale=ok is a simple solution to the bulk problem, but the application
> servers will also expect to be able to update documents, and retrieve the
> changes immediately after.
>
> Is there a good way to avoid these large view update delays?
>
> -Patrick Barnes
>