You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Joel Reed <jo...@visn.biz> on 2008/08/19 03:03:21 UTC

Replication Options for Offline Web App

I'm currently evaluating CouchDB for an offline web application that 
would involve a central document repository and multiple "leaf nodes" 
where new documents are created.

Does the roadmap for CouchDB include any capabilities that would allow 
me to replicate the database from the leaf nodes to the central server, 
while NOT replicating all documents in the server to the leaf nodes?

Maybe a better way to state my question would be: does the roadmap 
include any option to replicate a single document from one server to 
another? Perhaps for my purposes, I should just do this all manually by 
GETting from one DB and PUTting to the central repo... but it would be 
very nice to just say "all documents on the leaf nodes get replicated to 
the central repo."

jr



Re: Replication Options for Offline Web App

Posted by Sho Fukamachi <sh...@gmail.com>.
On 19/08/2008, at 1:31 PM, Chris Anderson wrote:

> From what I understand about the replication process, the client won't
> have any trouble receiving a subset of the full replication.

Perhaps it would be a good idea for Someone Who Knows to write out a  
brief summary of exactly what goes on with document lifecycle and  
replication. I also have an (incomplete, assumed, probably wrong)  
understanding of how I *think* it works - I've read the technical  
overview but there's still a lot of black box stuff going on.

For example, where is CouchDB storing the list of "documents updated  
since last replication"? Is that list generated by a push replication  
as well? What if you push-replicate to more than one remote server, is  
another list created for that server? How are servers identified - do  
they have IDs? etc etc .. Seems a few people would like to know more  
about the internals of replication.

http://wiki.apache.org/couchdb/ConfiguringDistributedSystems whet my  
appetite but it stops just as it's getting good!


> However,
> it's worth looking out for what happens if you try to replicate again
> later (perhaps from a different view of the same database). Could
> _revs get out of sync?

I am also curious about that - does replication grab only the latest  
_rev, or all of them (if extant)? If the status is deleted, does it  
bother replicating it? What happens if the local DB is compacted and  
deleted docs expunged before replication - are they deleted/retained  
on the remote?

If someone knows of an existing document with the answers to these  
questions, please direct me to it, and forgive my delinquency.


> This feature has been on the roadmap for a while, so maybe Damien has
> some ideas for how it should be designed.

Well, in the Technical Overview at http://incubator.apache.org/couchdb/docs/overview.html 
  it says:

"Partial replicas can be created and maintained. Replication can be  
filtered by a javascript function, so that only particular documents  
or those meeting specific criteria are replicated. This can allow  
users to take subsets of a large shared database application offline  
for their own use, while maintaining normal interaction with the  
application and that subset of data."

I think that's why I thought it actually DID work, thought I later  
realised that document is more of a "features CouchDB will have in the  
future" rather than a description of current functionality. Regardless  
it's obviously it's been intended to work like that for some time so I  
imagine a lot of that design work has already been done and it's just  
a matter of finding the time to finish it : )



> -- 
> Chris Anderson
> http://jchris.mfdz.com


Re: Replication Options for Offline Web App

Posted by Chris Anderson <jc...@grabb.it>.
On Mon, Aug 18, 2008 at 7:46 PM, Sho Fukamachi <sh...@gmail.com> wrote:
>
> I would think it possible, but at the least would require a new type of view
> - or perhaps the current view system (map only, of course) could have a
> special mode, triggered by a key maybe, allowing them to be replicated from
> (including delete info, etc ..)
>

Users could specify a view to calculate a doc-id mask, by which the
server would filter the replication.

>From what I understand about the replication process, the client won't
have any trouble receiving a subset of the full replication. However,
it's worth looking out for what happens if you try to replicate again
later (perhaps from a different view of the same database). Could
_revs get out of sync?

This feature has been on the roadmap for a while, so maybe Damien has
some ideas for how it should be designed.


-- 
Chris Anderson
http://jchris.mfdz.com

Re: Replication Options for Offline Web App

Posted by Sho Fukamachi <sh...@gmail.com>.
On 19/08/2008, at 11:16 AM, Paul Davis wrote:

> Perhaps if you wanted to submit a patch for something like replicating
> from a view. :D I really have no idea if that's even feasible, but
> given the little I know of the internals it sounds possible.

+1 on that idea.

I had actually somehow believed that was already possible until I  
tried it a couple weeks ago and it didn't work!

I would think it possible, but at the least would require a new type  
of view - or perhaps the current view system (map only, of course)  
could have a special mode, triggered by a key maybe, allowing them to  
be replicated from (including delete info, etc ..)

Sho

> Paul
>
> On Mon, Aug 18, 2008 at 9:03 PM, Joel Reed <jo...@visn.biz> wrote:
>> I'm currently evaluating CouchDB for an offline web application  
>> that would
>> involve a central document repository and multiple "leaf nodes"  
>> where new
>> documents are created.
>>
>> Does the roadmap for CouchDB include any capabilities that would  
>> allow me to
>> replicate the database from the leaf nodes to the central server,  
>> while NOT
>> replicating all documents in the server to the leaf nodes?
>>
>> Maybe a better way to state my question would be: does the roadmap  
>> include
>> any option to replicate a single document from one server to another?
>> Perhaps for my purposes, I should just do this all manually by  
>> GETting from
>> one DB and PUTting to the central repo... but it would be very nice  
>> to just
>> say "all documents on the leaf nodes get replicated to the central  
>> repo."
>>
>> jr
>>
>>
>>


Re: Replication Options for Offline Web App

Posted by Joel Reed <jo...@visn.biz>.
Paul Davis wrote:
> Replicating from your leaf nodes *to* the central node already works
> as you'd want. Ie, all documents in the leaf database are sent to the
> central db and things are updated as expected. (Verified this with
> minimal testing, so you might wait for confirmation)
>   
Great. I couldn't tell from the wiki page if doing a:

  POST /_replicate?source=$source_database&target=$target_database

Would trigger a one way or two way replication. Sounds like you're 
saying its one way, which is exactly what I need.

> The caveat is going in the other direction. To date, I don't know that
> there's anything to support replicating just a set of records. I think
> there was talk of this to support things like sharding etc at one
> point or another, but I don't know of anyone actively working on such
> a thing.
>   
Maybe I can figure out some manual process of GETting and PUTting, 
though I suppose this would render the builtin _id and _rev bits worthless.

> Perhaps if you wanted to submit a patch for something like replicating
> from a view. :D I really have no idea if that's even feasible, but
> given the little I know of the internals it sounds possible.
>   
Thanks for the feedback. I appreciate it!

Before I can submit any patches, I'll have to learn Erlang.

jr



Re: Replication Options for Offline Web App

Posted by Paul Davis <pa...@gmail.com>.
Replicating from your leaf nodes *to* the central node already works
as you'd want. Ie, all documents in the leaf database are sent to the
central db and things are updated as expected. (Verified this with
minimal testing, so you might wait for confirmation)

The caveat is going in the other direction. To date, I don't know that
there's anything to support replicating just a set of records. I think
there was talk of this to support things like sharding etc at one
point or another, but I don't know of anyone actively working on such
a thing.

Perhaps if you wanted to submit a patch for something like replicating
from a view. :D I really have no idea if that's even feasible, but
given the little I know of the internals it sounds possible.

Paul

On Mon, Aug 18, 2008 at 9:03 PM, Joel Reed <jo...@visn.biz> wrote:
> I'm currently evaluating CouchDB for an offline web application that would
> involve a central document repository and multiple "leaf nodes" where new
> documents are created.
>
> Does the roadmap for CouchDB include any capabilities that would allow me to
> replicate the database from the leaf nodes to the central server, while NOT
> replicating all documents in the server to the leaf nodes?
>
> Maybe a better way to state my question would be: does the roadmap include
> any option to replicate a single document from one server to another?
> Perhaps for my purposes, I should just do this all manually by GETting from
> one DB and PUTting to the central repo... but it would be very nice to just
> say "all documents on the leaf nodes get replicated to the central repo."
>
> jr
>
>
>