You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Ben Hall <be...@googlemail.com> on 2010/04/05 00:57:32 UTC

Replication Filters: When changing restrictions data becomes out of sync

Hi,

I have the following setup:

MainDB
FirstDB
SecondDB

First and Second will contain a subset of the data in MainDB. I
planned to use Replication Filters to populate the DB.  This is
working great, until I change a document in MainDB from being
restricted to FirstDB to being restricted to SecondDB.

When this happens, replication correctly applies it to SecondDB -
however it still exists in FirstDB. As such, my data is now
inconsistent.

Is this correct? The only thing I can think is that I'm going to have
to manually delete the document from FirstDB - which is a little bit
annoying.

Is there a better way?

Thanks

Ben
http://twitter.com/Ben_Hall

Re: Replication Filters: When changing restrictions data becomes out of sync

Posted by Randall Leeds <ra...@gmail.com>.
On Tue, Apr 6, 2010 at 03:42, J Chris Anderson <jc...@gmail.com> wrote:
>
> Perhaps it's possible to split your documents up so that you don't need to revoke them later. Maybe someone on the list can think of a smart way to do revokes inside a firewall. The delete thing seems OK to me -- but you don't want that delete to replicate upstream to MainDB or SecondDB.

For this, you can use the _purge functionality, which does not create
a new rev that replicates but removes the document entirely.

Re: Replication Filters: When changing restrictions data becomes out of sync

Posted by J Chris Anderson <jc...@gmail.com>.
On Apr 4, 2010, at 3:57 PM, Ben Hall wrote:

> Hi,
> 
> I have the following setup:
> 
> MainDB
> FirstDB
> SecondDB
> 
> First and Second will contain a subset of the data in MainDB. I
> planned to use Replication Filters to populate the DB.  This is
> working great, until I change a document in MainDB from being
> restricted to FirstDB to being restricted to SecondDB.
> 
> When this happens, replication correctly applies it to SecondDB -
> however it still exists in FirstDB. As such, my data is now
> inconsistent.
> 
> Is this correct? The only thing I can think is that I'm going to have
> to manually delete the document from FirstDB - which is a little bit
> annoying.
> 

Good Question.

Filtered replication is meant to control replication to endpoints that you don't control. In a fully distributed app, there can be no concept of revoking access to something that's already been shared.

In practice (with your dbs behind your firewall), you are best sending a delete to FirstDB, but also avoid replicating the delete from FirstDB to SecondDB.

Perhaps it's possible to split your documents up so that you don't need to revoke them later. Maybe someone on the list can think of a smart way to do revokes inside a firewall. The delete thing seems OK to me -- but you don't want that delete to replicate upstream to MainDB or SecondDB.

Chris

> Is there a better way?
> 
> Thanks
> 
> Ben
> http://twitter.com/Ben_Hall


Re: Replication Filters: When changing restrictions data becomes out of sync

Posted by Randall Leeds <ra...@gmail.com>.
I apologize. I was not clear.

There is a feature of CouchDB called update_notifications that allows
an external process, supervised by Erlang, to be sent information
about document updates.

Unfortunately, this feature is not well documented, but you can see a
couple examples[1][2].

It's not elegant, but it might work for you depending on how complex
your use case is. You can use whatever language and HTTP client
library you like to issue the appropriate commands.

You could also write a daemon that consumes the ?changes feed.

You might load all the _design documents at the beginning. When the
design document is updated, compare the filter functions to see if
they've changed. You could fetch all the documents one by one or in
chunks and run them through the old filter and the new filter and see
which ones should no longer be replicated and issue a DELETE.

In short, there's no particularly efficient way to handle "correcting"
a changed replication filter. But you can work around it if you're
really determined.

[1] http://wiki.apache.org/couchdb/Regenerating_views_on_update?highlight=(update\_notification)
[2] http://github.com/tilgovi/couchdb-lounge/raw/master/replicator/replication_notifier.py

Re: Replication Filters: When changing restrictions data becomes out of sync

Posted by Ben Hall <be...@googlemail.com>.
Hi Randall,

Thanks for the comment.  I looked at the _update test cases but
couldn't see anything which would fit.

http://svn.apache.org/viewvc/couchdb/trunk/share/www/script/test/update_documents.js?view=markup

How would I issue a DELETE command from javascript against the FirstDB?

Thanks

Ben

On Mon, Apr 5, 2010 at 1:27 AM, Randall Leeds <ra...@gmail.com> wrote:
> If you're looking for replication to delete documents that don't fit the
> filter from the target, you'll have to do that manually. It is never the
> place of replication to remove documents on the target.
>
> What you can do is set up an update handler on second db that deletes
> documents exclusivly meant for seconddb from firstdb so in this way you can
> make it automatic. This doesn't cover the case where a document that used to
> replicate should be deleted everywhere except maindb.
>
> On Apr 4, 2010 3:58 PM, "Ben Hall" <be...@googlemail.com> wrote:
>
> Hi,
>
> I have the following setup:
>
> MainDB
> FirstDB
> SecondDB
>
> First and Second will contain a subset of the data in MainDB. I
> planned to use Replication Filters to populate the DB.  This is
> working great, until I change a document in MainDB from being
> restricted to FirstDB to being restricted to SecondDB.
>
> When this happens, replication correctly applies it to SecondDB -
> however it still exists in FirstDB. As such, my data is now
> inconsistent.
>
> Is this correct? The only thing I can think is that I'm going to have
> to manually delete the document from FirstDB - which is a little bit
> annoying.
>
> Is there a better way?
>
> Thanks
>
> Ben
> http://twitter.com/Ben_Hall
>

Re: Replication Filters: When changing restrictions data becomes out of sync

Posted by Randall Leeds <ra...@gmail.com>.
If you're looking for replication to delete documents that don't fit the
filter from the target, you'll have to do that manually. It is never the
place of replication to remove documents on the target.

What you can do is set up an update handler on second db that deletes
documents exclusivly meant for seconddb from firstdb so in this way you can
make it automatic. This doesn't cover the case where a document that used to
replicate should be deleted everywhere except maindb.

On Apr 4, 2010 3:58 PM, "Ben Hall" <be...@googlemail.com> wrote:

Hi,

I have the following setup:

MainDB
FirstDB
SecondDB

First and Second will contain a subset of the data in MainDB. I
planned to use Replication Filters to populate the DB.  This is
working great, until I change a document in MainDB from being
restricted to FirstDB to being restricted to SecondDB.

When this happens, replication correctly applies it to SecondDB -
however it still exists in FirstDB. As such, my data is now
inconsistent.

Is this correct? The only thing I can think is that I'm going to have
to manually delete the document from FirstDB - which is a little bit
annoying.

Is there a better way?

Thanks

Ben
http://twitter.com/Ben_Hall