You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Alex P <ap...@kolosy.com> on 2009/11/13 21:22:11 UTC

potential conflict resolution strategy

this isn't so much about couch itself, but rather about using couch in a
webapp.

suppose i had an arbitrarily load-balanced front-end web server environment.
a user throughout the lifetime of their session may get bounced to any of N
different web servers, all talking to the same couch instance (for
argument's sake). if these web servers are doing any kind of local caching,
then there is the standard possibility of a conflict such as:

1. server 1 is asked for a copy of document foo, and stores revision
1-fedbca in its local cache
2. server 2 is asked to change document foo, foo gets revision of 2-abcdef,
and is stored in server 2's cache
3. server 1 is asked to modify document foo, retrieves it from local cache,
and attempts to save it with rev 1-fedbca, causing a conflict

assuming that this is a topology that we want to have for the time being,
what are the group's thoughts on this resolution strategy:

- in addition to storing the actual document, keep a record of which fields
are being modified (call this copy a)
- when a conflict is detected, retrieve couchdb's copy (call it copy b), and
apply the changed fields 'only' from copy a, to copy b.
- save copy b with the net difference.

thoughts?

thanks,
alex.

Re: potential conflict resolution strategy

Posted by Nathan Stott <nr...@gmail.com>.
You could put nginx or another proxy server in front of couchdb and use http
expires tags for your caching

On Fri, Nov 13, 2009 at 5:32 PM, Alex P <ap...@kolosy.com> wrote:

> i want to avoid the round trip if possible, hence the local cache. i'm not
> worried about race conditions because i don't have multiple users accessing
> the same data, so the only possible conflict scenario is being bounced to a
> different server...
>
> On Fri, Nov 13, 2009 at 4:10 PM, Daniel Truemper <truemped@googlemail.com
> >wrote:
>
> > Hi,
> >
> > > suppose i had an arbitrarily load-balanced front-end web server
> > environment.
> > > a user throughout the lifetime of their session may get bounced to any
> of
> > N
> > > different web servers, all talking to the same couch instance (for
> > > argument's sake). if these web servers are doing any kind of local
> > caching
> > If you want to cache the documents, why not use a Squid in front of the
> > CouchDB?
> >
> > But for the argument: you should probably at least do a HEAD request to
> get
> > the document's ETAG, i.e. revision in order to avoid creating conflicts
> in
> > you local cache...
> >
> > Daniel
>

Re: potential conflict resolution strategy

Posted by Alex P <ap...@kolosy.com>.
i want to avoid the round trip if possible, hence the local cache. i'm not
worried about race conditions because i don't have multiple users accessing
the same data, so the only possible conflict scenario is being bounced to a
different server...

On Fri, Nov 13, 2009 at 4:10 PM, Daniel Truemper <tr...@googlemail.com>wrote:

> Hi,
>
> > suppose i had an arbitrarily load-balanced front-end web server
> environment.
> > a user throughout the lifetime of their session may get bounced to any of
> N
> > different web servers, all talking to the same couch instance (for
> > argument's sake). if these web servers are doing any kind of local
> caching
> If you want to cache the documents, why not use a Squid in front of the
> CouchDB?
>
> But for the argument: you should probably at least do a HEAD request to get
> the document's ETAG, i.e. revision in order to avoid creating conflicts in
> you local cache...
>
> Daniel

Re: potential conflict resolution strategy

Posted by Daniel Truemper <tr...@googlemail.com>.
Hi,

> suppose i had an arbitrarily load-balanced front-end web server environment.
> a user throughout the lifetime of their session may get bounced to any of N
> different web servers, all talking to the same couch instance (for
> argument's sake). if these web servers are doing any kind of local caching
If you want to cache the documents, why not use a Squid in front of the CouchDB?

But for the argument: you should probably at least do a HEAD request to get the document's ETAG, i.e. revision in order to avoid creating conflicts in you local cache...

Daniel

Re: potential conflict resolution strategy

Posted by Daniel Truemper <tr...@googlemail.com>.
forgot...

> assuming that this is a topology that we want to have for the time being,
> what are the group's thoughts on this resolution strategy:
> 
> - in addition to storing the actual document, keep a record of which fields
> are being modified (call this copy a)
> - when a conflict is detected, retrieve couchdb's copy (call it copy b), and
> apply the changed fields 'only' from copy a, to copy b.
> - save copy b with the net difference.
I think you should really try to avoid doing this type of handling client side when you have multiple clients with multiple caches. Looks like the source of many race conditions...

Daniel

Re: potential conflict resolution strategy

Posted by Brian Candler <B....@pobox.com>.
On Fri, Nov 13, 2009 at 02:22:11PM -0600, Alex P wrote:
> 1. server 1 is asked for a copy of document foo, and stores revision
> 1-fedbca in its local cache
> 2. server 2 is asked to change document foo, foo gets revision of 2-abcdef,
> and is stored in server 2's cache
> 3. server 1 is asked to modify document foo, retrieves it from local cache,
> and attempts to save it with rev 1-fedbca, causing a conflict
> 
> assuming that this is a topology that we want to have for the time being,
> what are the group's thoughts on this resolution strategy:
> 
> - in addition to storing the actual document, keep a record of which fields
> are being modified (call this copy a)
> - when a conflict is detected, retrieve couchdb's copy (call it copy b), and
> apply the changed fields 'only' from copy a, to copy b.
> - save copy b with the net difference.
> 
> thoughts?

Sounds reasonable to me. To those saying "don't cache" or "do a HEAD request
first", I would point out that there is still a race window when conflicts
can be introduced into the database (*). Therefore IMO it's better to handle
conflicts properly in the first place, and then you gain the ability to do
"disconnected" caching for free.

Regards,

Brian.

(*) If the writers are writing to different backends which replicate to each
other, then you'll get conflicts. If the writers are writing to the same
database using "all_or_nothing":true then you'll get conflicts. If the
writers are writing to the same database using PUT then the second update
will be rejected entirely with a 409 error. See:
http://wiki.apache.org/couchdb/Replication_and_conflicts