You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Chris Anderson <jc...@apache.org> on 2009/03/25 15:29:18 UTC
Multi Document Rollback Strategies (was Re: [VOTE] Apache CouchDB 0.9.0 release)

Moving this to it's own thread to avoid hijacking the VOTE thread.

On Wed, Mar 25, 2009 at 12:44 PM, Chris Anderson <jc...@apache.org> wrote:
> On Wed, Mar 25, 2009 at 12:18 PM, Tim Parkin <in...@timparkin.co.uk> wrote:
>> Chris Anderson wrote:
>>>> I'd be interested in knowing what happened to the community discussion
>>>> around the removal of the bulk_docs 'feature'? I've tried to raise this a
>>>> couple of times but had little reaction. Am I right in understanding this
>>>> lack of reaction as meaning there is going to be no discussion?
>>>
>>>
>>> We've been concentrating on bulk docs documentation. In my experience,
>>> most people who understand that the unit of consistency is the
>>> document start to see ways of solving problems that work in a
>>> key/value world. Some use-cases don't fit, but for the 80% case we're
>>> better off with the simpler consistency model.
>>>
>>
>> I think the real issue is one of misunderstanding.. I personally just
>> want a way to rollback changes in order to deal with user interface
>> issues. The document I posted discusses the issue and highlights the
>> fact that the problem is not about consistency, it's about providing a
>> way to rollback changes if part of them fails so that a user can apply a
>> change by clicking submit and get a success/fail response.
>>
>
> I think I understand the issue. I think there are two ways to approach
> a solution. One is to confine end-user updates to a single key. This
> approach is the classic model for key/value stores.
>
> If your domain requires that edits are saved in multiple documents,
> the complexity grows. If you can control replication, and ensure that
> each user has a node to themselves, then you can treat edits between
> replications as a transaction, and the application can roll back any
> thing that has happened since the last outbound replication. It would
> require a library between the UI and storage if you want to make that
> simple for the user.
>
> If you are working in an environment where the application can't treat
> replication as a (soft) transaction boundary (hot-swap, or multiple
> concurrent users) then you'll need to break updates out into
> individual documents. A user can start an interactive transaction, and
> mark all updates with the transaction id. Then you can use a view to
> show only updates that are associated with a closed transaction.
>

This strategy of using a document per transaction, to mark the
transaction, boundaries, and applying the transaction api to all
documents created during a transaction, is worth exploring more. In
cases where you need multi-node transactions, and don't have
application control over replication, the model becomes more complex.

I can't really see how you can update existing documents in this mode.
To have multi-node transactions you'd need to build into your
application, that all reads come from views, which would accumulate
the results of completed transactions, and ignore transactions which
have not been marked as complete.

There is also no guarantee that all updates for a given transaction
are available on nodes which have the closed transaction document. In
that case, the closed transaction document could list all the ids of
the updates that occurred in a transaction, so that it can verify that
not only the transactions is closed, but also all updates in the
transaction are present.

> In this explicit-transaction use case, non-committed updates may be
> replicated, and it is the responsibility of the application to read
> data through a view which only shows updates that belong to finished
> transactions.
>
> The upshot is that bulk_docs can't and shouldn't give you any powers
> that you don't have available with the individual document APIs, but
> that doesn't mean your application can't provide those sorts of
> interfaces.
>
>> ... snip ...
>>>
>>> I think the general understanding is that CouchDB is built with a
>>> certain minimalist simplicity in mind. We appreciate that some of our
>>> users have demands that exceed our out-of-the-box functionality. I
>>> think once we have a solid understanding of how to use CouchDB in a
>>> distributed manner, we'll be on steady footing for more ambitious
>>> consistency guarantees.
>>>
>>
>> Just to reiterate.. it's not about consistency, it's about showing users
>> a logical success/fail rather than saying to them.. "well a part of your
>> change worked, part of it didn't - what would you like to do now?".
>>
>> The document I posted (I'll write a blog post about it if it helps)
>> details the issue.
>>
>> Tim
>>
>
>
>
> --
> Chris Anderson
> http://jchrisa.net
> http://couch.io
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io