You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by Apache Wiki <wi...@apache.org> on 2012/12/23 17:19:45 UTC

[Couchdb Wiki] Update of "HTTP_Bulk_Document_API" by RobertNewson

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "HTTP_Bulk_Document_API" page has been changed by RobertNewson:
http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API?action=diff&rev1=18&rev2=19

Comment:
clarified 'Transactional Semantics with Bulk Updates', mostly by removing obsolete preamble about ancient versions.

  
  
  === Transactional Semantics with Bulk Updates ===
- In previous releases of CouchDB, bulk updates were transactional - in particular, all requests in a bulk update failed if any request failed or was in conflict. There were a couple of problems with this approach:
  
-  * This doesn't actually work with replication. Replication doesn't provide the same transactional semantics, so downstream replicas won't see "all-or-nothing" transactional semantics. Instead, they will see documents in an inconsistent state until replication of all documents involved in the bulk update completes. With bidirectional replication it can get even worse, because you can get edit conflicts that must be fixed manually.
+ In short, there are none (by design). However, you can ask CouchDB to check that all the documents in your {{{_bulk_docs}}} request pass all your validation functions. If even one fails, none of the documents are written. You can select this mode by including {{{"all_or_nothing":true}}} in your request. With this mode, if all documents pass validation, then all documents will be updated, even if that introduces a conflict for the affected documents.
  
+ Bulk updates work independently of replication, the documents updated in a {{{_bulk_docs}}} request will not be replicated as a group, and will not even necessarily be replicated in the same order as they were in the request.
-  * If your database is partitioned (aka "sharded"), different documents within the transaction could live on different nodes in the cluster, and these kinds of transactional semantics don't work unless you use heavy, non-scalable approaches like two-phase commit.
- 
- With release 0.9 of CouchDB, bulk update semantics have been changed so that a CouchDB server behaves consistently in a single-node, replicated, and/or partitioned environment. Note that this change makes explicit the fact that CouchDB is not a relational store and does not guarantee relational consistency between documents. As a developer you need to be aware of these semantics and design your data model and your application with this in mind.
- 
- There are now two bulk update models supported:
- 
-  * '''non-atomic''' - This is the default behavior.  Some documents may successfully be saved and some may not.  The response will tell the application which documents were saved or not. In the case of a power failure, when the database restarts some may have been saved and some not.
- 
-  * '''all-or-nothing''' - To use this mode, include {{{"all_or_nothing":true}}} as part of the request.  In the case of a validation failure, none of the documents will be saved.  However, it does not do conflict checking, so all documents will be committed even if this creates conflicts.
- 
- {{{#!highlight javascript
- {
-   "all_or_nothing": true,
-   "docs": [
-     {"_id": "0", "_rev": "1-62657917", "integer": 10, "string": "10"},
-     {"_id": "1", "_rev": "2-1579510027", "integer": 2, "string": "2"},
-     {"_id": "2", "_rev": "2-3978456339", "integer": 3, "string": "3"}
-   ]
- }
- }}}
- In this case, all three documents will be saved, and the response will show success for all of them. However if the document with id 0 had a conflict, both versions will be present in the database, with an arbitrary choice made as to which appears in views. You can check for this status using a GET with {{{?conflicts=true}}}
- 
- If any updates fails validation, all updates will fail.
- 
- All or nothing transactions should not be used to enforce referential integrity, as some or all updated documents might become losing conflicts during the update. The transaction should be used to make sure all information is captured in an atomic operation, but conflicts may need to be addressed later. Applications that rely on this functionality should be able to tolerate some documents missing or being in a conflicted state until conflict resolution can occur.
- 
- Bulk updates work independently of replication, meaning document revisions originally saved as part of an all or nothing transaction will be replicated individually, not as part of a bulk transaction. This means other replica instances may only have a subset of the transaction, and if an update is rejected by the remote node during replication (e.g. not authorized error) the remote node may never have the complete transaction.
- 
- Note that POSTing a single document with {{{"all_or_nothing":true}}} behaves completely differently from a regular PUT, since it will save conflicting versions rather than rejecting a conflict.
- 
- {{{
- $ DB="http://127.0.0.1:5984/tstconf"
- $ curl -X PUT "$DB"
- $ curl -X PUT -d '{"name":"fred"}' "$DB/person"
- $ curl -X POST -H 'Content-Type: application/json' -d '{"all_or_nothing":true,"docs":[{"_id":"person","_rev":"1-877727288","name":"jim"}]}' "$DB/_bulk_docs"
- $ curl -X POST -H 'Content-Type: application/json' -d '{"all_or_nothing":true,"docs":[{"_id":"person","_rev":"1-877727288","name":"trunky"}]}' "$DB/_bulk_docs"
- $ curl "$DB/person?conflicts=true"
- }}}
- Result:
- 
- {{{#!highlight javascript
- {"ok":true}
- {"ok":true,"id":"person","rev":"1-877727288"}
- [{"id":"person","rev":"2-3595405"}]
- [{"id":"person","rev":"2-2835283254"}]
- {"_id":"person","_rev":"2-3595405","name":"jim","_conflicts":["2-2835283254"]}
- }}}
  
  === Posting Existing Revisions ===