You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by CGS <cg...@gmail.com> on 2011/11/09 16:13:58 UTC
_bulk_docs vs. single document insertion
Hi,
I wonder if others found the following difference and if it still exists
in CouchDB version 1.1.1. Maybe it's me who doesn't operate correctly
with the bulk.
Insertion scenarios:
1. Using _bulk_docs (I deleted here the part with user:password@ because
it's not the source of the difference in behavior - true or false in
all_or_nothing changes nothing):
bash$ curl -s -X POST http://127.0.0.1:5984/db/_bulk_docs -d
'{"all_or_nothing":true,"docs":[{"_id":"test1","test":"1"},{"_id":"test2","test":"2"},{"_id":"test3","test":"3"},{"_id":"test4","test":"4"},{"_id":"test5","test":"5"}]}'
-H 'Content-Type: application/json' && curl -s -X GET
http://127.0.0.1:5984/db/test1
Behavior:
a) first time ok, returning the list of docs id's and rev's + test1
document;
b) the second time not quite ok, returning the same id's and rev's + the
same test1 document (step discarded by bulk, getting the first bulk
operation result instead of incrementing the revision or reporting
conflict);
c) deleting all those documents and sending the same command returns the
same id's and rev's as when I sent for the first time that request
(Futon reports no document and I tried also compact operation after
deleting and before sending again the same command) +
{"error":"not_found","reason":"deleted"} (normal error for the GET
operation).
2. Using single document insertion:
bash$ curl -s -X POST http://127.0.0.1:5984/db -d
'{"_id":"test1","test":"1"}' -H 'Content-Type: application/json'
Behavior:
a) deleting and posting the same document, the revision increases (as
expected).
Now, comparing 1.c and 2.a, one can easily see that the behavior is
different. Is that normal or I am missing something? Because, if I am
not missing anything, this behavior slows down a lot data insertion
design, triggering possible loss of data in the case of high rate of data.
Cheers,
CGS
PS: Sorry if this post is a little bit messy as I tried to shorten it.
If anyone needs more details, let me know. Thank you for your time in
reading this post.
Re: _bulk_docs vs. single document insertion
Posted by CGS <cg...@gmail.com>.
So, it was my misunderstanding here. There is no real boolean
all_or_nothing (boolean = two values, while this has only one if
exists), but if exists creates conflicts in the db independent on being
true or false. Thanks, Jens! I will see how can I process the return
list (without including all_or_nothing) fast enough without killing my CPU.
Cheers,
CGS
On 11/09/2011 09:37 PM, Jens Alfke wrote:
> On Nov 9, 2011, at 7:13 AM, CGS wrote:
>
> b) the second time not quite ok, returning the same id's and rev's + the
> same test1 document (step discarded by bulk, getting the first bulk
> operation result instead of incrementing the revision or reporting
> conflict);
>
> You’re using all-or-nothing mode, which never returns 409 Conflict errors but instead creates conflicting revisions in the db. Did you read about how it works?
>
> http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Transactional_Semantics_with_Bulk_Updates
>
> —Jens
>
Re: _bulk_docs vs. single document insertion
Posted by Jens Alfke <je...@couchbase.com>.
On Nov 9, 2011, at 7:13 AM, CGS wrote:
b) the second time not quite ok, returning the same id's and rev's + the
same test1 document (step discarded by bulk, getting the first bulk
operation result instead of incrementing the revision or reporting
conflict);
You’re using all-or-nothing mode, which never returns 409 Conflict errors but instead creates conflicting revisions in the db. Did you read about how it works?
http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Transactional_Semantics_with_Bulk_Updates
—Jens