You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by CGS <cg...@gmail.com> on 2011/11/09 16:13:58 UTC

_bulk_docs vs. single document insertion

Hi,

I wonder if others found the following difference and if it still exists 
in CouchDB version 1.1.1. Maybe it's me who doesn't operate correctly 
with the bulk.

Insertion scenarios:

1. Using _bulk_docs (I deleted here the part with user:password@ because 
it's not the source of the difference in behavior - true or false in 
all_or_nothing changes nothing):

bash$ curl -s -X POST http://127.0.0.1:5984/db/_bulk_docs -d 
'{"all_or_nothing":true,"docs":[{"_id":"test1","test":"1"},{"_id":"test2","test":"2"},{"_id":"test3","test":"3"},{"_id":"test4","test":"4"},{"_id":"test5","test":"5"}]}' 
-H 'Content-Type: application/json' && curl -s -X GET 
http://127.0.0.1:5984/db/test1

Behavior:
a) first time ok, returning the list of docs id's and rev's + test1 
document;
b) the second time not quite ok, returning the same id's and rev's + the 
same test1 document (step discarded by bulk, getting the first bulk 
operation result instead of incrementing the revision or reporting 
conflict);
c) deleting all those documents and sending the same command returns the 
same id's and rev's as when I sent for the first time that request 
(Futon reports no document and I tried also compact operation after 
deleting and before sending again the same command) + 
{"error":"not_found","reason":"deleted"} (normal error for the GET 
operation).

2. Using single document insertion:

bash$ curl -s -X POST http://127.0.0.1:5984/db -d 
'{"_id":"test1","test":"1"}' -H 'Content-Type: application/json'

Behavior:
a) deleting and posting the same document, the revision increases (as 
expected).

Now, comparing 1.c and 2.a, one can easily see that the behavior is 
different. Is that normal or I am missing something? Because, if I am 
not missing anything, this behavior slows down a lot data insertion 
design, triggering  possible loss of data in the case of high rate of data.

Cheers,
CGS

PS: Sorry if this post is a little bit messy as I tried to shorten it. 
If anyone needs more details, let me know. Thank you for your time in 
reading this post.


Re: _bulk_docs vs. single document insertion

Posted by CGS <cg...@gmail.com>.
So, it was my misunderstanding here. There is no real boolean 
all_or_nothing (boolean = two values, while this has only one if 
exists), but if exists creates conflicts in the db independent on being 
true or false. Thanks, Jens! I will see how can I process the return 
list (without including all_or_nothing) fast enough without killing my CPU.

Cheers,
CGS




On 11/09/2011 09:37 PM, Jens Alfke wrote:
> On Nov 9, 2011, at 7:13 AM, CGS wrote:
>
> b) the second time not quite ok, returning the same id's and rev's + the
> same test1 document (step discarded by bulk, getting the first bulk
> operation result instead of incrementing the revision or reporting
> conflict);
>
> You’re using all-or-nothing mode, which never returns 409 Conflict errors but instead creates conflicting revisions in the db. Did you read about how it works?
>
> http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Transactional_Semantics_with_Bulk_Updates
>
> —Jens
>


Re: _bulk_docs vs. single document insertion

Posted by Jens Alfke <je...@couchbase.com>.
On Nov 9, 2011, at 7:13 AM, CGS wrote:

b) the second time not quite ok, returning the same id's and rev's + the
same test1 document (step discarded by bulk, getting the first bulk
operation result instead of incrementing the revision or reporting
conflict);

You’re using all-or-nothing mode, which never returns 409 Conflict errors but instead creates conflicting revisions in the db. Did you read about how it works?

http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Transactional_Semantics_with_Bulk_Updates

—Jens