You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Pau Freixes <pf...@gmail.com> on 2011/02/24 01:05:06 UTC

About document revision, conflicts against bulk and with concurrent operations against single couchdb instance

Hi list

First of all my apologies with this large subject :) but this can do a
good resume about my problems or my no-understand things.

Actually Im running one event driven aplication running a couchdb
update documents step by step against one single couchdb instance, so
far the performance has been acceptable with a 10-15 document writer
per second ... but now we are thinking to increase and build a more
faster software architecture with a more bigger perfomance, going to
sustained 100 hundred writers per second or more.

Reading some articles and building some stress scripts i have been
able to runn couchdb instance below of 0.5 cpu impact with no more of
100 hundred writers serializing all of this in no bulk operation.

There are a possibility to group a set of documents to do one bulk
operation and hope a bigger document write performance with cpu peak
consume but only a peak, and of course yield cpu to other proposes.

For example from set of documents (a, b, c, d, e) i can take (a, b) to
run one bulk opertion only and run one by one (c, d, e), but the
problem begin when i can receive a request to change document b when
it is inside one bulk operation and now i need to run new b document
version in non bulk operation and it can be overwrite more later from
last bulk operation

Step by step operation will be this

1. I send a bulk opeartion with all_or_nothing = true (a, b)
2. I receive one request to change document b and operation 1 still not finish
3. I send a non bulk opeartion to change b document with know revision
4. I receive a response from operation 3 with new revision document of b
5. I receive a response from operation 1 with new revision document a, b

In this scenario b document will have a wrong value

I can propose a scenario with all_or_nothing = False

1. I send a bulk opeartion with all_or_nothing = false (a, b)
2. I receive one request to change document b and operation 1 still not finish
3. I send a non bulk opeartion to change b document with know revision
4. I receive a response from operation 3 with new revision document of b
5. I receive a response from operation 1 with new a revision and
saying revision number of has conflicts b

In this scenario b document will have a right value

but ... always there are a but  in this scenario could be fail

1. I send a bulk opeartion with all_or_nothing = false (a, b)
2. I receive one request to change document b and operation 1 still not finish
3. I send a non bulk opeartion to change b document with know revision
5. I receive a response from operation 1 with new version document of a, b
4. I receive a response from operation 3 saying revision number of b
has conflicts

are there some tools to avoid this situation ? or is the side of the
application to decide how and what to do in these cases?

-- 
--pau