You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Gilbert B Garza <gi...@gmail.com> on 2008/07/25 17:39:49 UTC
Bulk insert conflicts
I just recently filed a bug for couchdb's error reporting on conflicts with
bulk insert:
https://issues.apache.org/jira/browse/COUCHDB-97
For the moment, the error couch is giving is
{"error":"conflict","reason":"Update conflict"}. When PUTing a single
document, this isn't that bad. When bulk inserting thousands of documents,
this isn't helpful in the least.
I would like to propose a solution to this. The current success message for
bulk docs looks like this:
{
"ok":true,
"new_revs": [
{"id": "0", "rev": "3682408536"},
{"id": "1", "rev": "3206753266"},
{"id": "2", "rev": "426742535"}
]
}
If there are any confllicts, I suggest that it would look something like
this:
{
"ok":true,
"conflicts":true,
"new_revs": [
{"id": "1", "rev": "12345"},
{"id": "3", "rev": "23456"},
{"id": "4", "rev": "34567"}
]
"conflict_revs": [
{"id": "0", "attempted_rev": "1001", "current_rev": "1002"},
{"id": "2", "attempted_rev": "5000", "current_rev": "5002"}
]
}
This way, not only do you know which documents had conflicts, but the entire
bulk operation does not have to fail.
What do you all think?
-Gilbert
Re: Bulk insert conflicts
Posted by Gilbert B Garza <gi...@gmail.com>.
On Fri, Jul 25, 2008 at 11:16 AM, Michael Hendricks <mi...@ndrix.org>
wrote:
> The current bulk_docs behavior of failing the entire operation when one
> part fails is a very useful feature though. It allows for limited
> transactions. I use the bulk_docs feature to maintain the database in a
> consistent state when I need to delete one document and create another
> one. If either operation fails, I know that the entire bulk_docs
> request will fail and the database will still be in a consistent state.
>
> Perhaps using something like this as the body of the 412 response:
>
> {
> "ok":false,
> "conflict_revs": [
> {"id": "0", "attempted_rev": "1001", "current_rev": "1002"},
> {"id": "2", "attempted_rev": "5000", "current_rev": "5002"}
> ]
> }
>
> I removed "conflicts":true since that can be inferred from the presence
> of conflicted_revs.
>
> --
> Michael
That's true, I didn't think about transactions. That's much more useful than
following through with missing conflicted docs.
I like your solution much better than mine, Michael. +1
Re: Bulk insert conflicts
Posted by Michael Hendricks <mi...@ndrix.org>.
On Fri, Jul 25, 2008 at 10:39:49AM -0500, Gilbert B Garza wrote:
> If there are any confllicts, I suggest that it would look something like
> this:
>
> {
> "ok":true,
> "conflicts":true,
> "new_revs": [
> {"id": "1", "rev": "12345"},
> {"id": "3", "rev": "23456"},
> {"id": "4", "rev": "34567"}
> ]
> "conflict_revs": [
> {"id": "0", "attempted_rev": "1001", "current_rev": "1002"},
> {"id": "2", "attempted_rev": "5000", "current_rev": "5002"}
> ]
> }
>
> This way, not only do you know which documents had conflicts, but the entire
> bulk operation does not have to fail.
I like the idea of indicating which documents caused the conflict. I
can see how that would make it easier to resolve the conflicts and try
again.
The current bulk_docs behavior of failing the entire operation when one
part fails is a very useful feature though. It allows for limited
transactions. I use the bulk_docs feature to maintain the database in a
consistent state when I need to delete one document and create another
one. If either operation fails, I know that the entire bulk_docs
request will fail and the database will still be in a consistent state.
Perhaps using something like this as the body of the 412 response:
{
"ok":false,
"conflict_revs": [
{"id": "0", "attempted_rev": "1001", "current_rev": "1002"},
{"id": "2", "attempted_rev": "5000", "current_rev": "5002"}
]
}
I removed "conflicts":true since that can be inferred from the presence
of conflicted_revs.
--
Michael
Re: Bulk insert conflicts
Posted by Chris Anderson <jc...@grabb.it>.
On Fri, Jul 25, 2008 at 11:39 AM, Gilbert B Garza
<gi...@gmail.com> wrote:
> This way, not only do you know which documents had conflicts, but the entire
> bulk operation does not have to fail.
>
> What do you all think?
It used to behave more like this, but was changed so that you can use
bulk insert to get transactions. I wouldn't mind an option
conflicts=ok or something, to trigger the functionality you describe.
It can get tedious having to check for each id you are about to
insert, before creating a bulk post.
--
Chris Anderson
http://jchris.mfdz.com