You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Gilbert B Garza <gi...@gmail.com> on 2008/07/25 17:39:49 UTC

Bulk insert conflicts

I just recently filed a bug for couchdb's error reporting on conflicts with
bulk insert:

https://issues.apache.org/jira/browse/COUCHDB-97

For the moment, the error couch is giving is
{"error":"conflict","reason":"Update conflict"}. When PUTing a single
document, this isn't that bad. When bulk inserting thousands of documents,
this isn't helpful in the least.

I would like to propose a solution to this. The current success message for
bulk docs looks like this:

{
  "ok":true,
  "new_revs": [
    {"id": "0", "rev": "3682408536"},
    {"id": "1", "rev": "3206753266"},
    {"id": "2", "rev": "426742535"}
  ]
}

If there are any confllicts, I suggest that it would look something like
this:

{
  "ok":true,
  "conflicts":true,
  "new_revs": [
    {"id": "1", "rev": "12345"},
    {"id": "3", "rev": "23456"},
    {"id": "4", "rev": "34567"}
  ]
  "conflict_revs": [
    {"id": "0", "attempted_rev": "1001", "current_rev": "1002"},
    {"id": "2", "attempted_rev": "5000", "current_rev": "5002"}
  ]
}

This way, not only do you know which documents had conflicts, but the entire
bulk operation does not have to fail.

What do you all think?

-Gilbert

Re: Bulk insert conflicts

Posted by Gilbert B Garza <gi...@gmail.com>.
On Fri, Jul 25, 2008 at 11:16 AM, Michael Hendricks <mi...@ndrix.org>
wrote:

> The current bulk_docs behavior of failing the entire operation when one
> part fails is a very useful feature though.  It allows for limited
> transactions.  I use the bulk_docs feature to maintain the database in a
> consistent state when I need to delete one document and create another
> one.  If either operation fails, I know that the entire bulk_docs
> request will fail and the database will still be in a consistent state.
>
> Perhaps using something like this as the body of the 412 response:
>
>    {
>    "ok":false,
>     "conflict_revs": [
>        {"id": "0", "attempted_rev": "1001", "current_rev": "1002"},
>        {"id": "2", "attempted_rev": "5000", "current_rev": "5002"}
>    ]
>    }
>
> I removed "conflicts":true since that can be inferred from the presence
> of conflicted_revs.
>
> --
> Michael


That's true, I didn't think about transactions. That's much more useful than
following through with missing conflicted docs.

I like your solution much better than mine, Michael.  +1

Re: Bulk insert conflicts

Posted by Michael Hendricks <mi...@ndrix.org>.
On Fri, Jul 25, 2008 at 10:39:49AM -0500, Gilbert B Garza wrote:
> If there are any confllicts, I suggest that it would look something like
> this:
> 
> {
>   "ok":true,
>   "conflicts":true,
>   "new_revs": [
>     {"id": "1", "rev": "12345"},
>     {"id": "3", "rev": "23456"},
>     {"id": "4", "rev": "34567"}
>   ]
>   "conflict_revs": [
>     {"id": "0", "attempted_rev": "1001", "current_rev": "1002"},
>     {"id": "2", "attempted_rev": "5000", "current_rev": "5002"}
>   ]
> }
> 
> This way, not only do you know which documents had conflicts, but the entire
> bulk operation does not have to fail.

I like the idea of indicating which documents caused the conflict.  I
can see how that would make it easier to resolve the conflicts and try
again.

The current bulk_docs behavior of failing the entire operation when one
part fails is a very useful feature though.  It allows for limited
transactions.  I use the bulk_docs feature to maintain the database in a
consistent state when I need to delete one document and create another
one.  If either operation fails, I know that the entire bulk_docs
request will fail and the database will still be in a consistent state.

Perhaps using something like this as the body of the 412 response:

    {
    "ok":false,
    "conflict_revs": [
        {"id": "0", "attempted_rev": "1001", "current_rev": "1002"},
        {"id": "2", "attempted_rev": "5000", "current_rev": "5002"}
    ]
    }

I removed "conflicts":true since that can be inferred from the presence
of conflicted_revs.

-- 
Michael

Re: Bulk insert conflicts

Posted by Chris Anderson <jc...@grabb.it>.
On Fri, Jul 25, 2008 at 11:39 AM, Gilbert B Garza
<gi...@gmail.com> wrote:
> This way, not only do you know which documents had conflicts, but the entire
> bulk operation does not have to fail.
>
> What do you all think?

It used to behave more like this, but was changed so that you can use
bulk insert to get transactions. I wouldn't mind an option
conflicts=ok or something, to trigger the functionality you describe.
It can get tedious having to check for each id you are about to
insert, before creating a bulk post.

-- 
Chris Anderson
http://jchris.mfdz.com