You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "Merlin Rabens (JIRA)" <ji...@apache.org> on 2016/02/02 17:14:39 UTC

[jira] [Commented] (COUCHDB-2735) Duplicate document _ids created under high edit load

    [ https://issues.apache.org/jira/browse/COUCHDB-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15128466#comment-15128466 ] 

Merlin Rabens commented on COUCHDB-2735:
----------------------------------------

Hi,

We're using CouchDB 1.6.1 for a year now and we're recently facing duplicate document id's issues. We *don't* have any validate_doc_update functions in our design documents but anyhow: There were a plenty of documents with the same document id but different revision no.'s. We have two systems that access our CouchDB nodes heavily via lightcouch.

Any ideas?

At first, it seemed, that this Issue describes exactly our problem but it doesn't since we haven't any validate_doc_update functions.

Thanks in advance.

Kind regards,
Merlin

> Duplicate document _ids created under high edit load
> ----------------------------------------------------
>
>                 Key: COUCHDB-2735
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2735
>             Project: CouchDB
>          Issue Type: Bug
>      Security Level: public(Regular issues) 
>          Components: Database Core
>            Reporter: James Dingwall
>            Assignee: Adam Kocoloski
>             Fix For: 1.7.0, 1.6.2
>
>
> Our database was created under CouchDB 1.2.1 and has been upgraded through 1.3.1 to 1.6.1.  We have been running 1.6.1 since last September.
> We are finding that making a large number of edits to existing documents is causing duplicated document _ids to be created in the _all_docs view:
> # curl -X GET http://127.0.0.1:5984/a2/_all_docs?key=\"vm-84082a94-0f1c-4eff-9216-7ac1e52ce9cd\"
> {"total_rows":11670,"offset":10577,"rows":[
> {"id":"vm-84082a94-0f1c-4eff-9216-7ac1e52ce9cd","key":"vm-84082a94-0f1c-4eff-9216-7ac1e52ce9cd","value":{"rev":"49-c2aa999386dbf20e3a88b72cccb678e0"}},
> {"id":"vm-84082a94-0f1c-4eff-9216-7ac1e52ce9cd","key":"vm-84082a94-0f1c-4eff-9216-7ac1e52ce9cd","value":{"rev":"14-984492669d302229de0fff2e1c0e4696"}}
> ]}
> Compacting the database will resolve this.
> # curl -X POST http://admin:password@127.0.0.1:5984/a2/_compact -H "Content-type: application/json" -d '{}'
> # curl -X GET http://127.0.0.1:5984/a2/_all_docs?key=\"vm-84082a94-0f1c-4eff-9216-7ac1e52ce9cd\"
> {"total_rows":11656,"offset":10564,"rows":[
> {"id":"vm-84082a94-0f1c-4eff-9216-7ac1e52ce9cd","key":"vm-84082a94-0f1c-4eff-9216-7ac1e52ce9cd","value":{"rev":"49-c2aa999386dbf20e3a88b72cccb678e0"}}
> ]}
> The document is not in conflict at its starting revision and no databases have this database as a target which would cause the problematic document to be written to via replications. i.e. curl -X GET 'http://127.0.0.1:5984/a000prodmaster/vm-84082a94-0f1c-4eff-9216-7ac1e52ce9cd?conflicts=true&deleted_conflicts=true' just returns the document.
> Our edit process consists of a number of view functions and update handlers which are connected by python code to add extra document fields.  We expect that many documents will come up in multiple views so document update conflicts are anticipated and handled in the python code.  Some of the edits are return([modified_doc, response]) others are return([null, modified_doc]) which are collected and submitted as bulk saves (all_or_nothing=false).
> When a document _id is duplicated it appears that that views are calculated using the older revision while modifications are written to the newer revision.
> I am experiencing this regularly while testing an upgrade for a database containing ~12000 documents and which will trigger ~26000 edits.  This upgrade test is on is a separate machine also running CouchDB 1.6.1 and Erlang 18 but the same was observed with 17.5.
> This issue appears similar to COUCHDB-968 but we have never run the versions that this affected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)