You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by "Sanjuan, Hector" <he...@here.com> on 2014/08/25 14:16:54 UTC

Deleted documents being replaced by previous revisions

Hi,

we are running a couchdb 1.5.0 setup with master-master replication.

I am observing that sometimes, a document has multiple revisions stored,
and when deleting the most current one, a previous one replaces it
and becomes available.

The old revision numbers that are available are non-consecutives (i.e.
rev 1234 would be replaced by 742). Querying the revs would come back
with a list of non-consecutive revisions, for which a full document
exists even after compactation.

As I understand it, old revision records are kept around for
replication and its contents subject to disappear on compactation. I'd
assume writing a document 1000 times and then issuing a DELETE would
mark it as deleted and inform of this on subsequent GETs.

Has anyone come across anything similar? I have searched around without
much luck.

Is this maybe related to replication conflicts were the conflict is
resolved but the conflicting revisions left behind?

As of now, getting the documents truly deleted means issuing DELETE
a few times until every leftover revision is gone. Of course this only
shows up randomly here and there, and in small tests couchdb deletes
and works as expected.

Thanks,

Hector

Re: Deleted documents being replaced by previous revisions

Posted by Robert Samuel Newson <rn...@apache.org>.
3 nodes is the recommended minimum, yeah. :)

Again, it cannot be done. It’s only in your specific case that you know there are just two servers, but CouchDB is more flexible than that. It’s essential that any and all replicas converge to the same state when they replicate, no matter how much time passes between being connected.

B.

On 25 Aug 2014, at 17:20, Sanjuan, Hector <he...@here.com> wrote:

> Hi,
> 
> Couchdb would not drop out information during replica's downtime because then it's revisions would never be losing ones as there would be no conflict to solve. Only when a conflict is identified between 2 parties, the losing one would just become obsolete like any other "old" revisions. It is just as if the user manually deletes it and solves the conflict.
> 
> Right now Couchdb is able to show consistently the winning revision based on a deterministic algorithm, and then lets the user manually figure out what to do just it case it got it wrong. This would just give full power to CouchDB to assume that, based on that algorithm, the winner revision is the correct one to all effects and the other can be discarded. Of course this would never be a default, just a option and only affects documents in conflict.
> 
> The normal point of view on conflicts comes from 2 users writing the same resource in different ways at the same time. My use case is different. Instead 1 user is writing the same resource in 2 masters in a rather brief period due to a failover from master1 to master2. This happens either because master1 is too slow or bloated to answer quickly, so it tries with master2 assuming master1 is down.
> 
> Thanks for the help, looking forward CouchDB 2.0... 2 node cluster is good, but ideally I'd like a 3 node-cluster :)
> 
> Hector
> ________________________________________
> From: Robert Samuel Newson <rn...@apache.org>
> Sent: Monday, August 25, 2014 17:35
> To: user@couchdb.apache.org
> Subject: Re: Deleted documents being replaced by previous revisions
> 
> Hi,
> 
> If you do GET /db/doc?conflicts=true you will get the list of all leaf revisions.
> 
> CouchDB cannot gain a switch to 'not preserve losing revisions' as there is no way to know when you no longer require that information. In your situation, imagine that the link between your two masters is down for a period of time. If CouchDB dropped that information sooner than when it came back, your replicas would never converge on the same state, it would be a disaster. CouchDB preserves this information forever (the fact that you deleted a leaf revision is kept forever too). It’s the databases core strength, enabling "offline replication" as we call it.
> 
> Yes, you can create a view to list conflicted documents but it might be smarter to always fetch documents with ?conflicts=true. That way you can write the new (overwriting update) as well as the delete of the other revisions. CouchDB is designed to prevent you from losing data even in the event of concurrent updates to the same document at disconnected locations, so you have to do a little bit of work to subvert it.
> 
> In CouchDB 2.0 you will be able to stand up a proper 2 node cluster that will behave much more like a single server than your current setup.
> 
> B.
> 
> On 25 Aug 2014, at 14:29, Sanjuan, Hector <he...@here.com> wrote:
> 
>> Yeah, it is acceptable that the losing version simply vanishes. It's something that mostly happens as a result of a failover and not often at all.
>> 
>> All non-winning leaves should have _conflicts=true right? I guess i can just loop through a conflicts view and remove them, but it would be nice if couchdb would simply not preserve losing revisions based on  a configuration option.
>> 
>> Thanks for the quick help,
>> 
>> H
>> ________________________________________
>> From: Robert Samuel Newson <rn...@apache.org>
>> Sent: Monday, August 25, 2014 15:05
>> To: user@couchdb.apache.org
>> Subject: Re: Deleted documents being replaced by previous revisions
>> 
>> Hi,
>> 
>> CouchDB does not resolve conflicts, it preserves them until you resolve them (by deleting them, as you’ve been doing). Reducing revs_limit will not help since that only controls the depth of the revision tree and not its breadth.
>> 
>> If you are updating the same document at two different sites, and then replicating them, you will introduce conflicts. This is something you need to account for in your application. If user A updates document Foo on site 1 and user B updates document Foo on site 2 then, after replication, both sites will present either user A or user B’s update, and the other is a losing revision (preserved but hidden). Is it acceptable in your application for one of these user writes to effectively vanish? Or should something be done to document Foo to reconcile the fact it was edited differently by two different users concurrently?
>> 
>> B.
>> 
>> 
>> On 25 Aug 2014, at 13:47, Sanjuan, Hector <he...@here.com> wrote:
>> 
>>> Any sensible workaround in order to not leaving any leaf behind? Whatever comes out of the couchdb conflict resolution is fine. The content of previous/conflicted revisions is not really important and not something I would like to go back to.
>>> 
>>> Both masters receive writes independently. I am tempted to reduce _revs_limit, but it sounds it will be a bad idea if my masters lose connectivity to each-other for some time (they sit on different DCs).
>>> 
>>> H
>>> 
>>> ________________________________________
>>> From: Robert Samuel Newson <rn...@apache.org>
>>> Sent: Monday, August 25, 2014 14:26
>>> To: user@couchdb.apache.org
>>> Subject: Re: Deleted documents being replaced by previous revisions
>>> 
>>> Hi,
>>> 
>>> What’s happening here is your document is conflicted. That is, there are multiple 'latest' revisions to choose from. In this situation, CouchDB chooses one of them to present to you when you do GET /dbname/docid. When you then delete that revision, you are promoting one of the others.
>>> 
>>> The common way to introduce conflicts is to edit the same document at multiple locations and then replicate, which would appear to be your setup. Are you allowing writes to both masters?
>>> 
>>> It is only non-latest (we say "non-leaf") revisions that are removed by compaction, CouchDB preserves all of the latest revisions (as we do not know which edit or edits you want to keep), so the revs limit of 1000 that you mention is in fact unrelated to your issue.
>>> 
>>> B.
>>> 
>>> 
>>> 
>>> On 25 Aug 2014, at 13:16, Sanjuan, Hector <he...@here.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> we are running a couchdb 1.5.0 setup with master-master replication.
>>>> 
>>>> I am observing that sometimes, a document has multiple revisions stored,
>>>> and when deleting the most current one, a previous one replaces it
>>>> and becomes available.
>>>> 
>>>> The old revision numbers that are available are non-consecutives (i.e.
>>>> rev 1234 would be replaced by 742). Querying the revs would come back
>>>> with a list of non-consecutive revisions, for which a full document
>>>> exists even after compactation.
>>>> 
>>>> As I understand it, old revision records are kept around for
>>>> replication and its contents subject to disappear on compactation. I'd
>>>> assume writing a document 1000 times and then issuing a DELETE would
>>>> mark it as deleted and inform of this on subsequent GETs.
>>>> 
>>>> Has anyone come across anything similar? I have searched around without
>>>> much luck.
>>>> 
>>>> Is this maybe related to replication conflicts were the conflict is
>>>> resolved but the conflicting revisions left behind?
>>>> 
>>>> As of now, getting the documents truly deleted means issuing DELETE
>>>> a few times until every leftover revision is gone. Of course this only
>>>> shows up randomly here and there, and in small tests couchdb deletes
>>>> and works as expected.
>>>> 
>>>> Thanks,
>>>> 
>>>> Hector
>>> 
>> 
> 


RE: Deleted documents being replaced by previous revisions

Posted by "Sanjuan, Hector" <he...@here.com>.
Hi,

Couchdb would not drop out information during replica's downtime because then it's revisions would never be losing ones as there would be no conflict to solve. Only when a conflict is identified between 2 parties, the losing one would just become obsolete like any other "old" revisions. It is just as if the user manually deletes it and solves the conflict.

Right now Couchdb is able to show consistently the winning revision based on a deterministic algorithm, and then lets the user manually figure out what to do just it case it got it wrong. This would just give full power to CouchDB to assume that, based on that algorithm, the winner revision is the correct one to all effects and the other can be discarded. Of course this would never be a default, just a option and only affects documents in conflict.

The normal point of view on conflicts comes from 2 users writing the same resource in different ways at the same time. My use case is different. Instead 1 user is writing the same resource in 2 masters in a rather brief period due to a failover from master1 to master2. This happens either because master1 is too slow or bloated to answer quickly, so it tries with master2 assuming master1 is down.

Thanks for the help, looking forward CouchDB 2.0... 2 node cluster is good, but ideally I'd like a 3 node-cluster :)

Hector
________________________________________
From: Robert Samuel Newson <rn...@apache.org>
Sent: Monday, August 25, 2014 17:35
To: user@couchdb.apache.org
Subject: Re: Deleted documents being replaced by previous revisions

Hi,

If you do GET /db/doc?conflicts=true you will get the list of all leaf revisions.

CouchDB cannot gain a switch to 'not preserve losing revisions' as there is no way to know when you no longer require that information. In your situation, imagine that the link between your two masters is down for a period of time. If CouchDB dropped that information sooner than when it came back, your replicas would never converge on the same state, it would be a disaster. CouchDB preserves this information forever (the fact that you deleted a leaf revision is kept forever too). It’s the databases core strength, enabling "offline replication" as we call it.

Yes, you can create a view to list conflicted documents but it might be smarter to always fetch documents with ?conflicts=true. That way you can write the new (overwriting update) as well as the delete of the other revisions. CouchDB is designed to prevent you from losing data even in the event of concurrent updates to the same document at disconnected locations, so you have to do a little bit of work to subvert it.

In CouchDB 2.0 you will be able to stand up a proper 2 node cluster that will behave much more like a single server than your current setup.

B.

On 25 Aug 2014, at 14:29, Sanjuan, Hector <he...@here.com> wrote:

> Yeah, it is acceptable that the losing version simply vanishes. It's something that mostly happens as a result of a failover and not often at all.
>
> All non-winning leaves should have _conflicts=true right? I guess i can just loop through a conflicts view and remove them, but it would be nice if couchdb would simply not preserve losing revisions based on  a configuration option.
>
> Thanks for the quick help,
>
> H
> ________________________________________
> From: Robert Samuel Newson <rn...@apache.org>
> Sent: Monday, August 25, 2014 15:05
> To: user@couchdb.apache.org
> Subject: Re: Deleted documents being replaced by previous revisions
>
> Hi,
>
> CouchDB does not resolve conflicts, it preserves them until you resolve them (by deleting them, as you’ve been doing). Reducing revs_limit will not help since that only controls the depth of the revision tree and not its breadth.
>
> If you are updating the same document at two different sites, and then replicating them, you will introduce conflicts. This is something you need to account for in your application. If user A updates document Foo on site 1 and user B updates document Foo on site 2 then, after replication, both sites will present either user A or user B’s update, and the other is a losing revision (preserved but hidden). Is it acceptable in your application for one of these user writes to effectively vanish? Or should something be done to document Foo to reconcile the fact it was edited differently by two different users concurrently?
>
> B.
>
>
> On 25 Aug 2014, at 13:47, Sanjuan, Hector <he...@here.com> wrote:
>
>> Any sensible workaround in order to not leaving any leaf behind? Whatever comes out of the couchdb conflict resolution is fine. The content of previous/conflicted revisions is not really important and not something I would like to go back to.
>>
>> Both masters receive writes independently. I am tempted to reduce _revs_limit, but it sounds it will be a bad idea if my masters lose connectivity to each-other for some time (they sit on different DCs).
>>
>> H
>>
>> ________________________________________
>> From: Robert Samuel Newson <rn...@apache.org>
>> Sent: Monday, August 25, 2014 14:26
>> To: user@couchdb.apache.org
>> Subject: Re: Deleted documents being replaced by previous revisions
>>
>> Hi,
>>
>> What’s happening here is your document is conflicted. That is, there are multiple 'latest' revisions to choose from. In this situation, CouchDB chooses one of them to present to you when you do GET /dbname/docid. When you then delete that revision, you are promoting one of the others.
>>
>> The common way to introduce conflicts is to edit the same document at multiple locations and then replicate, which would appear to be your setup. Are you allowing writes to both masters?
>>
>> It is only non-latest (we say "non-leaf") revisions that are removed by compaction, CouchDB preserves all of the latest revisions (as we do not know which edit or edits you want to keep), so the revs limit of 1000 that you mention is in fact unrelated to your issue.
>>
>> B.
>>
>>
>>
>> On 25 Aug 2014, at 13:16, Sanjuan, Hector <he...@here.com> wrote:
>>
>>> Hi,
>>>
>>> we are running a couchdb 1.5.0 setup with master-master replication.
>>>
>>> I am observing that sometimes, a document has multiple revisions stored,
>>> and when deleting the most current one, a previous one replaces it
>>> and becomes available.
>>>
>>> The old revision numbers that are available are non-consecutives (i.e.
>>> rev 1234 would be replaced by 742). Querying the revs would come back
>>> with a list of non-consecutive revisions, for which a full document
>>> exists even after compactation.
>>>
>>> As I understand it, old revision records are kept around for
>>> replication and its contents subject to disappear on compactation. I'd
>>> assume writing a document 1000 times and then issuing a DELETE would
>>> mark it as deleted and inform of this on subsequent GETs.
>>>
>>> Has anyone come across anything similar? I have searched around without
>>> much luck.
>>>
>>> Is this maybe related to replication conflicts were the conflict is
>>> resolved but the conflicting revisions left behind?
>>>
>>> As of now, getting the documents truly deleted means issuing DELETE
>>> a few times until every leftover revision is gone. Of course this only
>>> shows up randomly here and there, and in small tests couchdb deletes
>>> and works as expected.
>>>
>>> Thanks,
>>>
>>> Hector
>>
>


Re: Deleted documents being replaced by previous revisions

Posted by Robert Samuel Newson <rn...@apache.org>.
Hi,

If you do GET /db/doc?conflicts=true you will get the list of all leaf revisions.

CouchDB cannot gain a switch to 'not preserve losing revisions' as there is no way to know when you no longer require that information. In your situation, imagine that the link between your two masters is down for a period of time. If CouchDB dropped that information sooner than when it came back, your replicas would never converge on the same state, it would be a disaster. CouchDB preserves this information forever (the fact that you deleted a leaf revision is kept forever too). It’s the databases core strength, enabling "offline replication" as we call it.

Yes, you can create a view to list conflicted documents but it might be smarter to always fetch documents with ?conflicts=true. That way you can write the new (overwriting update) as well as the delete of the other revisions. CouchDB is designed to prevent you from losing data even in the event of concurrent updates to the same document at disconnected locations, so you have to do a little bit of work to subvert it.

In CouchDB 2.0 you will be able to stand up a proper 2 node cluster that will behave much more like a single server than your current setup.

B.

On 25 Aug 2014, at 14:29, Sanjuan, Hector <he...@here.com> wrote:

> Yeah, it is acceptable that the losing version simply vanishes. It's something that mostly happens as a result of a failover and not often at all.
> 
> All non-winning leaves should have _conflicts=true right? I guess i can just loop through a conflicts view and remove them, but it would be nice if couchdb would simply not preserve losing revisions based on  a configuration option.
> 
> Thanks for the quick help,
> 
> H
> ________________________________________
> From: Robert Samuel Newson <rn...@apache.org>
> Sent: Monday, August 25, 2014 15:05
> To: user@couchdb.apache.org
> Subject: Re: Deleted documents being replaced by previous revisions
> 
> Hi,
> 
> CouchDB does not resolve conflicts, it preserves them until you resolve them (by deleting them, as you’ve been doing). Reducing revs_limit will not help since that only controls the depth of the revision tree and not its breadth.
> 
> If you are updating the same document at two different sites, and then replicating them, you will introduce conflicts. This is something you need to account for in your application. If user A updates document Foo on site 1 and user B updates document Foo on site 2 then, after replication, both sites will present either user A or user B’s update, and the other is a losing revision (preserved but hidden). Is it acceptable in your application for one of these user writes to effectively vanish? Or should something be done to document Foo to reconcile the fact it was edited differently by two different users concurrently?
> 
> B.
> 
> 
> On 25 Aug 2014, at 13:47, Sanjuan, Hector <he...@here.com> wrote:
> 
>> Any sensible workaround in order to not leaving any leaf behind? Whatever comes out of the couchdb conflict resolution is fine. The content of previous/conflicted revisions is not really important and not something I would like to go back to.
>> 
>> Both masters receive writes independently. I am tempted to reduce _revs_limit, but it sounds it will be a bad idea if my masters lose connectivity to each-other for some time (they sit on different DCs).
>> 
>> H
>> 
>> ________________________________________
>> From: Robert Samuel Newson <rn...@apache.org>
>> Sent: Monday, August 25, 2014 14:26
>> To: user@couchdb.apache.org
>> Subject: Re: Deleted documents being replaced by previous revisions
>> 
>> Hi,
>> 
>> What’s happening here is your document is conflicted. That is, there are multiple 'latest' revisions to choose from. In this situation, CouchDB chooses one of them to present to you when you do GET /dbname/docid. When you then delete that revision, you are promoting one of the others.
>> 
>> The common way to introduce conflicts is to edit the same document at multiple locations and then replicate, which would appear to be your setup. Are you allowing writes to both masters?
>> 
>> It is only non-latest (we say "non-leaf") revisions that are removed by compaction, CouchDB preserves all of the latest revisions (as we do not know which edit or edits you want to keep), so the revs limit of 1000 that you mention is in fact unrelated to your issue.
>> 
>> B.
>> 
>> 
>> 
>> On 25 Aug 2014, at 13:16, Sanjuan, Hector <he...@here.com> wrote:
>> 
>>> Hi,
>>> 
>>> we are running a couchdb 1.5.0 setup with master-master replication.
>>> 
>>> I am observing that sometimes, a document has multiple revisions stored,
>>> and when deleting the most current one, a previous one replaces it
>>> and becomes available.
>>> 
>>> The old revision numbers that are available are non-consecutives (i.e.
>>> rev 1234 would be replaced by 742). Querying the revs would come back
>>> with a list of non-consecutive revisions, for which a full document
>>> exists even after compactation.
>>> 
>>> As I understand it, old revision records are kept around for
>>> replication and its contents subject to disappear on compactation. I'd
>>> assume writing a document 1000 times and then issuing a DELETE would
>>> mark it as deleted and inform of this on subsequent GETs.
>>> 
>>> Has anyone come across anything similar? I have searched around without
>>> much luck.
>>> 
>>> Is this maybe related to replication conflicts were the conflict is
>>> resolved but the conflicting revisions left behind?
>>> 
>>> As of now, getting the documents truly deleted means issuing DELETE
>>> a few times until every leftover revision is gone. Of course this only
>>> shows up randomly here and there, and in small tests couchdb deletes
>>> and works as expected.
>>> 
>>> Thanks,
>>> 
>>> Hector
>> 
> 


RE: Deleted documents being replaced by previous revisions

Posted by "Sanjuan, Hector" <he...@here.com>.
Yeah, it is acceptable that the losing version simply vanishes. It's something that mostly happens as a result of a failover and not often at all.

All non-winning leaves should have _conflicts=true right? I guess i can just loop through a conflicts view and remove them, but it would be nice if couchdb would simply not preserve losing revisions based on  a configuration option.

Thanks for the quick help,

H
________________________________________
From: Robert Samuel Newson <rn...@apache.org>
Sent: Monday, August 25, 2014 15:05
To: user@couchdb.apache.org
Subject: Re: Deleted documents being replaced by previous revisions

Hi,

CouchDB does not resolve conflicts, it preserves them until you resolve them (by deleting them, as you’ve been doing). Reducing revs_limit will not help since that only controls the depth of the revision tree and not its breadth.

If you are updating the same document at two different sites, and then replicating them, you will introduce conflicts. This is something you need to account for in your application. If user A updates document Foo on site 1 and user B updates document Foo on site 2 then, after replication, both sites will present either user A or user B’s update, and the other is a losing revision (preserved but hidden). Is it acceptable in your application for one of these user writes to effectively vanish? Or should something be done to document Foo to reconcile the fact it was edited differently by two different users concurrently?

B.


On 25 Aug 2014, at 13:47, Sanjuan, Hector <he...@here.com> wrote:

> Any sensible workaround in order to not leaving any leaf behind? Whatever comes out of the couchdb conflict resolution is fine. The content of previous/conflicted revisions is not really important and not something I would like to go back to.
>
> Both masters receive writes independently. I am tempted to reduce _revs_limit, but it sounds it will be a bad idea if my masters lose connectivity to each-other for some time (they sit on different DCs).
>
> H
>
> ________________________________________
> From: Robert Samuel Newson <rn...@apache.org>
> Sent: Monday, August 25, 2014 14:26
> To: user@couchdb.apache.org
> Subject: Re: Deleted documents being replaced by previous revisions
>
> Hi,
>
> What’s happening here is your document is conflicted. That is, there are multiple 'latest' revisions to choose from. In this situation, CouchDB chooses one of them to present to you when you do GET /dbname/docid. When you then delete that revision, you are promoting one of the others.
>
> The common way to introduce conflicts is to edit the same document at multiple locations and then replicate, which would appear to be your setup. Are you allowing writes to both masters?
>
> It is only non-latest (we say "non-leaf") revisions that are removed by compaction, CouchDB preserves all of the latest revisions (as we do not know which edit or edits you want to keep), so the revs limit of 1000 that you mention is in fact unrelated to your issue.
>
> B.
>
>
>
> On 25 Aug 2014, at 13:16, Sanjuan, Hector <he...@here.com> wrote:
>
>> Hi,
>>
>> we are running a couchdb 1.5.0 setup with master-master replication.
>>
>> I am observing that sometimes, a document has multiple revisions stored,
>> and when deleting the most current one, a previous one replaces it
>> and becomes available.
>>
>> The old revision numbers that are available are non-consecutives (i.e.
>> rev 1234 would be replaced by 742). Querying the revs would come back
>> with a list of non-consecutive revisions, for which a full document
>> exists even after compactation.
>>
>> As I understand it, old revision records are kept around for
>> replication and its contents subject to disappear on compactation. I'd
>> assume writing a document 1000 times and then issuing a DELETE would
>> mark it as deleted and inform of this on subsequent GETs.
>>
>> Has anyone come across anything similar? I have searched around without
>> much luck.
>>
>> Is this maybe related to replication conflicts were the conflict is
>> resolved but the conflicting revisions left behind?
>>
>> As of now, getting the documents truly deleted means issuing DELETE
>> a few times until every leftover revision is gone. Of course this only
>> shows up randomly here and there, and in small tests couchdb deletes
>> and works as expected.
>>
>> Thanks,
>>
>> Hector
>


Re: Deleted documents being replaced by previous revisions

Posted by Robert Samuel Newson <rn...@apache.org>.
Hi,

CouchDB does not resolve conflicts, it preserves them until you resolve them (by deleting them, as you’ve been doing). Reducing revs_limit will not help since that only controls the depth of the revision tree and not its breadth.

If you are updating the same document at two different sites, and then replicating them, you will introduce conflicts. This is something you need to account for in your application. If user A updates document Foo on site 1 and user B updates document Foo on site 2 then, after replication, both sites will present either user A or user B’s update, and the other is a losing revision (preserved but hidden). Is it acceptable in your application for one of these user writes to effectively vanish? Or should something be done to document Foo to reconcile the fact it was edited differently by two different users concurrently?

B.


On 25 Aug 2014, at 13:47, Sanjuan, Hector <he...@here.com> wrote:

> Any sensible workaround in order to not leaving any leaf behind? Whatever comes out of the couchdb conflict resolution is fine. The content of previous/conflicted revisions is not really important and not something I would like to go back to.
> 
> Both masters receive writes independently. I am tempted to reduce _revs_limit, but it sounds it will be a bad idea if my masters lose connectivity to each-other for some time (they sit on different DCs).
> 
> H
> 
> ________________________________________
> From: Robert Samuel Newson <rn...@apache.org>
> Sent: Monday, August 25, 2014 14:26
> To: user@couchdb.apache.org
> Subject: Re: Deleted documents being replaced by previous revisions
> 
> Hi,
> 
> What’s happening here is your document is conflicted. That is, there are multiple 'latest' revisions to choose from. In this situation, CouchDB chooses one of them to present to you when you do GET /dbname/docid. When you then delete that revision, you are promoting one of the others.
> 
> The common way to introduce conflicts is to edit the same document at multiple locations and then replicate, which would appear to be your setup. Are you allowing writes to both masters?
> 
> It is only non-latest (we say "non-leaf") revisions that are removed by compaction, CouchDB preserves all of the latest revisions (as we do not know which edit or edits you want to keep), so the revs limit of 1000 that you mention is in fact unrelated to your issue.
> 
> B.
> 
> 
> 
> On 25 Aug 2014, at 13:16, Sanjuan, Hector <he...@here.com> wrote:
> 
>> Hi,
>> 
>> we are running a couchdb 1.5.0 setup with master-master replication.
>> 
>> I am observing that sometimes, a document has multiple revisions stored,
>> and when deleting the most current one, a previous one replaces it
>> and becomes available.
>> 
>> The old revision numbers that are available are non-consecutives (i.e.
>> rev 1234 would be replaced by 742). Querying the revs would come back
>> with a list of non-consecutive revisions, for which a full document
>> exists even after compactation.
>> 
>> As I understand it, old revision records are kept around for
>> replication and its contents subject to disappear on compactation. I'd
>> assume writing a document 1000 times and then issuing a DELETE would
>> mark it as deleted and inform of this on subsequent GETs.
>> 
>> Has anyone come across anything similar? I have searched around without
>> much luck.
>> 
>> Is this maybe related to replication conflicts were the conflict is
>> resolved but the conflicting revisions left behind?
>> 
>> As of now, getting the documents truly deleted means issuing DELETE
>> a few times until every leftover revision is gone. Of course this only
>> shows up randomly here and there, and in small tests couchdb deletes
>> and works as expected.
>> 
>> Thanks,
>> 
>> Hector
> 


RE: Deleted documents being replaced by previous revisions

Posted by "Sanjuan, Hector" <he...@here.com>.
Any sensible workaround in order to not leaving any leaf behind? Whatever comes out of the couchdb conflict resolution is fine. The content of previous/conflicted revisions is not really important and not something I would like to go back to.

Both masters receive writes independently. I am tempted to reduce _revs_limit, but it sounds it will be a bad idea if my masters lose connectivity to each-other for some time (they sit on different DCs).

H

________________________________________
From: Robert Samuel Newson <rn...@apache.org>
Sent: Monday, August 25, 2014 14:26
To: user@couchdb.apache.org
Subject: Re: Deleted documents being replaced by previous revisions

Hi,

What’s happening here is your document is conflicted. That is, there are multiple 'latest' revisions to choose from. In this situation, CouchDB chooses one of them to present to you when you do GET /dbname/docid. When you then delete that revision, you are promoting one of the others.

The common way to introduce conflicts is to edit the same document at multiple locations and then replicate, which would appear to be your setup. Are you allowing writes to both masters?

It is only non-latest (we say "non-leaf") revisions that are removed by compaction, CouchDB preserves all of the latest revisions (as we do not know which edit or edits you want to keep), so the revs limit of 1000 that you mention is in fact unrelated to your issue.

B.



On 25 Aug 2014, at 13:16, Sanjuan, Hector <he...@here.com> wrote:

> Hi,
>
> we are running a couchdb 1.5.0 setup with master-master replication.
>
> I am observing that sometimes, a document has multiple revisions stored,
> and when deleting the most current one, a previous one replaces it
> and becomes available.
>
> The old revision numbers that are available are non-consecutives (i.e.
> rev 1234 would be replaced by 742). Querying the revs would come back
> with a list of non-consecutive revisions, for which a full document
> exists even after compactation.
>
> As I understand it, old revision records are kept around for
> replication and its contents subject to disappear on compactation. I'd
> assume writing a document 1000 times and then issuing a DELETE would
> mark it as deleted and inform of this on subsequent GETs.
>
> Has anyone come across anything similar? I have searched around without
> much luck.
>
> Is this maybe related to replication conflicts were the conflict is
> resolved but the conflicting revisions left behind?
>
> As of now, getting the documents truly deleted means issuing DELETE
> a few times until every leftover revision is gone. Of course this only
> shows up randomly here and there, and in small tests couchdb deletes
> and works as expected.
>
> Thanks,
>
> Hector


Re: Deleted documents being replaced by previous revisions

Posted by Robert Samuel Newson <rn...@apache.org>.
Hi,

What’s happening here is your document is conflicted. That is, there are multiple 'latest' revisions to choose from. In this situation, CouchDB chooses one of them to present to you when you do GET /dbname/docid. When you then delete that revision, you are promoting one of the others.

The common way to introduce conflicts is to edit the same document at multiple locations and then replicate, which would appear to be your setup. Are you allowing writes to both masters?

It is only non-latest (we say "non-leaf") revisions that are removed by compaction, CouchDB preserves all of the latest revisions (as we do not know which edit or edits you want to keep), so the revs limit of 1000 that you mention is in fact unrelated to your issue.

B.



On 25 Aug 2014, at 13:16, Sanjuan, Hector <he...@here.com> wrote:

> Hi,
> 
> we are running a couchdb 1.5.0 setup with master-master replication.
> 
> I am observing that sometimes, a document has multiple revisions stored,
> and when deleting the most current one, a previous one replaces it
> and becomes available.
> 
> The old revision numbers that are available are non-consecutives (i.e.
> rev 1234 would be replaced by 742). Querying the revs would come back
> with a list of non-consecutive revisions, for which a full document
> exists even after compactation.
> 
> As I understand it, old revision records are kept around for
> replication and its contents subject to disappear on compactation. I'd
> assume writing a document 1000 times and then issuing a DELETE would
> mark it as deleted and inform of this on subsequent GETs.
> 
> Has anyone come across anything similar? I have searched around without
> much luck.
> 
> Is this maybe related to replication conflicts were the conflict is
> resolved but the conflicting revisions left behind?
> 
> As of now, getting the documents truly deleted means issuing DELETE
> a few times until every leftover revision is gone. Of course this only
> shows up randomly here and there, and in small tests couchdb deletes
> and works as expected.
> 
> Thanks,
> 
> Hector