You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Jim Klo <ji...@sri.com> on 2013/05/17 19:03:34 UTC

Deleted and Replacement documents and VDU behavior

So Bob suggested that I raise this topic over here… Attaching the relevant thread from user@ here.

I suppose this is a feature request to be able to optionally enable the passing of the document "stub" as the oldDoc, if exists to the VDU, if the design doc contains a flag to enable this behavior.

I have a couple use cases:
	1. DCMA takedown; I want to add additional data to the deleted marker, that would permit the VDU to inspect the deleted "stub" doc for the additional data and determine whether to permit the reinstatement of the doc. 
	2. Want to prevent reuse of an _id.  I want docs to be immutable except for a one time delete.

In response to Bob's last response on user@ is how is this behavior any different than an document update?  My workaround is that I don't actually delete documents, but just update them into a "tombstone" document that removes the contents and then add the extra details I need for the VDU to work.  The major difference is that I must encode logic into my views to avoid the tombstone document which adds some unneeded complexity to the potentially every view.  My case is not so bad in that all my documents are 'typed' so my tombstone documents change the type such that they are excluded from views - however I had to carefully check to ensure that was the case.

Thoughts, explanation why this is bad, etc?

Jim Klo
Senior Software Engineer
Center for Software Engineering
SRI International
t.	@nsomnac

On May 17, 2013, at 9:35 AM, Robert Newson <rn...@apache.org>
 wrote:

> Hm, I dislike that, but you could raise it as a topic in the dev@
> mailing list. I think couchdb's core behavior should be predictable,
> it shouldn't change based on where you are. Consider that this would
> break eventual consistency (a validate_doc_update would have a
> different result based on that flags value).
> 
> B.
> 
> On 17 May 2013 17:29, Jim Klo <ji...@sri.com> wrote:
>> If there were a way to enable this sort of feature via a flag, like
>> local_seq that would be a good compromise IMO. :)  I don't know what the
>> ramifications of this would be on performance, but would be a 'nice to
>> have'.
>> 
>> Jim Klo
>> Senior Software Engineer
>> Center for Software Engineering
>> SRI International
>> t. @nsomnac
>> 
>> On May 17, 2013, at 8:39 AM, Robert Newson <rn...@apache.org>
>> wrote:
>> 
>> Aha, ok, that makes more sense. oldDoc will be null in that case to
>> match the behavior when there was never a document there, but it's
>> definitely a debatable nuance. I'm in favor of the existing behavior
>> but I do see your point.
>> 
>> B.
>> 
>> On 17 May 2013 16:31, Jim Klo <ji...@sri.com> wrote:
>> 
>> No, I think I incorrectly described the condition where this happens.
>> 
>> If I first delete a doc with extra info like you illustrated, and then
>> re-insert the doc as new, the VDU does not get the existing delete "stub" in
>> my experience. If this has changed in 1.3, I'd welcome it.
>> 
>> It would be useful if the VDU got the existing "deleted" document in certain
>> use cases, like a document got removed for DCMA violation - I don't want it
>> to reappear by mistake. I'd like to have the right logic in my VDU to check
>> the notes in the existing deleted stub before permitting the insert. There's
>> ways around this which I use instead, but think that if there's a stub that
>> could be handed to VDU, it should.
>> 
>> 
>> - JK
>> 
>> Sent from my iPhone
>> 
>> On May 17, 2013, at 7:41 AM, "Robert Newson" <rn...@apache.org> wrote:
>> 
>> VDU does receive the 'stub', which is always a document. The term
>> 'stub' can mislead people into thinking a deleted document is not an
>> actual document (it is).
>> 
>> Here I insist that deleted documents have a reason;
>> 
>> ➜  ~  curl localhost:5984/db1/_design/foo -XPUT -d
>> '{"validate_doc_update":"function(newDoc) { if(newDoc._deleted &&
>> !newDoc.reason) { throw({forbidden:\"must have a reason\"});  }  }"}'
>> {"ok":true,"id":"_design/foo","rev":"1-ab8a8ecd8cf3de35ed7541facfb75029"}
>> 
>> An empty doc;
>> 
>> ➜  ~  curl localhost:5984/db1/bar -XPUT -d {}
>> {"ok":true,"id":"bar","rev":"1-967a00dff5e02add41819138abb3284d"}
>> 
>> I try delete with DELETE method, which just does _id, _rev, _deleted.
>> 
>> ➜  ~  curl 'localhost:5984/db1/bar?rev=1-967a00dff5e02add41819138abb3284d'
>> -XDELETE
>> {"error":"forbidden","reason":"must have a reason"}
>> 
>> Now I delete with a PUT and a reason;
>> 
>> ➜ curl 'localhost:5984/db1/bar?rev=1-967a00dff5e02add41819138abb3284d'
>> -XPUT -d '{"reason":"because I said so","_deleted":true}'
>> {"ok":true,"id":"bar","rev":"2-6e10b3cc9ea15f6a9d81aa72aaa6e098"}
>> 
>> And it's really deleted;
>> 
>> ➜  ~  curl localhost:5984/db1/bar
>> {"error":"not_found","reason":"deleted"}
>> 
>> And my reason is recorded;
>> 
>> ➜  ~ curl 'localhost:5984/db1/bar?rev=2-6e10b3cc9ea15f6a9d81aa72aaa6e098'
>> {"_id":"bar","_rev":"2-6e10b3cc9ea15f6a9d81aa72aaa6e098","reason":"because
>> I said so","_deleted":true}
>> 
>> B.
>> 
>> On 17 May 2013 14:52, Jim Klo <ji...@sri.com> wrote:
>> 
>> It's a great tip, my only complaint about it is that the deleted stub
>> doesn't get handed to the VDU function, unless that's changed in 1.3
>> 
>> - Jim
>> 
>> 
>> On May 17, 2013, at 12:04 AM, "Dave Cottlehuber" <dc...@jsonified.com> wrote:
>> 
>> On 17 May 2013 01:32, Randall Leeds <ra...@gmail.com> wrote:
>> 
>> Actually, it's even easier than this. It is acceptable to put a body in the
>> DELETE. You can store whatever fields you want accessible in your deletion
>> stubs.
>> 
>> 
>> **WIN** best tip of the month!
>> 
>> 


Re: Deleted and Replacement documents and VDU behavior

Posted by Randall Leeds <ra...@gmail.com>.
On Sat, May 18, 2013 at 1:57 PM, Jim Klo <ji...@sri.com> wrote:
>
>
> Sent from my iPhone
>
> On May 18, 2013, at 1:39 PM, "Randall Leeds" <ra...@gmail.com> wrote:
>
>> On Sat, May 18, 2013 at 8:04 AM, Jim Klo <ji...@sri.com> wrote:
>>>
>>>
>>> Sent from my iPad
>>>
>>> On May 18, 2013, at 1:56 AM, "Randall Leeds" <ra...@gmail.com> wrote:
>>>
>>>> On Fri, May 17, 2013 at 12:12 PM, Randall Leeds <ra...@gmail.com> wrote:
>>>>> On Fri, May 17, 2013 at 11:57 AM, Robert Newson <rn...@apache.org> wrote:
>>>>>> oldDoc is null in this case. That matches the case that the doc is
>>>>>> brand new and is surely deliberate? I asked him to post it this here
>>>>>> because I do understand the benefits of it being otherwise and wanted
>>>>>> to see this conversation.
>>>>>>
>>>>>> My position is that deleting a document should free that id for any
>>>>>> future use, which is exactly what Jim does not want.
>>>>>>
>>>>>> I'd like to hear from folks that might have a memory of when this
>>>>>> particular semantic was decided. I think it could arguably have gone
>>>>>> the other way.
>>>>>
>>>>> I know we have a clause for revivification for an update without a rev
>>>>> to a deleted doc.
>>>>>
>>>>> This proposed alternative behavior is attractive to me and if my
>>>>> armchair spelunking is correct, it's actually pretty trivial. It seems
>>>>> to me like we could even make a minor breaking change for 1.4 where
>>>>> the old doc is always passed to VDU handlers, even if it's a
>>>>> tombstone. Migration would mean updating VDU handlers to consider
>>>>> oldDoc._deleted. I think many are probably using VDUs for validating
>>>>> the new doc anyway, and would ignore the second parameter.
>>>>>
>>>>> The default semantics could stay the same, but if we just passed the
>>>>> tombstone to VDU handlers it would be customizable in exactly the way
>>>>> Jim wants. Sounds exactly like the sort of thing VDU is for.
>>>>
>>>> I just realized that it's not clear which revision should be provided
>>>> when attempting to revive a deleted doc, since there may have been
>>>> several revision histories which ended in deletes and there is no
>>>> previous rev specified.
>>>
>>> I think in this case only I'd expect only the deleted 'tombstone' doc to be handed. I'm not sure what rev that falls under.
>>
>> I'm saying that due to replication conflicts it may be that there are
>> multiple choices.
>
> Thanks Randall,
>
> Educate me please. How does this differ from how it works now on an update where the rev is provided? Assuming the doc has been deleted, the rev of the tombstone is the only thing I would think you'd have to be concerned.  How would a conflict occur? Are you thinking that the conflict is when replicating from multiple sources - I'd think you'd have this race condition either way in a M-M scenario.

I don't believe it's necessary to specify a rev when creating a
document that has previously been deleted. It "comes back to life"
without the client needing to specify which revision was the deleted
one.

In a typical update scenario, the previous revision is provided with
the request and if it doesn't match what is stored then a conflict is
thrown.

However, with M-M replication scenarios, it's possible to end up with
conflicts. If a document has more than one live, conflicting revision,
and then they are all deleted, it's unclear to me which revision
should be sent as the "old doc" to the VDU if a create request for
that ID occurs later.

One possibility would be to take the "winning" rev. Usually, a
document with a branching revision tree returns a winning rev as the
response to a GET, but deleted leaf revisions are ignored. The same
precedence could work though, for determining which old doc to pass to
the VDU, by including deleted revisions when there is no *non*-deleted
leaf revision.

Re: Deleted and Replacement documents and VDU behavior

Posted by Jim Klo <ji...@sri.com>.

Sent from my iPhone

On May 18, 2013, at 1:39 PM, "Randall Leeds" <ra...@gmail.com> wrote:

> On Sat, May 18, 2013 at 8:04 AM, Jim Klo <ji...@sri.com> wrote:
>> 
>> 
>> Sent from my iPad
>> 
>> On May 18, 2013, at 1:56 AM, "Randall Leeds" <ra...@gmail.com> wrote:
>> 
>>> On Fri, May 17, 2013 at 12:12 PM, Randall Leeds <ra...@gmail.com> wrote:
>>>> On Fri, May 17, 2013 at 11:57 AM, Robert Newson <rn...@apache.org> wrote:
>>>>> oldDoc is null in this case. That matches the case that the doc is
>>>>> brand new and is surely deliberate? I asked him to post it this here
>>>>> because I do understand the benefits of it being otherwise and wanted
>>>>> to see this conversation.
>>>>> 
>>>>> My position is that deleting a document should free that id for any
>>>>> future use, which is exactly what Jim does not want.
>>>>> 
>>>>> I'd like to hear from folks that might have a memory of when this
>>>>> particular semantic was decided. I think it could arguably have gone
>>>>> the other way.
>>>> 
>>>> I know we have a clause for revivification for an update without a rev
>>>> to a deleted doc.
>>>> 
>>>> This proposed alternative behavior is attractive to me and if my
>>>> armchair spelunking is correct, it's actually pretty trivial. It seems
>>>> to me like we could even make a minor breaking change for 1.4 where
>>>> the old doc is always passed to VDU handlers, even if it's a
>>>> tombstone. Migration would mean updating VDU handlers to consider
>>>> oldDoc._deleted. I think many are probably using VDUs for validating
>>>> the new doc anyway, and would ignore the second parameter.
>>>> 
>>>> The default semantics could stay the same, but if we just passed the
>>>> tombstone to VDU handlers it would be customizable in exactly the way
>>>> Jim wants. Sounds exactly like the sort of thing VDU is for.
>>> 
>>> I just realized that it's not clear which revision should be provided
>>> when attempting to revive a deleted doc, since there may have been
>>> several revision histories which ended in deletes and there is no
>>> previous rev specified.
>> 
>> I think in this case only I'd expect only the deleted 'tombstone' doc to be handed. I'm not sure what rev that falls under.
> 
> I'm saying that due to replication conflicts it may be that there are
> multiple choices.

Thanks Randall,

Educate me please. How does this differ from how it works now on an update where the rev is provided? Assuming the doc has been deleted, the rev of the tombstone is the only thing I would think you'd have to be concerned.  How would a conflict occur? Are you thinking that the conflict is when replicating from multiple sources - I'd think you'd have this race condition either way in a M-M scenario.

Jim 

Re: Deleted and Replacement documents and VDU behavior

Posted by Randall Leeds <ra...@gmail.com>.
On Sat, May 18, 2013 at 8:04 AM, Jim Klo <ji...@sri.com> wrote:
>
>
> Sent from my iPad
>
> On May 18, 2013, at 1:56 AM, "Randall Leeds" <ra...@gmail.com> wrote:
>
>> On Fri, May 17, 2013 at 12:12 PM, Randall Leeds <ra...@gmail.com> wrote:
>>> On Fri, May 17, 2013 at 11:57 AM, Robert Newson <rn...@apache.org> wrote:
>>>> oldDoc is null in this case. That matches the case that the doc is
>>>> brand new and is surely deliberate? I asked him to post it this here
>>>> because I do understand the benefits of it being otherwise and wanted
>>>> to see this conversation.
>>>>
>>>> My position is that deleting a document should free that id for any
>>>> future use, which is exactly what Jim does not want.
>>>>
>>>> I'd like to hear from folks that might have a memory of when this
>>>> particular semantic was decided. I think it could arguably have gone
>>>> the other way.
>>>
>>> I know we have a clause for revivification for an update without a rev
>>> to a deleted doc.
>>>
>>> This proposed alternative behavior is attractive to me and if my
>>> armchair spelunking is correct, it's actually pretty trivial. It seems
>>> to me like we could even make a minor breaking change for 1.4 where
>>> the old doc is always passed to VDU handlers, even if it's a
>>> tombstone. Migration would mean updating VDU handlers to consider
>>> oldDoc._deleted. I think many are probably using VDUs for validating
>>> the new doc anyway, and would ignore the second parameter.
>>>
>>> The default semantics could stay the same, but if we just passed the
>>> tombstone to VDU handlers it would be customizable in exactly the way
>>> Jim wants. Sounds exactly like the sort of thing VDU is for.
>>
>> I just realized that it's not clear which revision should be provided
>> when attempting to revive a deleted doc, since there may have been
>> several revision histories which ended in deletes and there is no
>> previous rev specified.
>
> I think in this case only I'd expect only the deleted 'tombstone' doc to be handed. I'm not sure what rev that falls under.

I'm saying that due to replication conflicts it may be that there are
multiple choices.

Re: Deleted and Replacement documents and VDU behavior

Posted by Jim Klo <ji...@sri.com>.

Sent from my iPad

On May 18, 2013, at 1:56 AM, "Randall Leeds" <ra...@gmail.com> wrote:

> On Fri, May 17, 2013 at 12:12 PM, Randall Leeds <ra...@gmail.com> wrote:
>> On Fri, May 17, 2013 at 11:57 AM, Robert Newson <rn...@apache.org> wrote:
>>> oldDoc is null in this case. That matches the case that the doc is
>>> brand new and is surely deliberate? I asked him to post it this here
>>> because I do understand the benefits of it being otherwise and wanted
>>> to see this conversation.
>>> 
>>> My position is that deleting a document should free that id for any
>>> future use, which is exactly what Jim does not want.
>>> 
>>> I'd like to hear from folks that might have a memory of when this
>>> particular semantic was decided. I think it could arguably have gone
>>> the other way.
>> 
>> I know we have a clause for revivification for an update without a rev
>> to a deleted doc.
>> 
>> This proposed alternative behavior is attractive to me and if my
>> armchair spelunking is correct, it's actually pretty trivial. It seems
>> to me like we could even make a minor breaking change for 1.4 where
>> the old doc is always passed to VDU handlers, even if it's a
>> tombstone. Migration would mean updating VDU handlers to consider
>> oldDoc._deleted. I think many are probably using VDUs for validating
>> the new doc anyway, and would ignore the second parameter.
>> 
>> The default semantics could stay the same, but if we just passed the
>> tombstone to VDU handlers it would be customizable in exactly the way
>> Jim wants. Sounds exactly like the sort of thing VDU is for.
> 
> I just realized that it's not clear which revision should be provided
> when attempting to revive a deleted doc, since there may have been
> several revision histories which ended in deletes and there is no
> previous rev specified.

I think in this case only I'd expect only the deleted 'tombstone' doc to be handed. I'm not sure what rev that falls under. 

Re: Deleted and Replacement documents and VDU behavior

Posted by Randall Leeds <ra...@gmail.com>.
On Fri, May 17, 2013 at 12:12 PM, Randall Leeds <ra...@gmail.com> wrote:
> On Fri, May 17, 2013 at 11:57 AM, Robert Newson <rn...@apache.org> wrote:
>> oldDoc is null in this case. That matches the case that the doc is
>> brand new and is surely deliberate? I asked him to post it this here
>> because I do understand the benefits of it being otherwise and wanted
>> to see this conversation.
>>
>> My position is that deleting a document should free that id for any
>> future use, which is exactly what Jim does not want.
>>
>> I'd like to hear from folks that might have a memory of when this
>> particular semantic was decided. I think it could arguably have gone
>> the other way.
>
> I know we have a clause for revivification for an update without a rev
> to a deleted doc.
>
> This proposed alternative behavior is attractive to me and if my
> armchair spelunking is correct, it's actually pretty trivial. It seems
> to me like we could even make a minor breaking change for 1.4 where
> the old doc is always passed to VDU handlers, even if it's a
> tombstone. Migration would mean updating VDU handlers to consider
> oldDoc._deleted. I think many are probably using VDUs for validating
> the new doc anyway, and would ignore the second parameter.
>
> The default semantics could stay the same, but if we just passed the
> tombstone to VDU handlers it would be customizable in exactly the way
> Jim wants. Sounds exactly like the sort of thing VDU is for.

I just realized that it's not clear which revision should be provided
when attempting to revive a deleted doc, since there may have been
several revision histories which ended in deletes and there is no
previous rev specified.

Re: Deleted and Replacement documents and VDU behavior

Posted by Randall Leeds <ra...@gmail.com>.
On Fri, May 17, 2013 at 11:57 AM, Robert Newson <rn...@apache.org> wrote:
> oldDoc is null in this case. That matches the case that the doc is
> brand new and is surely deliberate? I asked him to post it this here
> because I do understand the benefits of it being otherwise and wanted
> to see this conversation.
>
> My position is that deleting a document should free that id for any
> future use, which is exactly what Jim does not want.
>
> I'd like to hear from folks that might have a memory of when this
> particular semantic was decided. I think it could arguably have gone
> the other way.

I know we have a clause for revivification for an update without a rev
to a deleted doc.

This proposed alternative behavior is attractive to me and if my
armchair spelunking is correct, it's actually pretty trivial. It seems
to me like we could even make a minor breaking change for 1.4 where
the old doc is always passed to VDU handlers, even if it's a
tombstone. Migration would mean updating VDU handlers to consider
oldDoc._deleted. I think many are probably using VDUs for validating
the new doc anyway, and would ignore the second parameter.

The default semantics could stay the same, but if we just passed the
tombstone to VDU handlers it would be customizable in exactly the way
Jim wants. Sounds exactly like the sort of thing VDU is for.

Re: Deleted and Replacement documents and VDU behavior

Posted by Robert Newson <rn...@apache.org>.
oldDoc is null in this case. That matches the case that the doc is
brand new and is surely deliberate? I asked him to post it this here
because I do understand the benefits of it being otherwise and wanted
to see this conversation.

My position is that deleting a document should free that id for any
future use, which is exactly what Jim does not want.

I'd like to hear from folks that might have a memory of when this
particular semantic was decided. I think it could arguably have gone
the other way.



On 17 May 2013 19:52, Randall Leeds <ra...@gmail.com> wrote:
> On Fri, May 17, 2013 at 10:03 AM, Jim Klo <ji...@sri.com> wrote:
>> So Bob suggested that I raise this topic over here… Attaching the relevant
>> thread from user@ here.
>>
>> I suppose this is a feature request to be able to optionally enable the
>> passing of the document "stub" as the oldDoc, if exists to the VDU, if the
>> design doc contains a flag to enable this behavior.
>>
>> I have a couple use cases:
>> 1. DCMA takedown; I want to add additional data to the deleted marker, that
>> would permit the VDU to inspect the deleted "stub" doc for the additional
>> data and determine whether to permit the reinstatement of the doc.
>> 2. Want to prevent reuse of an _id.  I want docs to be immutable except for
>> a one time delete.
>>
>> In response to Bob's last response on user@ is how is this behavior any
>> different than an document update?  My workaround is that I don't actually
>> delete documents, but just update them into a "tombstone" document that
>> removes the contents and then add the extra details I need for the VDU to
>> work.  The major difference is that I must encode logic into my views to
>> avoid the tombstone document which adds some unneeded complexity to the
>> potentially every view.  My case is not so bad in that all my documents are
>> 'typed' so my tombstone documents change the type such that they are
>> excluded from views - however I had to carefully check to ensure that was
>> the case.
>>
>> Thoughts, explanation why this is bad, etc?
>
> Are we not passing the old doc (when deleted) to the VDU currently? It
> seems like perhaps we should.
>
> The only other piece that would need to change if you use regular
> DELETEs with a tombstone body would be to ensure the tombstone doesn't
> disappear. I'm not sure currently whether or not the tombstones
> survive compaction, for instance. I suspect they do.

Re: Deleted and Replacement documents and VDU behavior

Posted by Randall Leeds <ra...@gmail.com>.
On Fri, May 17, 2013 at 10:03 AM, Jim Klo <ji...@sri.com> wrote:
> So Bob suggested that I raise this topic over here… Attaching the relevant
> thread from user@ here.
>
> I suppose this is a feature request to be able to optionally enable the
> passing of the document "stub" as the oldDoc, if exists to the VDU, if the
> design doc contains a flag to enable this behavior.
>
> I have a couple use cases:
> 1. DCMA takedown; I want to add additional data to the deleted marker, that
> would permit the VDU to inspect the deleted "stub" doc for the additional
> data and determine whether to permit the reinstatement of the doc.
> 2. Want to prevent reuse of an _id.  I want docs to be immutable except for
> a one time delete.
>
> In response to Bob's last response on user@ is how is this behavior any
> different than an document update?  My workaround is that I don't actually
> delete documents, but just update them into a "tombstone" document that
> removes the contents and then add the extra details I need for the VDU to
> work.  The major difference is that I must encode logic into my views to
> avoid the tombstone document which adds some unneeded complexity to the
> potentially every view.  My case is not so bad in that all my documents are
> 'typed' so my tombstone documents change the type such that they are
> excluded from views - however I had to carefully check to ensure that was
> the case.
>
> Thoughts, explanation why this is bad, etc?

Are we not passing the old doc (when deleted) to the VDU currently? It
seems like perhaps we should.

The only other piece that would need to change if you use regular
DELETEs with a tombstone body would be to ensure the tombstone doesn't
disappear. I'm not sure currently whether or not the tombstones
survive compaction, for instance. I suspect they do.