You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Michael Hendricks <mi...@ndrix.org> on 2008/07/10 05:34:53 UTC

deleting a deleted document

When using the bulk delete feature, I expected that deleting a document
which has already been deleted would respond with a 404 code.  However,
it responds with a 201 indicating that there is no problem.  Is this
expected behavior?  Here are the relevant HTTP requests and responses.
I'm running the current trunk version of CouchDB.


Delete a newly created document:

---- Request ----
POST http://127.0.0.1:5984/net-couchdb-4854-21551/_bulk_docs
Accept: application/json

{"docs":[{"_deleted":true,"_id":"1fffb1e8a4e3d9e69e14f6e83b408ef4","_rev":"1180153770"}]}

---- Response ----
HTTP/1.1 201 Created
Cache-Control: must-revalidate
Date: Thu, 10 Jul 2008 03:23:49 GMT
Server: CouchDB/0.9.0a-incubating (Erlang OTP/R12B)
Content-Length: 84
Content-Type: application/json
Client-Date: Thu, 10 Jul 2008 03:23:49 GMT
Client-Peer: 127.0.0.1:5984
Client-Response-Num: 1

{"ok":true,"new_revs":[{"id":"1fffb1e8a4e3d9e69e14f6e83b408ef4","rev":"161286410"}]}




Trying to delete the same document a second time:

---- Request ----
POST http://127.0.0.1:5984/net-couchdb-4854-21551/_bulk_docs
Accept: application/json

{"docs":[{"_deleted":true,"_id":"1fffb1e8a4e3d9e69e14f6e83b408ef4","_rev":"161286410"}]}

---- Response ----
HTTP/1.1 201 Created
Cache-Control: must-revalidate
Date: Thu, 10 Jul 2008 03:23:49 GMT
Server: CouchDB/0.9.0a-incubating (Erlang OTP/R12B)
Content-Length: 84
Content-Type: application/json
Client-Date: Thu, 10 Jul 2008 03:23:49 GMT
Client-Peer: 127.0.0.1:5984
Client-Response-Num: 1

{"ok":true,"new_revs":[{"id":"1fffb1e8a4e3d9e69e14f6e83b408ef4","rev":"890346482"}]}

-- 
Michael

[PATCH] _bulk_docs: Make deleting a deleted document a conflict

Posted by Michael Hendricks <mi...@ndrix.org>.
When using the REST API, if one tries to delete a document which
has already been deleted, he receives a 404 error.  However, if
one tries the same thing with the _bulk_docs API, the deletion
is accepted without error.  This patch causes _bulk_docs to generate
a 412 error if one attempts to delete a document which has already
been deleted.

Signed-off-by: Michael Hendricks <mi...@ndrix.org>
---
 share/www/script/couch_tests.js |   10 ++++++++++
 src/couchdb/couch_db.erl        |    4 +++-
 2 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/share/www/script/couch_tests.js b/share/www/script/couch_tests.js
index 6a1e72c..7c8af78 100644
--- a/share/www/script/couch_tests.js
+++ b/share/www/script/couch_tests.js
@@ -223,6 +223,16 @@ var tests = {
     T(result.new_revs.length == 5);
     for (i = 0; i < 5; i++) {
       T(db.open(docs[i]._id) == null);
+      docs[i]._deleted = true;
+    }
+
+    // Deleting the docs again fails
+    try {
+      db.bulkSave(docs);
+      T("no double delete conflict" && false); // we shouldn't hit here
+    } catch (e) {
+      T(e.error == "conflict");
+      T(e.reason == "Update conflict");
     }
   },
 
diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
index 05b584a..f8330ed 100644
--- a/src/couchdb/couch_db.erl
+++ b/src/couchdb/couch_db.erl
@@ -238,6 +238,8 @@ prepare_doc_for_new_edit(Db, #doc{id=Id,revs=[NewRev|PrevRevs]}=Doc, OldFullDocI
     case PrevRevs of
     [PrevRev|_] ->
         case dict:find(PrevRev, LeafRevsDict) of
+        {ok, {true, Sp, DiskRevs}} ->
+            throw(conflict);
         {ok, {Deleted, Sp, DiskRevs}} ->
             case couch_doc:has_stubs(Doc) of
             true ->
@@ -1035,4 +1037,4 @@ start_copy_compact_int(#db{name=Name,filepath=Filepath}=Db, CopyLocal) ->
     Db#db.update_pid ! {compact_done, CompactFile}.
     
     
-    
\ No newline at end of file
+    
-- 
1.5.6


Re: deleting a deleted document

Posted by Michael Hendricks <mi...@ndrix.org>.
On Fri, Jul 11, 2008 at 02:44:07PM -0400, Damien Katz wrote:
> There is actually a use case for re-deleting documents.
>
> CouchDB deletions replicate around, just like regular documents. However, I 
> realized i's often not be enough to record that the document was deleted, 
> you might also want there to be meta information stored in the deletion 
> record, like who deleted it, and when.

Very interesting.  Perhaps that explains why I was able to create a
document that started its life deleted.  For instance invoking
_bulk_docs with

    { "docs":[{ "id":"12345", "_deleted":true }] }

> One thing I don't understand about your request though, I'm not sure how 
> you'd delete documents without actually knowing they where already deleted. 
> Since it is required to have the rev to delete, wouldn't you also have 
> loaded the document information to get the rev?

Excellent point.  The test cases I had written in Javascript and Perl
both retained the rev from the previous delete and therefore could
delete without conflict.  When I made another test using fork() to more
closely model what I'm working towards, the rev in the child (which
deleted the document first) and the rev in the parent (which is trying
to delete it second) don't match which causes conflict.

Thank you.

-- 
Michael

Re: deleting a deleted document

Posted by Damien Katz <da...@apache.org>.
There is actually a use case for re-deleting documents.

CouchDB deletions replicate around, just like regular documents.  
However, I realized i's often not be enough to record that the  
document was deleted, you might also want there to be meta information  
stored in the deletion record, like who deleted it, and when.

I think I'm the only one who knows this, but you can put information  
any information into deleted documents that you can have in normal  
documents. Right now itsnt not very useful, and really though, you'd  
only want meta information in these documents, information about the  
deletion (who,when,why), not application data. You can use this meta  
information for security purposes (i.e. is this replicated delete by  
this user allowed) or for purge information, such that documents  
deleted X months/days/hours ago are now purged from the database. (no  
purge functionality exists just yet, but you see my point)

Anyway, the fact that this meta information is useful on a deletion  
also means you might want to "re-delete" something that's already  
deleted, to change the meta information associated with the deleted  
document.

One thing I don't understand about your request though, I'm not sure  
how you'd delete documents without actually knowing they where already  
deleted. Since it is required to have the rev to delete, wouldn't you  
also have loaded the document information to get the rev?

-Damien

On Jul 11, 2008, at 2:23 PM, Michael Hendricks wrote:

> On Fri, Jul 11, 2008 at 05:01:05PM +0100, Noah Slater wrote:
>> On Fri, Jul 11, 2008 at 09:20:41AM -0600, Michael Hendricks wrote:
>>> My other concern is that the CouchDB REST API doesn't accept double
>>> deletions.  It generates an error.  If CouchDB is going to give an  
>>> error
>>> in one place, I think it should be consistent and do it elsewhere  
>>> too.
>>
>> Consistency is good.
>
> The use case I'm working with is something like this.  I have a  
> document
> that I want to delete.  If the document exists in the database
> everything is fine and the delete operation proceeds normally.  If the
> document has already been deleted from the database, my own  
> knowledge of
> the database's state was obviously wrong (or I wouldn't have tried to
> delete this document).  In some cases I don't care and in some cases I
> do.
>
> If CouchDB silently deletes documents that have already been  
> deleted, I
> have no way of finding that out because the return value from  
> _bulk_docs
> gives no hint.
>
> If, for example, I run "rm file-does-not-exist" at a bash shell it
> generates a warning and helpfully sets $? to 1.  If I don't care that
> the file was missing, I just ignore the return value.  If I do care, I
> can take appropriate action.  I hesitate to use an example from  
> RDBMSes
> (since we probably wouldn't be here if we cared much for them), but  
> when
> I perform a DELETE query I receive a count of the rows affected (at
> least in PostgreSQL, MySQL and SQLite).  If the count matters to me, I
> can use it.  If not, I can ignore it.
>
> Perhaps _bulk_docs could return something like:
>
>    {
>        "ok" : true,
>        "new_revs" : [
>            { "id" : "123edce", "rev" : "12345" },
>            { "id" : "ca32f29", "rev" : "55632", "already_deleted" :  
> true },
>        ],
>    }
>
> which would let consumers take action if they care.
>
> We can't assume that people will just use DELETE if they want this  
> kind
> of notice because that doesn't scale well to deleting many records or
> performing these deletions as part of an atomic operation.
>
> -- 
> Michael


Re: deleting a deleted document

Posted by Michael Hendricks <mi...@ndrix.org>.
On Fri, Jul 11, 2008 at 05:01:05PM +0100, Noah Slater wrote:
> On Fri, Jul 11, 2008 at 09:20:41AM -0600, Michael Hendricks wrote:
> > My other concern is that the CouchDB REST API doesn't accept double
> > deletions.  It generates an error.  If CouchDB is going to give an error
> > in one place, I think it should be consistent and do it elsewhere too.
> 
> Consistency is good.

The use case I'm working with is something like this.  I have a document
that I want to delete.  If the document exists in the database
everything is fine and the delete operation proceeds normally.  If the
document has already been deleted from the database, my own knowledge of
the database's state was obviously wrong (or I wouldn't have tried to
delete this document).  In some cases I don't care and in some cases I
do.

If CouchDB silently deletes documents that have already been deleted, I
have no way of finding that out because the return value from _bulk_docs
gives no hint.

If, for example, I run "rm file-does-not-exist" at a bash shell it
generates a warning and helpfully sets $? to 1.  If I don't care that
the file was missing, I just ignore the return value.  If I do care, I
can take appropriate action.  I hesitate to use an example from RDBMSes
(since we probably wouldn't be here if we cared much for them), but when
I perform a DELETE query I receive a count of the rows affected (at
least in PostgreSQL, MySQL and SQLite).  If the count matters to me, I
can use it.  If not, I can ignore it.

Perhaps _bulk_docs could return something like:

    {
        "ok" : true,
        "new_revs" : [
            { "id" : "123edce", "rev" : "12345" },
            { "id" : "ca32f29", "rev" : "55632", "already_deleted" : true },
        ],
    }

which would let consumers take action if they care.

We can't assume that people will just use DELETE if they want this kind
of notice because that doesn't scale well to deleting many records or
performing these deletions as part of an atomic operation.

-- 
Michael

Re: deleting a deleted document

Posted by Troy Kruthoff <tk...@blit.com>.
On Jul 11, 2008, at 9:01 AM, Noah Slater wrote:

> On Fri, Jul 11, 2008 at 09:20:41AM -0600, Michael Hendricks wrote:
>> Most database interface modules that I've worked with generate  
>> exceptions in
>> such cases (even if the underlying database doesn't) as a way to  
>> help the
>> programmer find his mistakes.
>
> That is certainly not my experience.

Nor mine, it really is a concurrency issue, and concurrent deletes  
should be acceptable

>
>
>> My other concern is that the CouchDB REST API doesn't accept double
>> deletions.  It generates an error.  If CouchDB is going to give an  
>> error
>> in one place, I think it should be consistent and do it elsewhere  
>> too.
>
> Consistency is good.

Agreed, but remember the REST API is just an HTTP interface to  
couchdb, and the norm with HTTP is to return an error if a url is not  
found.  Thus, when using the DELETE verb to a document url there  
should be an error, if the url points to a missing resource.  However,  
when using the bulk_docs uri, you are using the verb POST to send a  
payload of multiple transactions, which inherently falls outside of my  
understanding of REST, which is all about a single URI pointing to a  
single resource and using the HTTP verbs.  So we should think of  
consistency in the sense of how do we make the bulk_docs API look and  
smell like taking several calls to the regular REST API and  
implementing them as a single transaction.

In a previous post, I mentioned that I would eventually like to see  
bulk_docs being able to process get, post, put, and delete verbs in a  
single payload.  With that in mind, I also think bulk_docs should not  
be aggressive with HTTP error codes, and should instead include an  
HTTP status code with each row of the response.  It would then be up  
to the app to parse the response and handle the status codes, and  
would give couchdb a true multi/bulk-resource wrapper around the REST  
API.

>
>
>> I can see some value in making this a configurable behavior if others
>> don't want an error under this circumstance.  Of course, I probably
>> won't send a patch for that since it's not my itch ;-)
>
> -1

Your absolutely right.

>
>
> Configuration parameter proliferation is a Bad Thing.
>
> -- 
> Noah Slater, http://people.apache.org/~nslater/


Re: deleting a deleted document

Posted by Noah Slater <ns...@apache.org>.
On Fri, Jul 11, 2008 at 09:20:41AM -0600, Michael Hendricks wrote:
> Most database interface modules that I've worked with generate exceptions in
> such cases (even if the underlying database doesn't) as a way to help the
> programmer find his mistakes.

That is certainly not my experience.

> My other concern is that the CouchDB REST API doesn't accept double
> deletions.  It generates an error.  If CouchDB is going to give an error
> in one place, I think it should be consistent and do it elsewhere too.

Consistency is good.

> I can see some value in making this a configurable behavior if others
> don't want an error under this circumstance.  Of course, I probably
> won't send a patch for that since it's not my itch ;-)

-1

Configuration parameter proliferation is a Bad Thing.

-- 
Noah Slater, http://people.apache.org/~nslater/

Re: deleting a deleted document

Posted by Michael Hendricks <mi...@ndrix.org>.
On Thu, Jul 10, 2008 at 04:56:27PM -0700, Troy Kruthoff wrote:
> I would think it would be expected behavior, as the end result is the same 
> regardless if the document existed.  Maybe some edge cases would need to 
> know if the document was already deleted. I could be skewed because all of 
> our current apps use a RDMS as a backend where deleting a non-existent 
> record is no-harm-no-foul.

In my experience, trying to delete something that has already been
deleted is a sign of a logic error in my application.  Most database
interface modules that I've worked with generate exceptions in such
cases (even if the underlying database doesn't) as a way to help the
programmer find his mistakes. If I knew that something was deleted, I
probably wouldn't be trying to delete it again.

My other concern is that the CouchDB REST API doesn't accept double
deletions.  It generates an error.  If CouchDB is going to give an error
in one place, I think it should be consistent and do it elsewhere too.

I can see some value in making this a configurable behavior if others
don't want an error under this circumstance.  Of course, I probably
won't send a patch for that since it's not my itch ;-)

-- 
Michael

Re: deleting a deleted document

Posted by Troy Kruthoff <tk...@blit.com>.
I would think it would be expected behavior, as the end result is the  
same regardless if the document existed.  Maybe some edge cases would  
need to know if the document was already deleted. I could be skewed  
because all of our current apps use a RDMS as a backend where deleting  
a non-existent record is no-harm-no-foul.

Another thought would be to still return success but to annotate in  
the response that the document was already deleted, or to be able to  
set an error-level as a param to bulk_docs, such as the regular  
delete:true + delete_or_error:true --something along those lines.

-- troy



On Jul 10, 2008, at 4:35 PM, Michael Hendricks wrote:

> On Wed, Jul 09, 2008 at 09:34:53PM -0600, Michael Hendricks wrote:
>> When using the bulk delete feature, I expected that deleting a  
>> document
>> which has already been deleted would respond with a 404 code.   
>> However,
>> it responds with a 201 indicating that there is no problem.  Is this
>> expected behavior?  Here are the relevant HTTP requests and  
>> responses.
>> I'm running the current trunk version of CouchDB.
>
> Thanks for not answering, it prompted me to send a patch instead :-)
> I sent it to the dev list.  It's only my second attempt at writing
> Erlang, but It Works For Me™
>
> -- 
> Michael


Re: deleting a deleted document

Posted by Michael Hendricks <mi...@ndrix.org>.
On Wed, Jul 09, 2008 at 09:34:53PM -0600, Michael Hendricks wrote:
> When using the bulk delete feature, I expected that deleting a document
> which has already been deleted would respond with a 404 code.  However,
> it responds with a 201 indicating that there is no problem.  Is this
> expected behavior?  Here are the relevant HTTP requests and responses.
> I'm running the current trunk version of CouchDB.

Thanks for not answering, it prompted me to send a patch instead :-)
I sent it to the dev list.  It's only my second attempt at writing
Erlang, but It Works For Me™

-- 
Michael