You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Brian Denny <br...@briandenny.net> on 2003/04/20 20:59:45 UTC

Issue #1075 analysis

I'm working on Issue #1075 (update receiving delete for missing
directory), and I'd like it if somebody who understands the update code
better than I could sanity-check my analysis.

Here's the repro recipe from IZ:
  svnadmin create repo
  svn mkdir -mx file://`pwd`/repo/foo
  svn rm -mx file://`pwd`/repo/foo
  svn co -r1 file://`pwd`/repo wc
  rm -rf wc/foo
  svn up wc
  ../svn/subversion/libsvn_wc/lock.c:422: (apr_err=155005)
  svn: Working copy not locked
  svn: directory not locked (wc/foo)


Studying the code, this is what i think is going on:

In svn_wc_crawl_revisions, missing directories are reported as such by
calling reporter->delete_path -- essentially removing them from the
"source tree" that the update editor will look at to generate deltas.   
This works great if the update wants to *restore* the missing directory
-- the editor will see that it's there in the target and not in the
source, and generate a delta to restore it.  But if the update wants to
*delete* the missing directory (as in the case of this bug), no delta is
generated because there is no difference between the source and target
trees.

By the time we get around to editor->close_edit, the entry for "foo"
should have already been removed from the parent dir's 'entries' file,
but since no delta was generated to delete "foo", the entry is still
there.  The "Working copy not locked" error happens in
recursively_tweak_entries, when we try to svn_wc_adm_retrieve for the
missing directory.  

So, we have to get rid of the parent dir's entry for "foo".
Off the top of my head, i have two ideas:

(1) I *think* that the editor could "do the right thing" by 
    always generating a delta when the source directory is missing -- 
    i.e., if the directory is present in the target, do an 
    add (restore), and if it's absent in the target, do a delete.  But 
    that would also require some way of signaling a missing path to the 
    reporter as distinct from a deleted path (not to mention that i have 
    no idea how the missing item should be represented in the transaction 
    "source tree").  
(2) Do everything as we are doing it, but in recursively_tweak_entries,
    check for a missing directory, and just remove the parent dir's
    entry for it (i.e., assume that any directory which is still missing 
    at the end of an update.

Idea #2 seems a *lot* simpler -- but is it too hacky?
I'm open to other ideas.  I just barely understand the update code now,
so there may be some clean, simple solution that i'm missing.

-brian

p.s., Here's a one-off patch for idea #2:

Index: subversion/libsvn_wc/adm_ops.c
===================================================================
--- subversion/libsvn_wc/adm_ops.c      (revision 5680)
+++ subversion/libsvn_wc/adm_ops.c      (working copy)
@@ -105,12 +105,28 @@
       else if (current_entry->kind == svn_node_dir)
         {
           svn_wc_adm_access_t *child_access;
-          const char *child_path
-            = svn_path_join (svn_wc_adm_access_path (dirpath), name, subpool);
-          SVN_ERR (svn_wc_adm_retrieve (&child_access, dirpath, child_path,
-                                        subpool));
-          SVN_ERR (recursively_tweak_entries
-                   (child_access, child_url, new_rev, subpool));
+          const char *child_path, *path;
+          apr_hash_t *dirents;
+          svn_boolean_t missing;
+
+          path = svn_wc_adm_access_path (dirpath);
+          svn_io_get_dirents(&dirents, path, subpool);
+          missing = (apr_hash_get (dirents, name, APR_HASH_KEY_STRING)
+                     == NULL);
+
+          if (missing)
+            {
+              svn_wc__entry_remove (entries, name);
+            }
+          else
+            {
+              child_path = svn_path_join (svn_wc_adm_access_path (dirpath),
+                                          name, subpool);
+              SVN_ERR (svn_wc_adm_retrieve (&child_access, dirpath, child_path,
+                                            subpool));
+              SVN_ERR (recursively_tweak_entries
+                       (child_access, child_url, new_rev, subpool));
+            }
         }
     }



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1075 analysis

Posted by Brian Denny <br...@briandenny.net>.
On Sun, Apr 20, 2003 at 01:59:45PM -0700, Brian Denny wrote:
> 
> (2) Do everything as we are doing it, but in recursively_tweak_entries,
>     check for a missing directory, and just remove the parent dir's
>     entry for it (i.e., assume that any directory which is still missing 
>     at the end of an update.
> 
[snip]
> p.s., Here's a one-off patch for idea #2:
[snip]


Heh.  This idea (or at least this patch) doesn't fix the *other*
reproduction recipe in IZ (which i didn't quote in my e-mail).  I'll
have to investigate further.  (Comments on what I wrote would still be
useful.)

-brian


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1075 analysis

Posted by Brian Denny <br...@briandenny.net>.
[replying to self, again]  :)

On Mon, Apr 21, 2003 at 10:01:18AM -0700, Brian Denny wrote:
> [sussman:]  
> > Perhaps schedule-delete directories shouldn't be described as missing
> > at all -- only "deleted" ones.  That would make the server send a
> 
> i'm confused again.  what do you mean by "schedule-delete" and
> "deleted", respectively?

now i get it.  :)  "schedule-delete" means you did 'svn rm DIR' on it;
'deleted' means you did 'svn up -rNUM DIR', where DIR is gone in rev
NUM, so that the dir is gone, but since you haven't updated the parent
dir yet, it still has an entry for DIR, marked as 'deleted'.

y'know, i think schedule-delete directories *already* aren't described 
as missing.  hmm.

-brian, 
who seems to be good at answering his own questions today,
several hours after posting them to the list...


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1075 analysis

Posted by Brian Denny <br...@briandenny.net>.
sussman@collab.net wrote:
> 
> I hope that's not what he's saying.  Philip seems to be describing
> what you already described in your first mail:  when the client sends
> the update report, it describes the schedule-delete directory as
> "missing", so it's removed from the repos txn.  And thus the server
> never sends a tree delta at all, which leads to problems.

i'm confused by the terminology here.  the dir in question has been 
'rm -rf'ed, but has not (in the working copy) been scheduled for
deletion.  so when you say 'schedule-delete', do you mean that it is 
due to be deleted by the update, or were you confused about the use case?
 
> Perhaps schedule-delete directories shouldn't be described as missing
> at all -- only "deleted" ones.  That would make the server send a

i'm confused again.  what do you mean by "schedule-delete" and
"deleted", respectively?

> deletion command, which the client would then carry out, even in the
> presence of a schedule-delete: it could be considered a case of
> "merging" tree changes from the server which don't conflict.

you probably understand this better than i do, but just to make sure
we're on the same page:

in the present use case, directory "foo" is missing in the working copy.

whatever we do, it has to do the right thing in the case where "foo" is 
(1) present in the target revision
(2) absent in the target revision

but AFAICT, at the level at which we're calling reporter->delete_path, 
we don't know which of those cases we're dealing with (we only know the 
state of the working copy, not the state of the target revision in the 
repos.)

what we need is for the server to send a deletion command in case (2), 
which would seem to require that we *not* call reporter->delete_path; 
however, case (1) would seem to require that we *do* call 
reporter->delete_path in case (c).


assuming that we're on the same page so far, i'm probably just slow
catching on -- could you spell out your idea a bit more for me?

thanks,
brian


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1075 analysis

Posted by Brian Denny <br...@briandenny.net>.
I wrote:
 
> (1) I *think* that the editor could "do the right thing" by 
>     always generating a delta when the source directory is missing -- 
>     i.e., if the directory is present in the target, do an 
>     add (restore), and if it's absent in the target, do a delete.  But 
>     that would also require some way of signaling a missing path to the 
>     reporter as distinct from a deleted path (not to mention that i have 
>     no idea how the missing item should be represented in the transaction 
>     "source tree").  

and later, Ben Collins-Sussman wrote:
> 
> Perhaps schedule-delete directories shouldn't be described as missing
> at all -- only "deleted" ones.  That would make the server send a
> deletion command, which the client would then carry out, even in the
> presence of a schedule-delete: it could be considered a case of
> "merging" tree changes from the server which don't conflict.

[light bulb goes on in brian's head]

oh, your suggestion is in the context of the idea i posted above, yeah?

so the server will always give out the delta if the source directory is
missing, and your suggestion is that the way we tell it's missing is that 
if it weren't missing we wouldn't have called reporter->delete_path (?)

interesting.  

if i'm understanding you correctly:

         Presence/Absence of dir in txn trees
State    Source tree         Target tree         Action      
1.       absent/missing      present             add
2.       missing             absent              (current: nothing; 
                                                  proposed: delete)
3.       present             absent              delete
4.       present             present             [not relevant here]

Cases which result in state 1:
- wc dir missing; exists in target rev
- wc dir DNE; exists in target rev

Cases which result in state 2:
- wc dir missing; deleted in target rev
- [sussman suggests that this state should *not* result from a wc dir that
   is scheduled for delete] 
- [note that for this to work, the update editor must still be able to 
   differentiate a source dir that is absent because we called
   reporter->delete_path (or something else) from one that is simply
   nonexistent; at least, there must be *something* for the editor to
   iterate over.  i haven't looked at the FS/txn code enough to know 
   whether there's a mechanism in place to make this distinction for us.]

Cases which result in state 3:
- wc dir exists; deleted in target rev
- wc dir scheduled for deletion (per sussman's suggestion); deleted in
  target rev.


... is this what you had in mind?  further comments?

-brian


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1075 analysis

Posted by Ben Collins-Sussman <su...@collab.net>.
Brian Denny <br...@briandenny.net> writes:

> On Sun, Apr 20, 2003 at 10:44:02PM +0100, Philip Martin wrote:
> > Brian Denny <br...@briandenny.net> writes:
> > 
> > > (1) I *think* that the editor could "do the right thing" by 
> > >     always generating a delta when the source directory is missing -- 
> > >     i.e., if the directory is present in the target, do an 
> > >     add (restore), and if it's absent in the target, do a delete.  But 
> > >     that would also require some way of signaling a missing path to the 
> > >     reporter as distinct from a deleted path (not to mention that i have 
> > >     no idea how the missing item should be represented in the transaction 
> > >     "source tree").  
> > 
> > 'svn update' will replace missing directories, even if nothing has
> > changed in the repository.  Can that mechanism be used to cause the
> > delete to get sent?
> 
> i don't understand what you have in mind -- are you saying that we
> should restore the missing directory first, so that when we do the real
> update, we get the correct delta?

I hope that's not what he's saying.  Philip seems to be describing
what you already described in your first mail:  when the client sends
the update report, it describes the schedule-delete directory as
"missing", so it's removed from the repos txn.  And thus the server
never sends a tree delta at all, which leads to problems.

Perhaps schedule-delete directories shouldn't be described as missing
at all -- only "deleted" ones.  That would make the server send a
deletion command, which the client would then carry out, even in the
presence of a schedule-delete: it could be considered a case of
"merging" tree changes from the server which don't conflict.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1075 analysis

Posted by Philip Martin <ph...@codematters.co.uk>.
Brian Denny <br...@briandenny.net> writes:

> On Sun, Apr 20, 2003 at 10:44:02PM +0100, Philip Martin wrote:
> > Brian Denny <br...@briandenny.net> writes:
> > 
> > > (1) I *think* that the editor could "do the right thing" by 
> > >     always generating a delta when the source directory is missing -- 
> > >     i.e., if the directory is present in the target, do an 
> > >     add (restore), and if it's absent in the target, do a delete.  But 
> > >     that would also require some way of signaling a missing path to the 
> > >     reporter as distinct from a deleted path (not to mention that i have 
> > >     no idea how the missing item should be represented in the transaction 
> > >     "source tree").  
> > 
> > 'svn update' will replace missing directories, even if nothing has
> > changed in the repository.  Can that mechanism be used to cause the
> > delete to get sent?
> 
> i don't understand what you have in mind -- are you saying that we
> should restore the missing directory first, so that when we do the real
> update, we get the correct delta?

If I 'rm -rf' a directory that is part of a working copy directory,
and then do 'svn up', the missing directory gets replaced.  It would
appear that the client is describing the missing directory to the
repository, and the repository is sending an update to reinstate it.
That appears to be something like what you have dscribed in (1) above,
and so I would have thought it was possible to get the repository to
send a delete, in the same way it sends an update.  (Note, I haven't
examined the code, so I'm not claiming this will definitely work, just
that I think it might work.)

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1075 analysis

Posted by Brian Denny <br...@briandenny.net>.
On Sun, Apr 20, 2003 at 10:44:02PM +0100, Philip Martin wrote:
> Brian Denny <br...@briandenny.net> writes:
> 
> > (1) I *think* that the editor could "do the right thing" by 
> >     always generating a delta when the source directory is missing -- 
> >     i.e., if the directory is present in the target, do an 
> >     add (restore), and if it's absent in the target, do a delete.  But 
> >     that would also require some way of signaling a missing path to the 
> >     reporter as distinct from a deleted path (not to mention that i have 
> >     no idea how the missing item should be represented in the transaction 
> >     "source tree").  
> 
> 'svn update' will replace missing directories, even if nothing has
> changed in the repository.  Can that mechanism be used to cause the
> delete to get sent?

i don't understand what you have in mind -- are you saying that we
should restore the missing directory first, so that when we do the real
update, we get the correct delta?


-brian
 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1075 analysis

Posted by Philip Martin <ph...@codematters.co.uk>.
Brian Denny <br...@briandenny.net> writes:

> (1) I *think* that the editor could "do the right thing" by 
>     always generating a delta when the source directory is missing -- 
>     i.e., if the directory is present in the target, do an 
>     add (restore), and if it's absent in the target, do a delete.  But 
>     that would also require some way of signaling a missing path to the 
>     reporter as distinct from a deleted path (not to mention that i have 
>     no idea how the missing item should be represented in the transaction 
>     "source tree").  

'svn update' will replace missing directories, even if nothing has
changed in the repository.  Can that mechanism be used to cause the
delete to get sent?

> (2) Do everything as we are doing it, but in recursively_tweak_entries,
>     check for a missing directory, and just remove the parent dir's
>     entry for it (i.e., assume that any directory which is still missing 
>     at the end of an update.
> 
> Idea #2 seems a *lot* simpler -- but is it too hacky?

It looks like a bit of a timebomb--removing an entry that has not been
deleted would be a serious mistake.

> I'm open to other ideas.  I just barely understand the update code now,
> so there may be some clean, simple solution that i'm missing.
> 
> -brian
> 
> p.s., Here's a one-off patch for idea #2:
> 
> Index: subversion/libsvn_wc/adm_ops.c
> ===================================================================
> --- subversion/libsvn_wc/adm_ops.c      (revision 5680)
> +++ subversion/libsvn_wc/adm_ops.c      (working copy)
> @@ -105,12 +105,28 @@
>        else if (current_entry->kind == svn_node_dir)
>          {
>            svn_wc_adm_access_t *child_access;
> -          const char *child_path
> -            = svn_path_join (svn_wc_adm_access_path (dirpath), name, subpool);
> -          SVN_ERR (svn_wc_adm_retrieve (&child_access, dirpath, child_path,
> -                                        subpool));
> -          SVN_ERR (recursively_tweak_entries
> -                   (child_access, child_url, new_rev, subpool));
> +          const char *child_path, *path;
> +          apr_hash_t *dirents;
> +          svn_boolean_t missing;
> +
> +          path = svn_wc_adm_access_path (dirpath);
> +          svn_io_get_dirents(&dirents, path, subpool);
> +          missing = (apr_hash_get (dirents, name, APR_HASH_KEY_STRING)
> +                     == NULL);
> +

If this is the best approach then don't implement it like that, use
svn_wc__adm_missing instead.

> +          if (missing)
> +            {
> +              svn_wc__entry_remove (entries, name);

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org