You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Paul Burba <pt...@gmail.com> on 2010/08/20 18:04:45 UTC

RFC: How should Subversion handle OS-deleted paths?

I think we can all agree that when a user deletes part of their WC via
the OS they have made a mistake of some sort.  But which mistake
exactly?  The obvious answer is that they really intended 'svn del
dirX/foo.c'.  But possibly they intended something more akin to 'svn
up --set-depth empty dirX'.  Or maybe it was just a true mistake, and
nothing was intended; they expect foo.c to be there!

I originally came to this question when dealing with merge tracking
and paths missing because they were deleted outside of Subversion, see
http://svn.haxx.se/dev/archive-2010-08/0156.shtml.

Somewhat tardily I realized the *first* question to be answered is not
how merge tracking should deal with such paths, but how should
Subversion in general deal with them.

I see a few basic approaches:

1) Consistent Approach -- Restore the Missing Paths (unless there is
really a compelling reason not too, but the default approach is to
restore)

WCNG's single DB allows us to simply restore these missing paths as we
encounter them (the notable exception being 'svn st' of course, which
just reports the missing state).

Exhibit A of this behavior: In 1.6.x and 1.7 in multiple DB mode, svn
revert skips missing directories.  In single-DB mode they are
restored.

2) Consistent Approach: Error out

If we detect missing paths, simply throw an error asking that the user
restore them before proceeding.  Obviously for svn st/up/revert, the
current behavior is fine, but for svn ci/sw/merge we could follow this
route.

3) Case-by-Case Approach:

Maybe there is *no* consistent approach, and sometimes we will want to
restore, but other times we'll do something else.  I'll leave what
"something else" is as an open question for the moment.


Any ideas as to the best approach?  I suspect that some of the wcng
folks have given some thought to this.

Paul

Re: RFC: How should Subversion handle OS-deleted paths?

Posted by Paul Burba <pt...@gmail.com>.
On Tue, Aug 31, 2010 at 8:27 PM, Paul Burba <pt...@gmail.com> wrote:
> On Thu, Aug 26, 2010 at 8:07 AM, Julian Foad <ju...@wandisco.com> wrote:
>> On Wed, 2010-08-25, Paul Burba wrote:
>>> On Tue, Aug 24, 2010 at 7:25 AM, Julian Foad <ju...@wandisco.com> wrote:
>> [...]
>>> Agreed, we simply can't be sure what the user's intention was...I'm
>>> beginning to be swayed to the idea that restoring the missing subtrees
>>> may not be the ideal approach...
>>
>> [...]
>>> > The pattern that's emerging from my thoughts is: throw an error if
>>> > logically we need to access the missing working version of the node.  If
>>> > we don't need to access it, just let it be.  Never "restore" it unless
>>> > the user specifically requests so and says what kind of restoration is
>>> > required.
>>> >
>>> > For these reasons I think "merge" should also throw an error if it needs
>>> > to merge changes into a missing node.  (If instead it needs to delete
>>> > it, then it has the options I mentioned for "svn delete".)
>>> >
>>> > But I suspect part of the reason why "restore" seems an attractive
>>> > option is because Subversion isn't very friendly when "merge" stops with
>>> > an error.  We don't provide any way to resume the same merge,
>>>
>>> Quite right about the unfriendliness.  Resuming the merge is really a
>>> hit-or-miss proposition, depending on how much of the merge was done
>>> successfully prior to encountering the missing subtree.  If it
>>> encountered early, before the application of any diffs, then things
>>> are ok, but after that things get ugly fast, particularly if the merge
>>> target originally had any local mods.
>>>
>>> It's worth noting again there there are *two* error-out approaches:
>>>
>>> i) Throw an error when a subtree the editor needs is found to be
>>> missing.  This is what you are talking about here.
>>>
>>> ii) Identify any missing subtrees at the *start* of the merge and
>>> throw an error before any editor drives are done.  Basically we
>>> disallow merges with missing subtrees due to OS-level deletes.
>>
>> You're right, I hadn't thought of erroring out before starting the
>> merge, and that is a much, much better option than erroring out in the
>> middle.
>>
>>> > and we
>>> > don't make it particularly easy to roll back the merge (although that's
>>> > getting better now that "revert" is, I hope, going to remove nodes that
>>> > it created by copying), and we don't have any way at all to roll back a
>>> > merge performed in a non-clean WC.  So we're trying to avoid erroring
>>> > out.
>>> >
>>> > Long term, those difficult problems are are the problems we should be
>>> > looking to solve.  Short term, I suppose it's useful to avoid erroring
>>> > out as much as we can.
>>>
>>> Equally bad is the case with trunk right now, where we simply ignore
>>> missing directories, even if the merge would affect them - see
>>> http://subversion.tigris.org/issues/show_bug.cgi?id=2915#desc4 for an
>>> example.
>>>
>>> > If that's the important issue, and we recognize
>>> > it as such, then I could support the idea of "merge" doing this
>>> > "restore" while other commands don't.
>>>
>>> Given what you've said here and thinking anew about merge and missing
>>> subtrees, I think the best approach might simply be what we currently
>>> do with files: Skip the missing subtree and record non-inheritable
>>> mergeinfo so the missing subtree is handled correctly (i.e. the
>>> mergeinfo reflects the fact that the merge never touched the missing
>>> subtree).
>>>
>>> This has a few things going for it:
>>>
>>> A) It's consistent with 1.5-1.6's treatment of missing files.
>>>
>>> B) As Daniel hinted at in
>>> http://svn.haxx.se/dev/archive-2010-08/0189.shtml, it's consistent
>>> with how merge tracking treats every other type of "missing" subtree
>>> (e.g. shallow WCs, switches, paths missing due to authz restrictions).
>>>
>>> C) It makes no assumptions about what the user intended, it just deals
>>> with the fact that the subtrees are missing.
>>>
>>> D) It allows the merge to complete, no errors mid-merge.
>>>
>>> E) It allows the missing subtrees to be restored via update (either
>>> before or after the merge is committed) and for the merge to be
>>> repeated.  The repeat merge will notice that the merge never touched
>>> the subtrees and will drive the editor such that only those subtrees
>>> have the merge repeated.
>>>
>>> Any thoughts to this approach?
>>
>> Another plus:
>>
>>  F) It means the merge algorithm has one less case that can choke it,
>> which is better than relying on a check to have been done beforehand.
>> It makes the subroutines easier to re-use (callable by GUIs etc.).  And
>> it could be extremely difficult to make sure that such an external check
>> exactly matches the merge code in all the corner cases.
>>
>> The only minus I can think of: In many ways we would serve our users
>> better by simply not allowing them to get into complex situations and
>> instead just disallowing such a merge.
>>
>> But the many advantages seem to outweigh that, and your suggestion
>> sounds close to perfect.
>>
>> If you can do that without much additional complexity, +1.
>
> I really thought that would be the case, as most of the logic is
> already in place to handle other types of missing subtrees.  But after
> hacking on this entirely too long and finding new bugs at every turn,
> I had some serious second thoughts, mainly along the lines of, "Is all
> this complexity worth it?"
>
> I think "no" and have come to see the wisdom of something you said earlier:
>
>  "You're right, I hadn't thought of erroring out before starting the
>   merge, and that is a much, much better option than erroring out in the
>   middle."
>
> This is relatively simple to do as the attached patch demonstrates.
> If there are no objections I will go in this direction instead.
>
> Note that there are a few test failures I still need to fix:
>
> FAIL:  merge_tests.py 16: merge into missing must not break working copy
> FAIL:  merge_tests.py 104: skipped files get correct mergeinfo set
> FAIL:  merge_tree_conflict_tests.py 21: tree conflicts: tree missing, leaf del
> FAIL:  merge_tree_conflict_tests.py 20: tree conflicts: tree missing, leaf edit
> FAIL:  merge_authz_tests.py 1: skipped paths get overriding mergeinfo
>
> These all fail because they expect the old behavior of OS-deleted
> paths to be skipped and OS-deleted directories to create
> tree-conflicts.  With the patch in place, if we try to merge to a
> target which is missing subtrees due to OS-level deletions, then we
> get an error like this:
>
> ### Oops, move rather than svn move:
>
> C:\SVN\src-branch-1.6.x-WCNG>move
> subversion\tests\cmdline\merge_authz_tests.py
> subversion\tests\cmdline\merge_auth_tests.py
>        1 file(s) moved.
>
> ### Let's merge:
>
> C:\SVN\src-branch-1.6.x-WCNG>svn merge ^^/subversion/trunk -c879766
> ..\..\..\subversion\svn\util.c:902: (apr_err=195016)
> ..\..\..\subversion\libsvn_client\merge.c:10548: (apr_err=195016)
> ..\..\..\subversion\libsvn_client\merge.c:10504: (apr_err=195016)
> ..\..\..\subversion\libsvn_client\merge.c:10504: (apr_err=195016)
> ..\..\..\subversion\libsvn_client\merge.c:10476: (apr_err=195016)
> ..\..\..\subversion\libsvn_client\merge.c:8611: (apr_err=195016)
> ..\..\..\subversion\libsvn_client\merge.c:8096: (apr_err=195016)
> ..\..\..\subversion\libsvn_client\merge.c:5851: (apr_err=195016)
> svn: Merge tracking not allowed with missing subtrees; try restoring
> these items first:
> C:\SVN\src-branch-1.6.x-WCNG\subversion\tests\cmdline\merge_authz_tests.py
>
> ### Oh, I see where I screwed up:
>
> C:\SVN\src-branch-1.6.x-WCNG>svn st
> ?       subversion\tests\cmdline\merge_auth_tests.py
> !       subversion\tests\cmdline\merge_authz_tests.py
>
> Note: In the interest of sanity, if there are many OS-deleted subtrees
> in a merge target, the error lists only the *roots* of the missing
> subtrees.
>
> [[[
> Fix issue #2915 'Handle mergeinfo for subtrees missing due to removal by
> non-svn command'.
>
> With this change, if you attempt a merge-tracking aware merge to a WC
> which is missing subtrees due to an OS-level deletion, then an error is
> raised before any editor drives begin.  The error message describes the
> root of each missing path.
>
> * subversion/libsvn_client/merge.c
>
>  (get_mergeinfo_walk_baton): Add some new members for tracking sub-
>   directories' dirents.
>
>  (get_mergeinfo_walk_cb):
>
>  (record_missing_subtree_roots): New function.
>
>  (record_missing_subtree_roots): Use new function to flag missing subtree
>   roots.
>
>  (get_mergeinfo_paths): Raise a SVN_ERR_CLIENT_NOT_READY_TO_MERGE error if
>   any unexpectedly missing subtrees are found.
>
> * subversion/tests/cmdline/merge_tests.py
>
>  (merge_with_os_deleted_subtrees): New test,
>
>  (test_list): Add merge_with_os_deleted_subtrees.
>
> ]]]

Committed this change with a few minor code changes and the failing tests fixed:

http://svn.apache.org/viewvc?view=revision&revision=992042

Paul

Re: RFC: How should Subversion handle OS-deleted paths?

Posted by Paul Burba <pt...@gmail.com>.
On Thu, Aug 26, 2010 at 8:07 AM, Julian Foad <ju...@wandisco.com> wrote:
> On Wed, 2010-08-25, Paul Burba wrote:
>> On Tue, Aug 24, 2010 at 7:25 AM, Julian Foad <ju...@wandisco.com> wrote:
> [...]
>> Agreed, we simply can't be sure what the user's intention was...I'm
>> beginning to be swayed to the idea that restoring the missing subtrees
>> may not be the ideal approach...
>
> [...]
>> > The pattern that's emerging from my thoughts is: throw an error if
>> > logically we need to access the missing working version of the node.  If
>> > we don't need to access it, just let it be.  Never "restore" it unless
>> > the user specifically requests so and says what kind of restoration is
>> > required.
>> >
>> > For these reasons I think "merge" should also throw an error if it needs
>> > to merge changes into a missing node.  (If instead it needs to delete
>> > it, then it has the options I mentioned for "svn delete".)
>> >
>> > But I suspect part of the reason why "restore" seems an attractive
>> > option is because Subversion isn't very friendly when "merge" stops with
>> > an error.  We don't provide any way to resume the same merge,
>>
>> Quite right about the unfriendliness.  Resuming the merge is really a
>> hit-or-miss proposition, depending on how much of the merge was done
>> successfully prior to encountering the missing subtree.  If it
>> encountered early, before the application of any diffs, then things
>> are ok, but after that things get ugly fast, particularly if the merge
>> target originally had any local mods.
>>
>> It's worth noting again there there are *two* error-out approaches:
>>
>> i) Throw an error when a subtree the editor needs is found to be
>> missing.  This is what you are talking about here.
>>
>> ii) Identify any missing subtrees at the *start* of the merge and
>> throw an error before any editor drives are done.  Basically we
>> disallow merges with missing subtrees due to OS-level deletes.
>
> You're right, I hadn't thought of erroring out before starting the
> merge, and that is a much, much better option than erroring out in the
> middle.
>
>> > and we
>> > don't make it particularly easy to roll back the merge (although that's
>> > getting better now that "revert" is, I hope, going to remove nodes that
>> > it created by copying), and we don't have any way at all to roll back a
>> > merge performed in a non-clean WC.  So we're trying to avoid erroring
>> > out.
>> >
>> > Long term, those difficult problems are are the problems we should be
>> > looking to solve.  Short term, I suppose it's useful to avoid erroring
>> > out as much as we can.
>>
>> Equally bad is the case with trunk right now, where we simply ignore
>> missing directories, even if the merge would affect them - see
>> http://subversion.tigris.org/issues/show_bug.cgi?id=2915#desc4 for an
>> example.
>>
>> > If that's the important issue, and we recognize
>> > it as such, then I could support the idea of "merge" doing this
>> > "restore" while other commands don't.
>>
>> Given what you've said here and thinking anew about merge and missing
>> subtrees, I think the best approach might simply be what we currently
>> do with files: Skip the missing subtree and record non-inheritable
>> mergeinfo so the missing subtree is handled correctly (i.e. the
>> mergeinfo reflects the fact that the merge never touched the missing
>> subtree).
>>
>> This has a few things going for it:
>>
>> A) It's consistent with 1.5-1.6's treatment of missing files.
>>
>> B) As Daniel hinted at in
>> http://svn.haxx.se/dev/archive-2010-08/0189.shtml, it's consistent
>> with how merge tracking treats every other type of "missing" subtree
>> (e.g. shallow WCs, switches, paths missing due to authz restrictions).
>>
>> C) It makes no assumptions about what the user intended, it just deals
>> with the fact that the subtrees are missing.
>>
>> D) It allows the merge to complete, no errors mid-merge.
>>
>> E) It allows the missing subtrees to be restored via update (either
>> before or after the merge is committed) and for the merge to be
>> repeated.  The repeat merge will notice that the merge never touched
>> the subtrees and will drive the editor such that only those subtrees
>> have the merge repeated.
>>
>> Any thoughts to this approach?
>
> Another plus:
>
>  F) It means the merge algorithm has one less case that can choke it,
> which is better than relying on a check to have been done beforehand.
> It makes the subroutines easier to re-use (callable by GUIs etc.).  And
> it could be extremely difficult to make sure that such an external check
> exactly matches the merge code in all the corner cases.
>
> The only minus I can think of: In many ways we would serve our users
> better by simply not allowing them to get into complex situations and
> instead just disallowing such a merge.
>
> But the many advantages seem to outweigh that, and your suggestion
> sounds close to perfect.
>
> If you can do that without much additional complexity, +1.

I really thought that would be the case, as most of the logic is
already in place to handle other types of missing subtrees.  But after
hacking on this entirely too long and finding new bugs at every turn,
I had some serious second thoughts, mainly along the lines of, "Is all
this complexity worth it?"

I think "no" and have come to see the wisdom of something you said earlier:

  "You're right, I hadn't thought of erroring out before starting the
   merge, and that is a much, much better option than erroring out in the
   middle."

This is relatively simple to do as the attached patch demonstrates.
If there are no objections I will go in this direction instead.

Note that there are a few test failures I still need to fix:

FAIL:  merge_tests.py 16: merge into missing must not break working copy
FAIL:  merge_tests.py 104: skipped files get correct mergeinfo set
FAIL:  merge_tree_conflict_tests.py 21: tree conflicts: tree missing, leaf del
FAIL:  merge_tree_conflict_tests.py 20: tree conflicts: tree missing, leaf edit
FAIL:  merge_authz_tests.py 1: skipped paths get overriding mergeinfo

These all fail because they expect the old behavior of OS-deleted
paths to be skipped and OS-deleted directories to create
tree-conflicts.  With the patch in place, if we try to merge to a
target which is missing subtrees due to OS-level deletions, then we
get an error like this:

### Oops, move rather than svn move:

C:\SVN\src-branch-1.6.x-WCNG>move
subversion\tests\cmdline\merge_authz_tests.py
subversion\tests\cmdline\merge_auth_tests.py
        1 file(s) moved.

### Let's merge:

C:\SVN\src-branch-1.6.x-WCNG>svn merge ^^/subversion/trunk -c879766
..\..\..\subversion\svn\util.c:902: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:10548: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:10504: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:10504: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:10476: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:8611: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:8096: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:5851: (apr_err=195016)
svn: Merge tracking not allowed with missing subtrees; try restoring
these items first:
C:\SVN\src-branch-1.6.x-WCNG\subversion\tests\cmdline\merge_authz_tests.py

### Oh, I see where I screwed up:

C:\SVN\src-branch-1.6.x-WCNG>svn st
?       subversion\tests\cmdline\merge_auth_tests.py
!       subversion\tests\cmdline\merge_authz_tests.py

Note: In the interest of sanity, if there are many OS-deleted subtrees
in a merge target, the error lists only the *roots* of the missing
subtrees.

[[[
Fix issue #2915 'Handle mergeinfo for subtrees missing due to removal by
non-svn command'.

With this change, if you attempt a merge-tracking aware merge to a WC
which is missing subtrees due to an OS-level deletion, then an error is
raised before any editor drives begin.  The error message describes the
root of each missing path.

* subversion/libsvn_client/merge.c

  (get_mergeinfo_walk_baton): Add some new members for tracking sub-
   directories' dirents.

  (get_mergeinfo_walk_cb):

  (record_missing_subtree_roots): New function.

  (record_missing_subtree_roots): Use new function to flag missing subtree
   roots.

  (get_mergeinfo_paths): Raise a SVN_ERR_CLIENT_NOT_READY_TO_MERGE error if
   any unexpectedly missing subtrees are found.

* subversion/tests/cmdline/merge_tests.py

  (merge_with_os_deleted_subtrees): New test,

  (test_list): Add merge_with_os_deleted_subtrees.

]]]

Paul

Re: RFC: How should Subversion handle OS-deleted paths?

Posted by Paul Burba <pt...@gmail.com>.
On Thu, Aug 26, 2010 at 8:07 AM, Julian Foad <ju...@wandisco.com> wrote:
> On Wed, 2010-08-25, Paul Burba wrote:
>> On Tue, Aug 24, 2010 at 7:25 AM, Julian Foad <ju...@wandisco.com> wrote:
> [...]
>> Agreed, we simply can't be sure what the user's intention was...I'm
>> beginning to be swayed to the idea that restoring the missing subtrees
>> may not be the ideal approach...
>
> [...]
>> > The pattern that's emerging from my thoughts is: throw an error if
>> > logically we need to access the missing working version of the node.  If
>> > we don't need to access it, just let it be.  Never "restore" it unless
>> > the user specifically requests so and says what kind of restoration is
>> > required.
>> >
>> > For these reasons I think "merge" should also throw an error if it needs
>> > to merge changes into a missing node.  (If instead it needs to delete
>> > it, then it has the options I mentioned for "svn delete".)
>> >
>> > But I suspect part of the reason why "restore" seems an attractive
>> > option is because Subversion isn't very friendly when "merge" stops with
>> > an error.  We don't provide any way to resume the same merge,
>>
>> Quite right about the unfriendliness.  Resuming the merge is really a
>> hit-or-miss proposition, depending on how much of the merge was done
>> successfully prior to encountering the missing subtree.  If it
>> encountered early, before the application of any diffs, then things
>> are ok, but after that things get ugly fast, particularly if the merge
>> target originally had any local mods.
>>
>> It's worth noting again there there are *two* error-out approaches:
>>
>> i) Throw an error when a subtree the editor needs is found to be
>> missing.  This is what you are talking about here.
>>
>> ii) Identify any missing subtrees at the *start* of the merge and
>> throw an error before any editor drives are done.  Basically we
>> disallow merges with missing subtrees due to OS-level deletes.
>
> You're right, I hadn't thought of erroring out before starting the
> merge, and that is a much, much better option than erroring out in the
> middle.
>
>> > and we
>> > don't make it particularly easy to roll back the merge (although that's
>> > getting better now that "revert" is, I hope, going to remove nodes that
>> > it created by copying), and we don't have any way at all to roll back a
>> > merge performed in a non-clean WC.  So we're trying to avoid erroring
>> > out.
>> >
>> > Long term, those difficult problems are are the problems we should be
>> > looking to solve.  Short term, I suppose it's useful to avoid erroring
>> > out as much as we can.
>>
>> Equally bad is the case with trunk right now, where we simply ignore
>> missing directories, even if the merge would affect them - see
>> http://subversion.tigris.org/issues/show_bug.cgi?id=2915#desc4 for an
>> example.
>>
>> > If that's the important issue, and we recognize
>> > it as such, then I could support the idea of "merge" doing this
>> > "restore" while other commands don't.
>>
>> Given what you've said here and thinking anew about merge and missing
>> subtrees, I think the best approach might simply be what we currently
>> do with files: Skip the missing subtree and record non-inheritable
>> mergeinfo so the missing subtree is handled correctly (i.e. the
>> mergeinfo reflects the fact that the merge never touched the missing
>> subtree).
>>
>> This has a few things going for it:
>>
>> A) It's consistent with 1.5-1.6's treatment of missing files.
>>
>> B) As Daniel hinted at in
>> http://svn.haxx.se/dev/archive-2010-08/0189.shtml, it's consistent
>> with how merge tracking treats every other type of "missing" subtree
>> (e.g. shallow WCs, switches, paths missing due to authz restrictions).
>>
>> C) It makes no assumptions about what the user intended, it just deals
>> with the fact that the subtrees are missing.
>>
>> D) It allows the merge to complete, no errors mid-merge.
>>
>> E) It allows the missing subtrees to be restored via update (either
>> before or after the merge is committed) and for the merge to be
>> repeated.  The repeat merge will notice that the merge never touched
>> the subtrees and will drive the editor such that only those subtrees
>> have the merge repeated.
>>
>> Any thoughts to this approach?
>
> Another plus:
>
>  F) It means the merge algorithm has one less case that can choke it,
> which is better than relying on a check to have been done beforehand.
> It makes the subroutines easier to re-use (callable by GUIs etc.).  And
> it could be extremely difficult to make sure that such an external check
> exactly matches the merge code in all the corner cases.
>
> The only minus I can think of: In many ways we would serve our users
> better by simply not allowing them to get into complex situations and
> instead just disallowing such a merge.
>
> But the many advantages seem to outweigh that, and your suggestion
> sounds close to perfect.
>
> If you can do that without much additional complexity, +1.

I really thought that would be the case, as most of the logic is
already in place to handle other types of missing subtrees.  But after
hacking on this entirely too long and finding new bugs at every turn,
I had some serious second thoughts, mainly along the lines of, "Is all
this complexity worth it?"

I think "no" and have come to see the wisdom of something you said earlier:

  "You're right, I hadn't thought of erroring out before starting the
   merge, and that is a much, much better option than erroring out in the
   middle."

This is relatively simple to do as the attached patch demonstrates.
If there are no objections I will go in this direction instead.

Note that there are a few test failures I still need to fix:

FAIL:  merge_tests.py 16: merge into missing must not break working copy
FAIL:  merge_tests.py 104: skipped files get correct mergeinfo set
FAIL:  merge_tree_conflict_tests.py 21: tree conflicts: tree missing, leaf del
FAIL:  merge_tree_conflict_tests.py 20: tree conflicts: tree missing, leaf edit
FAIL:  merge_authz_tests.py 1: skipped paths get overriding mergeinfo

These all fail because they expect the old behavior of OS-deleted
paths to be skipped and OS-deleted directories to create
tree-conflicts.  With the patch in place, if we try to merge to a
target which is missing subtrees due to OS-level deletions, then we
get an error like this:

### Oops, move rather than svn move:

C:\SVN\src-branch-1.6.x-WCNG>move
subversion\tests\cmdline\merge_authz_tests.py
subversion\tests\cmdline\merge_auth_tests.py
        1 file(s) moved.

### Let's merge:

C:\SVN\src-branch-1.6.x-WCNG>svn merge ^^/subversion/trunk -c879766
..\..\..\subversion\svn\util.c:902: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:10548: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:10504: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:10504: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:10476: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:8611: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:8096: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:5851: (apr_err=195016)
svn: Merge tracking not allowed with missing subtrees; try restoring
these items first:
C:\SVN\src-branch-1.6.x-WCNG\subversion\tests\cmdline\merge_authz_tests.py

### Oh, I see where I screwed up:

C:\SVN\src-branch-1.6.x-WCNG>svn st
?       subversion\tests\cmdline\merge_auth_tests.py
!       subversion\tests\cmdline\merge_authz_tests.py

Note: In the interest of sanity, if there are many OS-deleted subtrees
in a merge target, the error lists only the *roots* of the missing
subtrees.

[[[
Fix issue #2915 'Handle mergeinfo for subtrees missing due to removal by
non-svn command'.

With this change, if you attempt a merge-tracking aware merge to a WC
which is missing subtrees due to an OS-level deletion, then an error is
raised before any editor drives begin.  The error message describes the
root of each missing path.

* subversion/libsvn_client/merge.c

  (get_mergeinfo_walk_baton): Add some new members for tracking sub-
   directories' dirents.

  (get_mergeinfo_walk_cb):

  (record_missing_subtree_roots): New function.

  (record_missing_subtree_roots): Use new function to flag missing subtree
   roots.

  (get_mergeinfo_paths): Raise a SVN_ERR_CLIENT_NOT_READY_TO_MERGE error if
   any unexpectedly missing subtrees are found.

* subversion/tests/cmdline/merge_tests.py

  (merge_with_os_deleted_subtrees): New test,

  (test_list): Add merge_with_os_deleted_subtrees.

]]]

Paul

Re: RFC: How should Subversion handle OS-deleted paths?

Posted by Julian Foad <ju...@wandisco.com>.
On Wed, 2010-08-25, Paul Burba wrote:
> On Tue, Aug 24, 2010 at 7:25 AM, Julian Foad <ju...@wandisco.com> wrote:
[...]
> Agreed, we simply can't be sure what the user's intention was...I'm
> beginning to be swayed to the idea that restoring the missing subtrees
> may not be the ideal approach...

[...]
> > The pattern that's emerging from my thoughts is: throw an error if
> > logically we need to access the missing working version of the node.  If
> > we don't need to access it, just let it be.  Never "restore" it unless
> > the user specifically requests so and says what kind of restoration is
> > required.
> >
> > For these reasons I think "merge" should also throw an error if it needs
> > to merge changes into a missing node.  (If instead it needs to delete
> > it, then it has the options I mentioned for "svn delete".)
> >
> > But I suspect part of the reason why "restore" seems an attractive
> > option is because Subversion isn't very friendly when "merge" stops with
> > an error.  We don't provide any way to resume the same merge,
> 
> Quite right about the unfriendliness.  Resuming the merge is really a
> hit-or-miss proposition, depending on how much of the merge was done
> successfully prior to encountering the missing subtree.  If it
> encountered early, before the application of any diffs, then things
> are ok, but after that things get ugly fast, particularly if the merge
> target originally had any local mods.
> 
> It's worth noting again there there are *two* error-out approaches:
> 
> i) Throw an error when a subtree the editor needs is found to be
> missing.  This is what you are talking about here.
> 
> ii) Identify any missing subtrees at the *start* of the merge and
> throw an error before any editor drives are done.  Basically we
> disallow merges with missing subtrees due to OS-level deletes.

You're right, I hadn't thought of erroring out before starting the
merge, and that is a much, much better option than erroring out in the
middle.

> > and we
> > don't make it particularly easy to roll back the merge (although that's
> > getting better now that "revert" is, I hope, going to remove nodes that
> > it created by copying), and we don't have any way at all to roll back a
> > merge performed in a non-clean WC.  So we're trying to avoid erroring
> > out.
> >
> > Long term, those difficult problems are are the problems we should be
> > looking to solve.  Short term, I suppose it's useful to avoid erroring
> > out as much as we can.
> 
> Equally bad is the case with trunk right now, where we simply ignore
> missing directories, even if the merge would affect them - see
> http://subversion.tigris.org/issues/show_bug.cgi?id=2915#desc4 for an
> example.
> 
> > If that's the important issue, and we recognize
> > it as such, then I could support the idea of "merge" doing this
> > "restore" while other commands don't.
> 
> Given what you've said here and thinking anew about merge and missing
> subtrees, I think the best approach might simply be what we currently
> do with files: Skip the missing subtree and record non-inheritable
> mergeinfo so the missing subtree is handled correctly (i.e. the
> mergeinfo reflects the fact that the merge never touched the missing
> subtree).
> 
> This has a few things going for it:
> 
> A) It's consistent with 1.5-1.6's treatment of missing files.
> 
> B) As Daniel hinted at in
> http://svn.haxx.se/dev/archive-2010-08/0189.shtml, it's consistent
> with how merge tracking treats every other type of "missing" subtree
> (e.g. shallow WCs, switches, paths missing due to authz restrictions).
> 
> C) It makes no assumptions about what the user intended, it just deals
> with the fact that the subtrees are missing.
> 
> D) It allows the merge to complete, no errors mid-merge.
> 
> E) It allows the missing subtrees to be restored via update (either
> before or after the merge is committed) and for the merge to be
> repeated.  The repeat merge will notice that the merge never touched
> the subtrees and will drive the editor such that only those subtrees
> have the merge repeated.
> 
> Any thoughts to this approach?

Another plus:

  F) It means the merge algorithm has one less case that can choke it,
which is better than relying on a check to have been done beforehand.
It makes the subroutines easier to re-use (callable by GUIs etc.).  And
it could be extremely difficult to make sure that such an external check
exactly matches the merge code in all the corner cases.

The only minus I can think of: In many ways we would serve our users
better by simply not allowing them to get into complex situations and
instead just disallowing such a merge.

But the many advantages seem to outweigh that, and your suggestion
sounds close to perfect.

If you can do that without much additional complexity, +1.

- Julian


Re: RFC: How should Subversion handle OS-deleted paths?

Posted by Paul Burba <pt...@gmail.com>.
On Tue, Aug 24, 2010 at 7:25 AM, Julian Foad <ju...@wandisco.com> wrote:

Hi Julian,

Thanks for taking some time to think about this.  Comments below.

> On Fri, 2010-08-20, Paul Burba wrote:
>> I think we can all agree that when a user deletes part of their WC via
>> the OS they have made a mistake of some sort.  But which mistake
>> exactly?  The obvious answer is that they really intended 'svn del
>> dirX/foo.c'.  But possibly they intended something more akin to 'svn
>> up --set-depth empty dirX'.  Or maybe it was just a true mistake, and
>> nothing was intended; they expect foo.c to be there!
>
> Or maybe the user *renamed* or *moved* the item to a different path,
> where it still exists complete with its local mods, and when the user
> realizes they should have used "svn mv" instead, they will want to issue
> the as-yet-unavailable "svn mv --retrospectively" command [1], or,
> failing that, they will want to move the item back into place and then
> issue the correct "svn mv".

Agreed, we simply can't be sure what the user's intention was...I'm
beginning to be swayed to the idea that restoring the missing subtrees
may not be the ideal approach...

> So there are several courses of action that the user might want.  Is
> there an obvious, simple and "safe" default action that Subversion could
> take in order to resolve each missing item it encounters?  I think there
> isn't.
>
>
>> I originally came to this question when dealing with merge tracking
>> and paths missing because they were deleted outside of Subversion, see
>> http://svn.haxx.se/dev/archive-2010-08/0156.shtml.
>>
>> Somewhat tardily I realized the *first* question to be answered is not
>> how merge tracking should deal with such paths, but how should
>> Subversion in general deal with them.
>>
>> I see a few basic approaches:
>>
>> 1) Consistent Approach -- Restore the Missing Paths (unless there is
>> really a compelling reason not too, but the default approach is to
>> restore)
>>
>> WCNG's single DB allows us to simply restore these missing paths as we
>> encounter them (the notable exception being 'svn st' of course, which
>> just reports the missing state).
>
> For a file, the working properties (stored in ACTUAL_NODE table) are
> still known to Subversion, whereas its working text is NOT known to
> Subversion.  So Subversion could "restore" the item by:
>
>  * reverting the file text and keeping the working properties, or
>
>  * reverting the file text and reverting the properties.
>
> For a directory, that applies to each versioned file inside it, as well
> as to the properties on the directories themselves.  (Any unversioned
> items inside it would of course be lost.)
>
> Is either of those ways obviously the "right" way to restore?  I would
> suggest neither is obviously the right one.
>
>
>> Exhibit A of this behavior: In 1.6.x and 1.7 in multiple DB mode, svn
>> revert skips missing directories.  In single-DB mode they are
>> restored.
>
> Here, they are "restored" in the second sense above - i.e. "reverted".
>
>
>> 2) Consistent Approach: Error out
>>
>> If we detect missing paths, simply throw an error asking that the user
>> restore them before proceeding.  Obviously for svn st/up/revert, the
>> current behavior is fine, but for svn ci/sw/merge we could follow this
>> route.
>
> I think this is a good approach for ci/sw/merge and in fact most
> commands.  It is the simplest approach and that counts for a lot.
>
>> 3) Case-by-Case Approach:
>>
>> Maybe there is *no* consistent approach, and sometimes we will want to
>> restore, but other times we'll do something else.  I'll leave what
>> "something else" is as an open question for the moment.
>
> I think consistency is important, but I'm not seeing a single behaviour
> that's suitable for all commands.  Let me see...
>
> "svn revert" should of course restore any missing thing back to its last
> versioned state, because that's it's job.  (In 1.6 and earlier, it was
> unable to restore missing directories, which was always known as a
> deficiency.)
>
> "svn status" should of course report the problem, because that's its
> job.
>
> "svn delete" could throw an error, or it could ignore the inconsistency
> and just schedule the thing for deletion anyway.  (I suppose we could
> consider the latter behaviour to be first restoring and then deleting,
> in one operation.)  Same for any other UI that's able to make the node
> go away such as "svn up --set-depth empty PARENT_PATH".
>
> "svn diff" and "svn blame" I think should throw an error if trying to
> access the missing text of the node.  (Same for "svn cat -rWORKING" if
> we ever support that.)  I don't see any good reason for read-only
> commands such as these to attempt to "fix" a problem.
>
> "svn copy" and "svn move" where the source is missing: throw an error.
> They could perhaps alternatively perform a copy or move in which the
> target working file/dir is still missing, but that's greater complexity
> and I see no benefit.
>
> "svn export": throw an error.  It could perhaps alternatively skip the
> missing node, on the basis that its job is to re-create the current
> working state of the WC, but I think we can define 'missing' as an
> invalid (unacceptable) state.
>
> "svn cleanup": throw an error if it needs to access the missing working
> version, else no effect.
>
> I would say "update"'s auto-restoring behaviour is a historical wart.  I
> think it was for compatibility with CVS.  I won't try to argue that we
> should change it now.
>
> The pattern that's emerging from my thoughts is: throw an error if
> logically we need to access the missing working version of the node.  If
> we don't need to access it, just let it be.  Never "restore" it unless
> the user specifically requests so and says what kind of restoration is
> required.
>
> For these reasons I think "merge" should also throw an error if it needs
> to merge changes into a missing node.  (If instead it needs to delete
> it, then it has the options I mentioned for "svn delete".)
>
> But I suspect part of the reason why "restore" seems an attractive
> option is because Subversion isn't very friendly when "merge" stops with
> an error.  We don't provide any way to resume the same merge,

Quite right about the unfriendliness.  Resuming the merge is really a
hit-or-miss proposition, depending on how much of the merge was done
successfully prior to encountering the missing subtree.  If it
encountered early, before the application of any diffs, then things
are ok, but after that things get ugly fast, particularly if the merge
target originally had any local mods.

It's worth noting again there there are *two* error-out approaches:

i) Throw an error when a subtree the editor needs is found to be
missing.  This is what you are talking about here.

ii) Identify any missing subtrees at the *start* of the merge and
throw an error before any editor drives are done.  Basically we
disallow merges with missing subtrees due to OS-level deletes.

> and we
> don't make it particularly easy to roll back the merge (although that's
> getting better now that "revert" is, I hope, going to remove nodes that
> it created by copying), and we don't have any way at all to roll back a
> merge performed in a non-clean WC.  So we're trying to avoid erroring
> out.
>
> Long term, those difficult problems are are the problems we should be
> looking to solve.  Short term, I suppose it's useful to avoid erroring
> out as much as we can.

Equally bad is the case with trunk right now, where we simply ignore
missing directories, even if the merge would affect them - see
http://subversion.tigris.org/issues/show_bug.cgi?id=2915#desc4 for an
example.

> If that's the important issue, and we recognize
> it as such, then I could support the idea of "merge" doing this
> "restore" while other commands don't.

Given what you've said here and thinking anew about merge and missing
subtrees, I think the best approach might simply be what we currently
do with files: Skip the missing subtree and record non-inheritable
mergeinfo so the missing subtree is handled correctly (i.e. the
mergeinfo reflects the fact that the merge never touched the missing
subtree).

This has a few things going for it:

A) It's consistent with 1.5-1.6's treatment of missing files.

B) As Daniel hinted at in
http://svn.haxx.se/dev/archive-2010-08/0189.shtml, it's consistent
with how merge tracking treats every other type of "missing" subtree
(e.g. shallow WCs, switches, paths missing due to authz restrictions).

C) It makes no assumptions about what the user intended, it just deals
with the fact that the subtrees are missing.

D) It allows the merge to complete, no errors mid-merge.

E) It allows the missing subtrees to be restored via update (either
before or after the merge is committed) and for the merge to be
repeated.  The repeat merge will notice that the merge never touched
the subtrees and will drive the editor such that only those subtrees
have the merge repeated.

Any thoughts to this approach?

Paul

Re: RFC: How should Subversion handle OS-deleted paths?

Posted by Julian Foad <ju...@wandisco.com>.
On Tue, 2010-08-24 at 20:50 +0200, Alan Barrett wrote:
> On Tue, 24 Aug 2010, Julian Foad wrote:
> > the as-yet-unavailable "svn mv --retrospectively" command [1],
> 
> Is this is just a special case of the as-yet-unavailable
> "--metadata-only" option, which would tell svn to adjust its metadata as
> if some operation were performed, but not to touch anything outside the
> ".svn" directory (or wherever metadata is kept)?

Yes, that's basically what I meant, just aimed at the "retrospective"
subset of the possible use cases.

(For anyone not familiar with the history: neither of these options is a
planned feature and neither of them yet has a well defined meaning; they
are just made-up option names that are convenient reminders for talking
about some ideas that have been suggested.)

- Julian


Re: RFC: How should Subversion handle OS-deleted paths?

Posted by Alan Barrett <ap...@cequrux.com>.
On Tue, 24 Aug 2010, Julian Foad wrote:
> the as-yet-unavailable "svn mv --retrospectively" command [1],

Is this is just a special case of the as-yet-unavailable
"--metadata-only" option, which would tell svn to adjust its metadata as
if some operation were performed, but not to touch anything outside the
".svn" directory (or wherever metadata is kept)?

--apb (Alan Barrett)

Re: RFC: How should Subversion handle OS-deleted paths?

Posted by Julian Foad <ju...@wandisco.com>.
On Fri, 2010-08-20, Paul Burba wrote: 
> I think we can all agree that when a user deletes part of their WC via
> the OS they have made a mistake of some sort.  But which mistake
> exactly?  The obvious answer is that they really intended 'svn del
> dirX/foo.c'.  But possibly they intended something more akin to 'svn
> up --set-depth empty dirX'.  Or maybe it was just a true mistake, and
> nothing was intended; they expect foo.c to be there!

Or maybe the user *renamed* or *moved* the item to a different path,
where it still exists complete with its local mods, and when the user
realizes they should have used "svn mv" instead, they will want to issue
the as-yet-unavailable "svn mv --retrospectively" command [1], or,
failing that, they will want to move the item back into place and then
issue the correct "svn mv".

So there are several courses of action that the user might want.  Is
there an obvious, simple and "safe" default action that Subversion could
take in order to resolve each missing item it encounters?  I think there
isn't.


> I originally came to this question when dealing with merge tracking
> and paths missing because they were deleted outside of Subversion, see
> http://svn.haxx.se/dev/archive-2010-08/0156.shtml.
> 
> Somewhat tardily I realized the *first* question to be answered is not
> how merge tracking should deal with such paths, but how should
> Subversion in general deal with them.
> 
> I see a few basic approaches:
> 
> 1) Consistent Approach -- Restore the Missing Paths (unless there is
> really a compelling reason not too, but the default approach is to
> restore)
> 
> WCNG's single DB allows us to simply restore these missing paths as we
> encounter them (the notable exception being 'svn st' of course, which
> just reports the missing state).

For a file, the working properties (stored in ACTUAL_NODE table) are
still known to Subversion, whereas its working text is NOT known to
Subversion.  So Subversion could "restore" the item by:

  * reverting the file text and keeping the working properties, or

  * reverting the file text and reverting the properties.

For a directory, that applies to each versioned file inside it, as well
as to the properties on the directories themselves.  (Any unversioned
items inside it would of course be lost.)

Is either of those ways obviously the "right" way to restore?  I would
suggest neither is obviously the right one.


> Exhibit A of this behavior: In 1.6.x and 1.7 in multiple DB mode, svn
> revert skips missing directories.  In single-DB mode they are
> restored.

Here, they are "restored" in the second sense above - i.e. "reverted".


> 2) Consistent Approach: Error out
> 
> If we detect missing paths, simply throw an error asking that the user
> restore them before proceeding.  Obviously for svn st/up/revert, the
> current behavior is fine, but for svn ci/sw/merge we could follow this
> route.

I think this is a good approach for ci/sw/merge and in fact most
commands.  It is the simplest approach and that counts for a lot.


> 3) Case-by-Case Approach:
> 
> Maybe there is *no* consistent approach, and sometimes we will want to
> restore, but other times we'll do something else.  I'll leave what
> "something else" is as an open question for the moment.

I think consistency is important, but I'm not seeing a single behaviour
that's suitable for all commands.  Let me see...

"svn revert" should of course restore any missing thing back to its last
versioned state, because that's it's job.  (In 1.6 and earlier, it was
unable to restore missing directories, which was always known as a
deficiency.)

"svn status" should of course report the problem, because that's its
job.

"svn delete" could throw an error, or it could ignore the inconsistency
and just schedule the thing for deletion anyway.  (I suppose we could
consider the latter behaviour to be first restoring and then deleting,
in one operation.)  Same for any other UI that's able to make the node
go away such as "svn up --set-depth empty PARENT_PATH".

"svn diff" and "svn blame" I think should throw an error if trying to
access the missing text of the node.  (Same for "svn cat -rWORKING" if
we ever support that.)  I don't see any good reason for read-only
commands such as these to attempt to "fix" a problem.

"svn copy" and "svn move" where the source is missing: throw an error.
They could perhaps alternatively perform a copy or move in which the
target working file/dir is still missing, but that's greater complexity
and I see no benefit.

"svn export": throw an error.  It could perhaps alternatively skip the
missing node, on the basis that its job is to re-create the current
working state of the WC, but I think we can define 'missing' as an
invalid (unacceptable) state.

"svn cleanup": throw an error if it needs to access the missing working
version, else no effect.

I would say "update"'s auto-restoring behaviour is a historical wart.  I
think it was for compatibility with CVS.  I won't try to argue that we
should change it now.

The pattern that's emerging from my thoughts is: throw an error if
logically we need to access the missing working version of the node.  If
we don't need to access it, just let it be.  Never "restore" it unless
the user specifically requests so and says what kind of restoration is
required.

For these reasons I think "merge" should also throw an error if it needs
to merge changes into a missing node.  (If instead it needs to delete
it, then it has the options I mentioned for "svn delete".)

But I suspect part of the reason why "restore" seems an attractive
option is because Subversion isn't very friendly when "merge" stops with
an error.  We don't provide any way to resume the same merge, and we
don't make it particularly easy to roll back the merge (although that's
getting better now that "revert" is, I hope, going to remove nodes that
it created by copying), and we don't have any way at all to roll back a
merge performed in a non-clean WC.  So we're trying to avoid erroring
out.

Long term, those difficult problems are are the problems we should be
looking to solve.  Short term, I suppose it's useful to avoid erroring
out as much as we can.  If that's the important issue, and we recognize
it as such, then I could support the idea of "merge" doing this
"restore" while other commands don't.


> Any ideas as to the best approach?  I suspect that some of the wcng
> folks have given some thought to this.

(It's a UI question rather than a WC layer question, so perhaps not.)

- Julian


[1] A "--retrospectively" flag for operations such as "move" and "copy"
is an enhancement that's been requested and that some other VCS's have a
way to do.