You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Stefan Sperling <st...@elego.de> on 2008/03/18 15:43:00 UTC

Re: Tree conflict detection in 'svn merge'

On Fri, Feb 29, 2008 at 12:20:47AM +0000, Julian Foad wrote:
> It sounds like we're trying to define what merging means (with respect to 
> what target corresponds to what source), and I don't know whether we're 
> defining it correctly.
> 
> In fact, this algorithm is a fundamental part of merging.
> 
> I'm afraid my knowledge of merging is not yet up to the task of analysing this.
> 
> I'll see what I can do, but to begin with that mainly means getting help 
> from someone who knows about merging.

Julian,

I'm currently reading through the archives, trying to get an idea about
the current state of merging and tree-conflicts. I have some comments
and questions for you and everyone else interested:

Have you found someone informed about the details of merging
who is willing to help us analyze the issue yet?
Would anyone on the list reading this volunteer to do so?

It seems that part of Stephen Butler's rather elaborate plan about
use cases 4 and 5 in notes/tree-conflicts/detection.txt has been
made obsolete by comments made by Nico, stating that at least use
case 4 could be detected in much a simpler way. Can you confirm this?

If so, I think we should update detection.txt to reflect the current
status. I could try to modify the document accordingly and post a diff
for review, also because this would help me getting back into the loop
again after having been busy with other things than tree-conflicts
for a while. Should I do that?

Have you yourself made any advances in analysing the issue?
If so, would you mind sharing your thoughts on the list?

Having read some of the comments made by you and Nico, especially
about relating source items to target items during merging, and the
new use cases proposed by Nico, I keep thinking "Oh man, if only we had
true renames, all this would not be such a big issue...".
Do you feel the same? Do you think it's feasible trying to carry on with
implementing tree-conflicts detection without true renames?
I don't want to sound pessimistic here, far from it. I'd just like to
know whether others had or have similar thoughts on this.
Obviously, a true-rename implementation would open a whole different can
of worms and is way outside the scope of the tree-conflicts branch.
I'd still like to get at least a faint idea about to what degree true
renames would help in reducing the complexity of the problems the
tree-conflicts branch has to deal with. I've already written about what
true renames would mean for our update and switch use cases in detection.txt.

I'll be able to work again full-time on tree-conflicts for the next
four weeks at least. I hope we can get past the design phase during
this time and hopefully get some code written. I'm motivated to do so,
anyway. Oh, and don't worry, I'm not saying that I'd push for writing
code before the design is solid. I'd just very much like us to get there :)

Thanks,
-- 
Stefan Sperling <st...@elego.de>                 Software Developer
elego Software Solutions GmbH                            HRB 77719
Gustav-Meyer-Allee 25, Gebaeude 12        Tel:  +49 30 23 45 86 96 
13355 Berlin                              Fax:  +49 30 23 45 86 95
http://www.elego.de                 Geschaeftsfuehrer: Olaf Wagner

Re: Tree conflict detection in 'svn merge'

Posted by Julian Foad <ju...@btopenworld.com>.

Stefan Sperling wrote:
[...]
> Steve and I have managed to implement detection as described in detection.txt
> for use cases 4, 5, and 6 over easter, if you haven't noticed yet :)

Yay! I'll start running actual scenarios on it instead of just talking 
theoretically about what should happen.

[...]
> I think we should start filing issues in the svn.tigris.org issue
> tracker for the branch now, both to document the progress more
> transparently and to keep track of what still needs doing on
> the branch.
[...]

OK.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Tree conflict detection in 'svn merge'

Posted by Stefan Sperling <st...@elego.de>.

On Tue, Mar 25, 2008 at 05:38:05PM +0000, Julian Foad wrote:
>> What do you mean by "the target isn't similar to the merge-left source"?
>> If you mean a text-comparison, how do you define "similar"?
> 
> When merging textual changes into a file, we already have a merge algorithm 
> that succeeds if the target file is "similar enough" to the merge-left 
> source that the merge algorithm can find matching hunks of text in it. (I'm 
> not sure of the details. I think it has to find the hunks in the correct 
> order, and it may require some "context" or "overlap" at the edge of each 
> hunk being changed. The details are not important right now.)

OK, I understand that now, thanks. I wasn't aware that there is
already functionality like that. I assumed you were talking about
something that we'd need to implement from scratch.

> When we apply a modify-plus-delete onto an existing target file, my main 
> point was that we should not try to see if the textual modification has 
> already been done without our knowing it, and ignore that part if so; 
> instead, we should assume that the whole requested change is supposed to 
> apply cleanly, and if it doesn't then we should signal a conflict. Now, I 
> don't know whether we would want to compare the text in this case to 
> determine whether we should allow the deletion; that isn't the issue I was 
> debating. I'm saying that if we do decide to compare the text, it should be 
> strictly the merge-left text that we examine, and not some intermediate 
> text that may have existed after merge-left and before the deletion.

Steve and I have managed to implement detection as described in detection.txt
for use cases 4, 5, and 6 over easter, if you haven't noticed yet :)

Check the latest commits to the branch. The implementation is pretty
much to the letter for all use cases, i.e. the basic stuff is done.
I mean to say there is no special-casing [yet].

Steve still had patches from back when he started working on the merge
use cases that turned out to be just about what we wanted. So we got
this done pretty quickly.

Use case 5 detection currently compares text, via svn_diff_file_diff_2 and
svn_diff_contains_diffs. If that needs to be tweaked, I won't object in
any way.

I think we should start filing issues in the svn.tigris.org issue
tracker for the branch now, both to document the progress more
transparently and to keep track of what still needs doing on
the branch.

Karl suggested putting the branch URL into the URL field of the
issues concerning the tree-conflicts branch. This way we can key
the issues by that field (URL contains "tree-conflicts") to tell
them apart from all the others.

I plan to start filing some issues tomorrow, mainly for the failing tests,
and also for some UI issues I'd like to address.

If you have anything you want to file an issue for (e.g. modify+delete?),
please do so.

Stephen Butler is now a partial committer for the tree-conflicts branch,
so you, him, and I can all commit to it. This should help the work flow :)

>> Is the decrease in complexity you mention already covered in detection.txt,
>> with or without your patch?
> 
> Yes, the current version of "detection.txt" doesn't say anything about 
> needing to follow ancestry for any of the use cases.

I'm glad to hear once again that we really don't need to bother with
ancestry :) :) :)

(I rarely hand out three smilies at once -- I really mean it!)

-- 
Stefan Sperling <st...@elego.de>                 Software Developer
elego Software Solutions GmbH                            HRB 77719
Gustav-Meyer-Allee 25, Gebaeude 12        Tel:  +49 30 23 45 86 96 
13355 Berlin                              Fax:  +49 30 23 45 86 95
http://www.elego.de                 Geschaeftsfuehrer: Olaf Wagner

Re: Tree conflict detection in 'svn merge'

Posted by Julian Foad <ju...@btopenworld.com>.

Stefan Sperling wrote:
> On Wed, Mar 19, 2008 at 11:11:59PM +0000, Julian Foad wrote:
> 
>>If we accept this principle that the basic algorithm should report all such 
>>conflicts, then some of the complexities that have been mentioned 
>>disappear. For example, looking at UC5 (rename onto a modified file), we 
>>wondered about a modified version of this use case in which the 
>>merge-source included not just a rename but a modification followed by a 
>>rename. We wondered whether we should try to follow the history of 
>>Merge-Left (Foo.c) forward in the source branch to find what it looked like 
>>at the point it was finally deleted. This is unnecessary, as we have no 
>>reason to want the merge to succeed if the target isn't similar to the 
>>merge-left source.
> 
> I can't recall this discussion properly, I'm afraid.
> 
> What do you mean by "the target isn't similar to the merge-left source"?
> If you mean a text-comparison, how do you define "similar"?

When merging textual changes into a file, we already have a merge algorithm 
that succeeds if the target file is "similar enough" to the merge-left source 
that the merge algorithm can find matching hunks of text in it. (I'm not sure 
of the details. I think it has to find the hunks in the correct order, and it 
may require some "context" or "overlap" at the edge of each hunk being changed. 
The details are not important right now.)

For merging tree changes, our new algorithm should succeed if the target tree 
is "similar to" the merge-left source tree in the sense that each object that 
we want to delete or modify is there in the expected place in the target, and 
each object that we want to create is not already there. We are not yet sure 
about the precise definition of these rules, but they will be something like 
that. If the source-left and the target are not "similar enough" to meet these 
conditions, then the merge should raise a tree conflict.

I define "similar" as both of these things: the text being similar enough for 
the text-merge algorithm to succeed, and the tree being similar enough for the 
tree-merge algorithm to succeed.

When we apply a modify-plus-delete onto an existing target file, my main point 
was that we should not try to see if the textual modification has already been 
done without our knowing it, and ignore that part if so; instead, we should 
assume that the whole requested change is supposed to apply cleanly, and if it 
doesn't then we should signal a conflict. Now, I don't know whether we would 
want to compare the text in this case to determine whether we should allow the 
deletion; that isn't the issue I was debating. I'm saying that if we do decide 
to compare the text, it should be strictly the merge-left text that we examine, 
and not some intermediate text that may have existed after merge-left and 
before the deletion.

> Is the decrease in complexity you mention already covered in detection.txt,
> with or without your patch?

Yes, the current version of "detection.txt" doesn't say anything about needing 
to follow ancestry for any of the use cases.

> I approve your version of the patch to detection.txt, by the way,
> and will commit it. Thanks for improving my wording in several places
> by an order of magnitude :)
>  
> 
>>I expect a similar simplification can be made for UC5, but I haven't done that yet.
> 
> You must mean a different use case, because you talked about UC5
> above already :)

I meant that, now I have simplified the text on detecting UC4 and 6, I expect 
we can simplify the text on detecting UC5 in a similar way. It looks like you 
have already done so, which is fine.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Tree conflict detection in 'svn merge'

Posted by Stefan Sperling <st...@elego.de>.

On Wed, Mar 19, 2008 at 11:11:59PM +0000, Julian Foad wrote:
> If we accept this principle that the basic algorithm should report all such 
> conflicts, then some of the complexities that have been mentioned 
> disappear. For example, looking at UC5 (rename onto a modified file), we 
> wondered about a modified version of this use case in which the 
> merge-source included not just a rename but a modification followed by a 
> rename. We wondered whether we should try to follow the history of 
> Merge-Left (Foo.c) forward in the source branch to find what it looked like 
> at the point it was finally deleted. This is unnecessary, as we have no 
> reason to want the merge to succeed if the target isn't similar to the 
> merge-left source.

I can't recall this discussion properly, I'm afraid.

What do you mean by "the target isn't similar to the merge-left source"?
If you mean a text-comparison, how do you define "similar"?

Is the decrease in complexity you mention already covered in detection.txt,
with or without your patch?

I approve your version of the patch to detection.txt, by the way,
and will commit it. Thanks for improving my wording in several places
by an order of magnitude :)
 
> I expect a similar simplification can be made for UC5, but I haven't done that yet.

You must mean a different use case, because you talked about UC5
above already :)

-- 
Stefan Sperling <st...@elego.de>                 Software Developer
elego Software Solutions GmbH                            HRB 77719
Gustav-Meyer-Allee 25, Gebaeude 12        Tel:  +49 30 23 45 86 96 
13355 Berlin                              Fax:  +49 30 23 45 86 95
http://www.elego.de                 Geschaeftsfuehrer: Olaf Wagner

Re: Tree conflict detection in 'svn merge'

Posted by Julian Foad <ju...@btopenworld.com>.

Hi Stefan.

I have read your changes for UC 4 and 6, and I have modified these sections. 
Please have a look at my attached patch and see if that is better. One thing I 
have done is separated the notes about how users would like to resolve these 
cases from the main text about how we detect a conflict.

First, though, let me explain why I think it is right to always flag a conflict 
in those situations.

When a user asks Subversion to apply a diff (Source-Left -> Source-Right) onto 
a target, either the user expects Subversion to choose and apply only those 
source changes that have not already been merged, using the merge tracking 
feature, or the user has selected the exact source diffs that should be applied 
to the target.

In either case, the libsvn_client merge code expects that it is applying 
changes that have not already been applied to the target branch. When it 
encounters a change that it cannot apply, the fundamentally correct response is 
for it to flag a conflict. If it is deleting a file that has already been 
deleted in the target branch, or renaming a file that has already been renamed 
to the same new name, or modifying a line of text where the same text 
modification has already been made, then these are all strictly conflicts, but 
clearly some users will prefer Subversion to automatically resolve these 
conflicts in the "obvious" way.

Our text-merge algorithm has always auto-resolved two identical modifications 
by silently ignoring one of them, because it is often what users expect, and 
and will continue to do so, but it is easy to show examples where this is the 
wrong result.

The general solution is to use a basic algorithm that detects all such cases as 
potential conflicts and reports them to a higher-level customisable conflict 
handler that can automatically resolve certain cases according to the user's 
preferences. Only the cases that are not resolved would get reported to the 
user as persistent conflicts that need further attention.

If we accept this principle that the basic algorithm should report all such 
conflicts, then some of the complexities that have been mentioned disappear. 
For example, looking at UC5 (rename onto a modified file), we wondered about a 
modified version of this use case in which the merge-source included not just a 
rename but a modification followed by a rename. We wondered whether we should 
try to follow the history of Merge-Left (Foo.c) forward in the source branch to 
find what it looked like at the point it was finally deleted. This is 
unnecessary, as we have no reason to want the merge to succeed if the target 
isn't similar to the merge-left source.

I hope this makes sense.

I expect a similar simplification can be made for UC5, but I haven't done that yet.

- Julian


Stefan Sperling wrote:
> Index: notes/tree-conflicts/detection.txt
> ===================================================================
> --- notes/tree-conflicts/detection.txt	(revision 29943)
> +++ notes/tree-conflicts/detection.txt	(working copy)
[...]
> +target working copy, then the target file is a tree conflict victim.
> +The rationale behind this is that the source diff doesn't cover as
> +many revisions as it should. The file should either be brought in by
> +adding the revision that created the file to the list of revisions
> +to be merged, or changes made to it on the source branch should be
> +omitted from the merge range entirely.
> +
> +We will need to store tree-conflict information in entries about files
> +that may not actually be present in the working copy at all.
> +
> +If a modification to the nonexistent file is part of a larger diff
> +with changes to other files that should be merged, the user will
> +need to be able to manually resolve the tree-conflict while keeping
> +the desired changes.
> +
> +Users must be able to run a second merge command to resolve the
> +tree-conflict, or repeat a previous merge operation, but with
> +additional revisions, without harm.
> +
> +However, the current plan is to disallow merges into tree-conflicted
> +directories. This means that users will first have to mark the
> +tree-conflict around the missing victim as resolved before attempting
> +to merge the file again, this time including the revision that created
> +the file. This may be a bit of an awkward work flow but is required to
> +solve the problem this use case has in the current implementation,
> +namely that missing files may accidentally be overlooked during merging.
>  
>  ==========
>  USE CASE 5
> @@ -211,57 +126,85 @@ victim if its text is different from the
>  source.  We don't want to forget any text changes that are unique to
>  the file at the current URL.
>  
> +To allow uncommitted text modifications in the working copy, we should
> +do any text comparisons against the WORKING revision.
> +
>  A tree conflict exists if all of the following predicates are true:
>  
> -1. The merge operation is compatible with tree conflict detection.
> -   Same as #1 in use case 4.
> +1. We check specific fields of the merge-command baton (of type
> +   merge_cmd_baton_t). If all of the following boolean fields have
> +   the given values, we might have a tree conflict.
> +
> +    a. (same_repos == TRUE) Both of the source URLs, merge-left and
> +        merge-right, must be in the same repository. This is also
> +        a requirement of the current merge-tracking implementation.
> +
> +        Julian Foad said this requirement might be too strict in
> +        case Subversion ever grows support for merging between
> +        different repositories:
> +
> +        "I agree that the source and target must be in the same
> +        repository (unless ancestry is being ignored), but only because
> +        we currently have no way to trace the ancestry across repositories.
> +        I don't see why this same_repos condition flag exists or should
> +        be considered independently from asking whether we can find a
> +        common ancestor."
> +        http://svn.haxx.se/dev/archive-2008-02/0611.shtml
> +
> +    b. (sources_ancestral == TRUE) The merge-left URL given to the
> +        merge command must be an ancestor of the merge-right URL.
> +
> +    c. (ignore_ancestry == FALSE) The rest of the predicates below
> +        depend on ancestry queries, so if the user wants to ignore
> +        ancestry there's not much point in looking for tree conflicts.
> +
> +    d. (record_only == FALSE) A record-only merge operation updates
> +        mergeinfo without touching files. 
>  
>  2. The target file and the source-left file have a common ancestor.
>  
> +   Rationale:
> +   
> +   Otherwise the files are not related, so differences between
> +   them cannot be due to separate lines of history with conflicting
> +   tree-related changes.
> +
>     Implementation:
>  
>     Call svn_client__get_youngest_common_ancestor().  If the ancestor
>     exists, the predicate is TRUE.
>  
> -   Rationale:
> -
> -   Thankfully, the job here is a lot simpler than in use case 4,
> -   because we don't have to search the history to find a possible
> -   target file to serve as the tree conflict victim.
> -
>  3. The text of the target file does not match the text of the
>     "last surviving revision" of the file after merge-left.
>  
> +   Rationale:
> +
> +   We don't want to flag every file deletion as a tree conflict.  We
> +   want to warn the user if the file to be deleted locally is
> +   different from the file deleted in the merge source.  The user then
> +   has a chance to merge these unique changes.
> +
>     Implementation:
>  
> -   Call svn_ra_get_deleted_revnum(), as discussed in #5 in use case 4.
> +   In the repository, call svn_repos_deleted_rev().
>     The "start" revision is the source-left revision, and the "end"
>     revision is the source-right revision.  If the function returns a
>     valid "deleted" revision, decrement it by 1 to derive the "last
>     survivor" revision.
>  
> +   Note that the client library can't call svn_repos_deleted_rev()
> +   directly.  We'll have to add the corresponding function
> +   svn_ra_get_deleted_revnum() to each of the remote-access layers.
> +
>     Call svn_client_diff_summarize2() to compare the target file to the
>     "last survivor" from the merge source.  If there is a text
>     difference, the predicate is TRUE.
>  
> -   Rationale:
> -
> -   We don't want to flag every file deletion as a tree conflict.  We
> -   want to warn the user if the file to be deleted locally is
> -   different from the file deleted in the merge source.  The user then
> -   has a chance to merge these unique changes.
> -
>  ==========
>  USE CASE 6
>  ==========
>  
> -If 'svn merge' tries to delete a file that does not exist in the
> -target working copy, and the history of the target directory includes
> -a file of the same name, and this target file is related to the
> -deleted source file, then the target file is a tree conflict victim.
> -
> -The same predicates are applied as in use case 4, except that #2 is
> -skipped because the file in question does not exist at source-right.
> +The same predicates are applied as in use case 4.
>  
>  It would be nice to skip the tree conflict if the double-delete is
>  caused by two rename operations that have the same destination.
> 
>

Re: Tree conflict detection in 'svn merge'

Posted by Stefan Sperling <st...@elego.de>.

On Wed, Mar 19, 2008 at 01:19:51PM +0100, Stefan Sperling wrote:
> > Just a brief note on the rest of your questions: I'm not right back into this
> > in detail, but yes I think Nico was right on simplifying that case.
> 
> Good, I will try to update detection.txt accordingly.

Julian,

here is a diff for review.

The diff itself is a bit awkward to read. You might want to apply
it and read the file in one piece and only refer to the diff
to see what I've actually changed.

If you have anything to add or any comments, please let me know.

I will commit this diff only after you've reviewed it, I might
have gotten some details wrong or may have overlooked something
entirely.

Thanks :)

Index: notes/tree-conflicts/detection.txt
===================================================================
--- notes/tree-conflicts/detection.txt	(revision 29943)
+++ notes/tree-conflicts/detection.txt	(working copy)
@@ -90,117 +90,32 @@ USE CASE 4
 ==========
 
 If 'svn merge' tries to modify a file that does not exist in the
-target working copy, and the history of the target directory includes
-a file of the same name, and this target file is related to the source
-file, then the target file is a tree conflict victim.
-
-A tree conflict exists if all of the following predicates are true:
-
-1. The merge operation is compatible with tree conflict detection.
-
-   Implementation:
-
-   We check specific fields of the merge-command baton (of type
-   merge_cmd_baton_t), returning TRUE if all of the following boolean
-   fields have the given values.
-
-   a. (same_repos == TRUE) The sources (the URLs of the left and right
-       sides of the diff) and the target (the URL of the working copy)
-       must be in the same repository.
-
-   b. (sources_ancestral == TRUE)  The merge-left URL given to the
-       merge command must be an ancestor of the merge-right URL.
-
-   c. (record_only == FALSE)  A record-only merge operation updates
-       mergeinfo without touching files.
-
-   Rationale:
-
-   These are conditions under which merge tracking is done.  We check
-   them first because it's cheap to do so (no repository access
-   required).  It's not yet clear how to handle --ignore-ancestry, so
-   its field is not in the list.
-
-2. The source-left file and the source-right file have a common
-   ancestor.
-
-   Implementation:
-
-   Call svn_client__get_youngest_common_ancestor().  If the youngest
-   common ancestor (YCA) is the source-left file, the predicate is
-   TRUE.
-
-   Rationale:
-
-   This is more specific than #1b above.  This predicate applies to a
-   single file, while #1b applies to the full merge sources, which are
-   usually directory trees.
-
-3. The target directory and the corresponding source-left directory
-   have a common ancestor.
-
-   Implementation:
-
-   Call svn_client__get_youngest_common_ancestor(), passing the two
-   dirs as arguments.  If a YCA exists, the predicate is TRUE.
-
-   Rationale:
-
-   The deletion of a file is not part of the history of the file
-   itself.  Rather it is part of the history of its parent directory.
-   If the parent (target) directory has no history in common with the
-   corresponding merge source directory, then no tree conflict is
-   possible.
-
-4. The file existed in the YCA directory.
-
-   Implementation:
-
-   Call svn_ra_check_path(), passing the filename and the revision of
-   the YCA directory.  If the function says there was a file there,
-   the predicate is TRUE.
-
-5. In the history of the target directory, a file by this name has
-   been deleted.
-
-   Implementation:
-
-   In the repository, call svn_repos_deleted_rev().  The "start"
-   revision arg is the revision of the YCA directory.  The "end"
-   revision arg is the HEAD revision.  If the function sets a valid
-   "deleted" revision, then the predicate is TRUE.
-
-   Rationale:
-
-   We already know that the file doesn't exist, so we know it was
-   deleted.  The specific revision is useful in other predicates.
-
-   At first I thought the "end" revision should be the BASE of the
-   target directory.  But that could lead to undetected tree
-   conflicts.  Suppose the user has committed a file deletion and then
-   merged without updating first.  The target directory is out of
-   date, and a search limited to the target directory's BASE would not
-   find the file deletion.
-
-   Note that the client library can't call svn_repos_deleted_rev()
-   directly.  We'll have to add the corresponding function
-   svn_ra_get_deleted_revnum() to each of the remote-access layers.
-
-6. The file at source-left and the file deleted in the target
-   directory's history have a common ancestor.
-
-   Implementation:
-
-   Call svn_client__get_youngest_common_ancestor(), passing the
-   source-left file and the "last surviving revision" of the target
-   file, derived from #5 above.  If they have a common ancestor, the
-   predicate is TRUE.  We can finally declare that we have found a
-   tree conflict!
-
-   Rationale:
-
-   I suppose it would be equivalent to pass the revision at
-   source-left and the revision of the YCA directory.
+target working copy, then the target file is a tree conflict victim.
+The rationale behind this is that the source diff doesn't cover as
+many revisions as it should. The file should either be brought in by
+adding the revision that created the file to the list of revisions
+to be merged, or changes made to it on the source branch should be
+omitted from the merge range entirely.
+
+We will need to store tree-conflict information in entries about files
+that may not actually be present in the working copy at all.
+
+If a modification to the nonexistent file is part of a larger diff
+with changes to other files that should be merged, the user will
+need to be able to manually resolve the tree-conflict while keeping
+the desired changes.
+
+Users must be able to run a second merge command to resolve the
+tree-conflict, or repeat a previous merge operation, but with
+additional revisions, without harm.
+
+However, the current plan is to disallow merges into tree-conflicted
+directories. This means that users will first have to mark the
+tree-conflict around the missing victim as resolved before attempting
+to merge the file again, this time including the revision that created
+the file. This may be a bit of an awkward work flow but is required to
+solve the problem this use case has in the current implementation,
+namely that missing files may accidentally be overlooked during merging.
 
 ==========
 USE CASE 5
@@ -211,57 +126,85 @@ victim if its text is different from the
 source.  We don't want to forget any text changes that are unique to
 the file at the current URL.
 
+To allow uncommitted text modifications in the working copy, we should
+do any text comparisons against the WORKING revision.
+
 A tree conflict exists if all of the following predicates are true:
 
-1. The merge operation is compatible with tree conflict detection.
-   Same as #1 in use case 4.
+1. We check specific fields of the merge-command baton (of type
+   merge_cmd_baton_t). If all of the following boolean fields have
+   the given values, we might have a tree conflict.
+
+    a. (same_repos == TRUE) Both of the source URLs, merge-left and
+        merge-right, must be in the same repository. This is also
+        a requirement of the current merge-tracking implementation.
+
+        Julian Foad said this requirement might be too strict in
+        case Subversion ever grows support for merging between
+        different repositories:
+
+        "I agree that the source and target must be in the same
+        repository (unless ancestry is being ignored), but only because
+        we currently have no way to trace the ancestry across repositories.
+        I don't see why this same_repos condition flag exists or should
+        be considered independently from asking whether we can find a
+        common ancestor."
+        http://svn.haxx.se/dev/archive-2008-02/0611.shtml
+
+    b. (sources_ancestral == TRUE) The merge-left URL given to the
+        merge command must be an ancestor of the merge-right URL.
+
+    c. (ignore_ancestry == FALSE) The rest of the predicates below
+        depend on ancestry queries, so if the user wants to ignore
+        ancestry there's not much point in looking for tree conflicts.
+
+    d. (record_only == FALSE) A record-only merge operation updates
+        mergeinfo without touching files. 
 
 2. The target file and the source-left file have a common ancestor.
 
+   Rationale:
+   
+   Otherwise the files are not related, so differences between
+   them cannot be due to separate lines of history with conflicting
+   tree-related changes.
+
    Implementation:
 
    Call svn_client__get_youngest_common_ancestor().  If the ancestor
    exists, the predicate is TRUE.
 
-   Rationale:
-
-   Thankfully, the job here is a lot simpler than in use case 4,
-   because we don't have to search the history to find a possible
-   target file to serve as the tree conflict victim.
-
 3. The text of the target file does not match the text of the
    "last surviving revision" of the file after merge-left.
 
+   Rationale:
+
+   We don't want to flag every file deletion as a tree conflict.  We
+   want to warn the user if the file to be deleted locally is
+   different from the file deleted in the merge source.  The user then
+   has a chance to merge these unique changes.
+
    Implementation:
 
-   Call svn_ra_get_deleted_revnum(), as discussed in #5 in use case 4.
+   In the repository, call svn_repos_deleted_rev().
    The "start" revision is the source-left revision, and the "end"
    revision is the source-right revision.  If the function returns a
    valid "deleted" revision, decrement it by 1 to derive the "last
    survivor" revision.
 
+   Note that the client library can't call svn_repos_deleted_rev()
+   directly.  We'll have to add the corresponding function
+   svn_ra_get_deleted_revnum() to each of the remote-access layers.
+
    Call svn_client_diff_summarize2() to compare the target file to the
    "last survivor" from the merge source.  If there is a text
    difference, the predicate is TRUE.
 
-   Rationale:
-
-   We don't want to flag every file deletion as a tree conflict.  We
-   want to warn the user if the file to be deleted locally is
-   different from the file deleted in the merge source.  The user then
-   has a chance to merge these unique changes.
-
 ==========
 USE CASE 6
 ==========
 
-If 'svn merge' tries to delete a file that does not exist in the
-target working copy, and the history of the target directory includes
-a file of the same name, and this target file is related to the
-deleted source file, then the target file is a tree conflict victim.
-
-The same predicates are applied as in use case 4, except that #2 is
-skipped because the file in question does not exist at source-right.
+The same predicates are applied as in use case 4.
 
 It would be nice to skip the tree conflict if the double-delete is
 caused by two rename operations that have the same destination.


-- 
stefan
http://stsp.name                                         PGP Key: 0xF59D25F0

Re: Tree conflict detection in 'svn merge'

Posted by Stefan Sperling <st...@elego.de>.

On Wed, Mar 19, 2008 at 02:26:49AM +0000, Julian Foad wrote:
> --- Stefan Sperling <st...@elego.de> wrote:
> > Have you found someone informed about the details of merging
> > who is willing to help us analyze the issue yet?
> > Would anyone on the list reading this volunteer to do so?
> 
> I've just spent a couple of days with Paul and Mike and although we weren't
> specifically discussing tree conflicts I've gained a much better general
> understanding of how merging works in Subversion. I think I can now see how
> far we have to take this conflict detection, and it's not too far.

OK. I am eagerly waiting for details :)

> > It seems that part of Stephen Butler's rather elaborate plan about
> > use cases 4 and 5 in notes/tree-conflicts/detection.txt has been
> > made obsolete by comments made by Nico, stating that at least use
> > case 4 could be detected in much a simpler way. Can you confirm this?
> 
> Just a brief note on the rest of your questions: I'm not right back into this
> in detail, but yes I think Nico was right on simplifying that case.

Good, I will try to update detection.txt accordingly.

> On the subject of renames, yes I think we can and should proceed with detecting
> conflicts to the same degree (or lack) of sophistication that "svn merge"
> already does (or doesn't) cope. I think this is feasible and useful.

Great.

-- 
Stefan Sperling <st...@elego.de>                 Software Developer
elego Software Solutions GmbH                            HRB 77719
Gustav-Meyer-Allee 25, Gebaeude 12        Tel:  +49 30 23 45 86 96 
13355 Berlin                              Fax:  +49 30 23 45 86 95
http://www.elego.de                 Geschaeftsfuehrer: Olaf Wagner

Re: Tree conflict detection in 'svn merge'

Posted by Julian Foad <ju...@btopenworld.com>.

--- Stefan Sperling <st...@elego.de> wrote:
> I'm currently reading through the archives, trying to get an idea about
> the current state of merging and tree-conflicts. I have some comments
> and questions for you and everyone else interested:
> 
> Have you found someone informed about the details of merging
> who is willing to help us analyze the issue yet?
> Would anyone on the list reading this volunteer to do so?

I've just spent a couple of days with Paul and Mike and although we weren't
specifically discussing tree conflicts I've gained a much better general
understanding of how merging works in Subversion. I think I can now see how
far we have to take this conflict detection, and it's not too far.


> It seems that part of Stephen Butler's rather elaborate plan about
> use cases 4 and 5 in notes/tree-conflicts/detection.txt has been
> made obsolete by comments made by Nico, stating that at least use
> case 4 could be detected in much a simpler way. Can you confirm this?

Just a brief note on the rest of your questions: I'm not right back into this
in detail, but yes I think Nico was right on simplifying that case. On the
subject of renames, yes I think we can and should proceed with detecting
conflicts to the same degree (or lack) of sophistication that "svn merge"
already does (or doesn't) cope. I think this is feasible and useful.

- Julian


> 
> If so, I think we should update detection.txt to reflect the current
> status. I could try to modify the document accordingly and post a diff
> for review, also because this would help me getting back into the loop
> again after having been busy with other things than tree-conflicts
> for a while. Should I do that?
> 
> Have you yourself made any advances in analysing the issue?
> If so, would you mind sharing your thoughts on the list?
> 
> Having read some of the comments made by you and Nico, especially
> about relating source items to target items during merging, and the
> new use cases proposed by Nico, I keep thinking "Oh man, if only we had
> true renames, all this would not be such a big issue...".
> Do you feel the same? Do you think it's feasible trying to carry on with
> implementing tree-conflicts detection without true renames?
> I don't want to sound pessimistic here, far from it. I'd just like to
> know whether others had or have similar thoughts on this.
> Obviously, a true-rename implementation would open a whole different can
> of worms and is way outside the scope of the tree-conflicts branch.
> I'd still like to get at least a faint idea about to what degree true
> renames would help in reducing the complexity of the problems the
> tree-conflicts branch has to deal with. I've already written about what
> true renames would mean for our update and switch use cases in
> detection.txt.
> 
> I'll be able to work again full-time on tree-conflicts for the next
> four weeks at least. I hope we can get past the design phase during
> this time and hopefully get some code written. I'm motivated to do so,
> anyway. Oh, and don't worry, I'm not saying that I'd push for writing
> code before the design is solid. I'd just very much like us to get there :)



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org