You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Julian Foad <ju...@btopenworld.com> on 2012/07/17 19:28:25 UTC

Symmetric merge and subtrees

There are some merge scenarios for which it's not clear whether the user should specify '--reintegrate' or not.  We need to decide what the 'symmetric' (i.e. automatically-choosing) code should do in those cases.

The following example is adapted from merge_symmetric_tests.py 18, subtree_to_and_fro(), "reintegrate considers source subtree mergeinfo":

                   reintegrate 'everything'
                           |
  A ------o-s--------------x
      \          \        /
       \          \      /
  B     o-----s----s----s---
                   |
                 merge the subtree 'D' only

In terms of commands (assume a commit after each step):

  svn cp A B
  edit A
  edit A/D
  edit B/D  svn merge ^/A/D B/D
  edit B/D
  svn merge --reintegrate ^/B A

The output from "merge --reintegrate" is:

  svn: E195016: Reintegrate can only be used if
  revisions 2 through 8 were previously merged [...]

But why did the user choose a 'reintegrate' merge in this example, when there has been no merge of the branch root in the A->B direction?  Or, from another angle, what if we have this similar scenario?

                   merge 'everything'
                           |
  A     o-o-s--------------x
       /         \        /
      /           \      /
  B ----------s----s----s---

Given this example, it looks like a non-reintegrate merge might be the user's choice for the final merge (and for the intermediate merge... either a reintegrate or non-reintegrate); but all that's changed in this example is which branch was branched from which, and that shouldn't make any difference.  AFAIK we don't document a rule for whether '--reintegrate' should be used; it's ambiguous.

We have three possible results when we request a merge:

  (1) a correct merge
  (2) a wrong merge, with spurious conflicts
  (3) bail out

Currently, 'merge' gives (2), 'merge --reintegrate' gives (3), and 'merge --symmetric' gives (2).

The subtree_to_and_fro() test expects 'merge --symmetric' to give result (3), but the symmetric merge currently sees that case as one in which a *non-reintegrate* merge is needed, and goes ahead to produce a merged result (with a conflict):

  --- Merging r2 through r8 into 'A':
  U    A/D/gamma
  C    A/D/H/psi
  [...]

Is the symmetric merge 'broken'?  This test claims so, on the basis of expecting it to behave like a reintegrate merge.  However, we can't be strictly backward-compatible with both the reintegrate and the non-reintegrate behaviours if we have to pick one, because they differ.  What do we want to see here?

I think it depends what the user is thinking.  If the user is thinking "just help me merge everything that needs to be merged", then the user probably would have used the plain 'merge' command in 1.7, and would prefer a wrong merge with spurious conflicts, given that we haven't yet implemented a correct merge.  If the user is thinking "this is a reintegration, and I believe my branch was sync'd with the trunk recently", then the user probably would prefer this merge to bail out like 'reintegrate' does.

It almost goes without saying that if and when we have implemented a correct merge then that is the right thing to do, and that will resolve the ambiguity.  (When we do that, there is still the possibility of giving the user an option to make Subversion detect and report whether the required merge has anything complex about it -- subtrees, cherry-picked revisions, and so on -- because the user might want the mental security of confirming that in fact the merge is as simple as the user expects it to be.  But that is another topic.)

Our options for what 'symmetric' merge will do, in cases like this [1], are:

  (1) Implement correct merging.
  (2) Merge wrongly, with spurious conflicts, like plain 1.7 'merge'.
  (2a) (2) by default, withan option to check and bail out like (3).
  (3) Bail out like 1.7 'reintegrate'.
  (3a) (3) by default, with an option to force the merge to proceed like (2). 

If we choose (2), then we break strict compatibility with 1.7 'reintegrate' merge.  If we choose (3), then we break strict compatibility with 1.7 'non-reintegrate' merge.

I'd like to do (1), but, in the meantime, would we be happy with one of the alternatives?

- Julian


[1] At the moment, 'cases like this' means cases where the specified merge root node was last merged in the same direction as the current merge, or was never merged before.  That is, the symmetric merge code chooses whether to do a reintegrate style of merge or a non-reintegrate style of merge based on the direction of this merge relative to the direction of the last merge of the specified root node, not based on any merges that may have been done to subtrees.  That does NOT mean it doesn't handle subtrees; it handles them just the same as 
1.7 (reintegrate or plain) merge does.  It just means subtrees don't affect the decision of which style of merge it uses.


--
Certified & Supported Apache Subversion Downloads: http://www.wandisco.com/subversion/download

Re: Symmetric merge and subtrees

Posted by Julian Foad <ju...@btopenworld.com>.

I (Julian Foad) wrote:
> Investigating the second of three 'symmetric merge' test failures:
> 
>   merge_reintegrate_tests.py 10
> 
> There is a definite issue concerning reverse merges from the branches' own 
> history.  In this test, the merge that behaves unexpectedly is the final 
> 'reintegrate' shown in the following graph (which appears as a comment 
> in that test).

I've fixed this in r1365182 by making the 'reintegrate-like' code path of symmetric merge call the 'find_reintegrate_merge' function which rejects cases such as this one (where a revision from the common ancestry of the two branches is not recorded as merged in the source branch).

- Julian


>   # Make some more changes to A_COPY so that the same revisions have *not*
>   # been uniformly applied from A to A_COPY.  In this case the reintegrate
>   # merge should fail, but should provide a helpful message as to where the
>   # problems are.
>   #
>   #   A_COPY_3        4-------[9]--
>   #                  /          \
>   #                 /            \          [-8]___
>   #   A     -1---------5-6-7-8---10----------------\-------WC--
>   #            \ \               (D)        \       \      
> /reint.
>   #             \ \                    (mu)  \       \    /
>   #   A_COPY     2-\--------------------12---13--14--15--------
>   #                 \                   /            (D)
>   #                  \                 /
>   #   A_COPY_2        3-------------[11]--
> 
> A 'reintegrate' merge complains:
> 
> svn: E195016: Reintegrate can only be used if revisions 2 through 15 were 
> previously merged from [...]/A to the reintegrate source, but this is not the 
> case:
>   A_COPY/D
>     Missing ranges: /A/D:8
> 
> A 'symmetric' merge merges every change made on the branch A_COPY onto 
> branch 'A'.  That includes r15, the removal of change A:8, which was a 
> change to A/D/H/omega:
> 
> --- Merging differences between repository URLs into '[...]/A':
> U    [...]/A/B/E/alpha
> UU   [...]/A/mu
> U    [...]/A/D/H/omega
>  U   [...]/A/D
> --- Recording mergeinfo for merge between repository URLs into 
> '[...]/A':
>  U   [...]/A
>  U   [...]/A/D
>  G   [...]/A/mu
> 
> For completeness, let's see what a plain 1.7 merge does:
> 
> --- Merging r2 through r15 into '[...]/A':
> U    [...]/A/B/E/alpha
> UU   [...]/A/mu
> --- Merging r2 through r15 into '[...]/A/D':
>  G   [...]/A/D
> --- Recording mergeinfo for merge of r2 through r15 [...]
> 
> The plain merge does in fact merge r15, the reverse merge of r8.  (It 
> doesn't print any evidence of it there, because the result is a no-op, but 
> if we change the test to make another modification to that file, then we can see 
> it.)
> 
> 
> The root issue here is that a reverse-merge of a change from the shared 
> history[1] history is interpreted by 'reintegrate' as a user error to be 
> diagnosed, by plain 'merge' as just another change to be merged, and by 
> the current 'symmetric' code as just another change to be merged.
> 
> Without going into *why* they should behave differently right now, I think it is 
> clear that the 'symmetric' merge should in this case behave like the 
> 'reintegrate' merge and detect and report the situation.
> 
> So I'll take a stab at fixing that.
> 
> [1] We probably have very similar issues dealing with a reverse-merge of a 
> change from the history of one branch or the other (after their youngest common 
> ancestor).
> 
> - Julian
>

Re: Symmetric merge and subtrees

Posted by Julian Foad <ju...@btopenworld.com>.

Investigating the second of three 'symmetric merge' test failures:

  merge_reintegrate_tests.py 10

There is a definite issue concerning reverse merges from the branches' own history.  In this test, the merge that behaves unexpectedly is the final 'reintegrate' shown in the following graph (which appears as a comment in that test).

  # Make some more changes to A_COPY so that the same revisions have *not*
  # been uniformly applied from A to A_COPY.  In this case the reintegrate
  # merge should fail, but should provide a helpful message as to where the
  # problems are.
  #
  #   A_COPY_3        4-------[9]--
  #                  /          \
  #                 /            \          [-8]___
  #   A     -1---------5-6-7-8---10----------------\-------WC--
  #            \ \               (D)        \       \      /reint.
  #             \ \                    (mu)  \       \    /
  #   A_COPY     2-\--------------------12---13--14--15--------
  #                 \                   /            (D)
  #                  \                 /
  #   A_COPY_2        3-------------[11]--

A 'reintegrate' merge complains:

svn: E195016: Reintegrate can only be used if revisions 2 through 15 were previously merged from [...]/A to the reintegrate source, but this is not the case:
  A_COPY/D
    Missing ranges: /A/D:8

A 'symmetric' merge merges every change made on the branch A_COPY onto branch 'A'.  That includes r15, the removal of change A:8, which was a change to A/D/H/omega:

--- Merging differences between repository URLs into '[...]/A':
U    [...]/A/B/E/alpha
UU   [...]/A/mu
U    [...]/A/D/H/omega
 U   [...]/A/D
--- Recording mergeinfo for merge between repository URLs into '[...]/A':
 U   [...]/A
 U   [...]/A/D
 G   [...]/A/mu

For completeness, let's see what a plain 1.7 merge does:

--- Merging r2 through r15 into '[...]/A':
U    [...]/A/B/E/alpha
UU   [...]/A/mu
--- Merging r2 through r15 into '[...]/A/D':
 G   [...]/A/D
--- Recording mergeinfo for merge of r2 through r15 [...]

The plain merge does in fact merge r15, the reverse merge of r8.  (It doesn't print any evidence of it there, because the result is a no-op, but if we change the test to make another modification to that file, then we can see it.)


The root issue here is that a reverse-merge of a change from the shared history[1] history is interpreted by 'reintegrate' as a user error to be diagnosed, by plain 'merge' as just another change to be merged, and by the current 'symmetric' code as just another change to be merged.

Without going into *why* they should behave differently right now, I think it is clear that the 'symmetric' merge should in this case behave like the 'reintegrate' merge and detect and report the situation.

So I'll take a stab at fixing that.

[1] We probably have very similar issues dealing with a reverse-merge of a change from the history of one branch or the other (after their youngest common ancestor).

- Julian

Issue #4258 Automatic merge after subtree merge in opposite direction [was: Symmetric merge and subtrees]

Posted by Julian Foad <ju...@btopenworld.com>.

Paul Burba wrote on 2012-11-02:
> Julian Foad wrote:
>> Paul Burba wrote:

>>> Julian Foad wrote:
>>>>   The following example is adapted from
>>>> merge_symmetric_tests.py 18,  subtree_to_and_fro(),
>>>> "reintegrate considers source subtree  mergeinfo":
>>>> 
>>>>                     reintegrate 'everything'
>>>>                             |
>>>>     A------o-s--------------x
>>>>        \          \        /
>>>>          \          \      /
>>>>     B     o-----s----s----s---
>>>>                     |
>>>>                   merge the subtree 'D' only
[...]
>>>>   In terms of commands (assume a commit after each step):
>>>> 
>>>>     svn cp A B
>>>>     edit A
>>>>     edit A/D
>>>>     edit B/D  svn merge ^/A/D B/D
>>>>     edit B/D
>>>>     svn merge --reintegrate ^/B A
>>>> 
>>>>   The output from "merge --reintegrate" is:
>>>> 
>>>>     svn: E195016: Reintegrate can only be used if
>>>>     revisions 2 through 8 were previously merged [...]
[...]
>>>>   Is the symmetric merge 'broken'?  This test claims so, on 
>>>> the basis  of expecting it to behave like a reintegrate merge.
>>>> However, we can't  be strictly backward-compatible with both
>>>> the reintegrate and the  non-reintegrate behaviours if we have
>>>> to pick one, because they differ. 
>>>>  What do we want to see here?
>>>> 
>>>>   I think it depends what the user is thinking.  If the user
>>>> is  thinking  "just help me merge everything that needs to be
>>>> merged",  then the user  probably would have used the plain
>>>> 'merge' command in 1.7,  and would  prefer a wrong merge with
>>>> spurious conflicts, given that we haven't yet  implemented a
>>>> correct merge.  If the user is thinking "this is a
>>>>  reintegration, and I believe my branch was sync'd with the
>>>> trunk  recently", then the user probably would prefer this
>>>> merge to bail out  like 'reintegrate' does.
[...]
>>>>   Our options for what 'symmetric' merge will do, in cases 
>>>> like this  [1], are:
>>>> 
>>>>     (1)  Implement correct merging.
>>>>     (2)  Merge wrongly, with spurious conflicts, like plain
>>>>           1.7 'merge'.
>>>>     (2a) (2) by default, with an option to check and bail out
>>>>          like (3).
>>>>     (3)  Bail out like 1.7 'reintegrate'.
>>>>     (3a) (3) by default, with an option to force the merge to 
>>>>          proceed  like (2).
[...]
>>>  I take your points about user intent but if the symmetric merge
>>> code  is meant to do away with the --reintegrate option, I think
>>> we should  take the more cautious approach and assume the user
>>> intends a  reintegrate, so IMO (3a) is the best choice.
>> 
>>  OK, yes.  If I can try to interpret your meaning without saying 
>> "reintegrate", it would be: we should take the more cautious
>> approach which is to bail out because of the inconsistent subtree
>> state.  The rule would be, for a request to merge in the direction
>> A->B, roughly speaking:
>> 
>>    * if every node in the tree was last merged A->B or not at all
>>     -> merge in the non-reintegrate style
>> 
>>    * if every node in the tree was last merged B->A, up to the
>>     same  B revision
>>      -> merge in the reintegrate style
>> 
>>    * otherwise
>>      -> bail out and report the subtree inconsistency
>> 
>>  I'll try to formulate that rule more precisely.
> 
> Did you ever look into this any further?
> 
> I'm going over the XFailing tests to determine what is considered a
> 1.8 blocker and merge_automatic_tests.py 18: 'reintegrate considers
> source subtree mergeinfo' is obviously still set to XFail.

I have filed issue #4258 "Automatic merge after subtree merge in opposite direction" and marked the test accordingly.  I classified it as a defect because of your feeling that it's a regression, but I really feel it's an enhancement.  I set the milestone as 1.8-consider.

- Julian

Re: Symmetric merge and subtrees

Posted by Paul Burba <pt...@gmail.com>.

On Thu, Aug 2, 2012 at 10:45 AM, Julian Foad <ju...@btopenworld.com> wrote:
> Hi Paul.
>
> Thanks for looking at this and the other ('deleted subtrees') thread.  Replies in line below...
>
>
> Paul Burba wrote:
>> Julian Foad wrote:
>>>  There are some merge scenarios for which it's not clear whether the
>>> user should specify '--reintegrate' or not.  We need to decide what the
>>> 'symmetric' (i.e. automatically-choosing) code should do in those cases.
>>>
>>>  The following example is adapted from merge_symmetric_tests.py 18,
>>> subtree_to_and_fro(), "reintegrate considers source subtree
>>> mergeinfo":
>>>
>>>                    reintegrate 'everything'
>>>                            |
>>>    A ------o-s--------------x
>>>        \          \        /
>>>         \          \      /
>>>    B     o-----s----s----s---
>>>                    |
>>>                  merge the subtree 'D' only
>>
>> What does 's' represent in this diagram?
>
> Something happening to the subtree.
>
>>  In merge_symmetric_tests.py's key 's' represents 'a subtree
>> merge', but that's not the case here, AFAICT.
>
> Yup, sorry for the unclarity.
>
>>>  In terms of commands (assume a commit after each step):
>>>
>>>    svn cp A B
>>>    edit A
>>>    edit A/D
>>>    edit B/D  svn merge ^/A/D B/D
>>>    edit B/D
>>>    svn merge --reintegrate ^/B A
>>>
>>>  The output from "merge --reintegrate" is:
>>>
>>>    svn: E195016: Reintegrate can only be used if
>>>    revisions 2 through 8 were previously merged [...]
>>>
>>>  But why did the user choose a 'reintegrate' merge in this example,
>>> when there has been no merge of the branch root in the A->B
>>> direction?
>>
>> No, but we could easily have had such a merge,
>>
>>                   reintegrate 'everything'
>>                           |
>> A   -----o----o------------x
>>      \    \     \         /
>>       \    \     \       /
>> B      o----x-----s----s---
>>             |     |
>>             |   merge the subtree 'D' only
>>             |
>>           merge 'everything'
>>
>> and still be facing the same problem.
>
>
> I don't think so, because now it is clearly a 'reintegrate' that is needed.  I say 'clearly' because now the root path of the merge has mergeinfo on one side, showing that a merge was done in one direction on this *same root path* that we're now wanting to merge.  The rule for using "reintegrate", as far as I am aware, is that you use it when merging in the opposite direction to "the previous merge".  No matter whether "the previous one" means the most recent merge in the tree or the most recent merge of the given root path, in this scenario those are both in the direction A->B, and so both the user and the current symmetric merge code know what to do.  (I should check that it currently does.)

You are quite right, if there was any merge from the root of A to B,
the symmetric merge code path kicks in and alerts us that we aren't
ready for this merge:

>svn pl -vR
Properties on 'A_COPY':
  svn:mergeinfo
    /A:2-3
Properties on 'A_COPY\D':
  svn:mergeinfo
    /A/D:2-7

>svn merge ^^/A_COPY A
..\..\..\subversion\svn\util.c:913: (apr_err=195016)
..\..\..\subversion\svn\merge-cmd.c:163: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:11773: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:11695: (apr_err=195016)
..\..\..\subversion\libsvn_client\merge.c:10743: (apr_err=195016)
svn: E195016: Reintegrate can only be used if revisions 2 through 10
were previously merged from
file:///C:/SVN/src-trunk/Debug/subversion/tests/cmdline/svn-tes
t-work/repositories/merge_automatic_tests-18/A to the reintegrate
source, but this is not the case:
  A_COPY
    Missing ranges: /A:5

> On the other hand, in the test scenario, if "the previous one" means the most recent merge of the given root path (which is what I assumed, but who's to say?), then it's ambiguous because there was no such merge in either direction.
>
>
>
>>>  Or, from another angle, what if we have this similar scenario?
>>>
>>>                     merge 'everything'
>>>                             |
>>>    A     o-o-s--------------x
>>>         /         \        /
>>>        /           \      /
>>>    B ----------s----s----s---
>>>
>>>  Given this example, it looks like a non-reintegrate merge might be
>
>>> the user's choice for the final merge (and for the intermediate
>
>>> merge... either a reintegrate or non-reintegrate); but all that's
>
>>> changed in this example is which branch was branched from which, and
>
>>> that shouldn't make any difference.  AFAIK we don't document a rule
>
>>> for whether '--reintegrate' should be used; it's ambiguous.
>>>
>>>  We have three possible results when we request a merge:
>>>
>>>    (1) a correct merge
>>>    (2) a wrong merge, with spurious conflicts
>>>    (3) bail out
>>>
>>>  Currently, 'merge' gives (2), 'merge --reintegrate' gives
>>> (3), and 'merge --symmetric' gives (2).
>>>
>>>  The subtree_to_and_fro() test expects 'merge --symmetric' to give
>>> result (3), but the symmetric merge currently sees that case as one
>
>>> in which a *non-reintegrate* merge is needed, and goes ahead to
>
>>> produce a merged result (with a conflict):
>>>
>>>    --- Merging r2 through r8 into 'A':
>>>    U    A/D/gamma
>>>    C    A/D/H/psi
>>>    [...]
>>>
>>>  Is the symmetric merge 'broken'?  This test claims so, on the basis
>>> of expecting it to behave like a reintegrate merge.  However, we can't
>
>>> be strictly backward-compatible with both the reintegrate and the
>
>>> non-reintegrate behaviours if we have to pick one, because they differ.
>
>>> What do we want to see here?
>>>
>>>  I think it depends what the user is thinking.  If the user is thinking
>>> "just help me merge everything that needs to be merged", then the user
>>> probably would have used the plain 'merge' command in 1.7, and would
>>> prefer a wrong merge with spurious conflicts, given that we haven't yet
>>> implemented a correct merge.  If the user is thinking "this is a
>>> reintegration, and I believe my branch was sync'd with the trunk
>>> recently", then the user probably would prefer this merge to bail out
>
>>> like 'reintegrate' does.
>>>
>>>  It almost goes without saying that if and when we have implemented a
>>> correct merge then that is the right thing to do, and that will resolve
>
>>> the ambiguity.  (When we do that, there is still the possibility of
>
>>> giving the user an option to make Subversion detect and report whether
>
>>> the required merge has anything complex about it -- subtrees,
>
>>> cherry-picked revisions, and so on -- because the user might want the
>
>>> mental security of confirming that in fact the merge is as simple as
>
>>> the user expects it to be.  But that is another topic.)
>>>
>>>  Our options for what 'symmetric' merge will do, in cases like this
>>> [1], are:
>>>
>>>    (1) Implement correct merging.
>>>    (2) Merge wrongly, with spurious conflicts, like plain 1.7 'merge'.
>>>    (2a) (2) by default, with an option to check and bail out like (3).
>>>    (3) Bail out like 1.7 'reintegrate'.
>>>    (3a) (3) by default, with an option to force the merge to proceed
>
>>> like (2).
>>>
>>>  If we choose (2), then we break strict compatibility with 1.7
>>> 'reintegrate' merge.  If we choose (3), then we break strict
>>> compatibility with 1.7 'non-reintegrate' merge.
>>>
>>>  I'd like to do (1), but, in the meantime, would we be happy with one of
>>> the alternatives?
>>
>> #1 would be awesome :-) but in the meantime...
>>
>> I take your points about user intent but if the symmetric merge code
>> is meant to do away with the --reintegrate option, I think we should
>> take the more cautious approach and assume the user intends a
>> reintegrate, so IMO (3a) is the best choice.
>
>
> OK, yes.  If I can try to interpret your meaning without saying "reintegrate", it would be: we should take the more cautious approach which is to bail out because of the inconsistent subtree state.  The rule would be, for a request to merge in the direction A->B, roughly speaking:
>
>   * if every node in the tree was last merged A->B or not at all    -> merge in the non-reintegrate style
>
>   * if every node in the tree was last merged B->A, up to the same
>     B revision
>     -> merge in the reintegrate style
>
>
>   * otherwise
>     -> bail out and report the subtree inconsistency
>
> I'll try to formulate that rule more precisely.

Did you ever look into this any further?

I'm going over the XFailing tests to determine what is considered a
1.8 blocker and merge_automatic_tests.py 18: 'reintegrate considers
source subtree mergeinfo' is obviously still set to XFail.

-- 
Paul T. Burba
CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development
Skype: ptburba

> In particular, that rough statement doesn't address the difference between 'complete' merges (which don't leave unmerged gaps before any merged revs) and 'cherry-pick' merges (which do).
>
> - Julian
>

Re: Symmetric merge and subtrees

Posted by Julian Foad <ju...@btopenworld.com>.

Hi Paul.

Thanks for looking at this and the other ('deleted subtrees') thread.  Replies in line below...


Paul Burba wrote:
> Julian Foad wrote:
>>  There are some merge scenarios for which it's not clear whether the 
>> user should specify '--reintegrate' or not.  We need to decide what the 
>> 'symmetric' (i.e. automatically-choosing) code should do in those cases.
>> 
>>  The following example is adapted from merge_symmetric_tests.py 18, 
>> subtree_to_and_fro(), "reintegrate considers source subtree 
>> mergeinfo":
>> 
>>                    reintegrate 'everything'
>>                            |
>>    A ------o-s--------------x
>>        \          \        /
>>         \          \      /
>>    B     o-----s----s----s---
>>                    |
>>                  merge the subtree 'D' only
> 
> What does 's' represent in this diagram?

Something happening to the subtree.

>  In merge_symmetric_tests.py's key 's' represents 'a subtree 
> merge', but that's not the case here, AFAICT.

Yup, sorry for the unclarity.

>>  In terms of commands (assume a commit after each step):
>> 
>>    svn cp A B
>>    edit A
>>    edit A/D
>>    edit B/D  svn merge ^/A/D B/D
>>    edit B/D
>>    svn merge --reintegrate ^/B A
>> 
>>  The output from "merge --reintegrate" is:
>> 
>>    svn: E195016: Reintegrate can only be used if
>>    revisions 2 through 8 were previously merged [...]
>> 
>>  But why did the user choose a 'reintegrate' merge in this example, 
>> when there has been no merge of the branch root in the A->B
>> direction?
> 
> No, but we could easily have had such a merge,
> 
>                   reintegrate 'everything'
>                           |
> A   -----o----o------------x
>      \    \     \         /
>       \    \     \       /
> B      o----x-----s----s---
>             |     |
>             |   merge the subtree 'D' only
>             |
>           merge 'everything'
> 
> and still be facing the same problem.


I don't think so, because now it is clearly a 'reintegrate' that is needed.  I say 'clearly' because now the root path of the merge has mergeinfo on one side, showing that a merge was done in one direction on this *same root path* that we're now wanting to merge.  The rule for using "reintegrate", as far as I am aware, is that you use it when merging in the opposite direction to "the previous merge".  No matter whether "the previous one" means the most recent merge in the tree or the most recent merge of the given root path, in this scenario those are both in the direction A->B, and so both the user and the current symmetric merge code know what to do.  (I should check that it currently does.)

On the other hand, in the test scenario, if "the previous one" means the most recent merge of the given root path (which is what I assumed, but who's to say?), then it's ambiguous because there was no such merge in either direction.



>>  Or, from another angle, what if we have this similar scenario?
>> 
>>                     merge 'everything'
>>                             |
>>    A     o-o-s--------------x
>>         /         \        /
>>        /           \      /
>>    B ----------s----s----s---
>> 
>>  Given this example, it looks like a non-reintegrate merge might be 

>> the user's choice for the final merge (and for the intermediate 

>> merge... either a reintegrate or non-reintegrate); but all that's 

>> changed in this example is which branch was branched from which, and 

>> that shouldn't make any difference.  AFAIK we don't document a rule 

>> for whether '--reintegrate' should be used; it's ambiguous.
>> 
>>  We have three possible results when we request a merge:
>> 
>>    (1) a correct merge
>>    (2) a wrong merge, with spurious conflicts
>>    (3) bail out
>> 
>>  Currently, 'merge' gives (2), 'merge --reintegrate' gives 
>> (3), and 'merge --symmetric' gives (2).
>> 
>>  The subtree_to_and_fro() test expects 'merge --symmetric' to give 
>> result (3), but the symmetric merge currently sees that case as one 

>> in which a *non-reintegrate* merge is needed, and goes ahead to 

>> produce a merged result (with a conflict):
>> 
>>    --- Merging r2 through r8 into 'A':
>>    U    A/D/gamma
>>    C    A/D/H/psi
>>    [...]
>> 
>>  Is the symmetric merge 'broken'?  This test claims so, on the basis 
>> of expecting it to behave like a reintegrate merge.  However, we can't 

>> be strictly backward-compatible with both the reintegrate and the 

>> non-reintegrate behaviours if we have to pick one, because they differ. 

>> What do we want to see here?
>> 
>>  I think it depends what the user is thinking.  If the user is thinking 
>> "just help me merge everything that needs to be merged", then the user 
>> probably would have used the plain 'merge' command in 1.7, and would 
>> prefer a wrong merge with spurious conflicts, given that we haven't yet 
>> implemented a correct merge.  If the user is thinking "this is a 
>> reintegration, and I believe my branch was sync'd with the trunk 
>> recently", then the user probably would prefer this merge to bail out 

>> like 'reintegrate' does.
>> 
>>  It almost goes without saying that if and when we have implemented a 
>> correct merge then that is the right thing to do, and that will resolve 

>> the ambiguity.  (When we do that, there is still the possibility of 

>> giving the user an option to make Subversion detect and report whether 

>> the required merge has anything complex about it -- subtrees, 

>> cherry-picked revisions, and so on -- because the user might want the 

>> mental security of confirming that in fact the merge is as simple as 

>> the user expects it to be.  But that is another topic.)
>> 
>>  Our options for what 'symmetric' merge will do, in cases like this 
>> [1], are:
>> 
>>    (1) Implement correct merging.
>>    (2) Merge wrongly, with spurious conflicts, like plain 1.7 'merge'.
>>    (2a) (2) by default, with an option to check and bail out like (3).
>>    (3) Bail out like 1.7 'reintegrate'.
>>    (3a) (3) by default, with an option to force the merge to proceed 

>> like (2).
>> 
>>  If we choose (2), then we break strict compatibility with 1.7 
>> 'reintegrate' merge.  If we choose (3), then we break strict 
>> compatibility with 1.7 'non-reintegrate' merge.
>> 
>>  I'd like to do (1), but, in the meantime, would we be happy with one of 
>> the alternatives?
> 
> #1 would be awesome :-) but in the meantime...
> 
> I take your points about user intent but if the symmetric merge code
> is meant to do away with the --reintegrate option, I think we should
> take the more cautious approach and assume the user intends a
> reintegrate, so IMO (3a) is the best choice.


OK, yes.  If I can try to interpret your meaning without saying "reintegrate", it would be: we should take the more cautious approach which is to bail out because of the inconsistent subtree state.  The rule would be, for a request to merge in the direction A->B, roughly speaking:

  * if every node in the tree was last merged A->B or not at all    -> merge in the non-reintegrate style

  * if every node in the tree was last merged B->A, up to the same
    B revision
    -> merge in the reintegrate style


  * otherwise
    -> bail out and report the subtree inconsistency

I'll try to formulate that rule more precisely.  In particular, that rough statement doesn't address the difference between 'complete' merges (which don't leave unmerged gaps before any merged revs) and 'cherry-pick' merges (which do).

- Julian

Re: Symmetric merge and subtrees

Posted by Paul Burba <pt...@gmail.com>.

On Tue, Jul 17, 2012 at 1:28 PM, Julian Foad <ju...@btopenworld.com> wrote:
> There are some merge scenarios for which it's not clear whether the user should specify '--reintegrate' or not.  We need to decide what the 'symmetric' (i.e. automatically-choosing) code should do in those cases.
>
> The following example is adapted from merge_symmetric_tests.py 18, subtree_to_and_fro(), "reintegrate considers source subtree mergeinfo":
>
>                    reintegrate 'everything'
>                            |
>   A ------o-s--------------x
>       \          \        /
>        \          \      /
>   B     o-----s----s----s---
>                    |
>                  merge the subtree 'D' only

What does 's' represent in this diagram?  In
merge_symmetric_tests.py's key 's' represents 'a subtree merge', but
that's not the case here, AFAICT.

> In terms of commands (assume a commit after each step):
>
>   svn cp A B
>   edit A
>   edit A/D
>   edit B/D  svn merge ^/A/D B/D
>   edit B/D
>   svn merge --reintegrate ^/B A
>
> The output from "merge --reintegrate" is:
>
>   svn: E195016: Reintegrate can only be used if
>   revisions 2 through 8 were previously merged [...]
>
> But why did the user choose a 'reintegrate' merge in this example, when there has been no merge of the branch root in the A->B direction?

No, but we could easily have had such a merge,

                  reintegrate 'everything'
                          |
 A -----o----o------------x
     \    \     \        /
      \    \     \      /
 B     o----x-----s----s---
            |     |
            |   merge the subtree 'D' only
            |
          merge 'everything'

and still be facing the same problem.

> Or, from another angle, what if we have this similar scenario?
>
>                    merge 'everything'
>                            |
>   A     o-o-s--------------x
>        /         \        /
>       /           \      /
>   B ----------s----s----s---
>
> Given this example, it looks like a non-reintegrate merge might be the user's choice for the final merge (and for the intermediate merge... either a reintegrate or non-reintegrate); but all that's changed in this example is which branch was branched from which, and that shouldn't make any difference.  AFAIK we don't document a rule for whether '--reintegrate' should be used; it's ambiguous.
>
> We have three possible results when we request a merge:
>
>   (1) a correct merge
>   (2) a wrong merge, with spurious conflicts
>   (3) bail out
>
> Currently, 'merge' gives (2), 'merge --reintegrate' gives (3), and 'merge --symmetric' gives (2).
>
> The subtree_to_and_fro() test expects 'merge --symmetric' to give result (3), but the symmetric merge currently sees that case as one in which a *non-reintegrate* merge is needed, and goes ahead to produce a merged result (with a conflict):
>
>   --- Merging r2 through r8 into 'A':
>   U    A/D/gamma
>   C    A/D/H/psi
>   [...]
>
> Is the symmetric merge 'broken'?  This test claims so, on the basis of expecting it to behave like a reintegrate merge.  However, we can't be strictly backward-compatible with both the reintegrate and the non-reintegrate behaviours if we have to pick one, because they differ.  What do we want to see here?
>
> I think it depends what the user is thinking.  If the user is thinking "just help me merge everything that needs to be merged", then the user probably would have used the plain 'merge' command in 1.7, and would prefer a wrong merge with spurious conflicts, given that we haven't yet implemented a correct merge.  If the user is thinking "this is a reintegration, and I believe my branch was sync'd with the trunk recently", then the user probably would prefer this merge to bail out like 'reintegrate' does.
>
> It almost goes without saying that if and when we have implemented a correct merge then that is the right thing to do, and that will resolve the ambiguity.  (When we do that, there is still the possibility of giving the user an option to make Subversion detect and report whether the required merge has anything complex about it -- subtrees, cherry-picked revisions, and so on -- because the user might want the mental security of confirming that in fact the merge is as simple as the user expects it to be.  But that is another topic.)
>
> Our options for what 'symmetric' merge will do, in cases like this [1], are:
>
>   (1) Implement correct merging.
>   (2) Merge wrongly, with spurious conflicts, like plain 1.7 'merge'.
>   (2a) (2) by default, with an option to check and bail out like (3).
>   (3) Bail out like 1.7 'reintegrate'.
>   (3a) (3) by default, with an option to force the merge to proceed like (2).
>
> If we choose (2), then we break strict compatibility with 1.7 'reintegrate' merge.  If we choose (3), then we break strict compatibility with 1.7 'non-reintegrate' merge.
>
> I'd like to do (1), but, in the meantime, would we be happy with one of the alternatives?

#1 would be awesome :-) but in the meantime...

I take your points about user intent but if the symmetric merge code
is meant to do away with the --reintegrate option, I think we should
take the more cautious approach and assume the user intends a
reintegrate, so IMO (3a) is the best choice.

-- 
Paul T. Burba
CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development
Skype: ptburba

> - Julian
>
>
> [1] At the moment, 'cases like this' means cases where the specified merge root node was last merged in the same direction as the current merge, or was never merged before.  That is, the symmetric merge code chooses whether to do a reintegrate style of merge or a non-reintegrate style of merge based on the direction of this merge relative to the direction of the last merge of the specified root node, not based on any merges that may have been done to subtrees.  That does NOT mean it doesn't handle subtrees; it handles them just the same as
> 1.7 (reintegrate or plain) merge does.  It just means subtrees don't affect the decision of which style of merge it uses.
>
>
> --
> Certified & Supported Apache Subversion Downloads: http://www.wandisco.com/subversion/download