You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Kamesh Jayachandran <ka...@collab.net> on 2008/07/31 14:20:48 UTC

issue-2897(reflective merge solution)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi All,

Little bit background for the purpose of general audience.

Problem:
- ---------------------
- - Create /feature_branch from /trunk.
- - Synch up change(rX) from /trunk to /feature_branch and commit at rY
along with some other local changes 'Z'.
- - When you merge /feature_branch to /trunk later it should merge only
'Z' not rX again to /trunk as it is not a change originated in
/feature_branch.


Available solution:
- ------------------------
Solution 1: One can use '--reintegrate' switch to 'merge' command, but
it has lots of expectation on merge-target, which makes it unusable on a
feature_branch with renamed subtrees which is common when someone does a
refactoring there.(issue-3128).

Solution 2: issue-2897 branch 'extract non-reflective changes from a
reflective revision and apply'.
   This branch relies on availability of 'what revisions+merge_sources
got merged in a commit'. It had this information implemented in 'sqlite'
 our old 'mergeinfo' store.
   In January we have changed this 'sqlite' based mergeinfo storage to a
'fs' based 'storage', which this implementation has not caught up with yet.
   Given the issue-3128 in 'Solution 1' it seems 'Solution 2' is worth a
try.

Solution 2 explained:

Part 1: Store mergeinfo_added in a commit in the back ground(Currently I
have kind of local patch implementing the same for bdb, I will initiate
a separate discussion once my local patch becomes clean and mature).

Part 2: Retrieve the 'reflective rev and reflected merge ranges' within
a given merge range via the API. See API doc of
'svn_fs_get_commit_and_merge_ranges'
http://svn.collab.net/repos/svn/branches/issue-2897/subversion/include/svn_fs.h

Part 3: Treat the reflective revision specially as detailed below by
custom merge call back.


a) reflective text changes:
 reflective_merge_file_changed(mine, older, yours)
 mine = WC file.
 older = file@reflective_rev-1
 yours = file@reflective_rev


Normal merge_file_changed applies diff(older, yours) to mine.
This diff can contain the changes 'synch up from trunk' as well as the
local adhoc changes and conflict resolutions.

Applying the diff directly is going to cause lots of headaches.

So objective is to have a meaningful diff of *only local adhoc changes
done before committing the merge*.

Let me take the example and explain,


Case1:

We merge -r13, -r17, -r29 from /trunk to /feature_branch and commit at
r40 (Assume this is the only synch up)

Now we merge /feature_branch -r1:40 back to /trunk.

It does a merge of -r1:39 which is a normal merge.
It does a reflective merge of 40.
Let us say /feature_branch/test.c got a change from a merge at r40(Our
first merge)

To calculate the *meaningful diff*, We apply -r13 change to OLDER, and
r17 change to OLDER, r29 change to OLDER.

Now in this case after the above 3 merges(r13, r17, r29) OLDER becomes
YOURS and hence no change is applied to MINE.


Case2:

We merge -r13, -r17, -r29 from /trunk to /feature_branch + local
non-conflicting change to /feature_branch/test.c and commit at r40
(Assume this is the only synch up)

Now we merge /feature_branch -r1:40 back to /trunk.

It does a merge of -r1:39 which is a normal merge.
It does a reflective merge of r40.
Let us say /feature_branch/test.c got a change from a merge at r40(Our
first merge)

To calculate the *meaningful diff*, We apply -r13 change to OLDER, and
r17 change to OLDER, r29 change to OLDER.

Now in this case after the above 3 merges(r13, r17, r29) OLDER becomes
(YOURS-'*local non conflicting changes*') and hence
(YOURS-YOURS +'*local non conflicting changes*') is applied to MINE. i.e
'*local non conflicting changes*'

Case3:

We merge -r13, -r17(conflicting), -r29 from /trunk to /feature_branch +
local non-conflicting change + conflict resolution to r17 to
/feature_branch/test.c and commit at r40
(Assume this is the only synch up)

Now we merge /feature_branch -r1:40 back to /trunk.

It does a merge of -r1:39 which is a normal merge.
It does a reflective merge of 40.
Let us say /feature_branch/test.c got a change from a merge at r40(Our
first merge)

To calculate the *meaningful diff*, We apply -r13 change to OLDER, and
r29 change to OLDER. (SEE WE DONT APPLY r17 as it is a conflicting one)/

Now in this case after the above 2 merges(r13, r29) OLDER becomes
(YOURS-local non conflicting changes - r17 - conflict resolution to r17)
and hence
(YOURS-YOURS +local non conflicting changes + r17 + conflict resolution
to r17) is applied to MINE. i.e local non conflicting changes + r17 +
conflict resolution to r17. Moral: If synch up gave a conflict reverse
also would give.




b)reflective tree_changes dir_added/dir_deleted/file_added/file_deleted

We have code to handle some cases, have to think extensively on this
before calling it to be complete.


As things have changed quite a lot in the last 6 months, I cut one more
branch issue-2897-take2 where I will extract changes from issue-2897.


Thanks
With regards
Kamesh Jayachandran
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIkco/3WHvyO0YTCwRAnhVAJ9zaBBok5ab+meTcpDC1A/9QAQhHACgg0dO
A0kWMWI3V6VkwBjjZQEsYHM=
=xd43
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: issue-2897(reflective merge solution)

Posted by Mark Phippard <ma...@gmail.com>.
On Mon, Aug 4, 2008 at 1:48 PM, Julian Foad <ju...@btopenworld.com> wrote:

>> Solution 1: One can use '--reintegrate' switch to 'merge' command, but
>> it has lots of expectation on merge-target, which makes it unusable on a
>> feature_branch with renamed subtrees which is common when someone does a
>> refactoring there.(issue-3128).
>
> The restriction that makes "--reintegrate" unusable with sub-tree
> mergeinfo is something we could fix. It should check whether the subtree
> mergeinfo meets the criteria for what situations the command can handle.
> Currently it doesn't attempt to check this, and just assumes the
> merrgeinfo doesn't meet the criteria.
>
> I know solution 2 can handle many more general cases, but, in the simple
> cases that "--reintegrate" is intended to handle, does your solution 2
> achieve the same result, or does it have a different concept of what is
> the right result?

Reintegrate has another problem and that is that it is not conducive
the continue working on the branch after it is reintegrated.  IOW, set
aside the subtree mergeinfo problem and assume you do not have that.

After you run reintegrate to merge a branch back to trunk, you really
need to delete and re-branch if you want to keep working on it.  If
you don't you have some problems:

1) If you try to synch your branch with trunk again it no longer works
because now you have a reflective merge situation from trunk to the
branch.  I am not sure if you can add the --reintegrate option in this
direction.  If you can, that would be one solution.  But it would be
weird to suddenly change the way you were merging.

2) If you do not somehow successfully do #1, then --reintegrate will
not work properly (when you want to put the latest changes from branch
back to trunk).  This is because --reintegrate has no way to know to
use a newer revision from trunk when doing the merge.  Basically, it
always looks at the last revision you merged from trunk to the branch.
 If you have not done any more merges, then it is going to try to
merge everything again from the branch.

Anyway, do not mean to sidetrack this thread.  I am just pointing out
that there are several things that would have to be "fixed" about
reintegrate for it to be the only solution to this problem.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: issue-2897(reflective merge solution)

Posted by Mark Phippard <ma...@gmail.com>.
On Mon, Aug 4, 2008 at 1:48 PM, Julian Foad <ju...@btopenworld.com> wrote:

> Solution 0: If you were to use Subversion 1.5's plain "svn
> merge /feature_branch /trunk" command, then Subversion would omit rY
> entirely, which is wrong because we need the conflict-resolution part of
> the change. Without that part, the merge would throw up those same (or
> similar) conflicts again, which the user would have to resolve again.

I think what we finally decided to do in 1.5 was include rY in this
scenario.  So you would get a lot of repeated merges and likely
conflicts from that.  I'd have to search our archives to be sure, but
I recall we decided it was better to not risk losing code.  Obviously
both options "omit rY" and "take rY" are imperfect.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: issue-2897(reflective merge solution)

Posted by Kamesh Jayachandran <ka...@collab.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>>>> Available solution:
>>>> - ------------------------
>>> Solution 0: If you were to use Subversion 1.5's plain "svn
>>> merge /feature_branch /trunk" command, then Subversion would omit rY
>>> entirely, which is wrong because we need the conflict-resolution part of
>>> the change. Without that part, the merge would throw up those same (or
>>> similar) conflicts again, which the user would have to resolve again.
>> I believe you mean 'svn merge trunk_wc ^/feature_branch'
> 
> No; the source comes first: 'svn merge ^/feature_branch trunk_wc'. I
> expect that's what you meant :-)
> 

Yes source comes first, my typo :(.

>> This would *not* leave rY rather it would do a 'mad' merge
>> possibly(mostly) causing repeat merge problem.
> 
> OK, Mark explained that my assumption there was wrong and the current
> merge algorithm attempts to apply rY in this case so as not to throw
> away all the "logical conflict" resolution work that may have been
> included, even though the "physical conflict" parts will conflict again.
> 
>> Only instance it would not cause conflict is when there is enough
>> distance between 'trunk' and 'feature_branch' hunks.
> 
> I don't understand this. Each hunk from rY that conflicted and was
> resolved when rY was created from rX is almost certain to conflict
> again, is it not? I don't see how there can be any distance between
> where this hunk existed in rX and where it is going to be applied to the
> trunk again. It will be applied to the same place where it already
> exists.


Yes you are correct, it will conflict again.

When I mentioned distance between changes (hunk am I correct in my usage
of word hunk?) of synced revs of trunk and /fb changes, I had the
'simple case in mind'(no conflict in sync up) as described below.

/trunk@1 has test.c with the following content.
line1
line2
line3
line4
line5
line6
line7

i.e if you copy /fb from /trunk@1 and commit at r2, consider the
following sequence of events.

r3 - Modify /trunk/test.c 'line3->line3-changed'
r4 - merge ^/trunk fb_wc
r5 - Modify /fb/test.c 'line4->line4-changed'
merge ^/fb trunk_wc
This should not give a conflict.

But gives a conflict as below with normal merge(no reintegrate) as of
1.5.1 like follows,

<snip>
line1
line2
<<<<<<< .working
line3-changed
line4
=======
line3-changed
line4-changed
>>>>>>> .merge-right.r5
line5
line6
line7
</snip>

Both 'reintegrate' and 'issue-2897' should work well here without conflict.




> 
> 
> [...]
>>> I know solution 2 can handle many more general cases, but, in the simple
>>> cases that "--reintegrate" is intended to handle, does your solution 2
>>> achieve the same result, or does it have a different concept of what is
>>> the right result? 
>> It has no special cases, it just segregates a range of revisions to
>> merge as
>> [ordinary_range-1]...[reflective_rev-1]....[reflective-rev-2]...[ordinary_range-N]...
>>
>> Existing merge call backs are used for 'ordinary_ranges.
>>
>> Reflective call backs are used for 'reflective revs'.
>>
>> Purpose of Reflective call backs is 'extract non-reflective change and
>> merge the same'.
> 
> That doesn't really answer my question. I'm looking for a high-level
> end-user's perspective on what the merging process achieves. I think the
> answer is that, in the simple cases that "--reintegrate" is intended to
> handle, Solution 2 has the same concept of what is the right result
> (i.e. that the trunk should end up looking exactly like the
> feature_branch does) but does not achieve that result without manual
> conflict resolution.

Yes.

> 
> I wonder if it would be possible to make Solution 2 understand how to do
> the right thing in this simple case and avoid the need for conflict
> resolution. If we could do that, then it would win in every way.
>


I somehow think it should be possible in most of the cases, though I
don't have any algorithm in mind right now.


>>>
>>>   * Each conflict resolution performed on /feature_branch will correctly
>>> be included in the re-integration.
>>  Yes, after annoying with the user with the conflict again as mentioned
>> above.
> 
> Oh, yes, I suppose so. I was thinking here that changes made by the user
> for conflict resolution would not conflict again when merged back, but I
> suppose that would be highly unusual.

As said above it should be possible to make conflict-resolutions of
/feature_branch non-conflicting on the trunk.

> 
>>>   * Each revision from /trunk that conflicted when merged
>>> into /feature_branch will wrongly be included in the re-integration,
>>> creating new conflicts.
>>>
>> Yes. May be we can be bit more smart here once we have this stuff working.
> 
> I have tried drawing the merge scenario (case 3) on paper. There is a
> good way of representing merges, using tokens like "A" to represent a
> hunk of text and "Aa" or "A13" to represent a modified version of that
> hunk. This diagram shows one file starting in r9 containing three hunks
> of text "A", "B", and "C", and developing in trunk and on the branch:
>

Good.


> (You need a fixed-width typeface to display this nicely.)
> 
>     /trunk             /feature_branch
>     ____
> r9  |A
>     |B
>     |C
> 
> svn copy ^/trunk ^/feature_branch
>     ____               ____
> r10 |A                 |A
>     |B                 |B
>     |C                 |C
>

It could have been better if we have this /feature_branch r12 change at
r11 instead as r12 has commits in two branches(cross branch commit). No
problem though as technically this is possible.


> # Now various modifications are committed...
>     ____               ____
> r12 |Aa  <<            |A
>     |B                 |B
>     |C                 |Cc  <<
>     ____               ____
> r13 |Aa                |A
>     |B13 <<            |B
>     |C                 |Cc
>     ____               ____
> r17 |Aa                |A
>     |B13               |B
>     |C17 <<            |Cc
>     ____               ____
> r20 |A20 <<            |A
>     |B13               |B
>     |C17               |Cc
>     ____               ____
> r29 |A20               |A
>     |B29 <<            |B
>     |C17               |Cc
> 
> svn merge -c13,17,29 ^/trunk feature_branch_wc
>     ....               ____
> WC  ....               |A
>     ....               |B29 << auto-merged
>     ....               |<C17=C=Cc> << conflict
> 

I believe you did more than one merge with subset of revisions from
above to merge. Or else it would conflict first at r17 which if you
choose to resolve later will cause r29 to conflict too.

> # Manually resolve.
> svn commit
>     ____               ____
> r40 |A20               |A
>     |B29               |B29 << auto-merged
>     |C17               |C40 << manually resolved
>

However you did merge, final merge resolution is fine.

> svn merge ^/feature_branch trunk_wc
> 
> # phase 1: a normal merge of r10:39.

Phase1 would do bulk merge so r10:40

> # (The only operative change is r12 of /feature_branch.)
>
No r40 also is operative.


     ____
> WC  |A20
>     |B29
>     |<C17=C=Cc> << conflict
>

Above conflict is wrong, Conflict will be as follows,


     ____
 WC  |A20
     |B29 <--Even this may give a conflict as changes(see the
explanations given above for 'simple case in mind') are close.
     |<C17=C=C40> << conflict



> # Oops: it conflicts already. (Is that what we expected?)

Yes.

> # Manually resolve.
>     ____
> WC  |A20
>     |B29
>     |C40
> 
> # phase 2: a reflective-merge of r40
>


OLDER = /feature_branch/test.c@39(Though you have not mentioned test.c,
I presume it has some file like test.c)
                ____
> ( OLDER       = |?   )
>                 |?
>                 |?
>

OLDER       = A
              B
              Cc

                 ____
> ( OLDER +       |?   )
>   c13,17,29   = |?
>                 |?
>


OLDER_DASH = OLDER + c13,17,29
OLDER_DASH =   A
               B29
               Cc


> ( MINE        = |?   )
>                 |?
>                 |?
>     ____

I believe MINE is WC, YOURS is the actual name as per
subversion/libsvn_client/merge.c:merge_file_changed

YOURS = /feature_branch/test.c@40
YOURS        = A
               B29
               C40

diff(OLDER_DASH, YOURS) =
- -Cc
+C40


And hence a conflict. As I told above we can be smart here.


Thanks

With regards
Kamesh Jayachandran
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFInGUM3WHvyO0YTCwRAlilAJ4z0avYICioACdVLUKyJlOsINQtRwCeIACl
Ypr5DNHEsZPxA9Ty7smmTTE=
=H75x
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: issue-2897(reflective merge solution)

Posted by Julian Foad <ju...@btopenworld.com>.
Kamesh Jayachandran wrote:
> >> Problem:
> >> - ---------------------
> >> - - Create /feature_branch from /trunk.
> >> - - Synch up change(rX) from /trunk to /feature_branch and commit at rY
> >> along with some other local changes 'Z'.
> > 
> > To keep things simple and concrete, I will assume that, before this
> > merge, /feature_branch had already been synchronised with all of the
> > changes up to r(X-1) from /trunk.
> 
> Fine. By the above example I meant that 'final reflective merge
> solution' should support cherry picking also.

OK.

 
> > If the user is following best practices, then the extra change 'Z' is
> > what is needed to resolve conflicts.
> 
> May be, some times people may do logical conflict resolution(i.e not SVN
> conflict), for example merge may be successful but logically incorrect
> with respect to the state of things in feature branch, user may try to
> fix this inconsistency along with the merge and commit to ensure sanity
> of his commit.

Agreed. Z comprises the changes needed to resolve physical conflicts and
the changes needed to resolve logical conflicts.

 
> >> Available solution:
> >> - ------------------------
> > 
> > Solution 0: If you were to use Subversion 1.5's plain "svn
> > merge /feature_branch /trunk" command, then Subversion would omit rY
> > entirely, which is wrong because we need the conflict-resolution part of
> > the change. Without that part, the merge would throw up those same (or
> > similar) conflicts again, which the user would have to resolve again.
> 
> I believe you mean 'svn merge trunk_wc ^/feature_branch'

No; the source comes first: 'svn merge ^/feature_branch trunk_wc'. I
expect that's what you meant :-)

> This would *not* leave rY rather it would do a 'mad' merge
> possibly(mostly) causing repeat merge problem.

OK, Mark explained that my assumption there was wrong and the current
merge algorithm attempts to apply rY in this case so as not to throw
away all the "logical conflict" resolution work that may have been
included, even though the "physical conflict" parts will conflict again.

> Only instance it would not cause conflict is when there is enough
> distance between 'trunk' and 'feature_branch' hunks.

I don't understand this. Each hunk from rY that conflicted and was
resolved when rY was created from rX is almost certain to conflict
again, is it not? I don't see how there can be any distance between
where this hunk existed in rX and where it is going to be applied to the
trunk again. It will be applied to the same place where it already
exists.


[...]
> > I know solution 2 can handle many more general cases, but, in the simple
> > cases that "--reintegrate" is intended to handle, does your solution 2
> > achieve the same result, or does it have a different concept of what is
> > the right result? 
> 
> It has no special cases, it just segregates a range of revisions to
> merge as
> [ordinary_range-1]...[reflective_rev-1]....[reflective-rev-2]...[ordinary_range-N]...
> 
> Existing merge call backs are used for 'ordinary_ranges.
> 
> Reflective call backs are used for 'reflective revs'.
> 
> Purpose of Reflective call backs is 'extract non-reflective change and
> merge the same'.

That doesn't really answer my question. I'm looking for a high-level
end-user's perspective on what the merging process achieves. I think the
answer is that, in the simple cases that "--reintegrate" is intended to
handle, Solution 2 has the same concept of what is the right result
(i.e. that the trunk should end up looking exactly like the
feature_branch does) but does not achieve that result without manual
conflict resolution.

I wonder if it would be possible to make Solution 2 understand how to do
the right thing in this simple case and avoid the need for conflict
resolution. If we could do that, then it would win in every way.


[Snip lots of explanation that's all OK.]

> > So, this works well for non-conflicting changes, and for conflicting
> > changes it produces conflicts by merging the same change again.
> > 
> > In summary:
> > 
> >   * Each source change that originally merged cleanly
> > into /feature_branch will correctly be omitted from the re-integration
> > to /trunk.
> > 
> >   * Each independent change made to /feature_branch, including changes
> > made in the same revision as a merge, will correctly be re-integrated
> > to /trunk.
> > 
> >   * Each conflict resolution performed on /feature_branch will correctly
> > be included in the re-integration.
> 
>  Yes, after annoying with the user with the conflict again as mentioned
> above.

Oh, yes, I suppose so. I was thinking here that changes made by the user
for conflict resolution would not conflict again when merged back, but I
suppose that would be highly unusual.

> >   * Each revision from /trunk that conflicted when merged
> > into /feature_branch will wrongly be included in the re-integration,
> > creating new conflicts.
> > 
> Yes. May be we can be bit more smart here once we have this stuff working.

I have tried drawing the merge scenario (case 3) on paper. There is a
good way of representing merges, using tokens like "A" to represent a
hunk of text and "Aa" or "A13" to represent a modified version of that
hunk. This diagram shows one file starting in r9 containing three hunks
of text "A", "B", and "C", and developing in trunk and on the branch:

(You need a fixed-width typeface to display this nicely.)

    /trunk             /feature_branch
    ____
r9  |A
    |B
    |C

svn copy ^/trunk ^/feature_branch
    ____               ____
r10 |A                 |A
    |B                 |B
    |C                 |C

# Now various modifications are committed...
    ____               ____
r12 |Aa  <<            |A
    |B                 |B
    |C                 |Cc  <<
    ____               ____
r13 |Aa                |A
    |B13 <<            |B
    |C                 |Cc
    ____               ____
r17 |Aa                |A
    |B13               |B
    |C17 <<            |Cc
    ____               ____
r20 |A20 <<            |A
    |B13               |B
    |C17               |Cc
    ____               ____
r29 |A20               |A
    |B29 <<            |B
    |C17               |Cc

svn merge -c13,17,29 ^/trunk feature_branch_wc
    ....               ____
WC  ....               |A
    ....               |B29 << auto-merged
    ....               |<C17=C=Cc> << conflict

# Manually resolve.
svn commit
    ____               ____
r40 |A20               |A
    |B29               |B29 << auto-merged
    |C17               |C40 << manually resolved

svn merge ^/feature_branch trunk_wc

# phase 1: a normal merge of r10:39.
# (The only operative change is r12 of /feature_branch.)
    ____
WC  |A20
    |B29
    |<C17=C=Cc> << conflict

# Oops: it conflicts already. (Is that what we expected?)
# Manually resolve.
    ____
WC  |A20
    |B29
    |C40

# phase 2: a reflective-merge of r40
                ____
( OLDER       = |?   )
                |?
                |?
                ____
( OLDER +       |?   )
  c13,17,29   = |?
                |?
                ____
( MINE        = |?   )
                |?
                |?
    ____
WC  |?
    ...

Would you be interested in correcting this and filling in the rest?

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: issue-2897(reflective merge solution)

Posted by Kamesh Jayachandran <ka...@collab.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> 
> Thanks, Kamesh. I have not followed the development of merge tracking
> very closely, so I am one of the "general audience" trying to understand
> this now.
>

Welcome.


> 
>> Problem:
>> - ---------------------
>> - - Create /feature_branch from /trunk.
>> - - Synch up change(rX) from /trunk to /feature_branch and commit at rY
>> along with some other local changes 'Z'.
> 
> To keep things simple and concrete, I will assume that, before this
> merge, /feature_branch had already been synchronised with all of the
> changes up to r(X-1) from /trunk.

Fine. By the above example I meant that 'final reflective merge
solution' should support cherry picking also.

> 
> If the user is following best practices, then the extra change 'Z' is
> what is needed to resolve conflicts.
>

May be, some times people may do logical conflict resolution(i.e not SVN
conflict), for example merge may be successful but logically incorrect
with respect to the state of things in feature branch, user may try to
fix this inconsistency along with the merge and commit to ensure sanity
of his commit.


> 
>> Available solution:
>> - ------------------------
> 
> Solution 0: If you were to use Subversion 1.5's plain "svn
> merge /feature_branch /trunk" command, then Subversion would omit rY
> entirely, which is wrong because we need the conflict-resolution part of
> the change. Without that part, the merge would throw up those same (or
> similar) conflicts again, which the user would have to resolve again.
>

I believe you mean 'svn merge trunk_wc ^/feature_branch'
This would *not* leave rY rather it would do a 'mad' merge
possibly(mostly) causing repeat merge problem.

Only instance it would not cause conflict is when there is enough
distance between 'trunk' and 'feature_branch' hunks.




>> Solution 1: One can use '--reintegrate' switch to 'merge' command, but
>> it has lots of expectation on merge-target, which makes it unusable on a
>> feature_branch with renamed subtrees which is common when someone does a
>> refactoring there.(issue-3128).
> 
> The restriction that makes "--reintegrate" unusable with sub-tree
> mergeinfo is something we could fix. It should check whether the subtree
> mergeinfo meets the criteria for what situations the command can handle.
> Currently it doesn't attempt to check this, and just assumes the
> merrgeinfo doesn't meet the criteria.

OK.

> 
> I know solution 2 can handle many more general cases, but, in the simple
> cases that "--reintegrate" is intended to handle, does your solution 2
> achieve the same result, or does it have a different concept of what is
> the right result? 

It has no special cases, it just segregates a range of revisions to
merge as
[ordinary_range-1]...[reflective_rev-1]....[reflective-rev-2]...[ordinary_range-N]...

Existing merge call backs are used for 'ordinary_ranges.

Reflective call backs are used for 'reflective revs'.

Purpose of Reflective call backs is 'extract non-reflective change and
merge the same'.

> 
>> Solution 2: issue-2897 branch 'extract non-reflective changes from a
>> reflective revision and apply'.
>>    This branch relies on availability of 'what revisions+merge_sources
>> got merged in a commit'. It had this information implemented in 'sqlite'
>>  our old 'mergeinfo' store.
>>    In January we have changed this 'sqlite' based mergeinfo storage to a
>> 'fs' based 'storage', which this implementation has not caught up with yet.
>>    Given the issue-3128 in 'Solution 1' it seems 'Solution 2' is worth a
>> try.
>>
>> Solution 2 explained:
>>
>> Part 1: Store mergeinfo_added in a commit in the back ground(Currently I
>> have kind of local patch implementing the same for bdb, I will initiate
>> a separate discussion once my local patch becomes clean and mature).
> 
> By "mergeinfo_added" do you mean the change in mergeinfo from r(X-1) to
> r(X), for each revision X? So this is a cache to improve the speed of
> mergeinfo calculations. 

Yes.

> 
>> Part 2: Retrieve the 'reflective rev and reflected merge ranges' within
>> a given merge range via the API. See API doc of
>> 'svn_fs_get_commit_and_merge_ranges'
>> http://svn.collab.net/repos/svn/branches/issue-2897/subversion/include/svn_fs.h
>>
>> Part 3: Treat the reflective revision specially as detailed below by
>> custom merge call back.
>>
>>
>> a) reflective text changes:
>>  reflective_merge_file_changed(mine, older, yours)
>>  mine = WC file.
>>  older = file@reflective_rev-1
>>  yours = file@reflective_rev
>>
>>
>> Normal merge_file_changed applies diff(older, yours) to mine.
>> This diff can contain the changes 'synch up from trunk' as well as the
>> local adhoc changes and conflict resolutions.
>>
>> Applying the diff directly is going to cause lots of headaches.
>>
>> So objective is to have a meaningful diff of *only local adhoc changes
>> done before committing the merge*.
>>
>> Let me take the example and explain,
>>
>>
>> Case1:
>>
>> We merge -r13, -r17, -r29 from /trunk to /feature_branch and commit at
>> r40 (Assume this is the only synch up)
> 
> I'm trying to follow along. Is this cherry-picking, or a complete synch?
> I'll give a running commentary of how I interpret this.

Here it is 'cherry pick', it can be a cherry harvest too.

> 
> I think you mean trunk was changed only in those three revisions, and
> we're doing a complete synch.
>

No, trunk may have other changes say in r20 and r22. Let us say while
working on feature branch I see some important changes in r13, r17 and
r29 from trunk of immediate interest to me and I am not prepared to
accept any other changes.


> Let's say no conflict resolution was necessary, and no other changes
> were made during this synch. 
> 
>> Now we merge /feature_branch -r1:40 back to /trunk.
>>
>> It does a merge of -r1:39 which is a normal merge.
> 
> "It" is "Solution 2". A "normal merge" is Subversion's present
> merge-tracking "svn merge", which will therefore merge revision ranges
> 1:12, 13:16, 17:28, 29:39 from feature_branch to trunk, omitting the
> three revisions that it sees have come from trunk.
> 

No, I meant 'normal present 1.5 merge'. To make the discussion more
complete let us say we cut '/feature_branch' from '/trunk@9' at r10.

Now 'svn-issue-2897-client merge trunk_wc ^/feature_branch' will merge
like the following,

To start with it has to merge 1:40 from ^/feature_branch, it will break
to following pieces based on 2 info.
info 1
- --------
What is already merged from ^/feature_branch to trunk_wc
Ans: implict /trunk:1-9

So effectively it has to merge r10:40 from ^/feature_branch

info 2(new ra call get_commit_and_merge_ranges)
- ---------------
This call will tell us

r40->[13, 17, 29]

Then we spilt the merge range [10:40] to [10:39] and [40]

[10:39] is normal svn 1.5 style normal merge.
[40] reflective merge.

>> It does a reflective merge of 40.
> 
> A "reflective merge" is the new algorithm that calculates and applies
> the "Z" diff that you mentioned above.
> 
> It is going to merge (r39:40 /feature_branch/test.c) to
> the /trunk/test.c working copy which has already got local scheduled
> changes in it from the r1:39 phase of this overall merge.

Yes.

> 
>> Let us say /feature_branch/test.c got a change from a merge at r40(Our
>> first merge)
>>
>> To calculate the *meaningful diff*, We apply -r13 change to OLDER, and
>> r17 change to OLDER, r29 change to OLDER.
> 
> For this "reflective merge" phase of just r40,
> 
> MINE is /trunk/test.c working copy (BASE is r40)
> OLDER is /feature_branch/test.c@39
> YOURS is /feature_branch/test.c@40
> 
> Applying diffs (-c13,17,29 /trunk/test.c) to OLDER is exactly what
> happened in r40. 
> 
>> Now in this case after the above 3 merges(r13, r17, r29) OLDER becomes
>> YOURS and hence no change is applied to MINE.
> 
> Yes, if test.c has never been modified on the branch in this example.
> 
> 

YES.

>> Case2:
>>
>> We merge -r13, -r17, -r29 from /trunk to /feature_branch + local
>> non-conflicting change to /feature_branch/test.c and commit at r40
>> (Assume this is the only synch up)
>>
>> Now we merge /feature_branch -r1:40 back to /trunk.
>>
>> It does a merge of -r1:39 which is a normal merge.
>> It does a reflective merge of r40.
>> Let us say /feature_branch/test.c got a change from a merge at r40(Our
>> first merge)
>>
>> To calculate the *meaningful diff*, We apply -r13 change to OLDER, and
>> r17 change to OLDER, r29 change to OLDER.
> 
> OK, the steps are the same so far.

YES.

> 
>> Now in this case after the above 3 merges(r13, r17, r29) OLDER becomes
>> (YOURS-'*local non conflicting changes*') and hence
> 
> Just to clarify, the "-" here is a "minus" or "subtraction" sign.

Yes you can think it of something like 'All of YOURS except *local non
conflicting changes*'

> 
>> (YOURS-YOURS +'*local non conflicting changes*') is applied to MINE. i.e
>> '*local non conflicting changes*'
> 
> OK, great: this is where Solution 2 wins over the previous solutions.
> 
> 
>> Case3:
>>
>> We merge -r13, -r17(conflicting), -r29 from /trunk to /feature_branch +
>> local non-conflicting change + conflict resolution to r17 to
>> /feature_branch/test.c and commit at r40
>> (Assume this is the only synch up)
>>
>> Now we merge /feature_branch -r1:40 back to /trunk.
>>
>> It does a merge of -r1:39 which is a normal merge.
>> It does a reflective merge of 40.
>> Let us say /feature_branch/test.c got a change from a merge at r40(Our
>> first merge)
>>
>> To calculate the *meaningful diff*, We apply -r13 change to OLDER, and
>> r29 change to OLDER. (SEE WE DONT APPLY r17 as it is a conflicting one)/
> 
> How does Subversion know that r17 was a conflicting one? Normally when
> Subversion wants to merge three successive changes from /trunk it
> applies a single diff (r1:39). So here you're saying Subversion tries to
> merge each individual source revision in turn, and notices which ones
> conflict?
>

The moment we identify 'r40' as reflective we have 'r13, r17 and r29' as
 merge_cmd_baton->reflected_ranges list.

Algo looks like this
for each range in merge_cmd_baton->reflected_ranges:
  copy OLDER to TMP
  retval = merge change represented by range from /trunk to TMP
  if retval == success:
    rename TMP OLDER


> Oh, I see: you must mean r13, r17, r29 were merged in three separate
> phases in the original merge. (This could be because some changes
> on /trunk in both of the ranges r13:16 and r17:28 had already been
> merged to the branch.) Is that right?
>

Yes they can be three independent merge or one merge with comma
separated revnum as supported by 1.5.



>> Now in this case after the above 2 merges(r13, r29) OLDER becomes
>> (YOURS-local non conflicting changes - r17 - conflict resolution to r17)
>> and hence
>> (YOURS-YOURS +local non conflicting changes + r17 + conflict resolution
>> to r17) is applied to MINE. i.e local non conflicting changes + r17 +
>> conflict resolution to r17. Moral: If synch up gave a conflict reverse
>> also would give.
> 
> So, this works well for non-conflicting changes, and for conflicting
> changes it produces conflicts by merging the same change again.
> 
> In summary:
> 
>   * Each source change that originally merged cleanly
> into /feature_branch will correctly be omitted from the re-integration
> to /trunk.
> 
>   * Each independent change made to /feature_branch, including changes
> made in the same revision as a merge, will correctly be re-integrated
> to /trunk.
> 
>   * Each conflict resolution performed on /feature_branch will correctly
> be included in the re-integration.

 Yes, after annoying with the user with the conflict again as mentioned
above.
> 
>   * Each revision from /trunk that conflicted when merged
> into /feature_branch will wrongly be included in the re-integration,
> creating new conflicts.
> 
Yes. May be we can be bit more smart here once we have this stuff working.
> 
> Have I understood this well enough?
> 

Yes, you understood well.


Sorry for the delay, it needs some quality undisturbed length of time to
think about 'reflective merges'.


Thanks for the review.

With regards
Kamesh Jayachandran
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFImdFo3WHvyO0YTCwRAsajAJ99f+BWNfbipZwLM509JAzqJVH0zQCggldN
vtcrKj35I4HaZXd/WSe66WQ=
=XuVU
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: issue-2897(reflective merge solution)

Posted by Julian Foad <ju...@btopenworld.com>.
Kamesh Jayachandran wrote: 
> Little bit background for the purpose of general audience.

Thanks, Kamesh. I have not followed the development of merge tracking
very closely, so I am one of the "general audience" trying to understand
this now.


> Problem:
> - ---------------------
> - - Create /feature_branch from /trunk.
> - - Synch up change(rX) from /trunk to /feature_branch and commit at rY
> along with some other local changes 'Z'.

To keep things simple and concrete, I will assume that, before this
merge, /feature_branch had already been synchronised with all of the
changes up to r(X-1) from /trunk.

If the user is following best practices, then the extra change 'Z' is
what is needed to resolve conflicts.

> - - When you merge /feature_branch to /trunk later it should merge only
> 'Z' not rX again to /trunk as it is not a change originated in
> /feature_branch.

OK.


> Available solution:
> - ------------------------

Solution 0: If you were to use Subversion 1.5's plain "svn
merge /feature_branch /trunk" command, then Subversion would omit rY
entirely, which is wrong because we need the conflict-resolution part of
the change. Without that part, the merge would throw up those same (or
similar) conflicts again, which the user would have to resolve again.

> Solution 1: One can use '--reintegrate' switch to 'merge' command, but
> it has lots of expectation on merge-target, which makes it unusable on a
> feature_branch with renamed subtrees which is common when someone does a
> refactoring there.(issue-3128).

The restriction that makes "--reintegrate" unusable with sub-tree
mergeinfo is something we could fix. It should check whether the subtree
mergeinfo meets the criteria for what situations the command can handle.
Currently it doesn't attempt to check this, and just assumes the
merrgeinfo doesn't meet the criteria.

I know solution 2 can handle many more general cases, but, in the simple
cases that "--reintegrate" is intended to handle, does your solution 2
achieve the same result, or does it have a different concept of what is
the right result? 

> Solution 2: issue-2897 branch 'extract non-reflective changes from a
> reflective revision and apply'.
>    This branch relies on availability of 'what revisions+merge_sources
> got merged in a commit'. It had this information implemented in 'sqlite'
>  our old 'mergeinfo' store.
>    In January we have changed this 'sqlite' based mergeinfo storage to a
> 'fs' based 'storage', which this implementation has not caught up with yet.
>    Given the issue-3128 in 'Solution 1' it seems 'Solution 2' is worth a
> try.
> 
> Solution 2 explained:
> 
> Part 1: Store mergeinfo_added in a commit in the back ground(Currently I
> have kind of local patch implementing the same for bdb, I will initiate
> a separate discussion once my local patch becomes clean and mature).

By "mergeinfo_added" do you mean the change in mergeinfo from r(X-1) to
r(X), for each revision X? So this is a cache to improve the speed of
mergeinfo calculations. 

> Part 2: Retrieve the 'reflective rev and reflected merge ranges' within
> a given merge range via the API. See API doc of
> 'svn_fs_get_commit_and_merge_ranges'
> http://svn.collab.net/repos/svn/branches/issue-2897/subversion/include/svn_fs.h
> 
> Part 3: Treat the reflective revision specially as detailed below by
> custom merge call back.
> 
> 
> a) reflective text changes:
>  reflective_merge_file_changed(mine, older, yours)
>  mine = WC file.
>  older = file@reflective_rev-1
>  yours = file@reflective_rev
> 
> 
> Normal merge_file_changed applies diff(older, yours) to mine.
> This diff can contain the changes 'synch up from trunk' as well as the
> local adhoc changes and conflict resolutions.
> 
> Applying the diff directly is going to cause lots of headaches.
> 
> So objective is to have a meaningful diff of *only local adhoc changes
> done before committing the merge*.
> 
> Let me take the example and explain,
> 
> 
> Case1:
> 
> We merge -r13, -r17, -r29 from /trunk to /feature_branch and commit at
> r40 (Assume this is the only synch up)

I'm trying to follow along. Is this cherry-picking, or a complete synch?
I'll give a running commentary of how I interpret this.

I think you mean trunk was changed only in those three revisions, and
we're doing a complete synch.

Let's say no conflict resolution was necessary, and no other changes
were made during this synch. 

> Now we merge /feature_branch -r1:40 back to /trunk.
> 
> It does a merge of -r1:39 which is a normal merge.

"It" is "Solution 2". A "normal merge" is Subversion's present
merge-tracking "svn merge", which will therefore merge revision ranges
1:12, 13:16, 17:28, 29:39 from feature_branch to trunk, omitting the
three revisions that it sees have come from trunk.

> It does a reflective merge of 40.

A "reflective merge" is the new algorithm that calculates and applies
the "Z" diff that you mentioned above.

It is going to merge (r39:40 /feature_branch/test.c) to
the /trunk/test.c working copy which has already got local scheduled
changes in it from the r1:39 phase of this overall merge.

> Let us say /feature_branch/test.c got a change from a merge at r40(Our
> first merge)
> 
> To calculate the *meaningful diff*, We apply -r13 change to OLDER, and
> r17 change to OLDER, r29 change to OLDER.

For this "reflective merge" phase of just r40,

MINE is /trunk/test.c working copy (BASE is r40)
OLDER is /feature_branch/test.c@39
YOURS is /feature_branch/test.c@40

Applying diffs (-c13,17,29 /trunk/test.c) to OLDER is exactly what
happened in r40. 

> Now in this case after the above 3 merges(r13, r17, r29) OLDER becomes
> YOURS and hence no change is applied to MINE.

Yes, if test.c has never been modified on the branch in this example.


> Case2:
> 
> We merge -r13, -r17, -r29 from /trunk to /feature_branch + local
> non-conflicting change to /feature_branch/test.c and commit at r40
> (Assume this is the only synch up)
> 
> Now we merge /feature_branch -r1:40 back to /trunk.
> 
> It does a merge of -r1:39 which is a normal merge.
> It does a reflective merge of r40.
> Let us say /feature_branch/test.c got a change from a merge at r40(Our
> first merge)
> 
> To calculate the *meaningful diff*, We apply -r13 change to OLDER, and
> r17 change to OLDER, r29 change to OLDER.

OK, the steps are the same so far.

> Now in this case after the above 3 merges(r13, r17, r29) OLDER becomes
> (YOURS-'*local non conflicting changes*') and hence

Just to clarify, the "-" here is a "minus" or "subtraction" sign.

> (YOURS-YOURS +'*local non conflicting changes*') is applied to MINE. i.e
> '*local non conflicting changes*'

OK, great: this is where Solution 2 wins over the previous solutions.


> Case3:
> 
> We merge -r13, -r17(conflicting), -r29 from /trunk to /feature_branch +
> local non-conflicting change + conflict resolution to r17 to
> /feature_branch/test.c and commit at r40
> (Assume this is the only synch up)
> 
> Now we merge /feature_branch -r1:40 back to /trunk.
> 
> It does a merge of -r1:39 which is a normal merge.
> It does a reflective merge of 40.
> Let us say /feature_branch/test.c got a change from a merge at r40(Our
> first merge)
> 
> To calculate the *meaningful diff*, We apply -r13 change to OLDER, and
> r29 change to OLDER. (SEE WE DONT APPLY r17 as it is a conflicting one)/

How does Subversion know that r17 was a conflicting one? Normally when
Subversion wants to merge three successive changes from /trunk it
applies a single diff (r1:39). So here you're saying Subversion tries to
merge each individual source revision in turn, and notices which ones
conflict?

Oh, I see: you must mean r13, r17, r29 were merged in three separate
phases in the original merge. (This could be because some changes
on /trunk in both of the ranges r13:16 and r17:28 had already been
merged to the branch.) Is that right?

> Now in this case after the above 2 merges(r13, r29) OLDER becomes
> (YOURS-local non conflicting changes - r17 - conflict resolution to r17)
> and hence
> (YOURS-YOURS +local non conflicting changes + r17 + conflict resolution
> to r17) is applied to MINE. i.e local non conflicting changes + r17 +
> conflict resolution to r17. Moral: If synch up gave a conflict reverse
> also would give.

So, this works well for non-conflicting changes, and for conflicting
changes it produces conflicts by merging the same change again.

In summary:

  * Each source change that originally merged cleanly
into /feature_branch will correctly be omitted from the re-integration
to /trunk.

  * Each independent change made to /feature_branch, including changes
made in the same revision as a merge, will correctly be re-integrated
to /trunk.

  * Each conflict resolution performed on /feature_branch will correctly
be included in the re-integration.

  * Each revision from /trunk that conflicted when merged
into /feature_branch will wrongly be included in the re-integration,
creating new conflicts.


Have I understood this well enough?

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org