You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Julian Foad <ju...@wandisco.com> on 2011/11/10 18:13:41 UTC

Do we need to store redundant mergeinfo?

         2  3   4  5
BranchA--o-----------------------------------------
          \
           \      "A:2"
BranchB-----o---o----------------------------------
                 \
                  \  "A:2 B:3-4"
BranchC------------o-------------------------------

Philip and I were prompted by a customer to consider why Subversion copies
mergeinfo from branch to branch, in transitive merges (branch A -> branch B
-> branch C).  Why do we need mergeinfo on branch C that refers directly to
A?  If, as I believe to be the case, Subversion only supports merge
tracking if the branching graph is tree-shaped, then the only merges
allowed to or from branch C are those to or from branch B (and those to or
any further branches to the "right" of it: D1, D2).  And thus the mergeinfo
on C that refers to A is not useful.  It's redundant anyway, in the sense
that if we hadn't stored it explicitly then we could crawl the branching
graph to find it, but that's not my concern; rather, I wonder if it is ever
actually providing any benefit.

It seems to me that we must have done this (propagate mergeinfo) because we
intended that Subversion's merging should support merging patterns more
complex than that.  But do we?  The big question for me at the moment is:
do people in reality rely on Subversion doing kinds of merging that make
use of this transitive mergeinfo?

- Julian

Re: Do we need to store redundant mergeinfo?

Posted by Philip Martin <ph...@wandisco.com>.

Julian Foad <ju...@wandisco.com> writes:

>          2  3   4  5
> BranchA--o-----------------------------------------
>           \
>            \      "A:2"
> BranchB-----o---o----------------------------------
>                  \
>                   \  "A:2 B:3-4"
> BranchC------------o-------------------------------
>
> Philip and I were prompted by a customer to consider why Subversion copies
> mergeinfo from branch to branch, in transitive merges (branch A -> branch B
> -> branch C).  Why do we need mergeinfo on branch C that refers directly to
> A?  If, as I believe to be the case, Subversion only supports merge
> tracking if the branching graph is tree-shaped, then the only merges
> allowed to or from branch C are those to or from branch B (and those to or
> any further branches to the "right" of it: D1, D2).  And thus the mergeinfo
> on C that refers to A is not useful.  It's redundant anyway, in the sense
> that if we hadn't stored it explicitly then we could crawl the branching
> graph to find it, but that's not my concern; rather, I wonder if it is ever
> actually providing any benefit.

Further, operations such as "log -g" and "blame -g" that do require the
transitive mergeinfo generally cope if the redundant information is not
stored, because the server follows the merges back and retreives the
information from the other branches.  Not storing the redundant
information may even reduce the server load and make these operation
more efficient. I suppose it would fall down if a record-only merge is
made to correct the mergeinfo on B before the merge from B to C.  In
that case we rely on the corrected transitive mergeinfo being present on
C, and if we simply follow the mergeinfo to B we won't always discover
the mergeinfo change on a different revision.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: Do we need to store redundant mergeinfo?

Posted by Mark Phippard <ma...@gmail.com>.

On Thu, Nov 10, 2011 at 1:13 PM, Philip Martin
<ph...@wandisco.com>wrote:

> > This statement is not true.  You can still merge BranchA to BranchC in
> the
> > above scenario.  SVN does not have any limits on where you can merge from
> > and to.
>
> Yes, Subversion allows one do it, but it doesn't work reliably.  It's
> the standard cyclic merge problem.  If we mix merges that go A-to-B-to-C
> with merges from A-to-C then conflicts can occur.  You must know this:
>

Yes I am aware that the cyclic merge issues would apply here as they do
elsewhere, but that is different than saying we do not support these
scenarios.  Even in the cyclic merge scenario Subversion is not preventing
you from doing a merge, it is simply not doing it as well as one could hope
it would and leads to conflicts that you should not have to resolve.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Do we need to store redundant mergeinfo?

Posted by Philip Martin <ph...@wandisco.com>.

Mark Phippard <ma...@gmail.com> writes:

> On Thu, Nov 10, 2011 at 12:13 PM, Julian Foad <ju...@wandisco.com>wrote:
>
>>          2  3   4  5
>> BranchA--o-----------------------------------------
>>           \
>>            \      "A:2"
>> BranchB-----o---o----------------------------------
>>                  \
>>                   \  "A:2 B:3-4"
>> BranchC------------o-------------------------------
>>
>> Philip and I were prompted by a customer to consider why Subversion copies
>> mergeinfo from branch to branch, in transitive merges (branch A -> branch B
>> -> branch C).  Why do we need mergeinfo on branch C that refers directly to
>> A?  If, as I believe to be the case, Subversion only supports merge
>> tracking if the branching graph is tree-shaped, then the only merges
>> allowed to or from branch C are those to or from branch B (and those to or
>> any further branches to the "right" of it: D1, D2).
>>
>>
> This statement is not true.  You can still merge BranchA to BranchC in the
> above scenario.  SVN does not have any limits on where you can merge from
> and to.

Yes, Subversion allows one do it, but it doesn't work reliably.  It's
the standard cyclic merge problem.  If we mix merges that go A-to-B-to-C
with merges from A-to-C then conflicts can occur.  You must know this:

svnadmin create repo
svn mkdir -mm file://`pwd`/repo/A
svn cp -mm file://`pwd`/repo/A ^/B
svn cp -mm file://`pwd`/repo/A ^/C
svn mkdir -mm file://`pwd`/repo/A/Q1
svn mkdir -mm file://`pwd`/repo/A/Q2
svn co file://`pwd`/repo/B wc
svn merge ^/A@4 wc               # A-to-B
svn ci -mm wc
svn sw ^/C wc
svn merge ^/B wc                 # (A-to-)B-to-C
svn ci -mm wc
svn merge ^/A wc                 # A-to-C
svn ci -mm wc
svn sw ^/B wc
svn merge ^/A wc                 # A-to-B
svn ci -mm wc
svn sw ^/C wc
svn merge ^/B wc                 # (A-to-)B-to-C  conflict!

This is a known limitation of merge tracking.  How can the user avoid
such problems?  The only real way is to avoiding cyclic merges.  If we
start from the premise that the user should not do cyclic merges because
of the limitations of merge tracking then the redundant info has no
benefit.  The only time the redundant info is useful is when the user is
doing merges that don't always "work".

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

RE: Do we need to store redundant mergeinfo?

Posted by Bob Jenkins <rj...@collab.net>.

Yes this feature is used by some enterprises (what percentage I would
have no idea, but it is always seen as a positive in training sessions).
I don't think we had the time or interest to put in extra functionality
that didn't come from community requirements.

 

Again, I don't think this list is an adequate way to evaluate whether
something is being used because the true customer base is not
subscribed.

 

Bob Jenkins

 

________________________________

From: Mark Phippard [mailto:markphip@gmail.com] 
Sent: Thursday, November 10, 2011 12:42 PM
To: Julian Foad
Cc: dev@subversion.apache.org
Subject: Re: Do we need to store redundant mergeinfo?

 

On Thu, Nov 10, 2011 at 12:13 PM, Julian Foad <ju...@wandisco.com>
wrote:

	         2  3   4  5
	BranchA--o-----------------------------------------
	          \
	           \      "A:2"
	BranchB-----o---o----------------------------------
	                 \
	                  \  "A:2 B:3-4"
	BranchC------------o-------------------------------
	
	Philip and I were prompted by a customer to consider why
Subversion copies mergeinfo from branch to branch, in transitive merges
(branch A -> branch B -> branch C).  Why do we need mergeinfo on branch
C that refers directly to A?  If, as I believe to be the case,
Subversion only supports merge tracking if the branching graph is
tree-shaped, then the only merges allowed to or from branch C are those
to or from branch B (and those to or any further branches to the "right"
of it: D1, D2). 

 

This statement is not true.  You can still merge BranchA to BranchC in
the above scenario.  SVN does not have any limits on where you can merge
from and to.

 

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Do we need to store redundant mergeinfo?

Posted by Mark Phippard <ma...@gmail.com>.

On Thu, Nov 10, 2011 at 12:13 PM, Julian Foad <ju...@wandisco.com>wrote:

>          2  3   4  5
> BranchA--o-----------------------------------------
>           \
>            \      "A:2"
> BranchB-----o---o----------------------------------
>                  \
>                   \  "A:2 B:3-4"
> BranchC------------o-------------------------------
>
> Philip and I were prompted by a customer to consider why Subversion copies
> mergeinfo from branch to branch, in transitive merges (branch A -> branch B
> -> branch C).  Why do we need mergeinfo on branch C that refers directly to
> A?  If, as I believe to be the case, Subversion only supports merge
> tracking if the branching graph is tree-shaped, then the only merges
> allowed to or from branch C are those to or from branch B (and those to or
> any further branches to the "right" of it: D1, D2).
>
>
This statement is not true.  You can still merge BranchA to BranchC in the
above scenario.  SVN does not have any limits on where you can merge from
and to.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Do we need to store redundant mergeinfo?

Posted by Daniel Shahaf <da...@elego.de>.

Julian Foad wrote on Thu, Nov 10, 2011 at 17:13:41 +0000:
>          2  3   4  5
> BranchA--o-----------------------------------------
>           \
>            \      "A:2"
> BranchB-----o---o----------------------------------
>                  \
>                   \  "A:2 B:3-4"
> BranchC------------o-------------------------------
> 
> Philip and I were prompted by a customer to consider why Subversion copies
> mergeinfo from branch to branch, in transitive merges (branch A -> branch B
> -> branch C).  Why do we need mergeinfo on branch C that refers directly to
> A?  If, as I believe to be the case, Subversion only supports merge
> tracking if the branching graph is tree-shaped, [...]

Can you define "tree-shaped"?  Do you mean "DAG-shaped"? Does your
definition allow merging between sibling feature branches?

> 
> It seems to me that we must have done this (propagate mergeinfo) because we
> intended that Subversion's merging should support merging patterns more
> complex than that.  But do we?  The big question for me at the moment is:
> do people in reality rely on Subversion doing kinds of merging that make
> use of this transitive mergeinfo?
> 
> - Julian

Re: Do we need to store redundant mergeinfo?

Posted by Daniel Shahaf <da...@elego.de>.

Stefan Sperling wrote on Fri, Nov 11, 2011 at 13:31:36 +0100:
> If an algorithm can't do this, then a sane version control system
> must flag a conflict and let a human resolve it.
...
> Subversion's merge-tracking, because a cyclic merge will always cause
> spurious conflicts, in the general case.

Just a note.  You keep talking about "the general case" (in both of
these quotes), but there is also the case that an algorithm exists that
solves the 99% case --- in which case we may have the option of
implementing it and flagging conflicts the rest (1%) of the time.

(This is analogous to NP problems which are NP-hard only on 1% of their
instances.)

Re: Do we need to store redundant mergeinfo?

Posted by Stefan Sperling <st...@elego.de>.

On Fri, Nov 11, 2011 at 11:31:55AM +0000, Philip Martin wrote:
> Philip Martin <ph...@wandisco.com> writes:
> 
> > Paul Burba <pt...@gmail.com> writes:
> >
> >> Does this answer your question at all (or did I answer a different
> >> question ;-)
> >
> > That's the cyclic merge again.  Storing the transitive mergeinfo makes
> > cyclic merge work in some circumstances but not in others.  What we are
> > trying to determine is whether that is the only use for transitive
> > mergeinfo.
> 
> A bit of confusion on my part, failing to recognise the difference
> between catch-up merge and a cherry-pick merge.  Cyclic catch-up merges
> can lead to conflicts due to merge tracking limitations, and the
> transitive mergeinfo doesn't really help.  But I can see that for
> cherry-pick merges the transitive mergeinfo will probably be useful in
> more cases.

I still wonder whether the cyclic merge problem can even be solved
in theory. I mean, given a change rX which is the result of merges
of "original" changes rA, rB, rC, and rD, merged across some arbitrary
number of branches, plus additional arbitrary changes for conflict
resolution thrown in, is there an algorithm to pick apart the change rX
into its constituent parts rA, rB, rC, rD, and conflict resolution changes?
If an algorithm can't do this, then a sane version control system
must flag a conflict and let a human resolve it.

I'm not saying that Subversion's current handling of these situations
is as good as it could be. But I doubt that focusing on fixing cyclic
merges is the right approach. Our problem right now is that we flag
conflicts where trivial resolutions exist. Subversion needs to get better
at resolving trivial conflicts at merge time, especially those involving
renames (it is already pretty good at everything not involving a rename).
I suspect that this is what other tools, like git, do a much better job
at right now. I think it is why people get the impression that git
is "better" at merging than Subversion. It is not. Rather, it is better
at resolving trivial conflicts.

Git has the same fundamental problem when it comes to cyclic merging.
In fact, git developers actively discourage their users from doing
cherry-picking because cherry-picked revisions throw a wrench into
their DAG-based merge-tracking functionality. So one could argue that
git's merge-tracking is broken because it doesn't support cherry-picking.
But again, that's besides the point. It is no more "broken" than
Subversion's merge-tracking, because a cyclic merge will always cause
spurious conflicts, in the general case.

Re: Do we need to store redundant mergeinfo?

Posted by Philip Martin <ph...@wandisco.com>.

Philip Martin <ph...@wandisco.com> writes:

> Paul Burba <pt...@gmail.com> writes:
>
>> Does this answer your question at all (or did I answer a different
>> question ;-)
>
> That's the cyclic merge again.  Storing the transitive mergeinfo makes
> cyclic merge work in some circumstances but not in others.  What we are
> trying to determine is whether that is the only use for transitive
> mergeinfo.

A bit of confusion on my part, failing to recognise the difference
between catch-up merge and a cherry-pick merge.  Cyclic catch-up merges
can lead to conflicts due to merge tracking limitations, and the
transitive mergeinfo doesn't really help.  But I can see that for
cherry-pick merges the transitive mergeinfo will probably be useful in
more cases.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: Do we need to store redundant mergeinfo?

Posted by Philip Martin <ph...@wandisco.com>.

Paul Burba <pt...@gmail.com> writes:

> Does this answer your question at all (or did I answer a different
> question ;-)

That's the cyclic merge again.  Storing the transitive mergeinfo makes
cyclic merge work in some circumstances but not in others.  What we are
trying to determine is whether that is the only use for transitive
mergeinfo.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: Do we need to store redundant mergeinfo?

Posted by Paul Burba <pt...@gmail.com>.

On Thu, Nov 10, 2011 at 12:13 PM, Julian Foad <ju...@wandisco.com> wrote:
>          2  3   4  5
> BranchA--o-----------------------------------------
>           \
>            \      "A:2"
> BranchB-----o---o----------------------------------
>                  \
>                   \  "A:2 B:3-4"
> BranchC------------o-------------------------------
>
> Philip and I were prompted by a customer to consider why Subversion copies
> mergeinfo from branch to branch, in transitive merges (branch A -> branch B
> -> branch C).  Why do we need mergeinfo on branch C that refers directly to
> A?

Hi Julian,

It's not clear from your diagram: is BranchB is a copy of BranchA or a
copy of their common parent?  If the latter we do need to copy the
mergeinfo contained in the diff in merges from BranchB to BranchC, in
addition to recording mergeinfo describing that merge.  For example:

# Start with a vanilla greek tree from our test suite.
# Make two branches:

  >svn copy A branchA
  A         branchA

  >svn copy A branchB
  A         branchB

  >svn ci -m "create two branches from 'trunk'"
  Adding         branchA
  Adding         branchB

  Committed revision 2.

# Make a change on our "trunk":

  >echo trunk edit > A\mu

  >svn ci -m "trunk edit"
  Sending        A\mu
  Transmitting file data .
  Committed revision 3.

# Merge that "trunk" change to our first branch:

  >svn merge ^^/A branchA -c3
  --- Merging r3 into 'branchA':
  U    branchA\mu
  --- Recording mergeinfo for merge of r3 into 'branchA':
   U   branchA

  >svn pl -vR
  Properties on 'branchA':
    svn:mergeinfo
      /A:3

  >svn ci -m "Merge a trunk change from ^/A to ^/branchA"
  Sending        branchA
  Sending        branchA\mu
  Transmitting file data .
  Committed revision 4.

# Now merge that merge from branchA to branchB:

  >svn merge ^^/branchA branchB -c4
  --- Merging r4 into 'branchB':
  U    branchB\mu
   U   branchB
  --- Recording mergeinfo for merge of r4 into 'branchB':
   G   branchB

  >svn st
   M      branchB
  M       branchB\mu

# The mergeinfo from the first merge is copied and new mergeinfo
# is set to describe *this* merge:

  >svn diff --depth empty branchB
  Index: branchB
  ===================================================================
  --- branchB     (revision 2)
  +++ branchB     (working copy)

  Property changes on: branchB
  ___________________________________________________________________
  Added: svn:mergeinfo
     Merged /A:r3
     Merged /branchA:r4

# Commit the second merge:

  >svn ci -m "Merge from ^/branchA to ^/branchB of the prior merge
  from ^/A to /branchA"
  Sending        branchB
  Sending        branchB\mu
  Transmitting file data .
  Committed revision 5.

# Make a local change to branchB that would conflict with r3
# if that revision were re-merged:

  >svn up -q

  >echo local edit > branchB\mu

# Try to remerge r3 from ^/A to branchB, this is a non-conflicting
# noop because of the mergeinfo '/A:3' on the target carried from
# the merge in r5:

  >svn merge ^^/A branchB -c3
  --- Recording mergeinfo for merge of r3 into 'branchB':
   U   branchB

  >svn revert -R .
  Reverted 'branchB\mu'

# Now let's manually erase the that mergeinfo with a propset:

  >svn ps svn:mergeinfo /branchA:4 branchB
  property 'svn:mergeinfo' set on 'branchB'

  >svn diff
  Index: branchB
  ===================================================================
  --- branchB     (revision 5)
  +++ branchB     (working copy)

  Property changes on: branchB
  ___________________________________________________________________
  Modified: svn:mergeinfo
     Reverse-merged /A:r3

  >svn pl -vR
  Properties on 'branchA':
    svn:mergeinfo
      /A:3
  Properties on 'branchB':
    svn:mergeinfo
      /branchA:4

  >svn ci -m "Erase transitive mergeinfo"
  Sending        branchB

  Committed revision 6.

# It's probably obvious what the problem is going to be...
# ...but if not, now let's try that merge of r3 from ^/A to
# branchB again:

  >svn up -q

# Again we make a local change to branchB that would
# conflict with r3:

  >echo local edit > branchB\mu

# And BOOM:

  >svn merge ^^/A branchB -c3
  Conflict discovered in
'C:/SVN/src-trunk/Debug/subversion/tests/cmdline/svn-test-work/working_copies/merge_tests-123/branchB/mu'.
  Select: (p) postpone, (df) diff-full, (e) edit,
          (mc) mine-conflict, (tc) theirs-conflict,
          (s) show all options: p
  --- Merging r3 into 'branchB':
  C    branchB\mu
  --- Recording mergeinfo for merge of r3 into 'branchB':
   U   branchB
  Summary of conflicts:
    Text conflicts: 1

Now if instead you meant that BranchC is a copy of BranchB is a copy
of BranchA, then we still have a similar problem.  I won't show the
whole example, but in the above example we instead copied ^/A to
^/branchA in r2, copied ^/branchA to ^/branchB in r3, made an edit to
^/A/mu in r4, merged r4 from ^/A to ^/branchA in r5, and merged r5
from ^/branchA to /branchB in r6, then we still have a similar
problem:

  >svn ps svn:mergeinfo /branchA:5 branchB
  property 'svn:mergeinfo' set on 'branchB'

  >echo local edit > branchB\mu

  >svn merge ^^/A branchB -c4
  Conflict discovered in
'C:/SVN/src-trunk/Debug/subversion/tests/cmdline/svn-test-work/working_copies/merge_tests-124/branchB/mu'.
  Select: (p) postpone, (df) diff-full, (e) edit,
          (mc) mine-conflict, (tc) theirs-conflict,
          (s) show all options: p
  --- Merging r4 into 'branchB':
  C    branchB\mu
  --- Recording mergeinfo for merge of r4 into 'branchB':
   G   branchB
  Summary of conflicts:
    Text conflicts: 1

Does this answer your question at all (or did I answer a different question ;-)

Paul

> If, as I believe to be the case, Subversion only supports merge tracking
> if the branching graph is tree-shaped, then the only merges allowed to or
> from branch C are those to or from branch B (and those to or any further
> branches to the "right" of it: D1, D2).  And thus the mergeinfo on C that
> refers to A is not useful.  It's redundant anyway, in the sense that if we
> hadn't stored it explicitly then we could crawl the branching graph to find
> it, but that's not my concern; rather, I wonder if it is ever actually
> providing any benefit.
>
> It seems to me that we must have done this (propagate mergeinfo) because we
> intended that Subversion's merging should support merging patterns more
> complex than that.  But do we?  The big question for me at the moment is: do
> people in reality rely on Subversion doing kinds of merging that make use of
> this transitive mergeinfo?
>
> - Julian
>
>