You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Stefan Fuhrmann <st...@alice-dsl.de> on 2012/04/19 12:48:44 UTC

Merge policies

Hi all,

After having a closer look at merge and discussing it
with Julian on IRC, there seems to be no silver bullet.
However, we identified a few things that could be changed
and set of constellations that make merge harder than
it needs to be.

For the first, there will be another post soon. The second
boils down to policy. Luckily, SVN has a mechanism to
enforce policies: server-side hook scripts. My proposal
is to develop a small set of scripts that a user can
combine to prevent situations that her life harder than
necessary. This should give us enough time to improve
the merge logic inside the SVN libs.

The following pre-commit scripts / policies would be useful.

* Common parts [not a policy]
   We first check whether the commit contains a changed
   svn:merge-info property. This limits the performance
   impact on non-merge commits and we need to identify
   all changed svn:merge-info anyway.

   Also, the merges that happened on the source branch
   from a different location than the target branch are
   of no interest for the policy checkers. E.g.:

   r20: merge r19 from ^/sub-branch to ^/branch
   txn: merge r10-20 from ^/branch to ^/trunk
   Both merges will show up in the merge-info delta but
   we only need to evaluate the second one.

* Strict merge hierarchy
   A merge from A->B is only allowed, if the copy-from
   of A is B or vice versa and the copy source has not
   been replaced since the copy). This prevents circular
   merges and others (note 1).

   In a more sophisticated implementation, we could identify /
   allow for renamed branches as well as A and B having
   the same relative path to some parents that form a
   direct branch (i.e. allow sub-tree merges).

* No sub-tree merges
   Like the above but without the check for parents.

* No aggregate merges
   There must only be one source branch, i.e. we can't
   merge from branches A and B to C in the same revision.

* No distributive merges
   For each path being merged (i.e. having a merge-info
   delta), the relative paths in source and target must
   correspond (i.e. start as the same and then may get
   renamed etc.). This is basically the same as the
   "sophisticated" part of the check for strict merges.

* No cherry picking
   Check that the source branch does not contain revisions
   that lie before the last to-be-merged revision but
   have neither been merged before nor are being merged
   right now.

* No criss-crossing
   Prevent situations like the criss-cross examples here:
   http://wiki.apache.org/subversion/SymmetricMerge

   For a merge A->B, abort if there has been a merge
   B->A after the last revision of A to be merged to B.
   This only valid for non-cherry-picking merges and
   only if the change sets of both merges overlap.

Except for the last one those checks will simply verify
that the user followed certain policies. They should,
therefore, rarely reject a commit.

Again, the user shall be free to combine (or not use)
these policies although not all combinations are meaningful.

Thoughts?

-- Stefan^2.


Note 1:

   One thing that we might want to support is integration
   branches where a temporary branch is being used as
   an intermediate merge target:

   integrate A->B as
   rN: copy B->A_integration
   rN+1: merge A->A_integration
   rN+x: ... various changes on A_integration
   rN+y: merge A_integration->B
   rN+y+1: delete A_integration

   These checks become more complicated, requires
   naming conventions for the integration branches etc.

Re: Merge policies

Posted by Geoff Rowell <ge...@gmail.com>.

On Thu, Apr 19, 2012 at 9:26 AM, Mark Phippard <ma...@gmail.com> wrote:
> I have no objection to coming up with some hook scripts to help
> enforce merge policies.  I recall seeing this project on users@:
>
> http://sourceforge.net/apps/mediawiki/svnhook/index.php?title=Main_Page
>
> It adds the ability to block subtree merges by rejecting commits that
> add mergeinfo below the root.

The company I work for has been using the SvnHook framework for years.
We use it to prevent sub-tree merging, among other things.
Unfortunately, I'm not a Python developer, so it's written in Perl.
I'd love to see the concept re-implemented in Python.
-- 
Geoff Rowell
geoff.rowell@gmail.com

Re: Merge policies

Posted by Julian Foad <ju...@btopenworld.com>.

Paul Burba wrote on 2012-04-23:

>>>  Do you have a pointer to some details about how a mixed-rev target WC 
>>> is handled?
> 
> (Julian, apologies if most of this is old-hat for you, I'm writing as
> much for the newly involved folks as I am for you)

Hi Paul.  Thanks for this detailed exposition.

> At the most basic level, mixed-revision WCs impact how mergeinfo is
> (or isn't) inherited[1].  The best place to start is looking at the
> doc string for svn_client__get_wc_mergeinfo.  Essentially, within the
> working copy, a CHILD path without explicit mergeinfo may inherit[2]
> mergeinfo from its working copy PARENT if:
[...]
> ### Still have that mixed-rev WC:
> 
>> svn st -v
>                  8        8 pburba       .
>                  7        1 jrandom      mu
[...]
>                  7        1 jrandom      D
>                  7        1 jrandom      D\H
>                  9        9 pburba       D\H\psi
> 
> ### A repeat of the cherry pick to the root of our branch (where the
> ### explicit mergeinfo is found) is a no-op:
> 
>> svn merge ^^/A . -c4 --allow-mixed-revisions
> --- Recording mergeinfo for merge of r4 into '.':
> U   .
> 
> ### But if we perform a subtree merge to 'D', the target doesn't
> ### inherit the explicit mergeinfo on '.' because of the mixed-rev
> ### inheritance rules, so the repository is contacted to see what
> ### mergeinfo D@7 inherits (which is none). The fact that a subtree
> ### of the target, D/H/psi *does* inherit mergeinfo reflecting the
> ### previous cherry-pick is not considered, so the merge is
> ### repeated, producing a spurious text conflict:
> 
>> svn merge ^^/A/D D -c4 --allow-mixed-revisions
> Conflict discovered in '[...]D/H/psi'.

So this is a case where we know how it should work, semantically, but it's not currently implemented.

> Situations similar to this are what lead us to disallow merges to
> mixed-rev working copies by default (i.e. adoption of the
> --allow-mixed-revisions option
> http://svn.haxx.se/dev/archive-2010-10/0000.shtml).
> 
> Certainly it would be possible to detect cases like this, but doing so
> in a performant manner is another question.

OK, thanks; it's good to know what we're dealing with here.

- Julian

Re: Merge policies

Posted by Paul Burba <pt...@gmail.com>.

On Fri, Apr 20, 2012 at 5:07 PM, Paul Burba <pt...@gmail.com> wrote:
> On Fri, Apr 20, 2012 at 6:17 AM, Julian Foad <ju...@btopenworld.com> wrote:
>> Paul Burba wrote:
>>
>>> On Thu, Apr 19, 2012 at 12:57 PM, Julian Foad wrote:
>>>>  Branko Čibej wrote:
>>>>>  By the way, I'm all for removing support for merging into
>>>>> mixed-revision
>>>>>  and/or switched-subtree working copies. There's too much room for
>>>>>  unexpected results in these cases.
>>>>
>>>>  I don't necessarily disagree, but that sounds a bit unsubstantiated.
>>>> Is there too much "room for unexpected results" because you don't  have
>>>> a mental model of what to expect?  If I were to figure out and describe
>>>> some sensible semantics and a use case, might you then change your mind?
>>>>
>>>>  FWIW, I have an inkling of how a mixed-rev target WC should work: it's
>>>> logically very similar to having different mergeinfo on different subtrees,
>>>> except that the differences are on the target side of the merge rather than
>>>> the  source side of the merge.  It requires some care to be sure that the
>>>> parent-to-child inheritance of mergeinfo in the WC remains valid where base
>>>> revision numbers differ.
>>>>
>>>>  That said, I can't imagine any valid reason to need to support a
>>>> mixed-rev target WC.
>>>>
>>>>  I have thought very little about about how switched subtrees should
>>>> work [and use cases].
>>>>
>>>>  Maybe Paul can help fill in some of the "how" and "why" gaps here?
>>>
>>> I believe I answered this question already, see:
>>> http://svn.haxx.se/dev/archive-2012-03/0632.shtml  If that doesn't
>>> sufficiently answer your questions let me know.
>>
>> Hi Paul.  Yes, what you wrote there makes sense as to how we handle switched subtrees and "why" in a logical sense.  I've just followed up with a reply to that email going into a bit more detail.
>
> I replied on that thread.
>
>> I feel that with this level of functional spec, Switched Subtrees is something I can now add to the Symmetric Merge design proposal.
>>
>> I see there's already a brief statement about sparse WCs too: <http://svn.apache.org/repos/asf/subversion/trunk/notes/merge-tracking/func-spec.html#sparse-checkouts>.  I'm not clear how the whole 'depth' thing works, though, if you add children to a directory that was initially sparse or exclude children from a directory that was initially depth non-empty.
>
> Ah, welcome to my world!  Yes, this is a tricky situation.  If our
> merge target is shallow (i.e. depth < infinity) and the merge applies
> changes that would effect paths *deeper* than the immediate children
> currently present, then those children are skipped and we set
> non-inheritable mergeinfo to reflect that these changes were not
> merged.  Where it gets whacky is if the merge *adds* paths which are
> immediate children of the children currently present (e.g. merge
> target at depth empty, merge adds a immediate file to the merge
> target).  Today we add the file, despite the fact that the target's
> depth is empty.
>
> Before you claw your eyes out in frustration, read the issue I just
> filed: http://subversion.tigris.org/issues/show_bug.cgi?id=4164  It
> spells out the problem fairly clearly (I hope!).
>
> Paul
>
>> Do you have a pointer to some details about how a mixed-rev target WC is handled?

(Julian, apologies if most of this is old-hat for you, I'm writing as
much for the newly involved folks as I am for you)

At the most basic level, mixed-revision WCs impact how mergeinfo is
(or isn't) inherited[1].  The best place to start is looking at the
doc string for svn_client__get_wc_mergeinfo.  Essentially, within the
working copy, a CHILD path without explicit mergeinfo may inherit[2]
mergeinfo from its working copy PARENT if:

PARENT_CHANGED_REV <= CHILD_BASE_REV <= PARENT_BASE_REV

If the above condition doesn't hold then we need to contact the
repository for CHILD's inherited mergeinfo @BASE_REV.

~~~~~

Ok, so what?  How does this impact merge-tracking aware merges?

1) When the root of a merge target has no explicit mergeinfo.  We need
to know the target's "merge history" before we can decide what to
merge, and this "merge history" is effectively made up of the target's
inherited or explicit mergeinfo and its implicit mergeinfo (a.k.a. its
natural history).  If we can find the target's inherited mergeinfo
from it's WC parents we do so, otherwise we contact the repository.
That's pretty straightforward.

2) Any time a merge has cause to set explicit mergeinfo on a path that
previously had no explicit mergeinfo, we need to find the path's
inherited mergeinfo (if any) and combine that with the new mergeinfo
we want to record (otherwise merge history can be lost).  There are
several different cases in which we do this, for example during a
shallow merge where we need to record non-inheritable mergeinfo on the
roots of subtrees into which the merge did not extend.

3) The third impact is probably the most important and is where
inherited mergeinfo is *not* considered: Mixed revision merge targets
where subtrees without explicit mergeinfo appear to be missing merge
history that is actually present.  A simple example will explain what
I mean:

### Given the WC for a branch at a uniform revision, with no missing
subtrees, and no
### explicit mergeinfo:

>svn up
Updating '.':
At revision 7.

>svn st

>svn pl -vR

### Cherry pick a single revision and commit that merge:

>svn merge ^^/A . -c4
--- Merging r4 into '.':
U    D\H\psi
--- Recording mergeinfo for merge of r4 into '.':
 U   .

>svn ci -m "cherry pick r4 from ^/A"
Sending        .
Sending        D\H\psi
Transmitting file data .
Committed revision 8.

### We obviously have a mixed-rev WC:

>svn st -v
                 8        8 pburba       .
                 7        1 jrandom      mu
                 7        1 jrandom      B
                 7        1 jrandom      B\lambda
                 7        1 jrandom      B\E
                 7        1 jrandom      B\E\alpha
                 7        1 jrandom      B\E\beta
                 7        1 jrandom      B\F
                 7        1 jrandom      C
                 7        1 jrandom      D
                 7        1 jrandom      D\gamma
                 7        1 jrandom      D\G
                 7        1 jrandom      D\G\rho
                 7        1 jrandom      D\G\pi
                 7        1 jrandom      D\G\tau
                 7        1 jrandom      D\H
                 7        1 jrandom      D\H\omega
                 8        8 pburba       D\H\psi
                 7        1 jrandom      D\H\chi

### Now make a change to the file which was modified by the cherry pick:

>echo branch edit > D\H\psi

>svn ci -m ""
Sending        D\H\psi
Transmitting file data .
Committed revision 9.

### Still have that mixed-rev WC:

>svn st -v
                 8        8 pburba       .
                 7        1 jrandom      mu
                 7        1 jrandom      B
                 7        1 jrandom      B\lambda
                 7        1 jrandom      B\E
                 7        1 jrandom      B\E\alpha
                 7        1 jrandom      B\E\beta
                 7        1 jrandom      B\F
                 7        1 jrandom      C
                 7        1 jrandom      D
                 7        1 jrandom      D\gamma
                 7        1 jrandom      D\G
                 7        1 jrandom      D\G\rho
                 7        1 jrandom      D\G\pi
                 7        1 jrandom      D\G\tau
                 7        1 jrandom      D\H
                 9        9 pburba       D\H\psi
                 7        1 jrandom      D\H\chi
                 7        1 jrandom      D\H\omega

### A repeat of the cherry pick to the root of our branch (where the explicit
### mergeinfo is found) is a no-op:

>svn merge ^^/A . -c4 --allow-mixed-revisions
--- Recording mergeinfo for merge of r4 into '.':
 U   .

>svn st

### But if we perform a subtree merge to 'D', the target doesn't inherit the
### explicit mergeinfo on '.' because of the mixed-rev inheritance rules, so the
### repository is contacted to see what mergeinfo D@7 inherits (which is none).
### The fact that a subtree of the target, D/H/psi *does* inherit mergeinfo
### reflecting the previous cherry-pick is not considered, so the merge is
### repeated, producing a spurious text conflict:

>svn merge ^^/A/D D -c4 --allow-mixed-revisions
Conflict discovered in
'C:/SVN/src-branch-1.7.x/Debug/subversion/tests/cmdline/svn-test-work/working_copies/merge_tests-125/A_COPY/D/H/psi'.
Select: (p) postpone, (df) diff-full, (e) edit,
        (mc) mine-conflict, (tc) theirs-conflict,
        (s) show all options: p
--- Merging r4 into 'D':
C    D\H\psi
--- Recording mergeinfo for merge of r4 into 'D':
 U   D
Summary of conflicts:
  Text conflicts: 1

Situations similar to this are what lead us to disallow merges to
mixed-rev working copies by default (i.e. adoption of the
--allow-mixed-revisions option
http://svn.haxx.se/dev/archive-2010-10/0000.shtml).

Certainly it would be possible to detect cases like this, but doing so
in a performant manner is another question.

Paul

[1] Anything that affects inheritance obviously also effects elision.
I.e. is the explicit mergeinfo on CHILD identical to the mergeinfo
CHILD would inherit from PARENT if CHILD had not explicit mergeinfo?
If so then the explicit mergeinfo on CHILD is redundant and can be
elided/removed.

[2] This is an oversimplification, really what we are talking about is
the whether the child may inherit across this mixed-revision boundary,
even if the inherited mergeinfo in question doesn't explicitly exist
on the parent.

Grand-parent (explicit mergeinfo)
  |
inheritance
allowed?
 |
 V
parent (no explicit mergeinfo)
  |
inheritance
allowed?
 |
 V
child (no explicit mergeinfo)

>> I have now added a section "Mixed-Rev, Switched, or Sparse WC" to <http://wiki.apache.org/subversion/SymmetricMerge>.
>>
>>
>>> As to removing support for merging into WCs with switched subtrees,
>>> let's be clear that this *does* work today (if you think otherwise
>>> please provide a use-case demonstrating what is broken).  Now maybe we
>>> want to restrict these types of merges to make further improvements
>>> possible (e.g. Julian's symmetric merge work), that I won't argue
>>> against, but I want to be clear that we are making the choice to
>>> remove some existing functionality as a trade-off in implementing new
>>> functionality, rather than fixing a broken feature.
>>
>> Totally clear, yes, if that's what we decide to do.
>>
>> - Julian

Re: Merge policies -- sparse, mixed-rev and switched target WCs

Posted by Stefan Fuhrmann <eq...@web.de>.

Am 24.04.2012 16:32, schrieb Julian Foad:
> Paul Burba wrote:
>
>> On Fri, Apr 20, 2012 at 6:17 AM, Julian Foad wrote:
>>>   I see there's already a brief statement about sparse WCs too:
>>> <http://svn.apache.org/repos/asf/subversion/trunk/notes/merge-tracking/func-spec.html#sparse-checkouts>.
>>> I'm not clear how the whole 'depth' thing works, though, if you add
>>> children to a directory that was initially sparse or exclude children
>>> from a  directory that was initially depth non-empty.
>> Ah, welcome to my world!  [...]
>>     Where it gets whacky is if the merge *adds* paths which are
>> immediate children of the children currently present (e.g. merge
>> target at depth empty, merge adds a immediate file to the merge
>> target).  Today we add the file, despite the fact that the target's
>> depth is empty.
>>
>> Before you claw your eyes out in frustration, read the issue I just
>> filed: http://subversion.tigris.org/issues/show_bug.cgi?id=4164  It
>> spells out the problem fairly clearly (I hope!).
> The sparse-directories design<notes/sparse-directories.txt>  doesn't seem to indicate the intended behaviour when update brings in a new child.  I've added a note about that to the end of that document.
>
> It seems that the problem is not *how* to make merge behave the right way to be compatible with sparse directories, but that we haven't decided what the Right Way is.

Observation on my part: I think it is ironic that a
centralized VCS tries to do de-centralized merges
with all the missing information and special cases
that it has to allow for on the client side. Those
special cases stem from features that you need to
have (sparse wc, mixed revs) as long as you want
to have client-side WCs.

Maybe, the "right way" to do merges would be
server-side merges. But I'm not advocating to put
any effort into that right away ;)
>
> Stefan Fuhrmann wrote:
>> Am 19.04.2012 18:57, schrieb Julian Foad:
>>>   FWIW, I have an inkling of how a mixed-rev target WC should work [...]
>>>
>>>   That said, I can't imagine any valid reason to need to support a
>>> mixed-rev target WC.
>> Well, in the end one will always commit against HEAD.
>> But there are two use-cases of different importance
>> that should be supported:
>>
>> * A WC is usually not at HEAD because commits don't
>>    update unchanged nodes. On branches, the WC will
>>    often contain the latest changes as few people work
>>    on that branch. A WC of /trunk can be expected to be
>>    outdated even if it was updated to HEAD shortly before
>>    the merge.
> Those are some valid observations.  Note that your third point doesn't result in a mixed-rev WC.  But it's not a big deal to ask the user to 'svn up' before merging, so it's not obvious that any of these situations mean we need to support merging into the WC without an update first.  (Of course I acknowledge that it would be *easier* for the user, if it doesn't cause other problems.)

For experience with large working copies, I would not
want SVN to ask me to update to HEAD whenever I
happen to fall behind. It is a nuisance already that I
often have to update before I can commit the merge
because SVN would not allow to modify the directory
properties (svn:merge-info) without the directory
content being up-to-date.

The issue with large working copies is clearly the time
factor: Non-merging users can commit quickly, i.e. often.
A merging user needs to update the whole WC and then
try to commit.
>> * Mixed-rev working copies are an intermediate step
>>    in creating arbitrary configurations via 'svn cp WC URL'.
>>    Merging into the copy target should be semantically
>>    identical to merging into a mixed-rev WC.
> Yes, I agree.
>
>>>   I have thought very little about about how switched subtrees should work. 
>>> However, I *can* imagine scenarios where the user has a large chunk of WC
>>> switched, and now wants to merge something into the whole WC knowing that
>>> it won't in fact touch the switched part.  That seems like a reasonable
>>> thing  to support: untouched paths being switched.  I haven't yet thought
>>> of a use  case for switched paths that are touched by the merge.
>> We could even be more permissive: as long as the merge
>> does not cross "switch boundaries" (switched / non-
>> switched or switched to different URL) as may allow
>> the merge.
> Let's say A/B is switched relative to A.
>
> You mean a merge where the specified target is A/B should succeed?  Yes,
>   I agree with that, and that's already documented in
> <http://svn.apache.org/repos/asf/subversion/trunk/notes/merge-tracking/func-spec.html#switched-paths>.
>
> At first I thought you meant a merge where the specified target is A
> could be allowed to succeed if it only touched paths inside A/B.  That
> doesn't seem like a good idea.

OK.

>> This could be a somewhat typical use-case for people
>> using switched WCs. I would expect that these users
>> want to merge into those sub-trees just like they
>> would also 'svn up' them.
>
> One thing is clear: this whole discussion is evidence that the way merging should work in a
>   WC with sparse directories, mixed revisions and/or switched subtrees is not exactly clear.  I accept that we have existing behaviour as precedent for the common cases, as well as somne level of rationale; I'm not trying to say we have no idea at all.

Same feeling here. SvnDay @Berlin would be an excellent opportunity
to discuss some these more fundamental questions.

As I'm writing this, I'm on my way to Antalya where I can discuss
with Cristian and Andy directly which concrete improvements
can be made right now. For the time being, I would like to focus
on the rename tracking part and support testing efforts. I would
expect that to minimize the coordination overhead.

-- Stefan^2.

Re: Merge policies -- sparse, mixed-rev and switched target WCs

Posted by Julian Foad <ju...@btopenworld.com>.

Paul Burba wrote:

> On Fri, Apr 20, 2012 at 6:17 AM, Julian Foad wrote:
>>  I see there's already a brief statement about sparse WCs too:
>> <http://svn.apache.org/repos/asf/subversion/trunk/notes/merge-tracking/func-spec.html#sparse-checkouts>.
>> I'm not clear how the whole 'depth' thing works, though, if you add
>> children to a directory that was initially sparse or exclude children
>> from a  directory that was initially depth non-empty.
> 
> Ah, welcome to my world!  [...]
>   Where it gets whacky is if the merge *adds* paths which are
> immediate children of the children currently present (e.g. merge
> target at depth empty, merge adds a immediate file to the merge
> target).  Today we add the file, despite the fact that the target's
> depth is empty.
> 
> Before you claw your eyes out in frustration, read the issue I just
> filed: http://subversion.tigris.org/issues/show_bug.cgi?id=4164  It
> spells out the problem fairly clearly (I hope!).

The sparse-directories design <notes/sparse-directories.txt> doesn't seem to indicate the intended behaviour when update brings in a new child.  I've added a note about that to the end of that document.

It seems that the problem is not *how* to make merge behave the right way to be compatible with sparse directories, but that we haven't decided what the Right Way is.


Stefan Fuhrmann wrote:
> Am 19.04.2012 18:57, schrieb Julian Foad:
>>  FWIW, I have an inkling of how a mixed-rev target WC should work [...]
>> 
>>  That said, I can't imagine any valid reason to need to support a 
>> mixed-rev target WC.
> 
> Well, in the end one will always commit against HEAD.
> But there are two use-cases of different importance
> that should be supported:
> 
> * A WC is usually not at HEAD because commits don't
>   update unchanged nodes. On branches, the WC will
>   often contain the latest changes as few people work
>   on that branch. A WC of /trunk can be expected to be
>   outdated even if it was updated to HEAD shortly before
>   the merge.

Those are some valid observations.  Note that your third point doesn't result in a mixed-rev WC.  But it's not a big deal to ask the user to 'svn up' before merging, so it's not obvious that any of these situations mean we need to support merging into the WC without an update first.  (Of course I acknowledge that it would be *easier* for the user, if it doesn't cause other problems.)

> * Mixed-rev working copies are an intermediate step
>   in creating arbitrary configurations via 'svn cp WC URL'.
>   Merging into the copy target should be semantically
>   identical to merging into a mixed-rev WC.

Yes, I agree.  

>>  I have thought very little about about how switched subtrees should work.  
>> However, I *can* imagine scenarios where the user has a large chunk of WC 
>> switched, and now wants to merge something into the whole WC knowing that 
>> it won't in fact touch the switched part.  That seems like a reasonable
>> thing  to support: untouched paths being switched.  I haven't yet thought
>> of a use  case for switched paths that are touched by the merge.
> 
> We could even be more permissive: as long as the merge
> does not cross "switch boundaries" (switched / non-
> switched or switched to different URL) as may allow
> the merge.

Let's say A/B is switched relative to A.

You mean a merge where the specified target is A/B should succeed?  Yes,
 I agree with that, and that's already documented in 
<http://svn.apache.org/repos/asf/subversion/trunk/notes/merge-tracking/func-spec.html#switched-paths>.

At first I thought you meant a merge where the specified target is A 
could be allowed to succeed if it only touched paths inside A/B.  That 
doesn't seem like a good idea.

> This could be a somewhat typical use-case for people
> using switched WCs. I would expect that these users
> want to merge into those sub-trees just like they
> would also 'svn up' them.


One thing is clear: this whole discussion is evidence that the way merging should work in a
 WC with sparse directories, mixed revisions and/or switched subtrees is not exactly clear.  I accept that we have existing behaviour as precedent for the common cases, as well as somne level of rationale; I'm not trying to say we have no idea at all.

- Julian

Re: Merge policies

Posted by Paul Burba <pt...@gmail.com>.

On Fri, Apr 20, 2012 at 6:17 AM, Julian Foad <ju...@btopenworld.com> wrote:
> Paul Burba wrote:
>
>> On Thu, Apr 19, 2012 at 12:57 PM, Julian Foad wrote:
>>>  Branko Čibej wrote:
>>>>  By the way, I'm all for removing support for merging into
>>>> mixed-revision
>>>>  and/or switched-subtree working copies. There's too much room for
>>>>  unexpected results in these cases.
>>>
>>>  I don't necessarily disagree, but that sounds a bit unsubstantiated.
>>> Is there too much "room for unexpected results" because you don't  have
>>> a mental model of what to expect?  If I were to figure out and describe
>>> some sensible semantics and a use case, might you then change your mind?
>>>
>>>  FWIW, I have an inkling of how a mixed-rev target WC should work: it's
>>> logically very similar to having different mergeinfo on different subtrees,
>>> except that the differences are on the target side of the merge rather than
>>> the  source side of the merge.  It requires some care to be sure that the
>>> parent-to-child inheritance of mergeinfo in the WC remains valid where base
>>> revision numbers differ.
>>>
>>>  That said, I can't imagine any valid reason to need to support a
>>> mixed-rev target WC.
>>>
>>>  I have thought very little about about how switched subtrees should
>>> work [and use cases].
>>>
>>>  Maybe Paul can help fill in some of the "how" and "why" gaps here?
>>
>> I believe I answered this question already, see:
>> http://svn.haxx.se/dev/archive-2012-03/0632.shtml  If that doesn't
>> sufficiently answer your questions let me know.
>
> Hi Paul.  Yes, what you wrote there makes sense as to how we handle switched subtrees and "why" in a logical sense.  I've just followed up with a reply to that email going into a bit more detail.

I replied on that thread.

> I feel that with this level of functional spec, Switched Subtrees is something I can now add to the Symmetric Merge design proposal.
>
> I see there's already a brief statement about sparse WCs too: <http://svn.apache.org/repos/asf/subversion/trunk/notes/merge-tracking/func-spec.html#sparse-checkouts>.  I'm not clear how the whole 'depth' thing works, though, if you add children to a directory that was initially sparse or exclude children from a directory that was initially depth non-empty.

Ah, welcome to my world!  Yes, this is a tricky situation.  If our
merge target is shallow (i.e. depth < infinity) and the merge applies
changes that would effect paths *deeper* than the immediate children
currently present, then those children are skipped and we set
non-inheritable mergeinfo to reflect that these changes were not
merged.  Where it gets whacky is if the merge *adds* paths which are
immediate children of the children currently present (e.g. merge
target at depth empty, merge adds a immediate file to the merge
target).  Today we add the file, despite the fact that the target's
depth is empty.

Before you claw your eyes out in frustration, read the issue I just
filed: http://subversion.tigris.org/issues/show_bug.cgi?id=4164  It
spells out the problem fairly clearly (I hope!).

Paul

> Do you have a pointer to some details about how a mixed-rev target WC is handled?
>
> I have now added a section "Mixed-Rev, Switched, or Sparse WC" to <http://wiki.apache.org/subversion/SymmetricMerge>.
>
>
>> As to removing support for merging into WCs with switched subtrees,
>> let's be clear that this *does* work today (if you think otherwise
>> please provide a use-case demonstrating what is broken).  Now maybe we
>> want to restrict these types of merges to make further improvements
>> possible (e.g. Julian's symmetric merge work), that I won't argue
>> against, but I want to be clear that we are making the choice to
>> remove some existing functionality as a trade-off in implementing new
>> functionality, rather than fixing a broken feature.
>
> Totally clear, yes, if that's what we decide to do.
>
> - Julian

Re: Merge policies

Posted by Julian Foad <ju...@btopenworld.com>.

Paul Burba wrote:

> On Thu, Apr 19, 2012 at 12:57 PM, Julian Foad wrote:
>>  Branko Čibej wrote:
>>>  By the way, I'm all for removing support for merging into
>>> mixed-revision
>>>  and/or switched-subtree working copies. There's too much room for
>>>  unexpected results in these cases.
>> 
>>  I don't necessarily disagree, but that sounds a bit unsubstantiated.
>> Is there too much "room for unexpected results" because you don't  have
>> a mental model of what to expect?  If I were to figure out and describe
>> some sensible semantics and a use case, might you then change your mind?
>> 
>>  FWIW, I have an inkling of how a mixed-rev target WC should work: it's
>> logically very similar to having different mergeinfo on different subtrees,
>> except that the differences are on the target side of the merge rather than
>> the  source side of the merge.  It requires some care to be sure that the
>> parent-to-child inheritance of mergeinfo in the WC remains valid where base
>> revision numbers differ.
>> 
>>  That said, I can't imagine any valid reason to need to support a 
>> mixed-rev target WC.
>> 
>>  I have thought very little about about how switched subtrees should
>> work [and use cases].
>> 
>>  Maybe Paul can help fill in some of the "how" and "why" gaps here?
> 
> I believe I answered this question already, see:
> http://svn.haxx.se/dev/archive-2012-03/0632.shtml  If that doesn't
> sufficiently answer your questions let me know.

Hi Paul.  Yes, what you wrote there makes sense as to how we handle switched subtrees and "why" in a logical sense.  I've just followed up with a reply to that email going into a bit more detail.  I feel that with this level of functional spec, Switched Subtrees is something I can now add to the Symmetric Merge design proposal.

I see there's already a brief statement about sparse WCs too: <http://svn.apache.org/repos/asf/subversion/trunk/notes/merge-tracking/func-spec.html#sparse-checkouts>.  I'm not clear how the whole 'depth' thing works, though, if you add children to a directory that was initially sparse or exclude children from a directory that was initially depth non-empty.

Do you have a pointer to some details about how a mixed-rev target WC is handled?

I have now added a section "Mixed-Rev, Switched, or Sparse WC" to <http://wiki.apache.org/subversion/SymmetricMerge>.

> As to removing support for merging into WCs with switched subtrees,
> let's be clear that this *does* work today (if you think otherwise
> please provide a use-case demonstrating what is broken).  Now maybe we
> want to restrict these types of merges to make further improvements
> possible (e.g. Julian's symmetric merge work), that I won't argue
> against, but I want to be clear that we are making the choice to
> remove some existing functionality as a trade-off in implementing new
> functionality, rather than fixing a broken feature.

Totally clear, yes, if that's what we decide to do.

- Julian

Re: Merge policies

Posted by Paul Burba <pt...@gmail.com>.

On Thu, Apr 19, 2012 at 12:57 PM, Julian Foad
<ju...@btopenworld.com> wrote:
> Branko Čibej wrote:
>
>> Julian Foad wrote:
>>>  To get symmetric behaviour, the problem is it's freakin' hard to
>>> untangle the subtree support and the mixed-rev support and the
>>> missing-subtree  support and everything from the basic merge algorithm
>>>  outline, in the existing code.  And the problem is not so much at a
>>>  coding level, but rather a matter of understanding what, in fact, the
>>> semantics are that we're intending to implement currently.
>>
>> By the way, I'm all for removing support for merging into mixed-revision
>> and/or switched-subtree working copies. There's too much room for
>> unexpected results in these cases.
>
> I don't necessarily disagree, but that sounds a bit unsubstantiated.  Is there too much "room for unexpected results" because you don't have a mental model of what to expect?  If I were to figure out and describe some sensible semantics and a use case, might you then change your mind?
>
> FWIW, I have an inkling of how a mixed-rev target WC should work: it's logically very similar to having different mergeinfo on different subtrees, except that the differences are on the target side of the merge rather than the source side of the merge.  It requires some care to be sure that the parent-to-child inheritance of mergeinfo in the WC remains valid where base revision numbers differ.
>
> That said, I can't imagine any valid reason to need to support a mixed-rev target WC.
>
> I have thought very little about about how switched subtrees should work.  However, I *can* imagine scenarios where the user has a large chunk of WC switched, and now wants to merge something into the whole WC knowing that it won't in fact touch the switched part.  That seems like a reasonable thing to support: untouched paths being switched.  I haven't yet thought of a use case for switched paths that are touched by the merge.
>
> Maybe Paul can help fill in some of the "how" and "why" gaps here?

I believe I answered this question already, see:
http://svn.haxx.se/dev/archive-2012-03/0632.shtml  If that doesn't
sufficiently answer your questions let me know.

As to removing support for merging into WCs with switched subtrees,
let's be clear that this *does* work today (if you think otherwise
please provide a use-case demonstrating what is broken).  Now maybe we
want to restrict these types of merges to make further improvements
possible (e.g. Julian's symmetric merge work), that I won't argue
against, but I want to be clear that we are making the choice to
remove some existing functionality as a trade-off in implementing new
functionality, rather than fixing a broken feature.

Paul

> - Julian

Re: Merge policies

Posted by Julian Foad <ju...@btopenworld.com>.




----- Original Message -----
> From: Branko Čibej <br...@apache.org>
> To: dev@subversion.apache.org
> Cc: 
> Sent: Thursday, 19 April 2012, 18:10
> Subject: Re: Merge policies
> 
> On 19.04.2012 18:57, Julian Foad wrote:
>>  Branko Čibej wrote:
>> 
>>>  Julian Foad wrote:
>>>>   To get symmetric behaviour, the problem is it's freakin' 
> hard to 
>>>>  untangle the subtree support and the mixed-rev support and the
>>>>  missing-subtree  support and everything from the basic merge 
> algorithm
>>>>   outline, in the existing code.  And the problem is not so much at 
> a
>>>>   coding level, but rather a matter of understanding what, in fact, 
> the
>>>>  semantics are that we're intending to implement currently.
>>>  By the way, I'm all for removing support for merging into 
> mixed-revision
>>>  and/or switched-subtree working copies. There's too much room for
>>>  unexpected results in these cases.
>>  I don't necessarily disagree, but that sounds a bit unsubstantiated.  
> Is there too much "room for unexpected results" because you don't 
> have a mental model of what to expect?  If I were to figure out and describe 
> some sensible semantics and a use case, might you then change your mind?
> 
> No, the semantics are reasonably clear. I mean unexpected results from
> the user's perspective, and the sheer horror of trying to merge
> svn:mergeinfo changes the repository happens to contain an
> already-committed newer merge on the same branch.

But the repo already having a committed merge later on the same branch is not specific to a mixed-rev WC.  That can happen any your WC isn't at HEAD.  Unless you've locked the branch against further commits and updated to head.  Is that what we should be checking for, instead?

- Julian

Re: Merge policies

Posted by Branko Čibej <br...@apache.org>.

On 19.04.2012 18:57, Julian Foad wrote:
> Branko Čibej wrote:
>
>> Julian Foad wrote:
>>>  To get symmetric behaviour, the problem is it's freakin' hard to 
>>> untangle the subtree support and the mixed-rev support and the
>>> missing-subtree  support and everything from the basic merge algorithm
>>>  outline, in the existing code.  And the problem is not so much at a
>>>  coding level, but rather a matter of understanding what, in fact, the
>>> semantics are that we're intending to implement currently.
>> By the way, I'm all for removing support for merging into mixed-revision
>> and/or switched-subtree working copies. There's too much room for
>> unexpected results in these cases.
> I don't necessarily disagree, but that sounds a bit unsubstantiated.  Is there too much "room for unexpected results" because you don't have a mental model of what to expect?  If I were to figure out and describe some sensible semantics and a use case, might you then change your mind?

No, the semantics are reasonably clear. I mean unexpected results from
the user's perspective, and the sheer horror of trying to merge
svn:mergeinfo changes the repository happens to contain an
already-committed newer merge on the same branch.

-- Brane

Re: Merge policies

Posted by Stefan Fuhrmann <eq...@web.de>.

Branko Čibej wrote:
> On 20.04.2012 23:45, Stefan Fuhrmann wrote:
>> We could even be more permissive: as long as the merge
>> does not cross "switch boundaries" (switched / non-
>> switched or switched to different URL) as may allow
>> the merge.
>>
>> This could be a somewhat typical use-case for people
>> using switched WCs. I would expect that these users
>> want to merge into those sub-trees just like they
>> would also 'svn up' them.
> Doesn't merging in a switched subtree almost by definition imply
> creating subtree mergeinfo? I was under the impression that we wanted to
> more or less forbid such merges at some point in the  future.

If the merge does not cross "switch boundaries",
at least in theory, we could record the merge info
at the respective branches' merge info. I understand
that the respective node is not readily available
but the problem is not entirely unsolvable.

-- Stefan^2.

Re: Merge policies

Posted by Branko Čibej <br...@apache.org>.

On 20.04.2012 23:45, Stefan Fuhrmann wrote:
> We could even be more permissive: as long as the merge
> does not cross "switch boundaries" (switched / non-
> switched or switched to different URL) as may allow
> the merge.
>
> This could be a somewhat typical use-case for people
> using switched WCs. I would expect that these users
> want to merge into those sub-trees just like they
> would also 'svn up' them.

Doesn't merging in a switched subtree almost by definition imply
creating subtree mergeinfo? I was under the impression that we wanted to
more or less forbid such merges at some point in the  future.

-- Brane

Re: Merge policies

Posted by Stefan Fuhrmann <eq...@web.de>.

Am 19.04.2012 18:57, schrieb Julian Foad:
> Branko Čibej wrote:
>
>> Julian Foad wrote:
>>>   To get symmetric behaviour, the problem is it's freakin' hard to
>>> untangle the subtree support and the mixed-rev support and the
>>> missing-subtree  support and everything from the basic merge algorithm
>>>   outline, in the existing code.  And the problem is not so much at a
>>>   coding level, but rather a matter of understanding what, in fact, the
>>> semantics are that we're intending to implement currently.
>> By the way, I'm all for removing support for merging into mixed-revision
>> and/or switched-subtree working copies. There's too much room for
>> unexpected results in these cases.
> I don't necessarily disagree, but that sounds a bit unsubstantiated.  Is there too much "room for unexpected results" because you don't have a mental model of what to expect?  If I were to figure out and describe some sensible semantics and a use case, might you then change your mind?
>
> FWIW, I have an inkling of how a mixed-rev target WC should work: it's logically very similar to having different mergeinfo on different subtrees, except that the differences are on the target side of the merge rather than the source side of the merge.  It requires some care to be sure that the parent-to-child inheritance of mergeinfo in the WC remains valid where base revision numbers differ.
>
> That said, I can't imagine any valid reason to need to support a mixed-rev target WC.

Well, in the end one will always commit against HEAD.
But there are two use-cases of different importance
that should be supported:

* A WC is usually not at HEAD because commits don't
   update unchanged nodes. On branches, the WC will
   often contain the latest changes as few people work
   on that branch. A WC of /trunk can be expected to be
   outdated even if it was updated to HEAD shortly before
   the merge.
* Mixed-rev working copies are an intermediate step
   in creating arbitrary configurations via 'svn cp WC URL'.
   Merging into the copy target should be semantically
   identical to merging into a mixed-rev WC.
> I have thought very little about about how switched subtrees should work.  However, I *can* imagine scenarios where the user has a large chunk of WC switched, and now wants to merge something into the whole WC knowing that it won't in fact touch the switched part.  That seems like a reasonable thing to support: untouched paths being switched.  I haven't yet thought of a use case for switched paths that are touched by the merge.

We could even be more permissive: as long as the merge
does not cross "switch boundaries" (switched / non-
switched or switched to different URL) as may allow
the merge.

This could be a somewhat typical use-case for people
using switched WCs. I would expect that these users
want to merge into those sub-trees just like they
would also 'svn up' them.

-- Stefan^2.

Re: Merge policies

Posted by Julian Foad <ju...@btopenworld.com>.

Branko Čibej wrote:

> Julian Foad wrote:
>>  To get symmetric behaviour, the problem is it's freakin' hard to 
>> untangle the subtree support and the mixed-rev support and the
>> missing-subtree  support and everything from the basic merge algorithm
>>  outline, in the existing code.  And the problem is not so much at a
>>  coding level, but rather a matter of understanding what, in fact, the
>> semantics are that we're intending to implement currently.
> 
> By the way, I'm all for removing support for merging into mixed-revision
> and/or switched-subtree working copies. There's too much room for
> unexpected results in these cases.

I don't necessarily disagree, but that sounds a bit unsubstantiated.  Is there too much "room for unexpected results" because you don't have a mental model of what to expect?  If I were to figure out and describe some sensible semantics and a use case, might you then change your mind?

FWIW, I have an inkling of how a mixed-rev target WC should work: it's logically very similar to having different mergeinfo on different subtrees, except that the differences are on the target side of the merge rather than the source side of the merge.  It requires some care to be sure that the parent-to-child inheritance of mergeinfo in the WC remains valid where base revision numbers differ.

That said, I can't imagine any valid reason to need to support a mixed-rev target WC.

I have thought very little about about how switched subtrees should work.  However, I *can* imagine scenarios where the user has a large chunk of WC switched, and now wants to merge something into the whole WC knowing that it won't in fact touch the switched part.  That seems like a reasonable thing to support: untouched paths being switched.  I haven't yet thought of a use case for switched paths that are touched by the merge.

Maybe Paul can help fill in some of the "how" and "why" gaps here?

- Julian

Re: Merge policies

Posted by Branko Čibej <br...@apache.org>.

On 19.04.2012 16:56, Julian Foad wrote:

> To get symmetric behaviour, the problem is it's freakin' hard to untangle the subtree support and the mixed-rev support and the missing-subtree support and everything from the basic merge algorithm outline, in the existing code.  And the problem is not so much at a coding level, but rather a matter of understanding what, in fact, the semantics are that we're intending to implement currently.

By the way, I'm all for removing support for merging into mixed-revision
and/or switched-subtree working copies. There's too much room for
unexpected results in these cases.

-- Brane

Re: Merge policies

Posted by Julian Foad <ju...@btopenworld.com>.

Mark Phippard wrote:

> My concern is any link between these scripts and our merge code.  It
> sounds like the plan would be to create these policies and then come
> up with a newmerge command that does not support any of the features
> these policies block?

No, that's not this community's plan.  I evidently don't talk enough; that's a personality problem I have.  I need to communicate more.  How do I explain?  The plan is to improve Subversion's merge capabilities.  We're still working out the details of what that means.

> I honestly do not think these sorts of issues are what most of our
> users are struggling with.  You can do whole branch merges today, you
> can block subtree merges with scripts if that is important etc.  Most
> users seem to learn and adjust to --reintegrate.

It's true we *can* do successful merges now, and much of the time *we* the developers *do* it successfully, but sometimes even we get unexpected results or do what we wouldn't have wished to do.  But *typical* users *often* get it wrong.  Several of us have tried to help real users to unpick the results of where they've merged things around and about and tried to fix it themselves with --record-only and have then called us in, in despair.

It's my goal to help users avoid common pitfalls.

I'm looking at a number of ways to do that.

>  It seems like Julian
> has figured out a way to get rid of the --reintegrate option and if he
> is able to do it while still knowing when to do all the reintegrate
> checks, then subtree merges are no longer an immediate problem.

Yes, I've figured that out *in principle*.  The code I've written so far is just a crude starting point that doesn't yet lift the restrictions that --reintegrate has, and as such is not satisfactory (in fact it makes it *more* confusing because it hides the behaviour difference under a 'symmetric' user interface).

To get symmetric behaviour, the problem is it's freakin' hard to untangle the subtree support and the mixed-rev support and the missing-subtree support and everything from the basic merge algorithm outline, in the existing code.  And the problem is not so much at a coding level, but rather a matter of understanding what, in fact, the semantics are that we're intending to implement currently.

I have so far been dabbing carefully at the existing code, pulling a little more low-level structure into it (such as svn_client__pathrev_t) while extremely carefully preserving the exact behaviour of the code, and I'm doing it that way because I don't understand the intention of the code at sufficient level of detail.  But that's not efficient development.

I am currently thinking it's time to change tack here, stop trying to incrementally change the existing implementation, and instead start figuring out a design and implementation, that we can grow in stages.  Until we're confident it has a promising future, it can stay on a branch or #ifdef'd out or whatever.  I've got to start somewhere.  I can't feasibly reverse-engineer the whole semantics that we've created before I start.  (In this style of open source dev, we often don't have detailed specs for the stuff we create.)  So to get a handle on it I've gone back to the fundamentals of what the merge is meant to do, and am building up from there.  So I've got as far as the theory of how to basically merge a tree (without subtrees, and without a mixed-rev target WC, and so on).  It would be massively constructive and helpful to write some of this as code before and during getting on with figuring out a design for subtrees.  (Actually I have already
 thought and written a start on this, just not in great detail, and I have a theory that a mixed-rev target WC is a very similar issue as subtrees-have-differing-mergeinfo.)

At the same time as having a particular 'symmetric merge' idea, I'd like to take every opportunity to make it easier for others to understand the structure and behaviour of the merge code, and participate in developing it.  So for example I broke the 'reintegrate' API into two parts ('find' and 'do') so that higher level code (including svn but especially GUIs) can get some visibility and a bit more control over what's going on.  And similarly I think we should split the 'merge' API into 'automatic merge' and 'manual merge', because the use cases and the behaviour and the implementation are really quite separate (auto merge should *use* manual merge as a subroutine, not be a slight variant of it).

> Julian also seems to have some nice ideas on improving the output from
> the commands to make the process more understandable.

Yes; I haven't touched on that for a while.

> Where I see users struggling with merges is when it comes to tree
> conflicts.  People rename stuff frequently and then merge in SVN
> becomes nearly unusable.  If we want to really fix merge, I think we
> have to improve how renames are handled.

I totally agree, we really do need to fix that.

> Andy expressed a desire to provide a code review workflow.  With the
> right UI to make the branch creation easy (as well as of course the
> review), I think you could to that today.  There is no reason that
> issues like subtree merges and criss-cross merges etc. should make it
> difficult to provide a feature like that because the UI would steer
> the user away from doing that.  The blocker that does exist today is
> that we do not handle things like renames well.

Yes, agreed: if the branch isn't going to have any complex merging, and is only going to be reintegrated once, then the current merge will handle it.

> Back to your proposal, if users could easily put hooks in place to
> block certain merge features it would make our existing merge easier
> as well.  I guess I am just saying it does not get us off the hook for
> needing to support those features in our merge for the people that do
> want to use them.

Agreed.

- Julian

Re: Merge policies

Posted by Stefan Sperling <st...@elego.de>.

On Thu, Apr 19, 2012 at 08:59:58AM -0500, Hyrum K Wright wrote:
> While I've not done any experiments, it feel like it might be possible
> to heuristically detect some (but not all) moves by looking at copies,
> and asking if the source was deleted in the same revision.  There are
> probably all kinds of corner cases I haven't considered, but at first
> blush that seems like it might work.

I've tried doing just that some months ago. It can be made to work
but we'll need a better conflict resolution framework first.
I've put my notes here:
http://svn.apache.org/repos/asf/subversion/branches/moves-scan-log/BRANCH-README

> The client still needs to be taught to actually use this information.

There are on-going efforts to implement working-copy-local moves
for Subversion 1.8. The bulk of the work is done. My personal goal
for 1.8 is to fix the remaining issues with the current local-moves
implementation in trunk. Philip has been doing great work in this area
that pushed the feature beyond what I originally anticipated.

I am confident that local-moves will "just work" in 1.8, and plan to
focus on improving handling of incoming moves in releases 1.9 and later.
Wether that will involve Editor-V2, or the scan-log approach (for
compatibility with old servers), or both, remains to be seen.

But it seems we're on the right track. We're trying to attack the problem
from several angles. We now have, as a collective, a much better
understanding of the problem space than when tree-conflicts first popped
up on the radar during the development of 1.5/1.6. From my point of view,
it is a question of when, not if, Subversion will properly merge renames.

Re: Merge policies

Posted by Hyrum K Wright <hy...@wandisco.com>.

On Thu, Apr 19, 2012 at 8:51 AM, Andy Singleton <an...@assembla.com> wrote:
>  Yes, I agree.  If merge handles moves, and if merging between branch and
> trunk doesn't require arguments and reintegrate, then it would work in
> enough cases to make people comfortable with the review-and-merge workflow.
>
> It does seem that moves and renames cause a high percentage of user
> complaints.  People don't commit moves in the correct way, and they never
> will, and they get merge problems even if they do.  It is possible that you
> could eliminate about 90% of these complaints by
> * Putting in an agent that looks for moves and renames heuristically,
> assumes that it is correct when it finds something, uses the information for
> any current merge or update.
> * Add move information to next commit, to make it easier to handle moves and
> history in the future.  This step is apparently optional, as git handles
> moves more effectively than subversion using only the heuristic detection,
> without move tracking.

For the record, include move information when committing is a primary
goal of Ev2.  We currently send move information in the case of
repos->repos moves on the ev2-export branch, though there is still the
need to store this information on the server and then provide it to
the client when doing a merge.  And it will only work for future
revisions, not historic ones, as you say.

While I've not done any experiments, it feel like it might be possible
to heuristically detect some (but not all) moves by looking at copies,
and asking if the source was deleted in the same revision.  There are
probably all kinds of corner cases I haven't considered, but at first
blush that seems like it might work.

The client still needs to be taught to actually use this information.

-Hyrum

-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/

Re: Merge policies

Posted by Andy Singleton <an...@assembla.com>.

  Yes, I agree.  If merge handles moves, and if merging between branch 
and trunk doesn't require arguments and reintegrate, then it would work 
in enough cases to make people comfortable with the review-and-merge 
workflow.

It does seem that moves and renames cause a high percentage of user 
complaints.  People don't commit moves in the correct way, and they 
never will, and they get merge problems even if they do.  It is possible 
that you could eliminate about 90% of these complaints by
* Putting in an agent that looks for moves and renames heuristically, 
assumes that it is correct when it finds something, uses the information 
for any current merge or update.
* Add move information to next commit, to make it easier to handle moves 
and history in the future.  This step is apparently optional, as git 
handles moves more effectively than subversion using only the heuristic 
detection, without move tracking.


On 4/19/2012 9:26 AM, Mark Phippard wrote:
> I have no objection to coming up with some hook scripts to help
> enforce merge policies.  I recall seeing this project on users@:
>
> http://sourceforge.net/apps/mediawiki/svnhook/index.php?title=Main_Page
>
> It adds the ability to block subtree merges by rejecting commits that
> add mergeinfo below the root.
>
> My concern is any link between these scripts and our merge code.  It
> sounds like the plan would be to create these policies and then come
> up with a newmerge command that does not support any of the features
> these policies block?  You obviously have no way to know that the hook
> scripts have been installed, so I would guess at a minimum your new
> command would have to detect all these problems and error out rather
> than run and produce invalid results.
>
> I honestly do not think these sorts of issues are what most of our
> users are struggling with.  You can do whole branch merges today, you
> can block subtree merges with scripts if that is important etc.  Most
> users seem to learn and adjust to --reintegrate.  It seems like Julian
> has figured out a way to get rid of the --reintegrate option and if he
> is able to do it while still knowing when to do all the reintegrate
> checks, then subtree merges are no longer an immediate problem.
> Julian also seems to have some nice ideas on improving the output from
> the commands to make the process more understandable.
>
> Where I see users struggling with merges is when it comes to tree
> conflicts.  People rename stuff frequently and then merge in SVN
> becomes nearly unusable.  If we want to really fix merge, I think we
> have to improve how renames are handled.
>
> Andy expressed a desire to provide a code review workflow.  With the
> right UI to make the branch creation easy (as well as of course the
> review), I think you could to that today.  There is no reason that
> issues like subtree merges and criss-cross merges etc. should make it
> difficult to provide a feature like that because the UI would steer
> the user away from doing that.  The blocker that does exist today is
> that we do not handle things like renames well.
>
> Back to your proposal, if users could easily put hooks in place to
> block certain merge features it would make our existing merge easier
> as well.  I guess I am just saying it does not get us off the hook for
> needing to support those features in our merge for the people that do
> want to use them.
>
> Mark
>
>
>
> On Thu, Apr 19, 2012 at 6:48 AM, Stefan Fuhrmann
> <st...@alice-dsl.de>  wrote:
>> Hi all,
>>
>> After having a closer look at merge and discussing it
>> with Julian on IRC, there seems to be no silver bullet.
>> However, we identified a few things that could be changed
>> and set of constellations that make merge harder than
>> it needs to be.
>>
>> For the first, there will be another post soon. The second
>> boils down to policy. Luckily, SVN has a mechanism to
>> enforce policies: server-side hook scripts. My proposal
>> is to develop a small set of scripts that a user can
>> combine to prevent situations that her life harder than
>> necessary. This should give us enough time to improve
>> the merge logic inside the SVN libs.
>>
>> The following pre-commit scripts / policies would be useful.
>>
>> * Common parts [not a policy]
>>   We first check whether the commit contains a changed
>>   svn:merge-info property. This limits the performance
>>   impact on non-merge commits and we need to identify
>>   all changed svn:merge-info anyway.
>>
>>   Also, the merges that happened on the source branch
>>   from a different location than the target branch are
>>   of no interest for the policy checkers. E.g.:
>>
>>   r20: merge r19 from ^/sub-branch to ^/branch
>>   txn: merge r10-20 from ^/branch to ^/trunk
>>   Both merges will show up in the merge-info delta but
>>   we only need to evaluate the second one.
>>
>> * Strict merge hierarchy
>>   A merge from A->B is only allowed, if the copy-from
>>   of A is B or vice versa and the copy source has not
>>   been replaced since the copy). This prevents circular
>>   merges and others (note 1).
>>
>>   In a more sophisticated implementation, we could identify /
>>   allow for renamed branches as well as A and B having
>>   the same relative path to some parents that form a
>>   direct branch (i.e. allow sub-tree merges).
>>
>> * No sub-tree merges
>>   Like the above but without the check for parents.
>>
>> * No aggregate merges
>>   There must only be one source branch, i.e. we can't
>>   merge from branches A and B to C in the same revision.
>>
>> * No distributive merges
>>   For each path being merged (i.e. having a merge-info
>>   delta), the relative paths in source and target must
>>   correspond (i.e. start as the same and then may get
>>   renamed etc.). This is basically the same as the
>>   "sophisticated" part of the check for strict merges.
>>
>> * No cherry picking
>>   Check that the source branch does not contain revisions
>>   that lie before the last to-be-merged revision but
>>   have neither been merged before nor are being merged
>>   right now.
>>
>> * No criss-crossing
>>   Prevent situations like the criss-cross examples here:
>>   http://wiki.apache.org/subversion/SymmetricMerge
>>
>>   For a merge A->B, abort if there has been a merge
>>   B->A after the last revision of A to be merged to B.
>>   This only valid for non-cherry-picking merges and
>>   only if the change sets of both merges overlap.
>>
>> Except for the last one those checks will simply verify
>> that the user followed certain policies. They should,
>> therefore, rarely reject a commit.
>>
>> Again, the user shall be free to combine (or not use)
>> these policies although not all combinations are meaningful.
>>
>> Thoughts?
>>
>> -- Stefan^2.
>>
>>
>> Note 1:
>>
>>   One thing that we might want to support is integration
>>   branches where a temporary branch is being used as
>>   an intermediate merge target:
>>
>>   integrate A->B as
>>   rN: copy B->A_integration
>>   rN+1: merge A->A_integration
>>   rN+x: ... various changes on A_integration
>>   rN+y: merge A_integration->B
>>   rN+y+1: delete A_integration
>>
>>   These checks become more complicated, requires
>>   naming conventions for the integration branches etc.
>
>


-- 
Andy Singleton
Founder/CEO, Assembla Online: http://www.assembla.com
Phone: 781-328-2241
Skype: andysingleton

Re: Merge policies

Posted by Mark Phippard <ma...@gmail.com>.

I have no objection to coming up with some hook scripts to help
enforce merge policies.  I recall seeing this project on users@:

http://sourceforge.net/apps/mediawiki/svnhook/index.php?title=Main_Page

It adds the ability to block subtree merges by rejecting commits that
add mergeinfo below the root.

My concern is any link between these scripts and our merge code.  It
sounds like the plan would be to create these policies and then come
up with a newmerge command that does not support any of the features
these policies block?  You obviously have no way to know that the hook
scripts have been installed, so I would guess at a minimum your new
command would have to detect all these problems and error out rather
than run and produce invalid results.

I honestly do not think these sorts of issues are what most of our
users are struggling with.  You can do whole branch merges today, you
can block subtree merges with scripts if that is important etc.  Most
users seem to learn and adjust to --reintegrate.  It seems like Julian
has figured out a way to get rid of the --reintegrate option and if he
is able to do it while still knowing when to do all the reintegrate
checks, then subtree merges are no longer an immediate problem.
Julian also seems to have some nice ideas on improving the output from
the commands to make the process more understandable.

Where I see users struggling with merges is when it comes to tree
conflicts.  People rename stuff frequently and then merge in SVN
becomes nearly unusable.  If we want to really fix merge, I think we
have to improve how renames are handled.

Andy expressed a desire to provide a code review workflow.  With the
right UI to make the branch creation easy (as well as of course the
review), I think you could to that today.  There is no reason that
issues like subtree merges and criss-cross merges etc. should make it
difficult to provide a feature like that because the UI would steer
the user away from doing that.  The blocker that does exist today is
that we do not handle things like renames well.

Back to your proposal, if users could easily put hooks in place to
block certain merge features it would make our existing merge easier
as well.  I guess I am just saying it does not get us off the hook for
needing to support those features in our merge for the people that do
want to use them.

Mark

On Thu, Apr 19, 2012 at 6:48 AM, Stefan Fuhrmann
<st...@alice-dsl.de> wrote:
> Hi all,
>
> After having a closer look at merge and discussing it
> with Julian on IRC, there seems to be no silver bullet.
> However, we identified a few things that could be changed
> and set of constellations that make merge harder than
> it needs to be.
>
> For the first, there will be another post soon. The second
> boils down to policy. Luckily, SVN has a mechanism to
> enforce policies: server-side hook scripts. My proposal
> is to develop a small set of scripts that a user can
> combine to prevent situations that her life harder than
> necessary. This should give us enough time to improve
> the merge logic inside the SVN libs.
>
> The following pre-commit scripts / policies would be useful.
>
> * Common parts [not a policy]
>  We first check whether the commit contains a changed
>  svn:merge-info property. This limits the performance
>  impact on non-merge commits and we need to identify
>  all changed svn:merge-info anyway.
>
>  Also, the merges that happened on the source branch
>  from a different location than the target branch are
>  of no interest for the policy checkers. E.g.:
>
>  r20: merge r19 from ^/sub-branch to ^/branch
>  txn: merge r10-20 from ^/branch to ^/trunk
>  Both merges will show up in the merge-info delta but
>  we only need to evaluate the second one.
>
> * Strict merge hierarchy
>  A merge from A->B is only allowed, if the copy-from
>  of A is B or vice versa and the copy source has not
>  been replaced since the copy). This prevents circular
>  merges and others (note 1).
>
>  In a more sophisticated implementation, we could identify /
>  allow for renamed branches as well as A and B having
>  the same relative path to some parents that form a
>  direct branch (i.e. allow sub-tree merges).
>
> * No sub-tree merges
>  Like the above but without the check for parents.
>
> * No aggregate merges
>  There must only be one source branch, i.e. we can't
>  merge from branches A and B to C in the same revision.
>
> * No distributive merges
>  For each path being merged (i.e. having a merge-info
>  delta), the relative paths in source and target must
>  correspond (i.e. start as the same and then may get
>  renamed etc.). This is basically the same as the
>  "sophisticated" part of the check for strict merges.
>
> * No cherry picking
>  Check that the source branch does not contain revisions
>  that lie before the last to-be-merged revision but
>  have neither been merged before nor are being merged
>  right now.
>
> * No criss-crossing
>  Prevent situations like the criss-cross examples here:
>  http://wiki.apache.org/subversion/SymmetricMerge
>
>  For a merge A->B, abort if there has been a merge
>  B->A after the last revision of A to be merged to B.
>  This only valid for non-cherry-picking merges and
>  only if the change sets of both merges overlap.
>
> Except for the last one those checks will simply verify
> that the user followed certain policies. They should,
> therefore, rarely reject a commit.
>
> Again, the user shall be free to combine (or not use)
> these policies although not all combinations are meaningful.
>
> Thoughts?
>
> -- Stefan^2.
>
>
> Note 1:
>
>  One thing that we might want to support is integration
>  branches where a temporary branch is being used as
>  an intermediate merge target:
>
>  integrate A->B as
>  rN: copy B->A_integration
>  rN+1: merge A->A_integration
>  rN+x: ... various changes on A_integration
>  rN+y: merge A_integration->B
>  rN+y+1: delete A_integration
>
>  These checks become more complicated, requires
>  naming conventions for the integration branches etc.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Merge policies

Posted by Trent Nelson <tr...@snakebite.org>.

On Apr 19, 2012, at 1:36 PM, Trent Nelson wrote:

> 
> On Apr 19, 2012, at 8:38 AM, Stefan Sperling wrote:
> 
>> On Thu, Apr 19, 2012 at 12:48:44PM +0200, Stefan Fuhrmann wrote:
>>> Hi all,
>>> 
>>> After having a closer look at merge and discussing it
>>> with Julian on IRC, there seems to be no silver bullet.
>>> However, we identified a few things that could be changed
>>> and set of constellations that make merge harder than
>>> it needs to be.
>>> 
>>> For the first, there will be another post soon. The second
>>> boils down to policy. Luckily, SVN has a mechanism to
>>> enforce policies: server-side hook scripts. My proposal
>>> is to develop a small set of scripts that a user can
>>> combine to prevent situations that her life harder than
>>> necessary.
>> 
>> I like the idea of providing pre-commit hook samples that each
>> implement a piece of a merge-policy puzzle.
>> 
>> At elego we often advise customers about branching/merging strategies.
>> It is true that virtually every shop has different requirements (i.e.
>> there is no silver bullet). This simply stems from the fact that every
>> software project is different. At the same time, a lot of shops end up
>> using the same or similar strategy and sometimes reinvent the wheel.
>> 
>> Subversion by itself enforces virtually no restrictions, and adding
>> restrictions to the system is often a major part of the work involved
>> in getting a production setup up and running. Often restrictions are
>> added over time as people repeatedly make mistakes that cause problems.
>> 
>> Having an officially supported set of high-quality hook scripts in
>> our tools/ directory that are documented well and support a wide
>> range of use cases in a modular fashion would help a lot. I believe
>> many users would be happy to deploy a hook collection that is officially
>> maintained by the Apache Subversion project rather than relying on
>> self-written or third-party hook scripts.
>> 
>> I would definitely be interested in contributing, and maybe others
>> at elego would, too. I'd be very happy to install "Apache Subversion"
>> branded hooks rather then "elego" branded hooks at customer sites and
>> share the development effort.
>> 
>> And maybe Trent can contribute some of his experience, if he has time?
> 
> As the spirit of JFDI is so rife before bed, I'm proud to announce enversion 0.0.0: https://github.com/tpn/enversion!
> 

It's probably worth noting this particular file:

https://github.com/tpn/enversion/blob/master/evn/constants.py

That's a list of all the things Enversion currently detects/blocks/warns against.  As you can see, lots of merge validation stuff.

I'll follow up tomorrow with more info in a separate thread rather than hijacking this one.

	Trent.

--
class _Notes(Constant):
    MergeinfoRemovedFromRepositoryRoot = 'svn:mergeinfo removed from repository root'
    SubtreeMergeinfoModified = 'subtree svn:mergeinfo modified'
    SubtreeMergeinfoRemoved = 'subtree svn:mergeinfo removed'
    Merge = 'merge'
    RootRemoved = 'root removed'
    ValidMultirootCommit = 'valid multi-root commit'
    MultipleUnknownAndKnowRootsVerifiedByExternals = 'multiple known and unknown roots verified by svn:externals'
    BranchRenamed = 'branch renamed'
    TrunkRelocated = 'trunk relocated'
    FileReplacedViaCopyDuringMerge = 'file replaced via copy during merge'
    FileUnchangedButParentCopyOrRenameBug = 'file is unchanged but there is a parent rename or copy action'
    DirUnchangedButParentCopyOrRenameBug = 'directory is unchanged but there is a parent rename or copy action'
    UnchangedFileDuringMerge = 'file unchanged during merge'
    UnchangedDirDuringMerge = 'dir unchanged during merge'

n = _Notes()

class _Warnings(Constant):
    KnownRootRemoved = 'known root removed'

w = _Warnings()

class _Errors(Constant):
    TagRenamed = 'tag renamed'
    TagModified = 'tag modified'
    MultipleUnknownAndKnownRootsModified = 'multiple known and unknown roots modified in the same commit'
    MixedRootNamesInMultiRootCommit = 'mixed root names in multi-root commit'
    MixedRootTypesInMultiRootCommit = 'mixed root types in multi-root commit'
    SubversionRepositoryCheckedIn = 'subversion repository checked in'
    MergeinfoAddedToRepositoryRoot = "svn:mergeinfo added to repository root '/'"
    MergeinfoModifiedOnRepositoryRoot = "svn:mergeinfo modified on repository root '/'"
    SubtreeMergeinfoAdded = 'svn:mergeinfo added to subtree'
    RootMergeinfoRemoved = 'svn:mergeinfo removed from root'
    DirectoryReplacedDuringMerge = 'directory replaced during merge'
    EmptyMergeinfoCreated = 'empty svn:mergeinfo property set on path'
    TagDirectoryCreatedManually = 'tag directory created manually'
    BranchDirectoryCreatedManually = 'branch directory created manually'
    BranchRenamedToTrunk = 'branch renamed to trunk'
    TrunkRenamedToBranch = 'trunk renamed to branch'
    TrunkRenamedToTag = 'trunk renamed to tag'
    BranchRenamedToTag = 'branch renamed to tag'
    BranchRenamedOutsideRootBaseDir = 'branch renamed to location outside root base dir'
    TagSubtreePathRemoved = 'tag subtree path removed'
    RenameAffectsMultipleRoots = 'rename affects multiple roots'
    UncleanRenameAffectsMultipleRoots = 'unclean rename affects multiple roots'
    MultipleRootsCopied = 'multiple roots copied'
    TagCopied = 'tag copied'
    UncleanCopy = 'unclean copy'
    FileRemovedFromTag = 'file removed from tag'
    CopyKnownRootSubtreeToValidAbsRootPath = 'copy known root subtree to valid absolute root path'
    MixedRootsNotClarifiedByExternals = 'multiple known and unknown roots in commit could not be clarified by svn:externals'
    CopyKnownRootToIncorrectlyNamedRootPath = 'known root copied to an incorrectly-named new root path'
    CopyKnownRootSubtreeToIncorrectlyNamedRootPath = 'known root subtree copied to incorrectly-named new root path'
    UnknownPathRenamedToIncorrectlyNamedNewRootPath = 'unknown path renamed incorrectly to new root path name'
    RenamedKnownRootToIncorrectlyNamedRootPath = 'renamed known root to incorrectly named root path'
    MixedChangeTypesInMultiRootCommit = 'mixed change types in multi-root commit'
    CopyKnownRootToKnownRootSubtree = 'copy known root to known root subtree'
    UnknownPathCopiedToIncorrectlyNamedNewRootPath = 'unknown path copied to incorrectly named new root path'
    RenamedKnownRootToKnownRootSubtree = 'renamed root to known root subtree'
    FileUnchangedAndNoParentCopyOrRename = 'file has no text or property changes, and no parent copy or rename actions can be found'
    DirUnchangedAndNoParentCopyOrRename = 'directory has not changed, and no parent copy or rename actions can be found'
    EmptyChangeSet = 'empty change set'
    RenameRelocatedPathOutsideKnownRoot = 'rename relocated path outside known root'
    TagRemoved = 'tag removed'
    CopyKnownRootToUnknownPath = 'known root copied to unknown path'
    CopyKnownRootSubtreeToInvalidRootPath = 'known root copied to invalid root path'
    NewRootCreatedByRenamingUnknownPath = 'new root created by renaming unknown path'
    UnknownPathCopiedToKnownRootSubtree = 'unknown path copied to known root subtree'
    NewRootCreatedByCopyingUnknownPath = 'new root created by copying unknown path'
    RenamedKnownRootToUnknownPath = 'known root renamed to unknown path'
    RenamedKnownRootSubtreeToUnknownPath = 'known root subtree renamed to unknown path'
    RenamedKnownRootSubtreeToValidRootPath = 'known root subtree renamed to valid root path'
    RenamedKnownRootSubtreeToIncorrectlyNamedRootPath = 'known root subtree renamed to incorrectly-named root path'
    UncleanRename = 'unclean rename'
    PathCopiedFromOutsideRootDuringNonMerge = 'path copied from outside root during non-merge'
    UnknownDirReplacedViaCopyDuringNonMerge = 'unknown directory replaced via copy during non-merge'
    DirReplacedViaCopyDuringNonMerge = 'directory replaced via copy during non-merge'
    DirectoryReplacedDuringNonMerge = 'directory replaced during non-merge'
    PreviousPathNotMatchedToPathsInMergeinfo = 'previous path not matched to paths in mergeinfo'
    PreviousRevDiffersFromParentCopiedFromRev = 'previous rev differs from parent copied-from rev'
    PreviousPathDiffersFromParentCopiedFromPath = 'previous path differs from parent copied-from path'
    PreviousRevDiffersFromParentRenamedFromRev = 'previous rev differs from parent renamed-from rev'
    PreviousPathDiffersFromParentRenamedFromPath = 'previous path differs from parent renamed-from path'
    KnownRootPathReplacedViaCopy = 'known root path replaced via copy'
    BranchesDirShouldBeCreatedManuallyNotCopied = "'branches' directory should be created manually not copied"
    TagsDirShouldBeCreatedManuallyNotCopied = "'tags' directory should be created manually not copied"
    CopiedFromPathNotMatchedToPathsInMergeinfo = 'copied-from path not matched to paths in mergeinfo'
    InvariantViolatedModifyContainsMismatchedPreviousPath = 'invariant violated: modify contains mismatched previous path'
    InvariantViolatedModifyContainsMismatchedPreviousRev = 'invariant violated: modify contains mismatched previous path'
    InvariantViolatedCopyNewPathInRootsButNotReplace = "invariant violated: the new path name created (via copy) is already a known root, but the change isn't marked as a replace."
    MultipleRootsAffectedByRemove = 'multiple roots affected by remove'
    InvariantViolatedDieCalledWithoutErrorInfo = "invariant violated: Repository.die() method called without any error information being set"
    VersionMismatch = "version mismatch: we are at version %d, but the 'evn:version' revision property found on revision 0 reports the repository is at version %d"
    MissingOrEmptyRevPropOnRev0 = "missing or empty 'evn:%s' revision property on revision 0"
    InvalidIntegerRevPropOnRev0 = "invalid value for 'evn:%s' revision property on revision 0; expected an integer greater than or equal to %d, got: %s"
    PropertyValueConversionFailed = "failed to convert property %s's value: %s"
    PropertyValueLiteralEvalFailed = "invalid value for property '%s': %s"
    LastRevTooHigh = "the value for the revision property 'evn:last_rev' on revision 0 of the repository is too high; it indicates the last processed revision was %d, however, the highest revision in the repository is only %d"
    RepositoryOutOfSyncTxn = "the repository is out of sync and can not be committed to at this time (base revision for this transaction: %d, repository last synchronised at revision: %d, current repository revision: %d)"
    LastRevNotSetToBaseRevDuringPostCommit = "the repository is out of sync (last_rev is not set to base_rev during post-commit), preventing post-commit processing of this revision; is the pre-commit hook enabled? (this revision: %d, base revision: %d, repository last synchronised at revision: %d, current repository revision: %d)"
    OutOfOrderRevisionProcessingAttempt = "unable to process repository revision %d until the base revision %d has been processed; however, the last processed revision reported by the repository is %d (current repository revision: %d)"
    RootsMissingFromBaseRevTxn = "missing or empty 'evn:roots' revision property on base revision (this transaction: %s, base revision: %d, repository last synchronised at revision: %d, current repository revision: %d)"
    RootsMissingFromBaseRevDuringPostCommit = "missing or empty 'evn:roots' revision property on base revision, preventing post-commit processing of this revision; is the pre-commit hook enabled? (this revision: %d, base revision: %d, repository last synchronised at revision: %d, current repository revision: %d)"
    ChangeSetOnlyApplicableForRev1AndHigher = "changeset only applicable to revisions 1 and higher"
    InvalidRootsForRev = "invalid value for 'evn:roots' revision property on revision (this revision: %d, base revision: %d, repository last synchronised at revision: %d, current repository revision: %d)"
    StaleTxnProbablyDueToHighLoad = "please re-try your commit -- the repository is under load and your transaction became out-of-date while it was being queued for processing (base revision for this transaction: %d, repository last synchronised at revision: %d, current repository revision: %d)"
    AbsoluteRootOfRepositoryCopied = "absolute root of repository copied"
    PropertyChangedButOldAndNewValuesAreSame = "the property '%s' is recorded as having changed, but the old value and new value are identical ('%s')"

Re: Merge policies

Posted by Trent Nelson <tr...@snakebite.org>.

On Apr 19, 2012, at 8:38 AM, Stefan Sperling wrote:

> On Thu, Apr 19, 2012 at 12:48:44PM +0200, Stefan Fuhrmann wrote:
>> Hi all,
>> 
>> After having a closer look at merge and discussing it
>> with Julian on IRC, there seems to be no silver bullet.
>> However, we identified a few things that could be changed
>> and set of constellations that make merge harder than
>> it needs to be.
>> 
>> For the first, there will be another post soon. The second
>> boils down to policy. Luckily, SVN has a mechanism to
>> enforce policies: server-side hook scripts. My proposal
>> is to develop a small set of scripts that a user can
>> combine to prevent situations that her life harder than
>> necessary.
> 
> I like the idea of providing pre-commit hook samples that each
> implement a piece of a merge-policy puzzle.
> 
> At elego we often advise customers about branching/merging strategies.
> It is true that virtually every shop has different requirements (i.e.
> there is no silver bullet). This simply stems from the fact that every
> software project is different. At the same time, a lot of shops end up
> using the same or similar strategy and sometimes reinvent the wheel.
> 
> Subversion by itself enforces virtually no restrictions, and adding
> restrictions to the system is often a major part of the work involved
> in getting a production setup up and running. Often restrictions are
> added over time as people repeatedly make mistakes that cause problems.
> 
> Having an officially supported set of high-quality hook scripts in
> our tools/ directory that are documented well and support a wide
> range of use cases in a modular fashion would help a lot. I believe
> many users would be happy to deploy a hook collection that is officially
> maintained by the Apache Subversion project rather than relying on
> self-written or third-party hook scripts.
> 
> I would definitely be interested in contributing, and maybe others
> at elego would, too. I'd be very happy to install "Apache Subversion"
> branded hooks rather then "elego" branded hooks at customer sites and
> share the development effort.
> 
> And maybe Trent can contribute some of his experience, if he has time?

As the spirit of JFDI is so rife before bed, I'm proud to announce enversion 0.0.0: https://github.com/tpn/enversion!

Quick start guide (that I sort-of just tested on my dusty ol' Macbook):

% git clone git@github.com:tpn/enversion.git
% export PYTHONPATH=`pwd`/enversion
% export PATH=$PATH:`pwd`/enversion/bin
% evnadmin          
Type 'evnadmin <subcommand> help' for help on a specific subcommand.

Available subcommands:
    analyze
    create
    disable-remote-debug (drd)
    doctest
    dump-config (dc)
    dump-default-config (ddc)
    dump-hook-code (dhc)
    enable
    enable-remote-debug (erd)
    find-merges (fm)
    fix-hooks (fh)
    root-info (ri)
    run-hook (rh)
    show-config-file-load-order (scflo)
    show-repo-hook-status (srhs)
    show-repo-remote-debug-sessions (srrds)
    show-roots (sr)
    toggle-remote-debug (trd)
    version


If you get that far, take a look at the hasn't-even-been-proof-read-yet quick start guide: https://github.com/tpn/enversion/blob/master/doc/quick-start.rst



	Trent.

Re: Merge policies

Posted by Stefan Sperling <st...@elego.de>.

On Thu, Apr 19, 2012 at 12:48:44PM +0200, Stefan Fuhrmann wrote:
> Hi all,
> 
> After having a closer look at merge and discussing it
> with Julian on IRC, there seems to be no silver bullet.
> However, we identified a few things that could be changed
> and set of constellations that make merge harder than
> it needs to be.
> 
> For the first, there will be another post soon. The second
> boils down to policy. Luckily, SVN has a mechanism to
> enforce policies: server-side hook scripts. My proposal
> is to develop a small set of scripts that a user can
> combine to prevent situations that her life harder than
> necessary.

I like the idea of providing pre-commit hook samples that each
implement a piece of a merge-policy puzzle.

At elego we often advise customers about branching/merging strategies.
It is true that virtually every shop has different requirements (i.e.
there is no silver bullet). This simply stems from the fact that every
software project is different. At the same time, a lot of shops end up
using the same or similar strategy and sometimes reinvent the wheel.

Subversion by itself enforces virtually no restrictions, and adding
restrictions to the system is often a major part of the work involved
in getting a production setup up and running. Often restrictions are
added over time as people repeatedly make mistakes that cause problems.

Having an officially supported set of high-quality hook scripts in
our tools/ directory that are documented well and support a wide
range of use cases in a modular fashion would help a lot. I believe
many users would be happy to deploy a hook collection that is officially
maintained by the Apache Subversion project rather than relying on
self-written or third-party hook scripts.

I would definitely be interested in contributing, and maybe others
at elego would, too. I'd be very happy to install "Apache Subversion"
branded hooks rather then "elego" branded hooks at customer sites and
share the development effort.

And maybe Trent can contribute some of his experience, if he has time?

Re: Merge policies

Posted by Johan Corveleyn <jc...@gmail.com>.

On Thu, Apr 19, 2012 at 2:17 PM, Julian Foad <ju...@btopenworld.com> wrote:
> Johan Corveleyn wrote:
>> I haven't read your mail in detail yet, but [...] propgetting
>> ('svnlook propget') every changed file during pre-commit can be very
>> expensive.
> [...]
>> The problem is mainly that 'svnlook propget' doesn't support recursive
>> propgetting, nor getting all props from the entire transation, or ...
>> only the props for a single path. Which means you have to invoke it
>> separately for every item that needs to be inspected. Maybe that can
>> be improved somehow?
>
> Of course that can be improved.  Let's attack the source of the problem here.  If we want to use 'svnlook' to discover what mergeinfo is changing in the proposed commit, then we can write a function/subcommand that answers that question directly.  Or a subset of it (e.g. recursive propget) or a superset of it (e.g. get higher level merge graph-change info directly), whatever is effective.
>

+1 to any of those improvements :-). Ideally, I'd like to see
something that not only helps getting mergeinfo, but also other props
to make it easier/faster to enforce autoprops (or repository-dictated
props etc ...). In addition something specifically tuned to getting
the mergeinfo answers directly might of course be even better for this
concrete usecase.

-- 
Johan

Re: Merge policies

Posted by Julian Foad <ju...@btopenworld.com>.

Johan Corveleyn wrote:
> I haven't read your mail in detail yet, but [...] propgetting
> ('svnlook propget') every changed file during pre-commit can be very
> expensive.
[...]
> The problem is mainly that 'svnlook propget' doesn't support recursive
> propgetting, nor getting all props from the entire transation, or ...
> only the props for a single path. Which means you have to invoke it
> separately for every item that needs to be inspected. Maybe that can
> be improved somehow?

Of course that can be improved.  Let's attack the source of the problem here.  If we want to use 'svnlook' to discover what mergeinfo is changing in the proposed commit, then we can write a function/subcommand that answers that question directly.  Or a subset of it (e.g. recursive propget) or a superset of it (e.g. get higher level merge graph-change info directly), whatever is effective.

- Julian

> 
> Anyway, these are just some slightly OT random thoughts.

Re: Merge policies

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Johan Corveleyn wrote on Wed, May 02, 2012 at 12:20:35 +0200:
> On Fri, Apr 20, 2012 at 10:52 PM, Stefan Fuhrmann <eq...@web.de> wrote:
> > Am 19.04.2012 14:03, schrieb Johan Corveleyn:
> 
> [snip]
> 
> >> I haven't read your mail in detail yet, but just a note in passing: in
> >> my experience (with a 1.5 server with FSFS-on-NFS backend) propgetting
> >> ('svnlook propget') every changed file during pre-commit can be very
> >> expensive.
> 
> [snip]
> 
> > Random thought: is it possible to link the transactions
> > folder to /dev/shm even if the repository is on NFS?
> 
> Ah yes, thanks for the suggestion. I should try that someday. Right
> now we're working on moving our repo to a newer server, so it'll take
> me a while to experiment with this.
> 
> While we're on the subject: do you know an easy way to do this on
> Windows? It's possible that we'll be moving our server to Windows
> (from Solaris). Not sure yet, it depends on some testing /
> benchmarking.
> 
> The reason is that I can probably get faster I/O from our NAS when
> running the server on Windows, because there I can work with iSCSI
> (which is currently not supported by our sysadmins on Solaris). And I
> think iSCSI will get me better performance than NFS (there is also a
> third option: Solaris on ESX, which makes Solaris talk to the storage
> as if it's a local disk (but underlying it's still NFS I think) ...
> don't know whether that makes it faster). To be benchmarked ...
> 
> >> - maybe propgetting through one of the bindings is much faster than
> >> through svnlook, and you intend for admins to set this up with some
> >> language bindings? In that case, keep in mind that bindings are not
> >> always installed. Svnlook usually is always there.
> >
> >
> > The problem is probably that svnlook is at a hefty
> > disadvantage over e.g. the server because it has
> > to open the FSFS repo anew for each access. Don't
> > know really what to do about that.
> 
> So the only way to get the "relevant property changes" really fast
> with svnlook, would be with a command that gets everything with one
> invocation, I guess.
> 
> Actually, there is probably a way to get all those propchanges
> quicker, by parsing the output of 'svnlook diff -t TXN'. But I've not
> done the effort to do that.
> 
> Besides, by invoking diff (and discarding the actuall text diffs) a
> lot of needless work is done (and think of those edge cases where our
> internal diff can get really slow), and needless bytes are being
> written/read.
> 
> Maybe we'd need a props-only-diff, which gives you only the prop
> changes? In some easily parseable way ...
> 
> Another alternative would be to make 'svnlook propget' support
> multiple targets, so you can just propget all paths from 'svnlook
> changed' (possibly trimmed to the Adds and PropMods) in one go.

What about some svnmucc-style interface?

svnmlcc /path/to/repos propget r1 trunk/README svn:eol-style revpropget r1 svn:author propsetf r2 svn:log =(echo danielsh)

This will open one FS handle and won't require a custom script via the
bindings.

Re: Merge policies

Posted by Johan Corveleyn <jc...@gmail.com>.

On Wed, May 2, 2012 at 12:47 PM, Philip Martin
<ph...@wandisco.com> wrote:
> Johan Corveleyn <jc...@gmail.com> writes:
>
>> On Fri, Apr 20, 2012 at 10:52 PM, Stefan Fuhrmann <eq...@web.de> wrote:
>>> Am 19.04.2012 14:03, schrieb Johan Corveleyn:
>>>> - maybe propgetting through one of the bindings is much faster than
>>>> through svnlook, and you intend for admins to set this up with some
>>>> language bindings? In that case, keep in mind that bindings are not
>>>> always installed. Svnlook usually is always there.
>>>
>>>
>>> The problem is probably that svnlook is at a hefty
>>> disadvantage over e.g. the server because it has
>>> to open the FSFS repo anew for each access. Don't
>>> know really what to do about that.
>>
>> So the only way to get the "relevant property changes" really fast
>> with svnlook, would be with a command that gets everything with one
>> invocation, I guess.
>>
>> Actually, there is probably a way to get all those propchanges
>> quicker, by parsing the output of 'svnlook diff -t TXN'. But I've not
>> done the effort to do that.
>
> As suggested above, the bindings are fast:
>
> $ svnadmin create repo
> $ svn import -mm ../src file://`pwd`/repo/trunk   # import Subversion trunk
> $ svn co file://`pwd`/repo wc
> $ svn ps -R svn:mime-type foo/bar wc
> $ svn ci -mm wc
> $ ./x.py repo 2 | wc
>   2270    2270  141635
>
> Getting the mime-type for 2270 files takes less than 1 second.
>
>
> $ cat x.py
> #!/usr/bin/python
> import sys
> from svn import repos, fs
> r=repos.open(sys.argv[1])
> f=repos.fs(r)
> #t=fs.open_txn(f, sys.argv[2])
> t=fs.revision_root(f, int(sys.argv[2]))
> p=fs.paths_changed2(t)
> for key, val in p.iteritems():
>  print key + ":" + str(fs.node_prop(t, key, 'svn:mime-type'))

Nice. Thanks for testing. That's good to know.

That said, I think an improvement for svnlook is still sorely needed.
A lot of installations out there don't have bindings installed, plus a
lot of existing (sometimes publicly available) hooks are written for
svnlook (to be as generically useful as possible). Like this very nice
configurable pre-commit hook:

https://github.com/qazwart/SVN-Precommit-Kitchen-Sink-Hook

Anyway, I don't have the cycles to work on this myself, but I just
wanted to mention this as an important aspect if more and more tools
are going to validate properties in pre-commit hooks (like for
enforcing merge policies).
-- 
Johan

Re: Merge policies

Posted by Philip Martin <ph...@wandisco.com>.

Johan Corveleyn <jc...@gmail.com> writes:

> On Fri, Apr 20, 2012 at 10:52 PM, Stefan Fuhrmann <eq...@web.de> wrote:
>> Am 19.04.2012 14:03, schrieb Johan Corveleyn:
>>> - maybe propgetting through one of the bindings is much faster than
>>> through svnlook, and you intend for admins to set this up with some
>>> language bindings? In that case, keep in mind that bindings are not
>>> always installed. Svnlook usually is always there.
>>
>>
>> The problem is probably that svnlook is at a hefty
>> disadvantage over e.g. the server because it has
>> to open the FSFS repo anew for each access. Don't
>> know really what to do about that.
>
> So the only way to get the "relevant property changes" really fast
> with svnlook, would be with a command that gets everything with one
> invocation, I guess.
>
> Actually, there is probably a way to get all those propchanges
> quicker, by parsing the output of 'svnlook diff -t TXN'. But I've not
> done the effort to do that.

As suggested above, the bindings are fast:

$ svnadmin create repo
$ svn import -mm ../src file://`pwd`/repo/trunk   # import Subversion trunk
$ svn co file://`pwd`/repo wc
$ svn ps -R svn:mime-type foo/bar wc
$ svn ci -mm wc
$ ./x.py repo 2 | wc
   2270    2270  141635

Getting the mime-type for 2270 files takes less than 1 second.


$ cat x.py
#!/usr/bin/python
import sys
from svn import repos, fs
r=repos.open(sys.argv[1])
f=repos.fs(r)
#t=fs.open_txn(f, sys.argv[2])
t=fs.revision_root(f, int(sys.argv[2]))
p=fs.paths_changed2(t)
for key, val in p.iteritems():
  print key + ":" + str(fs.node_prop(t, key, 'svn:mime-type'))



-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: Merge policies

Posted by Johan Corveleyn <jc...@gmail.com>.

On Fri, Apr 20, 2012 at 10:52 PM, Stefan Fuhrmann <eq...@web.de> wrote:
> Am 19.04.2012 14:03, schrieb Johan Corveleyn:

[snip]

>> I haven't read your mail in detail yet, but just a note in passing: in
>> my experience (with a 1.5 server with FSFS-on-NFS backend) propgetting
>> ('svnlook propget') every changed file during pre-commit can be very
>> expensive.

[snip]

> Random thought: is it possible to link the transactions
> folder to /dev/shm even if the repository is on NFS?

Ah yes, thanks for the suggestion. I should try that someday. Right
now we're working on moving our repo to a newer server, so it'll take
me a while to experiment with this.

While we're on the subject: do you know an easy way to do this on
Windows? It's possible that we'll be moving our server to Windows
(from Solaris). Not sure yet, it depends on some testing /
benchmarking.

The reason is that I can probably get faster I/O from our NAS when
running the server on Windows, because there I can work with iSCSI
(which is currently not supported by our sysadmins on Solaris). And I
think iSCSI will get me better performance than NFS (there is also a
third option: Solaris on ESX, which makes Solaris talk to the storage
as if it's a local disk (but underlying it's still NFS I think) ...
don't know whether that makes it faster). To be benchmarked ...

>> - maybe propgetting through one of the bindings is much faster than
>> through svnlook, and you intend for admins to set this up with some
>> language bindings? In that case, keep in mind that bindings are not
>> always installed. Svnlook usually is always there.
>
>
> The problem is probably that svnlook is at a hefty
> disadvantage over e.g. the server because it has
> to open the FSFS repo anew for each access. Don't
> know really what to do about that.

So the only way to get the "relevant property changes" really fast
with svnlook, would be with a command that gets everything with one
invocation, I guess.

Actually, there is probably a way to get all those propchanges
quicker, by parsing the output of 'svnlook diff -t TXN'. But I've not
done the effort to do that.

Besides, by invoking diff (and discarding the actuall text diffs) a
lot of needless work is done (and think of those edge cases where our
internal diff can get really slow), and needless bytes are being
written/read.

Maybe we'd need a props-only-diff, which gives you only the prop
changes? In some easily parseable way ...

Another alternative would be to make 'svnlook propget' support
multiple targets, so you can just propget all paths from 'svnlook
changed' (possibly trimmed to the Adds and PropMods) in one go.

-- 
Johan

Re: Merge policies

Posted by Stefan Fuhrmann <eq...@web.de>.

Am 19.04.2012 14:03, schrieb Johan Corveleyn:
> On Thu, Apr 19, 2012 at 12:48 PM, Stefan Fuhrmann
> <st...@alice-dsl.de>  wrote:
>> Hi all,
>>
>> After having a closer look at merge and discussing it
>> with Julian on IRC, there seems to be no silver bullet.
>> However, we identified a few things that could be changed
>> and set of constellations that make merge harder than
>> it needs to be.
>>
>> For the first, there will be another post soon. The second
>> boils down to policy. Luckily, SVN has a mechanism to
>> enforce policies: server-side hook scripts. My proposal
>> is to develop a small set of scripts that a user can
>> combine to prevent situations that her life harder than
>> necessary. This should give us enough time to improve
>> the merge logic inside the SVN libs.
>>
>> The following pre-commit scripts / policies would be useful.
>>
>> * Common parts [not a policy]
>>   We first check whether the commit contains a changed
>>   svn:merge-info property. This limits the performance
>>   impact on non-merge commits and we need to identify
>>   all changed svn:merge-info anyway.
>>
>>   Also, the merges that happened on the source branch
>>   from a different location than the target branch are
>>   of no interest for the policy checkers. E.g.:
>>
>>   r20: merge r19 from ^/sub-branch to ^/branch
>>   txn: merge r10-20 from ^/branch to ^/trunk
>>   Both merges will show up in the merge-info delta but
>>   we only need to evaluate the second one.
> I haven't read your mail in detail yet, but just a note in passing: in
> my experience (with a 1.5 server with FSFS-on-NFS backend) propgetting
> ('svnlook propget') every changed file during pre-commit can be very
> expensive. I ran into this with a script for enforcing our autoprops
> (like svn:eol-style=native on some files), with a combination of
> 'svnlook changed' and 'svnlook propget'. On our (admittedly not very
> fast) server, users started running into client-side timeouts because
> of this, when they committed 300 or so changed files (which happens
> easily when syncing branches). The default client-side timeout for
> neon is 30 seconds (which can be increased in the 'servers' file), and
> if the pre-commit hook takes longer than that, you're screwed.
>
> So I've optimized the pre-commit validation script to only 'propget'
> files from 'svnlook changed' which were either Added ('A'), or which
> had property modifications ('*U'). So modified files and deleted files
> don't have to be propgotten. I suppose this set of pre-commit scripts
> can use the same optimization, so it only scales with the amount of
> added and prop-changed files/dirs in the transaction.
>
> But even then, someday a user comes along that wants to add 60.000
> files (really). He's willing to split it up into commits of say 5000
> files each. The pre-commit hook takes approx 2 hours to validate this,
> so he has to set his client-side timeout very high while doing this
> (after first losing a couple of days before he figures out it's time
> to contact an admin).

Thanks for giving some hard performance numbers and
your approach to mitigate bottlenecks. As you know, I'm
constantly working on eliminating those ...
> Maybe some of this is a moot point, because:
>
> - svn+FSFS+NFS might have gotten much faster at getting props??

Random thought: is it possible to link the transactions
folder to /dev/shm even if the repository is on NFS?
> - maybe propgetting through one of the bindings is much faster than
> through svnlook, and you intend for admins to set this up with some
> language bindings? In that case, keep in mind that bindings are not
> always installed. Svnlook usually is always there.

The problem is probably that svnlook is at a hefty
disadvantage over e.g. the server because it has
to open the FSFS repo anew for each access. Don't
know really what to do about that.
> - maybe you just expect the infrastructure to be much faster?
>
>
> The problem is mainly that 'svnlook propget' doesn't support recursive
> propgetting, nor getting all props from the entire transation, or ...
> only the props for a single path. Which means you have to invoke it
> separately for every item that needs to be inspected. Maybe that can
> be improved somehow?
>
> Anyway, these are just some slightly OT random thoughts.
>
Still a good post ;)

-- Stefan^2.

Re: Merge policies

Posted by Johan Corveleyn <jc...@gmail.com>.

On Thu, Apr 19, 2012 at 12:48 PM, Stefan Fuhrmann
<st...@alice-dsl.de> wrote:
> Hi all,
>
> After having a closer look at merge and discussing it
> with Julian on IRC, there seems to be no silver bullet.
> However, we identified a few things that could be changed
> and set of constellations that make merge harder than
> it needs to be.
>
> For the first, there will be another post soon. The second
> boils down to policy. Luckily, SVN has a mechanism to
> enforce policies: server-side hook scripts. My proposal
> is to develop a small set of scripts that a user can
> combine to prevent situations that her life harder than
> necessary. This should give us enough time to improve
> the merge logic inside the SVN libs.
>
> The following pre-commit scripts / policies would be useful.
>
> * Common parts [not a policy]
>  We first check whether the commit contains a changed
>  svn:merge-info property. This limits the performance
>  impact on non-merge commits and we need to identify
>  all changed svn:merge-info anyway.
>
>  Also, the merges that happened on the source branch
>  from a different location than the target branch are
>  of no interest for the policy checkers. E.g.:
>
>  r20: merge r19 from ^/sub-branch to ^/branch
>  txn: merge r10-20 from ^/branch to ^/trunk
>  Both merges will show up in the merge-info delta but
>  we only need to evaluate the second one.

I haven't read your mail in detail yet, but just a note in passing: in
my experience (with a 1.5 server with FSFS-on-NFS backend) propgetting
('svnlook propget') every changed file during pre-commit can be very
expensive. I ran into this with a script for enforcing our autoprops
(like svn:eol-style=native on some files), with a combination of
'svnlook changed' and 'svnlook propget'. On our (admittedly not very
fast) server, users started running into client-side timeouts because
of this, when they committed 300 or so changed files (which happens
easily when syncing branches). The default client-side timeout for
neon is 30 seconds (which can be increased in the 'servers' file), and
if the pre-commit hook takes longer than that, you're screwed.

So I've optimized the pre-commit validation script to only 'propget'
files from 'svnlook changed' which were either Added ('A'), or which
had property modifications ('*U'). So modified files and deleted files
don't have to be propgotten. I suppose this set of pre-commit scripts
can use the same optimization, so it only scales with the amount of
added and prop-changed files/dirs in the transaction.

But even then, someday a user comes along that wants to add 60.000
files (really). He's willing to split it up into commits of say 5000
files each. The pre-commit hook takes approx 2 hours to validate this,
so he has to set his client-side timeout very high while doing this
(after first losing a couple of days before he figures out it's time
to contact an admin).

Maybe some of this is a moot point, because:

- svn+FSFS+NFS might have gotten much faster at getting props??

- maybe propgetting through one of the bindings is much faster than
through svnlook, and you intend for admins to set this up with some
language bindings? In that case, keep in mind that bindings are not
always installed. Svnlook usually is always there.

- maybe you just expect the infrastructure to be much faster?

The problem is mainly that 'svnlook propget' doesn't support recursive
propgetting, nor getting all props from the entire transation, or ...
only the props for a single path. Which means you have to invoke it
separately for every item that needs to be inspected. Maybe that can
be improved somehow?

Anyway, these are just some slightly OT random thoughts.

-- 
Johan

Re: Merge policies

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.

Am 19.04.2012 16:02, schrieb Julian Foad:
> Stefan Fuhrmann wrote:
>
>> The following pre-commit scripts / policies would be useful.
>>
>> * Common parts [not a policy]
>>    We first check whether the commit contains a changed
>>    svn:merge-info property. This limits the performance
>>    impact on non-merge commits and we need to identify
>>    all changed svn:merge-info anyway.
>>
>>    Also, the merges that happened on the source branch
>>    from a different location than the target branch are
>>    of no interest for the policy checkers. E.g.:
>>
>>    r20: merge r19 from ^/sub-branch to ^/branch
>>    txn: merge r10-20 from ^/branch to ^/trunk
>>    Both merges will show up in the merge-info delta but
>>    we only need to evaluate the second one.
> It took me a few minutes to realize that you mean
Sorry for that! I often work out details of my ideas
while I'm writing them down. So, the presentation
isn't as refined as it could be ...

> the hook is analyzing a commit to ^/trunk and is seeing mergeinfo additions that indicate both a merge from ^/sub-branch and a merge from ^/branch.  And in this example the merge from ^/sub-branch was in fact a merge to ^/branch, which was committed in r20, and which is now propagating to ^/trunk.
>
> In order for the hook to figure that out, starting from just looking at the mergeinfo change (which doesn't say what the target of each merge was), it has to examine revisions 10-20 of ^branch and notice that one of those revisions (r20, in fact) was the other merge that it's considering.
>
> That's OK, it can do that; I'm just clarifying how it can figure out which merges "happened on the source branch".

And it might get costly. But that remains to be seen.
> I'll assume the admin can configure each of these policies to apply differently on different branches.  That's mostly just a matter of inventing a suitable configuration scheme, presumably with branch name pattern matching or similar, and that would be totally necessary in a large project.

Agreed. There could also be user-based exceptions,
i.e. experts might get full freedom as a fallback option.
>> * No distributive merges
>>    For each path being merged (i.e. having a merge-info
>>    delta), the relative paths in source and target must
>>    correspond (i.e. start as the same and then may get
>>    renamed etc.). This is basically the same as the
>>    "sophisticated" part of the check for strict merges.
> In other words: in any subtree merge (that is, when a subtree of the branch root is being merged separately or differently from its parent or the branch root), the 'source-left' subtree and the 'target' subtree must be traceable back to a common ancestor.  In the absence of any renames or replacements inside either branch, this would mean that the two subtrees have the same relpath within each respective branch.
>
> What we're rejecting here is any attempt such as:
>
>    mkdir trunk/subtree1 trunk/subtree2
>    copy  trunk branch
>    merge "trunk/subtree1" into "branch/subtree2"

Correct. And some users might do that accidentally
depending on the size / complexity of the source tree.
> Agreed, that policy is both almost universally useful, and is required from a functional point of view when we're doing "automatic" merges because the merge code needs to be able to trace them toward a common ancestor.
>
> So I think such a merge would be rejected by the current (1.7) automatic-merge code already.  (Haven't tried.)
>
> Note: We also support a non-automatic merge, called a "2-URL" merge, where merge tracking is not attempted, and that (as I understand it) may allow such a merge.

Basically a merge between unrelated paths.
>> * No criss-crossing
>>    Prevent situations like the criss-cross examples here:
>>    http://wiki.apache.org/subversion/SymmetricMerge
>>
>>    For a merge A->B, abort if there has been a merge
>>    B->A after the last revision of A to be merged to B.
> Agreed.  I believe getting into a criss-cross situation would very rarely be intentional and would in general make the next merge more difficult than it would otherwise be.
>
>>    This only valid for non-cherry-picking merges and
>>    only if the change sets of both merges overlap.
> Yes; I haven't fully grokked the detail, but I agree there's no need to stop cherry-picks crossing over.
>
>> Except for the last one those checks will simply verify
>> that the user followed certain policies. They should,
>> therefore, rarely reject a commit.
> Why do you suggest "no criss-crossing" is an exception?  It seems also to be just verifying that the user followed certain policies.

Criss-crossing is hard to prevent by pure policy.
As soon as more than one user can do merges
(e.g. one pulling in from the branch while another
updates that branch), those processes may overlap
in time. That causes a criss-crossing merge graph.

-- Stefan^2.

Re: Merge policies

Posted by Julian Foad <ju...@btopenworld.com>.

Stefan Fuhrmann wrote:

> The following pre-commit scripts / policies would be useful.
> 
> * Common parts [not a policy]
>   We first check whether the commit contains a changed
>   svn:merge-info property. This limits the performance
>   impact on non-merge commits and we need to identify
>   all changed svn:merge-info anyway.
> 
>   Also, the merges that happened on the source branch
>   from a different location than the target branch are
>   of no interest for the policy checkers. E.g.:
> 
>   r20: merge r19 from ^/sub-branch to ^/branch
>   txn: merge r10-20 from ^/branch to ^/trunk
>   Both merges will show up in the merge-info delta but
>   we only need to evaluate the second one.

It took me a few minutes to realize that you mean the hook is analyzing a commit to ^/trunk and is seeing mergeinfo additions that indicate both a merge from ^/sub-branch and a merge from ^/branch.  And in this example the merge from ^/sub-branch was in fact a merge to ^/branch, which was committed in r20, and which is now propagating to ^/trunk.

In order for the hook to figure that out, starting from just looking at the mergeinfo change (which doesn't say what the target of each merge was), it has to examine revisions 10-20 of ^branch and notice that one of those revisions (r20, in fact) was the other merge that it's considering.

That's OK, it can do that; I'm just clarifying how it can figure out which merges "happened on the source branch".

I'll assume the admin can configure each of these policies to apply differently on different branches.  That's mostly just a matter of inventing a suitable configuration scheme, presumably with branch name pattern matching or similar, and that would be totally necessary in a large project.

> * Strict merge hierarchy
>   A merge from A->B is only allowed, if the copy-from
>   of A is B or vice versa and the copy source has not
>   been replaced since the copy). This prevents circular
>   merges and others (note 1).

Agreed, that policy would often be useful, even in that simple form ...

>   In a more sophisticated implementation, we could identify /
>   allow for renamed branches as well as A and B having
>   the same relative path to some parents that form a
>   direct branch (i.e. allow sub-tree merges).

... and very widely useful with those additions.

> * No sub-tree merges
>   Like the above but without the check for parents.

Agreed: that policy would often be useful.

> * No aggregate merges
>   There must only be one source branch, i.e. we can't
>   merge from branches A and B to C in the same revision.

Agreed: that policy would often be useful.

> * No distributive merges
>   For each path being merged (i.e. having a merge-info
>   delta), the relative paths in source and target must
>   correspond (i.e. start as the same and then may get
>   renamed etc.). This is basically the same as the
>   "sophisticated" part of the check for strict merges.

In other words: in any subtree merge (that is, when a subtree of the branch root is being merged separately or differently from its parent or the branch root), the 'source-left' subtree and the 'target' subtree must be traceable back to a common ancestor.  In the absence of any renames or replacements inside either branch, this would mean that the two subtrees have the same relpath within each respective branch.

What we're rejecting here is any attempt such as:

  mkdir trunk/subtree1 trunk/subtree2
  copy  trunk branch
  merge "trunk/subtree1" into "branch/subtree2"

Agreed, that policy is both almost universally useful, and is required from a functional point of view when we're doing "automatic" merges because the merge code needs to be able to trace them toward a common ancestor.

So I think such a merge would be rejected by the current (1.7) automatic-merge code already.  (Haven't tried.)

Note: We also support a non-automatic merge, called a "2-URL" merge, where merge tracking is not attempted, and that (as I understand it) may allow such a merge.

> * No cherry picking
>   Check that the source branch does not contain revisions
>   that lie before the last to-be-merged revision but
>   have neither been merged before nor are being merged
>   right now.

Agreed: that's sometimes useful. Maybe not so often as most of the other suggestions.

> * No criss-crossing
>   Prevent situations like the criss-cross examples here:
>   http://wiki.apache.org/subversion/SymmetricMerge
> 
>   For a merge A->B, abort if there has been a merge
>   B->A after the last revision of A to be merged to B.

Agreed.  I believe getting into a criss-cross situation would very rarely be intentional and would in general make the next merge more difficult than it would otherwise be.

>   This only valid for non-cherry-picking merges and
>   only if the change sets of both merges overlap.

Yes; I haven't fully grokked the detail, but I agree there's no need to stop cherry-picks crossing over.

> Except for the last one those checks will simply verify
> that the user followed certain policies. They should,
> therefore, rarely reject a commit.

Why do you suggest "no criss-crossing" is an exception?  It seems also to be just verifying that the user followed certain policies.

> Again, the user shall be free to combine (or not use)
> these policies although not all combinations are meaningful.
> 
> Thoughts?
> 
> -- Stefan^2.
> 
> 
> Note 1:
> 
>   One thing that we might want to support is integration
>   branches where a temporary branch is being used as
>   an intermediate merge target:
> 
>   integrate A->B as
>   rN: copy B->A_integration
>   rN+1: merge A->A_integration
>   rN+x: ... various changes on A_integration
>   rN+y: merge A_integration->B
>   rN+y+1: delete A_integration

That's a classic case of merging among three branches.  After doing that integration, the user will then expect to be able to merge A->B and B->A, and later do another similar integration via a new branch, B -> A_integration_2 -> A.

The only way in which I can see this scenario can be simpler than the general case, is that it would be reasonable to disallow A->B and B->A merges during the time while B->integ->A is happening.

>   These checks become more complicated, requires
>   naming conventions for the integration branches etc.

- Julian

Re: Merge policies

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Paul Burba wrote on Wed, May 02, 2012 at 11:42:18 -0400:
> And for some reason the spacing of ASCII diagram I posted in [2] is
> wrong on svn.haxx.se, it's a lot easier to understand here:
> 

svn.haxx.se gets the spacing correctly in the HTML source view, just not
in the rendered view.

(Also, the HTML source has the message-ID somewhere --- which facilitates
constructing the URL to the same message in mail-archives.a.o and gmane :))

> http://mail-archives.apache.org/mod_mbox/subversion-users/201011.mbox/%3CAANLkTin0xJwxg78rU-83QafOt-bcfgML_pB2AHWt2z1s@mail.gmail.com%3E

Re: Merge policies

Posted by Paul Burba <pt...@gmail.com>.

On Wed, May 2, 2012 at 11:24 AM, Johan Corveleyn <jc...@gmail.com> wrote:
> On Thu, Apr 19, 2012 at 12:48 PM, Stefan Fuhrmann
> <st...@alice-dsl.de> wrote:
>
> [snip]
>
>> The following pre-commit scripts / policies would be useful.
>>
>> * Common parts [not a policy]
>>  We first check whether the commit contains a changed
>>  svn:merge-info property. This limits the performance
>>  impact on non-merge commits and we need to identify
>>  all changed svn:merge-info anyway.
>>
>>  Also, the merges that happened on the source branch
>>  from a different location than the target branch are
>>  of no interest for the policy checkers. E.g.:
>>
>>  r20: merge r19 from ^/sub-branch to ^/branch
>>  txn: merge r10-20 from ^/branch to ^/trunk
>>  Both merges will show up in the merge-info delta but
>>  we only need to evaluate the second one.
>>
>> * Strict merge hierarchy
>>  A merge from A->B is only allowed, if the copy-from
>>  of A is B or vice versa and the copy source has not
>>  been replaced since the copy). This prevents circular
>>  merges and others (note 1).
>>
>>  In a more sophisticated implementation, we could identify /
>>  allow for renamed branches as well as A and B having
>>  the same relative path to some parents that form a
>>  direct branch (i.e. allow sub-tree merges).
>>
>> * No sub-tree merges
>>  Like the above but without the check for parents.
>
> I just remembered a users@ discussion from way back: "URL-only renames
> adds svn:mergeinfo property" [1]. Paul gave a very interesting
> explanation in that thread as to why URL-[WC|URL] copies/moves
> currently generate explicit mergeinfo [2] (has something to do with
> the fact that merge tracking doesn't "follow" copies/moves).

And that explanation still describes how I feel on that matter, i.e.
ambivalent and open to other alternatives.

And for some reason the spacing of ASCII diagram I posted in [2] is
wrong on svn.haxx.se, it's a lot easier to understand here:

http://mail-archives.apache.org/mod_mbox/subversion-users/201011.mbox/%3CAANLkTin0xJwxg78rU-83QafOt-bcfgML_pB2AHWt2z1s@mail.gmail.com%3E

Paul

> Just thought I'd mention this here, because that's one of the ways
> people can currently (inadvertently) generate subtree mergeinfo. This
> can happen when simply performing a server-side rename of a file (for
> instance to perform a case-only rename --- prior to 1.7 this couldn't
> be done client-side on a case-insensitive platform, so this was one of
> the workarounds mentioned in the FAQ).
>
> Don't know if this is relevant to this discussion, so just ignore this
> if it isn't :-).
>
> --
> Johan
>
> [1] http://svn.haxx.se/users/archive-2010-11/0357.shtml
> [2] http://svn.haxx.se/users/archive-2010-11/0466.shtml

Re: Merge policies

Posted by Johan Corveleyn <jc...@gmail.com>.

On Thu, Apr 19, 2012 at 12:48 PM, Stefan Fuhrmann
<st...@alice-dsl.de> wrote:

[snip]

> The following pre-commit scripts / policies would be useful.
>
> * Common parts [not a policy]
>  We first check whether the commit contains a changed
>  svn:merge-info property. This limits the performance
>  impact on non-merge commits and we need to identify
>  all changed svn:merge-info anyway.
>
>  Also, the merges that happened on the source branch
>  from a different location than the target branch are
>  of no interest for the policy checkers. E.g.:
>
>  r20: merge r19 from ^/sub-branch to ^/branch
>  txn: merge r10-20 from ^/branch to ^/trunk
>  Both merges will show up in the merge-info delta but
>  we only need to evaluate the second one.
>
> * Strict merge hierarchy
>  A merge from A->B is only allowed, if the copy-from
>  of A is B or vice versa and the copy source has not
>  been replaced since the copy). This prevents circular
>  merges and others (note 1).
>
>  In a more sophisticated implementation, we could identify /
>  allow for renamed branches as well as A and B having
>  the same relative path to some parents that form a
>  direct branch (i.e. allow sub-tree merges).
>
> * No sub-tree merges
>  Like the above but without the check for parents.

I just remembered a users@ discussion from way back: "URL-only renames
adds svn:mergeinfo property" [1]. Paul gave a very interesting
explanation in that thread as to why URL-[WC|URL] copies/moves
currently generate explicit mergeinfo [2] (has something to do with
the fact that merge tracking doesn't "follow" copies/moves).

Just thought I'd mention this here, because that's one of the ways
people can currently (inadvertently) generate subtree mergeinfo. This
can happen when simply performing a server-side rename of a file (for
instance to perform a case-only rename --- prior to 1.7 this couldn't
be done client-side on a case-insensitive platform, so this was one of
the workarounds mentioned in the FAQ).

Don't know if this is relevant to this discussion, so just ignore this
if it isn't :-).

-- 
Johan

[1] http://svn.haxx.se/users/archive-2010-11/0357.shtml
[2] http://svn.haxx.se/users/archive-2010-11/0466.shtml

Re: Merge policies

Posted by Branko Čibej <br...@apache.org>.

On 19.04.2012 16:02, Paul Burba wrote:
> Just one thing I'm wondering about:
>> This should give us enough time to improve
>> the merge logic inside the SVN libs.
> Which improvements are these exactly?  What is documented in
> http://wiki.apache.org/subversion/SymmetricMerge?action=show&redirect=SvnMergeTheory
> or something more?
>
> Also curious about your mention of "Enough time", is there a deadline
> your group is under?

I missed this part before ... I'd like to point out that a developing a
set of optional hooks that add restrictions to merging will not, in
fact, give you any more time or let you drop support for edge cases from
the merge algorithm (implementation). The key word here is "optional",
so all you actually get is time /spent/ for developing the hooks, you
can never force everyone to install all your restrictive hooks.

-- Brane

Re: Merge policies

Posted by Paul Burba <pt...@gmail.com>.

On Thu, Apr 19, 2012 at 6:48 AM, Stefan Fuhrmann
<st...@alice-dsl.de> wrote:
> Hi all,
>
> After having a closer look at merge and discussing it
> with Julian on IRC, there seems to be no silver bullet.
> However, we identified a few things that could be changed
> and set of constellations that make merge harder than
> it needs to be.
>
> For the first, there will be another post soon. The second
> boils down to policy. Luckily, SVN has a mechanism to
> enforce policies: server-side hook scripts. My proposal
> is to develop a small set of scripts that a user can
> combine to prevent situations that her life harder than
> necessary.

Hi Stefan,

No objections to developing a set of hook scripts to optionally
enforce different merge policies; that seems completely reasonable
(I'm not expressing an opinion one way or the other regarding the
details of the particular scripts).

Just one thing I'm wondering about:

> This should give us enough time to improve
> the merge logic inside the SVN libs.

Which improvements are these exactly?  What is documented in
http://wiki.apache.org/subversion/SymmetricMerge?action=show&redirect=SvnMergeTheory
or something more?

Also curious about your mention of "Enough time", is there a deadline
your group is under?

Paul

> The following pre-commit scripts / policies would be useful.
>
> * Common parts [not a policy]
>  We first check whether the commit contains a changed
>  svn:merge-info property. This limits the performance
>  impact on non-merge commits and we need to identify
>  all changed svn:merge-info anyway.
>
>  Also, the merges that happened on the source branch
>  from a different location than the target branch are
>  of no interest for the policy checkers. E.g.:
>
>  r20: merge r19 from ^/sub-branch to ^/branch
>  txn: merge r10-20 from ^/branch to ^/trunk
>  Both merges will show up in the merge-info delta but
>  we only need to evaluate the second one.
>
> * Strict merge hierarchy
>  A merge from A->B is only allowed, if the copy-from
>  of A is B or vice versa and the copy source has not
>  been replaced since the copy). This prevents circular
>  merges and others (note 1).
>
>  In a more sophisticated implementation, we could identify /
>  allow for renamed branches as well as A and B having
>  the same relative path to some parents that form a
>  direct branch (i.e. allow sub-tree merges).
>
> * No sub-tree merges
>  Like the above but without the check for parents.
>
> * No aggregate merges
>  There must only be one source branch, i.e. we can't
>  merge from branches A and B to C in the same revision.
>
> * No distributive merges
>  For each path being merged (i.e. having a merge-info
>  delta), the relative paths in source and target must
>  correspond (i.e. start as the same and then may get
>  renamed etc.). This is basically the same as the
>  "sophisticated" part of the check for strict merges.
>
> * No cherry picking
>  Check that the source branch does not contain revisions
>  that lie before the last to-be-merged revision but
>  have neither been merged before nor are being merged
>  right now.
>
> * No criss-crossing
>  Prevent situations like the criss-cross examples here:
>  http://wiki.apache.org/subversion/SymmetricMerge
>
>  For a merge A->B, abort if there has been a merge
>  B->A after the last revision of A to be merged to B.
>  This only valid for non-cherry-picking merges and
>  only if the change sets of both merges overlap.
>
> Except for the last one those checks will simply verify
> that the user followed certain policies. They should,
> therefore, rarely reject a commit.
>
> Again, the user shall be free to combine (or not use)
> these policies although not all combinations are meaningful.
>
> Thoughts?
>
> -- Stefan^2.
>
>
> Note 1:
>
>  One thing that we might want to support is integration
>  branches where a temporary branch is being used as
>  an intermediate merge target:
>
>  integrate A->B as
>  rN: copy B->A_integration
>  rN+1: merge A->A_integration
>  rN+x: ... various changes on A_integration
>  rN+y: merge A_integration->B
>  rN+y+1: delete A_integration
>
>  These checks become more complicated, requires
>  naming conventions for the integration branches etc.

Re: Merge policies

Posted by Stefan Fuhrmann <eq...@web.de>.

Julian Foad wrote:
> Stefan,
>
> What I understand you're saying is:
>
> Here are some bits of merging policy, which users quite commonly may wish to enforce for some of their branches.  And in particular these are bits of policy that could perhaps be enforced by means of hook scripts.  If we help to create a bunch of such hooks, then users (and their admins) could more easily set up some merge policy enforcement which would help them to avoid accidents such as a user merging onto a different branch than he thought it was.
>
> Is that right?  Sounds useful to me.  (Haven't studied the details yet.)
>
> - Julian

This is exactly correct. The post was motivated by
people now and then proposing that "if we limited
SVN merge this way or the other, things will be
much simpler". Limiting SVN's capabilities does not
appeal to me but I'd like to offer some guidelines
that people may adopt at their own choosing.

> Branko Čibej wrote:
>
>> On 19.04.2012 12:48, Stefan Fuhrmann wrote:
>>>     Also, the merges that happened on the source branch
>>>     from a different location than the target branch are
>>>     of no interest for the policy checkers. E.g.:
>>>
>>>     r20: merge r19 from ^/sub-branch to ^/branch
>>>     txn: merge r10-20 from ^/branch to ^/trunk
>>>     Both merges will show up in the merge-info delta but
>>>     we only need to evaluate the second one.
>> Ignoring aggregate mergeinfo strikes me as an over-simplification. I
>> can't point my finger at why right now, but the idea makes me
>> uncomfortable ...

I'm not ignoring it. This is rather an implementation
note. The merge from ^/sub-branch into ^/branch will
show up in the merge-info delta of the current merge
attempt from ^/branch to ^/trunk. Since the sub-branch
merge already took place, there is nothing left to be
checked for validity.

>>>   * Strict merge hierarchy
>>>     A merge from A->B is only allowed, if the copy-from
>>>     of A is B or vice versa and the copy source has not
>>>     been replaced since the copy). This prevents circular
>>>     merges and others (note 1).
>> This would prevent merges between sibling branches (e.g., branch A->B
>> and A->C then merge B->C). This situation is rather common, for example,
>> it happens in our release branch maintenance workflow every time we have
>> to create a backport branch for some non-trivial change. You cover that
>> in (note 1) but also mention naming conventions, which makes me want to
>> light fuses on some very big firecrackers. ...

Yes, this is a weak point. A strict branch (and merge)
hierarchy is very useful, in my experience. But there
must be some formalized exception to the rule such that
the rule may still be enforced. One way to do that may
be a user-defined path regex or a limitation to specific
users etc.

>>>   * No sub-tree merges
>>>     Like the above but without the check for parents.
>> The "like the above" had me scratching my head for a while there. I
>> always have to remember that Subversion doesn't really have branches ...
>> never mind. :)

I (partly) disagree. Subversion's semantics of branches
being a sub-set of copies is correct. The only problem
I see is that other sub-sets like moves and splits are
not recognized as such.

>> Other than that, I think this is a reasonable restriction, as long as it
>> doesn't prevent people from merging branches that already have subtree
>> mergeinfo.

The scripts may need to handle / ignore deleted mergeinfos
on sub-tree nodes in that case.

>>>   * No aggregate merges
>>>     There must only be one source branch, i.e. we can't
>>>     merge from branches A and B to C in the same revision.
>> I can't think of a case where splitting such a merge into two commits
>> would change the semantics of the operation, so +1.
>>
>>>   * No distributive merges
>>>     For each path being merged (i.e. having a merge-info
>>>     delta), the relative paths in source and target must
>>>     correspond (i.e. start as the same and then may get
>>>     renamed etc.). This is basically the same as the
>>>     "sophisticated" part of the check for strict merges.
>> I don't understand this case. Can you elaborate with a more detailed
>> example, please?

This covers a case where the user probably accidentally
merged files or sub-trees that don't correspond to each
other. E.g. merge from ^/branch/path/foo into unrelated
file ^/trunk/foo.

Maybe, this should be called "unrelated merges".

>>>   * No cherry picking
>>>     Check that the source branch does not contain revisions
>>>     that lie before the last to-be-merged revision but
>>>     have neither been merged before nor are being merged
>>>     right now.
>> I'm not sure this is a good restriction to have. Cherry-picking is an
>> almost indispensable merging tool. We use it all the time when
>> backporting changes to maintenance branches. Since the merge algorithm
>> will have to account for cherry-picks in any case, what benefit would
>> the user get from enabling such a restriction?

It may simplify future merges, i.e. reduce the risk of
conflicts.

But I agree that this is a very restrictive policy.
Again, I'm not proposing to limit the SVN library
implementation in any way. If the user wants to apply
restrictions, server-side hooks seem to be the way
to go.

-- Stefan^2.

Re: Merge policies

Posted by Julian Foad <ju...@btopenworld.com>.

Stefan,

What I understand you're saying is:

Here are some bits of merging policy, which users quite commonly may wish to enforce for some of their branches.  And in particular these are bits of policy that could perhaps be enforced by means of hook scripts.  If we help to create a bunch of such hooks, then users (and their admins) could more easily set up some merge policy enforcement which would help them to avoid accidents such as a user merging onto a different branch than he thought it was.

Is that right?  Sounds useful to me.  (Haven't studied the details yet.)

- Julian


Branko Čibej wrote:

> On 19.04.2012 12:48, Stefan Fuhrmann wrote:
>>    Also, the merges that happened on the source branch
>>    from a different location than the target branch are
>>    of no interest for the policy checkers. E.g.:
>> 
>>    r20: merge r19 from ^/sub-branch to ^/branch
>>    txn: merge r10-20 from ^/branch to ^/trunk
>>    Both merges will show up in the merge-info delta but
>>    we only need to evaluate the second one.
> 
> Ignoring aggregate mergeinfo strikes me as an over-simplification. I
> can't point my finger at why right now, but the idea makes me
> uncomfortable ...
> 
>>  * Strict merge hierarchy
>>    A merge from A->B is only allowed, if the copy-from
>>    of A is B or vice versa and the copy source has not
>>    been replaced since the copy). This prevents circular
>>    merges and others (note 1).
> 
> This would prevent merges between sibling branches (e.g., branch A->B
> and A->C then merge B->C). This situation is rather common, for example,
> it happens in our release branch maintenance workflow every time we have
> to create a backport branch for some non-trivial change. You cover that
> in (note 1) but also mention naming conventions, which makes me want to
> light fuses on some very big firecrackers. ...
> 
>>  * No sub-tree merges
>>    Like the above but without the check for parents.
> 
> The "like the above" had me scratching my head for a while there. I
> always have to remember that Subversion doesn't really have branches ...
> never mind. :)
> 
> Other than that, I think this is a reasonable restriction, as long as it
> doesn't prevent people from merging branches that already have subtree
> mergeinfo.
> 
>>  * No aggregate merges
>>    There must only be one source branch, i.e. we can't
>>    merge from branches A and B to C in the same revision.
> 
> I can't think of a case where splitting such a merge into two commits
> would change the semantics of the operation, so +1.
> 
>>  * No distributive merges
>>    For each path being merged (i.e. having a merge-info
>>    delta), the relative paths in source and target must
>>    correspond (i.e. start as the same and then may get
>>    renamed etc.). This is basically the same as the
>>    "sophisticated" part of the check for strict merges.
> 
> I don't understand this case. Can you elaborate with a more detailed
> example, please?
> 
>>  * No cherry picking
>>    Check that the source branch does not contain revisions
>>    that lie before the last to-be-merged revision but
>>    have neither been merged before nor are being merged
>>    right now.
> 
> I'm not sure this is a good restriction to have. Cherry-picking is an
> almost indispensable merging tool. We use it all the time when
> backporting changes to maintenance branches. Since the merge algorithm
> will have to account for cherry-picks in any case, what benefit would
> the user get from enabling such a restriction?
> 
> -- Brane
>

Re: Merge policies

Posted by Branko Čibej <br...@apache.org>.

On 19.04.2012 12:48, Stefan Fuhrmann wrote:
>   Also, the merges that happened on the source branch
>   from a different location than the target branch are
>   of no interest for the policy checkers. E.g.:
>
>   r20: merge r19 from ^/sub-branch to ^/branch
>   txn: merge r10-20 from ^/branch to ^/trunk
>   Both merges will show up in the merge-info delta but
>   we only need to evaluate the second one.

Ignoring aggregate mergeinfo strikes me as an over-simplification. I
can't point my finger at why right now, but the idea makes me
uncomfortable ...

> * Strict merge hierarchy
>   A merge from A->B is only allowed, if the copy-from
>   of A is B or vice versa and the copy source has not
>   been replaced since the copy). This prevents circular
>   merges and others (note 1).

This would prevent merges between sibling branches (e.g., branch A->B
and A->C then merge B->C). This situation is rather common, for example,
it happens in our release branch maintenance workflow every time we have
to create a backport branch for some non-trivial change. You cover that
in (note 1) but also mention naming conventions, which makes me want to
light fuses on some very big firecrackers. ...

> * No sub-tree merges
>   Like the above but without the check for parents.

The "like the above" had me scratching my head for a while there. I
always have to remember that Subversion doesn't really have branches ...
never mind. :)

Other than that, I think this is a reasonable restriction, as long as it
doesn't prevent people from merging branches that already have subtree
mergeinfo.

> * No aggregate merges
>   There must only be one source branch, i.e. we can't
>   merge from branches A and B to C in the same revision.

I can't think of a case where splitting such a merge into two commits
would change the semantics of the operation, so +1.

> * No distributive merges
>   For each path being merged (i.e. having a merge-info
>   delta), the relative paths in source and target must
>   correspond (i.e. start as the same and then may get
>   renamed etc.). This is basically the same as the
>   "sophisticated" part of the check for strict merges.

I don't understand this case. Can you elaborate with a more detailed
example, please?

> * No cherry picking
>   Check that the source branch does not contain revisions
>   that lie before the last to-be-merged revision but
>   have neither been merged before nor are being merged
>   right now.

I'm not sure this is a good restriction to have. Cherry-picking is an
almost indispensable merging tool. We use it all the time when
backporting changes to maintenance branches. Since the merge algorithm
will have to account for cherry-picks in any case, what benefit would
the user get from enabling such a restriction?

-- Brane