You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Paul Burba <pt...@gmail.com> on 2013/02/27 22:21:50 UTC

Problems Reintegrating the fsfs-format7 branch (Was: FSFS format7 status and first results)

(Stefan - If you don't have time to read all this please at least take
a look at the short questions at the very end)

On Mon, Feb 18, 2013 at 11:54 AM, Mark Phippard <ma...@gmail.com> wrote:

> BTW, how are you managing your branch?  I tried merging it back to
> trunk to get an idea on the diff and there were a lot of text and tree
> conflicts.  I thought I had seen you doing synch merges from trunk in
> the past so I assumed it would merge back.

On Thu, Feb 21, 2013 at 5:05 AM, Stefan Fuhrmann
<st...@wandisco.com> wrote:

> Hm. I split fsfs.c and .h into multiple files on the
> branch and the first merge from /trunk required
> significant manual intervention. Since that, merges
> have been clean because fsfs.* wasn't touched
> on /trunk.
>
> If I understand Julian's merge changes in 1.8,
> reintegrating should work because there has been
> no cherry picking etc.

On Thu, Feb 21, 2013 at 11:04 AM, Mark Phippard <ma...@gmail.com> wrote:

> I see this when using 1.8:
>
> $ svn mergeinfo ^/subversion/branches/fsfs-format7
>     youngest common ancestor
>     |         last full merge
>     |         |        tip of branch
>     |         |        |         repository path
>
>     1414755            1448697
>     |                  |
>        --| |------------         subversion/branches/fsfs-format7
>       /
>      /
>   -------| |------------         subversion/trunk
>                        |
>                        1447423
>
> It seems to imply that it does not think you have ever synched with
> trunk.  So maybe it is trying to replay every revision from your
> branch when I merge back and that is why it gets so many conflicts?

I found the cause of the conflict filled reintegrate merge.  The
automatic merge code seems to be doing the right thing re Mark's
automatic merge above, the problem arises earlier.

First, here is the relationship between the two branches in question:

trunk@1414755------------@1442089---@r1445080------------>
       |                    |           |           ^
      copy                 sync        sync         |
       |                   merge       merge   conflict-filled
       |                    |           |         automatic
       |                    |           |          merge
       |                    |           |           |
       V                    V           V           |
   fsfs-format7@1414757---r1442344---r1445479------------>

The first problem is that the "sync" merge in r1442344 recorded the
wrong mergeinfo:

>svn log ^^/subversion/branches/fsfs-format7 -r1442344
------------------------------------------------------------------------
r1442344 | stefan2 | 2013-02-04 15:48:05 -0500 (Mon, 04 Feb 2013) | 2 lines

On the fsfs-format7: ketchup merge from /trunk up to and including r1442089.
Some conflicts due to splitting up fs_fs.c needed to be resolved.
------------------------------------------------------------------------

>svn diff --depth empty ^^/subversion/branches/fsfs-format7 -c1442344
Index: .
===================================================================
--- .   (revision 1442343)
+++ .   (revision 1442344)

Property changes on: .
___________________________________________________________________
Modified: svn:ignore
## -49,3 +49,6 ##
 zlib
 sqlite-amalgamation
 serf
+gtest
+.git
+.gitignore
Modified: svn:mergeinfo
   Merged /subversion/branches/issue-4116-dev:r1424719-1425040
   Merged /subversion/trunk:r1414758-1442089
                             ^^^^^^^
                             Why not r1414756?
                             That is the first revision on trunk after
                             the branch was created.

   Merged /subversion/branches/tweak-build-take-two:r1424288-1425049,1425051-1425613
   Merged /subversion/branches/in-repo-authz:r1414342-1424779
   Merged /subversion/branches/issue-4194-dev:r1410507-1414880
   Merged /subversion/branches/wc-collate-path:r1407642

The mergeinfo recorded makes it look like a big cherrypick of
r1414758-1442089 was done from ^/subversion/trunk to
^/subversion/branches/fsfs-format7, rather than a proper sync merge.
Well, not to worry, the subsequent sync merge committed in r1445479
will notice that /subversion/trunk:r1414756-1414757 has *not* been
merged and merge those missing revisions -- they are inoperative but
that doesn't matter, the mergeinfo should still be recorded.  Except
that didn't happen either:

>svn diff --depth empty ^^/subversion/branches/fsfs-format7 -c1445479
Index: .
===================================================================
--- .   (revision 1445478)
+++ .   (revision 1445479)

Property changes on: .
___________________________________________________________________
Modified: svn:mergeinfo
   Merged /subversion/trunk:r1442090-1445080

So why is this a problem?  Because later, when we try to merge the
branch back to trunk, the automatic merge code is trying to determine
if it is dealing with a "sync" or "reintegrate" style merge, it sees
that there is a gap in the mergeinfo on the source branch (i.e.
/subversion/trunk:r1414756-1414757) that makes it appear that the
branch is *not* fully in sync with the trunk.  Not-in-sync means no
reintegrate-style merge, but rather a normal sync style merge, which
(just like 1.7 if the --reintegrate option isn't used) doesn't do what
we want.  It tries to merge all of the branches changes, resulting in
dozens of tree conflicts.

So that is the cause of the problem Mark observed, but why did the
sync merges Stefan performed in r1442344 and 1445479 do the wrong
thing?  Repeating those two offending merges I am not able to
replicate the faulty mergeinfo in either case:

The first sync merge done in r1442344:

C:\SVN\src-branch-fsfs-format7>svn info
Path: .
Working Copy Root Path: C:\SVN\src-branch-fsfs-format7
URL: https://svn.apache.org/repos/asf/subversion/branches/fsfs-format7
Relative URL: ^/subversion/branches/fsfs-format7
Repository Root: https://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 1442343
Node Kind: directory
Schedule: normal
Last Changed Author: stefan2
Last Changed Rev: 1442081
Last Changed Date: 2013-02-04 06:27:56 -0500 (Mon, 04 Feb 2013)

C:\SVN\src-branch-fsfs-format7>svn st

C:\SVN\src-branch-fsfs-format7>svn merge ^^/subversion/trunk . --accept postpone
--- Merging r1414756 through r1450917 into '.':
U    get-deps.sh
U    Makefile.in
U    build.conf
.
<SNIP>
.
U    autogen.sh
U    aclocal.m4
D    packages
 U   .
--- Recording mergeinfo for merge of r1414756 through r1450917 into '.':
 G   .
Summary of conflicts:
  Text conflicts: 5
  Tree conflicts: 1

C:\SVN\src-branch-fsfs-format7>svn diff --depth empty
Index: .
===================================================================
--- .   (revision 1442343)
+++ .   (working copy)

Property changes on: .
___________________________________________________________________
Modified: svn:mergeinfo
   Merged /subversion/branches/issue-4116-dev:r1424719-1425040
   Merged /subversion/trunk:r1414756-1450917
                             ^^^^^^^
                             That's right! The mergeinfo picks up right
                             after the yca.
   Merged /subversion/branches/tweak-build-take-two:r1424288-1425049,1425051-1425613
   Merged /subversion/branches/in-repo-authz:r1414342-1424779
   Merged /subversion/branches/issue-4194-dev:r1410507-1414880
   Merged /subversion/branches/wc-collate-path:r1407642
Modified: svn:ignore
## -49,3 +49,6 ##
 zlib
 sqlite-amalgamation
 serf
+gtest
+.git
+.gitignore


The second merge in r1445479, which should have filled in the
mergeinfo gap, also works when I tried it:

C:\SVN\src-branch-fsfs-format7>svn info
Path: .
Working Copy Root Path: C:\SVN\src-branch-fsfs-format7
URL: https://svn.apache.org/repos/asf/subversion/branches/fsfs-format7
Relative URL: ^/subversion/branches/fsfs-format7
Repository Root: https://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 1445478
Node Kind: directory
Schedule: normal
Last Changed Author: stefan2
Last Changed Rev: 1445080
Last Changed Date: 2013-02-12 05:02:13 -0500 (Tue, 12 Feb 2013)


C:\SVN\src-branch-fsfs-format7>svn st

C:\SVN\src-branch-fsfs-format7>svn merge ^^/subversion/trunk --accept postpone
--- Merging r1442090 through r1450954 into '.':
U    get-deps.sh
U    Makefile.in
U    build.conf
.
<SNIP>
.
U    INSTALL
U    CHANGES
D    packages
--- Recording mergeinfo for merge of r1414756 through r1450954 into '.':
 U   .
Summary of conflicts:
  Text conflicts: 1

C:\SVN\src-branch-fsfs-format7>svn diff --depth empty
Index: .
===================================================================
--- .   (revision 1445478)
+++ .   (working copy)

Property changes on: .
___________________________________________________________________
Modified: svn:mergeinfo
   Merged /subversion/trunk:r1414756-1414757,1442090-1450954
                             ^^^^^^^^^^^^^^^
                             The gap is plugged!

So what happened?  There are a few options that I can see:

1) Transient mergeinfo recording bug that afflicted the client that
Stefan used to do the merges (I don't recall anything like this
recently and this is pretty basic stuff that has been in place since
1.6).

2) Stefan didn't actually do a sync merge, but rather did a cherry
pick, e.g. 'svn merge ^/subversion/trunk branch-wc -r1414757:HEAD'.
Stefan - do you recall how you did these merges?  How about what
client version you used?  Did you use --allow-mixed-revisions by
chance

3) Some as of yet undiscovered bug

-- 
Paul T. Burba
CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development
Skype: ptburba

P.S. Stefan, In the meantime I suggest you do the following record
only merge to plug the gap, so when your branch is ultimately merged
back to trunk it will be possible via a simple automatic merge:

svn merge ^/subversion/trunk --record-only -r1414755:1414757

Re: Problems Reintegrating the fsfs-format7 branch

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Thu, Mar 21, 2013 at 4:25 PM, Paul Burba <pt...@gmail.com> wrote:

> fyi http://subversion.tigris.org/issues/show_bug.cgi?id=4329#desc7
> explains my proposed patch in terms of how it changes the symmetric
> merge algorithm.
>
>
Without reviewing the actual patch, I'm absolutely +1
with what it tries to achieve according to your description
in that issue.

-- Stefan^2.

-- 
*Join one of our free daily demo sessions on* *Scaling Subversion for the
Enterprise <http://www.wandisco.com/training/webinars>*
*

*

Re: Problems Reintegrating the fsfs-format7 branch

Posted by Julian Foad <ju...@btopenworld.com>.
Paul Burba wrote:
> On Fri, Mar 22, 2013 at 2:39 PM, Julian Foad wrote:
>> Paul Burba wrote:
>>> On Fri, Mar 8, 2013 at 4:17 PM, Paul Burba <pt...@gmail.com> wrote:
>>>> The attached patch accomplishes, let's call it "2.5", because it
>>>> handles the more general case I outlined above, but not the case
>>>> represented by issue #4255.  Would you mind taking a look?
>>> 
>>> fyi http://subversion.tigris.org/issues/show_bug.cgi?id=4329#desc7
>>> explains my proposed patch in terms of how it changes the symmetric
>>> merge algorithm.
>> 
>> Hi Paul.  That did help, yes.
>> 
>> It took me a while to get my head back into this stuff, but yes it looks like
>> this hangs together theoretically as well as fixing the immediate use case.
>> 
>> As for the implementation, in find_last_merged_location() the task is to
>> find the last location on SOURCE_BRANCH such that all changes on SOURCE_BRANCH
>> after YCA up to and including *BASE_P have already been merged into the target
>> branch.  The function finds the first eligible revision, considering mergeinfo
>> only on the branch root, and then you add a call to svn_client_mergeinfo_log2()
>> to find the first eligible revision after that, considering subtree mergeinfo
>> and considering only operative changes.  Two questions:
>> 
>>    - Why do you only call svn_client_mergeinfo_log2() if there has been a
>> merge from SOURCE to TARGET recorded at the root of the branch?  That would
>> seem to miss the cases where there have only been subtree merges.
> 
> You are quite right.  This can be done entirely with
> svn_client_mergeinfo_log2() and catch the case where only subtree
> merges have been performed, but are "complete" (do we have an official
> term to describe the situation where a given subtree merge of rX is
> effectively the same as a merge of rX to the branch root?).
> 
>>    - Why not just use svn_client_mergeinfo_log2(), if it answers the
>> question, and delete all the prior code that first looks at just the branch-root
>> mergeinfo?  If the initial look at branch-root mergeinfo is a performance
>> optimization, then it should be inside svn_client_mergeinfo_log2(), I would
>> assume.
> 
> I reworked the patch to do just this and committed it in r1464102.
> 
>> You might consider committing the part that adds a 'limit' argument to a
>> mergeinfo API as a separate change, as it looks like a fairly large part of
>> the patch.  But if we can use svn_client_mergeinfo_log2() to answer the whole
>> question in one go, then it probably won't need the limit argument anyway.
> 
> r1464102 simply raises a SVN_ERR_CEASE_INVOCATION error in the
> svn_log_entry_receiver_t callback.  This, combined with my earlier
> commit (r1463237) to allow to ask for revisions in youngest:oldest or
> oldest:younger order, means that there is no need for the limit code;
> we can call svn_client_mergeinfo_log2() in such a way as to find the
> youngest merged revision or the oldest elibigle revision in the first
> callback.

Fantastic.  Thanks, Paul.

For the record, this resolves the two issues <http://subversion.tigris.org/issues/show_bug.cgi?id=4258> "Automatic merge after subtree merge in opposite direction" and <http://subversion.tigris.org/issues/show_bug.cgi?id=4329> "automatic merge uses reintegrate type merge if source is fully synced".  (You've already updated them, I just want to mention them here.)


> What I've struggled with is here is the (not unexpected) catch: Using
> svn_client_mergeinfo_log2(), rather than simply looking at the
> mergeinfo on the branch roots, may be more correct, but obviously
> takes more time.  During my ad hoc testing with some of our own
> branches, it adds anywhere from 5 to 25 seconds of real time.
> 
> I'm still looking if there are any places we can claw back some of
> that performance loss before 1.8...but after a long history of
> speeding up merge since 1.5, we might have to take a performance hit
> in the name of correctness.

One candidate for optimisation is that I added a check whether the source and target branches are related, which is called from 'svn' before the merge API is called.  This check involves retrieving the location-segment histories of the two branch root dirs.  We then throw away these location histories and retrieve them again later in the merge.  We should store them and re-use them.  (I'm pretty sure they were already being fetched more than once even before I added this relatedness check.)

- Julian

Re: Problems Reintegrating the fsfs-format7 branch

Posted by Paul Burba <pt...@gmail.com>.
On Fri, Mar 22, 2013 at 2:39 PM, Julian Foad <ju...@btopenworld.com> wrote:
> Paul Burba wrote:
>
>> On Fri, Mar 8, 2013 at 4:17 PM, Paul Burba <pt...@gmail.com> wrote:
>>>  On Tue, Mar 5, 2013 at 11:37 AM, Julian Foad wrote:
>>>>  So, what to do exactly?  Options seem to be:
>>>>
>>>>    0) leave it as it is
>>>>
>>>>    1) detect this specific case and warn or error out
>>>>
>>>>    2) detect this specific case and do a reintegrate
>>>
>>>  The attached patch accomplishes, let's call it "2.5", because it
>>>  handles the more general case I outlined above, but not the case
>>>  represented by issue #4255.  Would you mind taking a look?
>>
>> fyi http://subversion.tigris.org/issues/show_bug.cgi?id=4329#desc7
>> explains my proposed patch in terms of how it changes the symmetric
>> merge algorithm.
>
> Hi Paul.  That did help, yes.
>
> It took me a while to get my head back into this stuff, but yes it looks like this hangs together theoretically as well as fixing the immediate use case.
>
> As for the implementation, in find_last_merged_location() the task is to find the last location on SOURCE_BRANCH such that all changes on SOURCE_BRANCH after YCA up to and including *BASE_P have already been merged into the target branch.  The function finds the first eligible revision, considering mergeinfo only on the branch root, and then you add a call to svn_client_mergeinfo_log2() to find the first eligible revision after that, considering subtree mergeinfo and considering only operative changes.  Two questions:
>
>   - Why do you only call svn_client_mergeinfo_log2() if there has been a merge from SOURCE to TARGET recorded at the root of the branch?  That would seem to miss the cases where there have only been subtree merges.

You are quite right.  This can be done entirely with
svn_client_mergeinfo_log2() and catch the case where only subtree
merges have been performed, but are "complete" (do we have an official
term to describe the situation where a given subtree merge of rX is
effectively the same as a merge of rX to the branch root?).

>   - Why not just use svn_client_mergeinfo_log2(), if it answers the question, and delete all the prior code that first looks at just the branch-root mergeinfo?  If the initial look at branch-root mergeinfo is a performance optimization, then it should be inside svn_client_mergeinfo_log2(), I would assume.

I reworked the patch to do just this and committed it in r1464102.

> You might consider committing the part that adds a 'limit' argument to a mergeinfo API as a separate change, as it looks like a fairly large part of the patch.  But if we can use svn_client_mergeinfo_log2() to answer the whole question in one go, then it probably won't need the limit argument anyway.

r1464102 simply raises a SVN_ERR_CEASE_INVOCATION error in the
svn_log_entry_receiver_t callback.  This, combined with my earlier
commit (r1463237) to allow to ask for revisions in youngest:oldest or
oldest:younger order, means that there is no need for the limit code;
we can call svn_client_mergeinfo_log2() in such a way as to find the
youngest merged revision or the oldest elibigle revision in the first
callback.

What I've struggled with is here is the (not unexpected) catch: Using
svn_client_mergeinfo_log2(), rather than simply looking at the
mergeinfo on the branch roots, may be more correct, but obviously
takes more time.  During my ad hoc testing with some of our own
branches, it adds anywhere from 5 to 25 seconds of real time.

I'm still looking if there are any places we can claw back some of
that performance loss before 1.8...but after a long history of
speeding up merge since 1.5, we might have to take a performance hit
in the name of correctness.

--
Paul T. Burba
CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development
Skype: ptburba

Re: Problems Reintegrating the fsfs-format7 branch

Posted by Julian Foad <ju...@btopenworld.com>.
I (Julian Foad) wrote:

> It took me a while to get my head back into this stuff, but yes it looks like 
> this hangs together theoretically as well as fixing the immediate use case.

To say a bit more here, my concern was about the impact further down the road of leaving the history of "automatic" merges recorded by mergeinfo that only necessarily records the operative changes.  If any software such as a merge history graphing tool (in general; I'm not talking about a specific one), and even the svn client merge code itself, wanted to know whether a merge in the past was a "complete" merge (of all the possible changes from the source branch), it would need to reexamine the history, scanning for whether the changes were operative or inoperative.  The place I have reached in my thinking is merely that it's OK to do this and that it shouldn't impact anything currently and doesn't scuttle any longer term plans that I know of.  It's not as if we're forcing people into this situation or even making it easy for them to get there: as soon as they do an automatic merge in the other direction, it will fill in the gaps.

When I was planning and writing the code, I originally wanted the merge to ignore no-op changes.  There's a comment mentioning it, just inside the body of find_last_merged_location().  When considering how this would work in the more general case of trying to account for merges across all branches (not just directly between the source and target branch), this became overly complex.  Then I realized that we wouldn't actually ever run into this situation if we just do complete merges (where all possible revisions are recorded as merged), so I dropped the idea at that point.  Reviving it now seems right.

- Julian

Re: Problems Reintegrating the fsfs-format7 branch

Posted by Julian Foad <ju...@btopenworld.com>.
Paul Burba wrote:

> On Fri, Mar 8, 2013 at 4:17 PM, Paul Burba <pt...@gmail.com> wrote:
>>  On Tue, Mar 5, 2013 at 11:37 AM, Julian Foad wrote:
>>>  So, what to do exactly?  Options seem to be:
>>> 
>>>    0) leave it as it is
>>> 
>>>    1) detect this specific case and warn or error out
>>> 
>>>    2) detect this specific case and do a reintegrate
>> 
>>  The attached patch accomplishes, let's call it "2.5", because it
>>  handles the more general case I outlined above, but not the case
>>  represented by issue #4255.  Would you mind taking a look?
> 
> fyi http://subversion.tigris.org/issues/show_bug.cgi?id=4329#desc7
> explains my proposed patch in terms of how it changes the symmetric
> merge algorithm.

Hi Paul.  That did help, yes.

It took me a while to get my head back into this stuff, but yes it looks like this hangs together theoretically as well as fixing the immediate use case.

As for the implementation, in find_last_merged_location() the task is to find the last location on SOURCE_BRANCH such that all changes on SOURCE_BRANCH after YCA up to and including *BASE_P have already been merged into the target branch.  The function finds the first eligible revision, considering mergeinfo only on the branch root, and then you add a call to svn_client_mergeinfo_log2() to find the first eligible revision after that, considering subtree mergeinfo and considering only operative changes.  Two questions:

  - Why do you only call svn_client_mergeinfo_log2() if there has been a merge from SOURCE to TARGET recorded at the root of the branch?  That would seem to miss the cases where there have only been subtree merges.

  - Why not just use svn_client_mergeinfo_log2(), if it answers the question, and delete all the prior code that first looks at just the branch-root mergeinfo?  If the initial look at branch-root mergeinfo is a performance optimization, then it should be inside svn_client_mergeinfo_log2(), I would assume.

You might consider committing the part that adds a 'limit' argument to a mergeinfo API as a separate change, as it looks like a fairly large part of the patch.  But if we can use svn_client_mergeinfo_log2() to answer the whole question in one go, then it probably won't need the limit argument anyway.

- Julian

Re: Problems Reintegrating the fsfs-format7 branch

Posted by Paul Burba <pt...@gmail.com>.
On Fri, Mar 8, 2013 at 4:17 PM, Paul Burba <pt...@gmail.com> wrote:
> On Tue, Mar 5, 2013 at 11:37 AM, Julian Foad <ju...@btopenworld.com> wrote:
>> Paul Burba wrote:
>>> On Wed, Feb 27, 2013 at 5:04 PM, Julian Foad wrote:
>>>>  Paul Burba wrote:
>>>>>  I found the cause of the conflict filled reintegrate merge.  The
>>>>>  automatic merge code seems to be doing the right thing re Mark's
>>>>>  automatic merge above, the problem arises earlier.
>>>>
>>>>  Paul, thanks for investigating.  While I'm glad to hear the automatic
>>>> merge was not the main root cause of the problem, it would make sense
>>>> for it to detect that merges in the opposite direction have been done
>>>> and at least warn the user.
>>>
>>> Hi Julian,
>>>
>>> What are we going to warn them to do exactly?
>>
>> What I was thinking of here is to detect the most general condition that isn't handled properly.  Let me explain this part, although I'm sure you're familiar with it.
>>
>> Whenever the direction of merge is from A to B and the automatic merge code doesn't detect that a reintegrate merge is required, and yet one or more cherry-picked revision ranges and/or subtree merges have been merged from B to A, Subversion currently doesn't notice those merges from B to A, and proceeds to merge "everything" from A to B including those changes that were the result of B-to-A merges.  That is what the plain "merge" from A to B has always done -- it has always ignored any mergeinfo describing merges from B to A.
>
> Yes, http://subversion.tigris.org/issues/show_bug.cgi?id=4255 for the curious.
>
>> In the example at hand, there has been a B-to-A merge of everything from B (except some non-operative revisions), and so we get loads of conflicts by ignoring this when we merge the other way.  The intelligent user thinks, "This is clearly a reintegrate scenario."
>
> Do they?  I'm not very confident that most users will immediately
> think that, especially since we've deprecated the --reintegrate
> option.  Even if they do think "I need a reintegrate style merge!" I
> suspect they are more likely to do just that, explicitly using the
> --reintegrate option, rather than re-syncing B->A and then repeating
> the automatic "reintegrate style" merge.
>
>> On the other hand, if there has been a B-to-A merge of not quite everything from B (leaving an operative unmerged part at the beginning or somewhere in the middle), then an old "reintegrate" merge from A to B would error out, whereas an old non-reintegrate merge would just do the merge and wrongly re-merge all the B-to-A changes back to B.  The new automatic merge, if it determines (based on the merging of the branch root node) to do a non-reintegrate, currently does the latter.
>>
>> At some point, maybe not for 1.8, if we can detect any such condition, it would be a good to warn the user:
>>
>>   A clean merge from A to B is not possible because some but not all
>>   changes have been merged from B to A.  You're going to see conflicts
>>   (unless all your potential conflicts are trivially auto-resolved).
>
> But a reintegrate style merge already does this, if the automatic
> merge could only determine up front that that is the correct type of
> merge.
>
>> We could add some suggestions to the warning:
>>
>>   If you are expecting this to be a reintegrate-like merge, check that
>>   all changes on B including non-operative revision ranges have been
>>   merged to A and recorded in the mergeinfo; try an automatic sync merge
>>   from B to A to fill in any gaps.
>>
>>   If you are expecting a sync merge, note that there have been some
>>   cherry-pick and/or subtree merges from B to A, and this merge will
>>   attempt to merge these same changes back to B, which will likely
>>   produce conflicts.
>>
>> We could add more specific details of what's been found, and more specific suggestions related to the details, such as the case at hand where all operative revs have been merged B-to-A, if we think it's worth the effort.
>>
>> A basic implementation would not be cheap if there is a significant amout of subtree mergeinfo to check.  And I have carefully avoided the issue of whether we should error out or proceed with the merge, as there's no point bike-shedding about that unless we really decide to do this.
>>
>> So let's take a closer look at what we could do for the specific case at hand.
>>
>>
>>>  To explicitly use
>>> --reintegrate option?  Because that would work in Stefan's case, so we
>>> should just do it rather than warn...unless you mean something else.
>>
>> So, you're thinking more about detecting the very specific kind of scenario that Stefan encountered -- where everything except some leading non-operative revisions had been merged B-to-A, and a reintegrate would give the "correct" result.
>
> Why is "correct" in quotes?  How is a reintegrate style merge ever the
> wrong choice in that scenario?
>
> Actually, don't answer that, because I'm thinking of a more general
> case: The missing non-operative revisions may be leading, as in
> Stefan's case, or they may be interspersed with operative revisions
> that have already been merged (including subtree merges where the
> change is only operative in the given subtree -- i.e. would be a no-op
> even if repeated at the branch root).  Right now the automatic merge
> code (i.e. merge.c:find_last_merged_location()) obviously doesn't
> handle this, but we already have code that can answer this question:
> svn_client_mergeinfo_log2[1].
>
> Here's an example from our test suite of what I mean:
>
> Run merge_reintegrate_tests.py 13 then revert the local changes the
> test leaves behind.  We want to merge A_COPY to A, but there have been
> prior cherrypicks and subtree merges from A to A_COPY:
>
>>svn pl -vR A_COPY
> Properties on 'A_COPY':
>   svn:mergeinfo
>     /A:6
> Properties on 'A_COPY\B':
>   svn:mergeinfo
>     /A/B:5
> Properties on 'A_COPY\D\G\rho':
>   svn:mergeinfo
>     /A/D/G/rho:4
> Properties on 'A_COPY\D\H':
>   svn:mergeinfo
>     /A/D/H:3,6
>
> However, we can see that these prior cherrypicks and subtree merges
> actually constitute all operative revisions up to r8:
>
>>svn mergeinfo --show-revs eligible -R ^^/A A_COPY
> r9
>
> So an explicit reintegrate merge works fine:
>
>>svn merge ^^/A_COPY A --reintegrate
> --- Merging differences between repository URLs into 'A':
> U    A\mu
> --- Recording mergeinfo for merge between repository URLs into 'A':
>  U   A
>
> But an automatic merge doesn't notice this, wrongly chooses a sync
> style merge and results in spurious changes/conflicts:
>
>>svn revert -Rq .
>
>>svn merge ^^/A_COPY A
> --- Merging r2 through r9 into 'A':
> U    A\mu
> C    A\D\H\psi
> --- Recording mergeinfo for merge of r2 through r9 into 'A':
>  U   A
> Summary of conflicts:
>   Text conflicts: 1
> Conflict discovered in file 'A\D\H\psi'.
> Select: (p) postpone, (df) diff-full, (e) edit, (m) merge,
>         (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p
>
> So my point is simply that if an explicit --reintegrate merge can
> DTRT, why can't an automatic merge?
>
> [1] calculate_left_hand_side also does effectively the same thing --
> which makes me wonder if we can't use svn_client_mergeinfo_log2 there,
> but that's another issue.
>
>> In an immediate sense, certainly we could "just do it", special-case this particular scenario (have you thought about what the exact condition should be?) and automatically choose the reintegrate code path in that case.
>
> See above "I'm thinking of a more general case".
>
>> The trouble is, that doesn't resolve the situation completely.  It gives the correct result in terms of the changes merged, but it doesn't straighten out the mergeinfo.
>
> You mean the mergeinfo on the source?
>
>> That means we have to make sure the rest of the merge subsystem consistently handles this case too: the "sync"-style merge, merge graphing tools that determine what kind of merge was done and what kinds of merge are eligible, and so on.  See below.
>
> I'm not sure how this is a problem.  If we do a reintegrate style
> merge in the general situation I describe above, how does that break
> subsequent sync style merges?  Can you point to an example?  in terms
> of determining eligible revisions, 'svn mergeinfo' already handles
> these gaps.  Which specific graphing tools are you speaking of?
>
>>> I filed an issue for this:
>>> http://subversion.tigris.org/issues/show_bug.cgi?id=4329 and added a
>>> test which demonstrates a very simple version of the problem.  I set
>>> issue #4329 a 1.8 blocker.  While I think what Stefan is doing is
>>> somewhat atypical[1], if we are deprecating --reintegrate, then
>>> automatic merges should handle cases like this.  After all, if we
>>> explicitly use the --reintegrate option the merge works fine.  What do
>>> others think?
>>
>> I agree it's atypical (or rather unintended) usage, and I also acknowledge that if Stefan used merge like this, then sometimes other people will do too.
>>
>> However, I don't want us to put any more special cases in the code, especially without putting them into the *design*.  If the old "--reintegrate" accepts a missing leading no-op revision range in considering whether "everything" has been merged, that should now be considered a bug.
>
> Seriously?  As long as we allow cherrypicks and subtree merges I think
> we want to support use cases like this...to call it a bug that we
> DTRT, I guess I just don't get it.
>
>> The new rule is that if every revision on B is *recorded* as merged to A, then we can reintegrate from A to B, else we can't.
>
> I never came away with the understanding in all the discussion of
> automatic merges...but maybe I'm the only one.
>
>> The main alternative would be to adopt some principle such as "When determining what's been merged and what needs to be merged, we only notice operative revisions, aka non-null changes".  I considered that during the design of automatic merge, and was strongly drawn to the idea and started prototyping around it, but that turned out to be a rat-hole involving potentially exhaustive traversal of the whole mergeinfo graph (all branches) to determine whether a given bit of mergeinfo represented an operative or an inoperative change.  And the benefits are near zero, because both sync and reintegrate merges were already filling in any gaps in the mergeinfo in the direction of the merge so the situation can only arise if you use manual merges.
>>
>> The sensible choice in the end was to require that revision ranges are explicitly recorded whether operative or not.
>>
>> Having found a case that we suspect may occur frequently enough to cause annoyance, I believe the best thing to do is to admit that we don't handle it automatically.  Just because the old reintegrate handles this particular case doesn't mean we're making a mistake if the new automatic merge doesn't.  This behaviour change is a small adjustment towards overall more consistent merge behaviour, and consistency is key to large-scale usability, much more so than special casing is.
>>
>> So, what to do exactly?  Options seem to be:
>>
>>   0) leave it as it is
>>
>>   1) detect this specific case and warn or error out
>>
>>   2) detect this specific case and do a reintegrate
>
> The attached patch accomplishes, let's call it "2.5", because it
> handles the more general case I outlined above, but not the case
> represented by issue #4255.  Would you mind taking a look?

fyi http://subversion.tigris.org/issues/show_bug.cgi?id=4329#desc7
explains my proposed patch in terms of how it changes the symmetric
merge algorithm.

-- 
Paul T. Burba
CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development
Skype: ptburba

Re: Problems Reintegrating the fsfs-format7 branch

Posted by Paul Burba <pt...@gmail.com>.
On Tue, Mar 5, 2013 at 11:37 AM, Julian Foad <ju...@btopenworld.com> wrote:
> Paul Burba wrote:
>> On Wed, Feb 27, 2013 at 5:04 PM, Julian Foad wrote:
>>>  Paul Burba wrote:
>>>>  I found the cause of the conflict filled reintegrate merge.  The
>>>>  automatic merge code seems to be doing the right thing re Mark's
>>>>  automatic merge above, the problem arises earlier.
>>>
>>>  Paul, thanks for investigating.  While I'm glad to hear the automatic
>>> merge was not the main root cause of the problem, it would make sense
>>> for it to detect that merges in the opposite direction have been done
>>> and at least warn the user.
>>
>> Hi Julian,
>>
>> What are we going to warn them to do exactly?
>
> What I was thinking of here is to detect the most general condition that isn't handled properly.  Let me explain this part, although I'm sure you're familiar with it.
>
> Whenever the direction of merge is from A to B and the automatic merge code doesn't detect that a reintegrate merge is required, and yet one or more cherry-picked revision ranges and/or subtree merges have been merged from B to A, Subversion currently doesn't notice those merges from B to A, and proceeds to merge "everything" from A to B including those changes that were the result of B-to-A merges.  That is what the plain "merge" from A to B has always done -- it has always ignored any mergeinfo describing merges from B to A.

Yes, http://subversion.tigris.org/issues/show_bug.cgi?id=4255 for the curious.

> In the example at hand, there has been a B-to-A merge of everything from B (except some non-operative revisions), and so we get loads of conflicts by ignoring this when we merge the other way.  The intelligent user thinks, "This is clearly a reintegrate scenario."

Do they?  I'm not very confident that most users will immediately
think that, especially since we've deprecated the --reintegrate
option.  Even if they do think "I need a reintegrate style merge!" I
suspect they are more likely to do just that, explicitly using the
--reintegrate option, rather than re-syncing B->A and then repeating
the automatic "reintegrate style" merge.

> On the other hand, if there has been a B-to-A merge of not quite everything from B (leaving an operative unmerged part at the beginning or somewhere in the middle), then an old "reintegrate" merge from A to B would error out, whereas an old non-reintegrate merge would just do the merge and wrongly re-merge all the B-to-A changes back to B.  The new automatic merge, if it determines (based on the merging of the branch root node) to do a non-reintegrate, currently does the latter.
>
> At some point, maybe not for 1.8, if we can detect any such condition, it would be a good to warn the user:
>
>   A clean merge from A to B is not possible because some but not all
>   changes have been merged from B to A.  You're going to see conflicts
>   (unless all your potential conflicts are trivially auto-resolved).

But a reintegrate style merge already does this, if the automatic
merge could only determine up front that that is the correct type of
merge.

> We could add some suggestions to the warning:
>
>   If you are expecting this to be a reintegrate-like merge, check that
>   all changes on B including non-operative revision ranges have been
>   merged to A and recorded in the mergeinfo; try an automatic sync merge
>   from B to A to fill in any gaps.
>
>   If you are expecting a sync merge, note that there have been some
>   cherry-pick and/or subtree merges from B to A, and this merge will
>   attempt to merge these same changes back to B, which will likely
>   produce conflicts.
>
> We could add more specific details of what's been found, and more specific suggestions related to the details, such as the case at hand where all operative revs have been merged B-to-A, if we think it's worth the effort.
>
> A basic implementation would not be cheap if there is a significant amout of subtree mergeinfo to check.  And I have carefully avoided the issue of whether we should error out or proceed with the merge, as there's no point bike-shedding about that unless we really decide to do this.
>
> So let's take a closer look at what we could do for the specific case at hand.
>
>
>>  To explicitly use
>> --reintegrate option?  Because that would work in Stefan's case, so we
>> should just do it rather than warn...unless you mean something else.
>
> So, you're thinking more about detecting the very specific kind of scenario that Stefan encountered -- where everything except some leading non-operative revisions had been merged B-to-A, and a reintegrate would give the "correct" result.

Why is "correct" in quotes?  How is a reintegrate style merge ever the
wrong choice in that scenario?

Actually, don't answer that, because I'm thinking of a more general
case: The missing non-operative revisions may be leading, as in
Stefan's case, or they may be interspersed with operative revisions
that have already been merged (including subtree merges where the
change is only operative in the given subtree -- i.e. would be a no-op
even if repeated at the branch root).  Right now the automatic merge
code (i.e. merge.c:find_last_merged_location()) obviously doesn't
handle this, but we already have code that can answer this question:
svn_client_mergeinfo_log2[1].

Here's an example from our test suite of what I mean:

Run merge_reintegrate_tests.py 13 then revert the local changes the
test leaves behind.  We want to merge A_COPY to A, but there have been
prior cherrypicks and subtree merges from A to A_COPY:

>svn pl -vR A_COPY
Properties on 'A_COPY':
  svn:mergeinfo
    /A:6
Properties on 'A_COPY\B':
  svn:mergeinfo
    /A/B:5
Properties on 'A_COPY\D\G\rho':
  svn:mergeinfo
    /A/D/G/rho:4
Properties on 'A_COPY\D\H':
  svn:mergeinfo
    /A/D/H:3,6

However, we can see that these prior cherrypicks and subtree merges
actually constitute all operative revisions up to r8:

>svn mergeinfo --show-revs eligible -R ^^/A A_COPY
r9

So an explicit reintegrate merge works fine:

>svn merge ^^/A_COPY A --reintegrate
--- Merging differences between repository URLs into 'A':
U    A\mu
--- Recording mergeinfo for merge between repository URLs into 'A':
 U   A

But an automatic merge doesn't notice this, wrongly chooses a sync
style merge and results in spurious changes/conflicts:

>svn revert -Rq .

>svn merge ^^/A_COPY A
--- Merging r2 through r9 into 'A':
U    A\mu
C    A\D\H\psi
--- Recording mergeinfo for merge of r2 through r9 into 'A':
 U   A
Summary of conflicts:
  Text conflicts: 1
Conflict discovered in file 'A\D\H\psi'.
Select: (p) postpone, (df) diff-full, (e) edit, (m) merge,
        (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p

So my point is simply that if an explicit --reintegrate merge can
DTRT, why can't an automatic merge?

[1] calculate_left_hand_side also does effectively the same thing --
which makes me wonder if we can't use svn_client_mergeinfo_log2 there,
but that's another issue.

> In an immediate sense, certainly we could "just do it", special-case this particular scenario (have you thought about what the exact condition should be?) and automatically choose the reintegrate code path in that case.

See above "I'm thinking of a more general case".

> The trouble is, that doesn't resolve the situation completely.  It gives the correct result in terms of the changes merged, but it doesn't straighten out the mergeinfo.

You mean the mergeinfo on the source?

> That means we have to make sure the rest of the merge subsystem consistently handles this case too: the "sync"-style merge, merge graphing tools that determine what kind of merge was done and what kinds of merge are eligible, and so on.  See below.

I'm not sure how this is a problem.  If we do a reintegrate style
merge in the general situation I describe above, how does that break
subsequent sync style merges?  Can you point to an example?  in terms
of determining eligible revisions, 'svn mergeinfo' already handles
these gaps.  Which specific graphing tools are you speaking of?

>> I filed an issue for this:
>> http://subversion.tigris.org/issues/show_bug.cgi?id=4329 and added a
>> test which demonstrates a very simple version of the problem.  I set
>> issue #4329 a 1.8 blocker.  While I think what Stefan is doing is
>> somewhat atypical[1], if we are deprecating --reintegrate, then
>> automatic merges should handle cases like this.  After all, if we
>> explicitly use the --reintegrate option the merge works fine.  What do
>> others think?
>
> I agree it's atypical (or rather unintended) usage, and I also acknowledge that if Stefan used merge like this, then sometimes other people will do too.
>
> However, I don't want us to put any more special cases in the code, especially without putting them into the *design*.  If the old "--reintegrate" accepts a missing leading no-op revision range in considering whether "everything" has been merged, that should now be considered a bug.

Seriously?  As long as we allow cherrypicks and subtree merges I think
we want to support use cases like this...to call it a bug that we
DTRT, I guess I just don't get it.

> The new rule is that if every revision on B is *recorded* as merged to A, then we can reintegrate from A to B, else we can't.

I never came away with the understanding in all the discussion of
automatic merges...but maybe I'm the only one.

> The main alternative would be to adopt some principle such as "When determining what's been merged and what needs to be merged, we only notice operative revisions, aka non-null changes".  I considered that during the design of automatic merge, and was strongly drawn to the idea and started prototyping around it, but that turned out to be a rat-hole involving potentially exhaustive traversal of the whole mergeinfo graph (all branches) to determine whether a given bit of mergeinfo represented an operative or an inoperative change.  And the benefits are near zero, because both sync and reintegrate merges were already filling in any gaps in the mergeinfo in the direction of the merge so the situation can only arise if you use manual merges.
>
> The sensible choice in the end was to require that revision ranges are explicitly recorded whether operative or not.
>
> Having found a case that we suspect may occur frequently enough to cause annoyance, I believe the best thing to do is to admit that we don't handle it automatically.  Just because the old reintegrate handles this particular case doesn't mean we're making a mistake if the new automatic merge doesn't.  This behaviour change is a small adjustment towards overall more consistent merge behaviour, and consistency is key to large-scale usability, much more so than special casing is.
>
> So, what to do exactly?  Options seem to be:
>
>   0) leave it as it is
>
>   1) detect this specific case and warn or error out
>
>   2) detect this specific case and do a reintegrate

The attached patch accomplishes, let's call it "2.5", because it
handles the more general case I outlined above, but not the case
represented by issue #4255.  Would you mind taking a look?

-- 
Paul T. Burba
CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development
Skype: ptburba

>   3) detect the general case and warn or error out
>
>   4) detect the general case and merge properly (hah! that's for 1.9)
>
> I would say 0/1/2 are achievable and 3/4 are too difficult, and 2 is a bad idea.
>
> Thoughts?
>
> - Julian

Re: Problems Reintegrating the fsfs-format7 branch

Posted by Julian Foad <ju...@btopenworld.com>.
Paul Burba wrote:
> On Wed, Feb 27, 2013 at 5:04 PM, Julian Foad wrote:
>>  Paul Burba wrote:
>>>  I found the cause of the conflict filled reintegrate merge.  The
>>>  automatic merge code seems to be doing the right thing re Mark's
>>>  automatic merge above, the problem arises earlier.
>> 
>>  Paul, thanks for investigating.  While I'm glad to hear the automatic 
>> merge was not the main root cause of the problem, it would make sense
>> for it to detect that merges in the opposite direction have been done
>> and at least warn the user.
> 
> Hi Julian,
> 
> What are we going to warn them to do exactly?

What I was thinking of here is to detect the most general condition that isn't handled properly.  Let me explain this part, although I'm sure you're familiar with it.

Whenever the direction of merge is from A to B and the automatic merge code doesn't detect that a reintegrate merge is required, and yet one or more cherry-picked revision ranges and/or subtree merges have been merged from B to A, Subversion currently doesn't notice those merges from B to A, and proceeds to merge "everything" from A to B including those changes that were the result of B-to-A merges.  That is what the plain "merge" from A to B has always done -- it has always ignored any mergeinfo describing merges from B to A.

In the example at hand, there has been a B-to-A merge of everything from B (except some non-operative revisions), and so we get loads of conflicts by ignoring this when we merge the other way.  The intelligent user thinks, "This is clearly a reintegrate scenario."

On the other hand, if there has been a B-to-A merge of not quite everything from B (leaving an operative unmerged part at the beginning or somewhere in the middle), then an old "reintegrate" merge from A to B would error out, whereas an old non-reintegrate merge would just do the merge and wrongly re-merge all the B-to-A changes back to B.  The new automatic merge, if it determines (based on the merging of the branch root node) to do a non-reintegrate, currently does the latter.

At some point, maybe not for 1.8, if we can detect any such condition, it would be a good to warn the user:

  A clean merge from A to B is not possible because some but not all
  changes have been merged from B to A.  You're going to see conflicts
  (unless all your potential conflicts are trivially auto-resolved).

We could add some suggestions to the warning:

  If you are expecting this to be a reintegrate-like merge, check that
  all changes on B including non-operative revision ranges have been
  merged to A and recorded in the mergeinfo; try an automatic sync merge
  from B to A to fill in any gaps.

  If you are expecting a sync merge, note that there have been some
  cherry-pick and/or subtree merges from B to A, and this merge will
  attempt to merge these same changes back to B, which will likely
  produce conflicts.

We could add more specific details of what's been found, and more specific suggestions related to the details, such as the case at hand where all operative revs have been merged B-to-A, if we think it's worth the effort.

A basic implementation would not be cheap if there is a significant amout of subtree mergeinfo to check.  And I have carefully avoided the issue of whether we should error out or proceed with the merge, as there's no point bike-shedding about that unless we really decide to do this.

So let's take a closer look at what we could do for the specific case at hand.


>  To explicitly use
> --reintegrate option?  Because that would work in Stefan's case, so we
> should just do it rather than warn...unless you mean something else.

So, you're thinking more about detecting the very specific kind of scenario that Stefan encountered -- where everything except some leading non-operative revisions had been merged B-to-A, and a reintegrate would give the "correct" result.

In an immediate sense, certainly we could "just do it", special-case this particular scenario (have you thought about what the exact condition should be?) and automatically choose the reintegrate code path in that case.

The trouble is, that doesn't resolve the situation completely.  It gives the correct result in terms of the changes merged, but it doesn't straighten out the mergeinfo.  That means we have to make sure the rest of the merge subsystem consistently handles this case too: the "sync"-style merge, merge graphing tools that determine what kind of merge was done and what kinds of merge are eligible, and so on.  See below.

> I filed an issue for this:
> http://subversion.tigris.org/issues/show_bug.cgi?id=4329 and added a
> test which demonstrates a very simple version of the problem.  I set
> issue #4329 a 1.8 blocker.  While I think what Stefan is doing is
> somewhat atypical[1], if we are deprecating --reintegrate, then
> automatic merges should handle cases like this.  After all, if we
> explicitly use the --reintegrate option the merge works fine.  What do
> others think?

I agree it's atypical (or rather unintended) usage, and I also acknowledge that if Stefan used merge like this, then sometimes other people will do too.

However, I don't want us to put any more special cases in the code, especially without putting them into the *design*.  If the old "--reintegrate" accepts a missing leading no-op revision range in considering whether "everything" has been merged, that should now be considered a bug.  The new rule is that if every revision on B is *recorded* as merged to A, then we can reintegrate from A to B, else we can't.

The main alternative would be to adopt some principle such as "When determining what's been merged and what needs to be merged, we only notice operative revisions, aka non-null changes".  I considered that during the design of automatic merge, and was strongly drawn to the idea and started prototyping around it, but that turned out to be a rat-hole involving potentially exhaustive traversal of the whole mergeinfo graph (all branches) to determine whether a given bit of mergeinfo represented an operative or an inoperative change.  And the benefits are near zero, because both sync and reintegrate merges were already filling in any gaps in the mergeinfo in the direction of the merge so the situation can only arise if you use manual merges.

The sensible choice in the end was to require that revision ranges are explicitly recorded whether operative or not.

Having found a case that we suspect may occur frequently enough to cause annoyance, I believe the best thing to do is to admit that we don't handle it automatically.  Just because the old reintegrate handles this particular case doesn't mean we're making a mistake if the new automatic merge doesn't.  This behaviour change is a small adjustment towards overall more consistent merge behaviour, and consistency is key to large-scale usability, much more so than special casing is.

So, what to do exactly?  Options seem to be:

  0) leave it as it is

  1) detect this specific case and warn or error out

  2) detect this specific case and do a reintegrate

  3) detect the general case and warn or error out

  4) detect the general case and merge properly (hah! that's for 1.9)

I would say 0/1/2 are achievable and 3/4 are too difficult, and 2 is a bad idea.

Thoughts?

- Julian

Re: Problems Reintegrating the fsfs-format7 branch (Was: FSFS format7 status and first results)

Posted by Paul Burba <pt...@gmail.com>.
On Wed, Feb 27, 2013 at 5:04 PM, Julian Foad <ju...@btopenworld.com> wrote:
> Paul Burba wrote:
>
>> On Mon, Feb 18, 2013 at 11:54 AM, Mark Phippard <ma...@gmail.com>
>> wrote:
>>
>>>  BTW, how are you managing your branch?  I tried merging it back to
>>>  trunk to get an idea on the diff and there were a lot of text and tree
>>>  conflicts.  I thought I had seen you doing synch merges from trunk in
>>>  the past so I assumed it would merge back.
>>
>> On Thu, Feb 21, 2013 at 5:05 AM, Stefan Fuhrmann
>> <st...@wandisco.com> wrote:
>>
>>>  Hm. I split fsfs.c and .h into multiple files on the
>>>  branch and the first merge from /trunk required
>>>  significant manual intervention. Since that, merges
>>>  have been clean because fsfs.* wasn't touched
>>>  on /trunk.
>>>
>>>  If I understand Julian's merge changes in 1.8,
>>>  reintegrating should work because there has been
>>>  no cherry picking etc.
>>
>> On Thu, Feb 21, 2013 at 11:04 AM, Mark Phippard <ma...@gmail.com>
>> wrote:
>>
>>>  I see this when using 1.8:
>>>
>>>  $ svn mergeinfo ^/subversion/branches/fsfs-format7
>>>      youngest common ancestor
>>>      |         last full merge
>>>      |         |        tip of branch
>>>      |         |        |         repository path
>>>
>>>      1414755            1448697
>>>      |                  |
>>>         --| |------------         subversion/branches/fsfs-format7
>>>        /
>>>       /
>>>    -------| |------------         subversion/trunk
>>>                         |
>>>                         1447423
>>>
>>>  It seems to imply that it does not think you have ever synched with
>>>  trunk.  So maybe it is trying to replay every revision from your
>>>  branch when I merge back and that is why it gets so many conflicts?
>>
>> I found the cause of the conflict filled reintegrate merge.  The
>> automatic merge code seems to be doing the right thing re Mark's
>> automatic merge above, the problem arises earlier.
>
> Paul, thanks for investigating.  While I'm glad to hear the automatic merge was not the main root cause of the problem, it would make sense for it to detect that merges in the opposite direction have been done and at least warn the user.

Hi Julian,

What are we going to warn them to do exactly?  To explicitly use
--reintegrate option?  Because that would work in Stefan's case, so we
should just do it rather than warn...unless you mean something else.

I filed an issue for this:
http://subversion.tigris.org/issues/show_bug.cgi?id=4329 and added a
test which demonstrates a very simple version of the problem.  I set
issue #4329 a 1.8 blocker.  While I think what Stefan is doing is
somewhat atypical[1], if we are deprecating --reintegrate, then
automatic merges should handle cases like this.  After all, if we
explicitly use the --reintegrate option the merge works fine.  What do
others think?

[1] Doing *cherrypick* merges to effectively keep a branch in sync
with trunk and then using an *automatic* merge to "reintegrate" the
branch back to trunk.

>  It really doesn't make sense to proceed with the merging in a case like this.
>
> - Julian

-- 
Paul T. Burba
CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development
Skype: ptburba

Re: Problems Reintegrating the fsfs-format7 branch (Was: FSFS format7 status and first results)

Posted by Julian Foad <ju...@btopenworld.com>.
Paul Burba wrote:

> On Mon, Feb 18, 2013 at 11:54 AM, Mark Phippard <ma...@gmail.com> 
> wrote:
> 
>>  BTW, how are you managing your branch?  I tried merging it back to
>>  trunk to get an idea on the diff and there were a lot of text and tree
>>  conflicts.  I thought I had seen you doing synch merges from trunk in
>>  the past so I assumed it would merge back.
> 
> On Thu, Feb 21, 2013 at 5:05 AM, Stefan Fuhrmann
> <st...@wandisco.com> wrote:
> 
>>  Hm. I split fsfs.c and .h into multiple files on the
>>  branch and the first merge from /trunk required
>>  significant manual intervention. Since that, merges
>>  have been clean because fsfs.* wasn't touched
>>  on /trunk.
>> 
>>  If I understand Julian's merge changes in 1.8,
>>  reintegrating should work because there has been
>>  no cherry picking etc.
> 
> On Thu, Feb 21, 2013 at 11:04 AM, Mark Phippard <ma...@gmail.com> 
> wrote:
> 
>>  I see this when using 1.8:
>> 
>>  $ svn mergeinfo ^/subversion/branches/fsfs-format7
>>      youngest common ancestor
>>      |         last full merge
>>      |         |        tip of branch
>>      |         |        |         repository path
>> 
>>      1414755            1448697
>>      |                  |
>>         --| |------------         subversion/branches/fsfs-format7
>>        /
>>       /
>>    -------| |------------         subversion/trunk
>>                         |
>>                         1447423
>> 
>>  It seems to imply that it does not think you have ever synched with
>>  trunk.  So maybe it is trying to replay every revision from your
>>  branch when I merge back and that is why it gets so many conflicts?
> 
> I found the cause of the conflict filled reintegrate merge.  The
> automatic merge code seems to be doing the right thing re Mark's
> automatic merge above, the problem arises earlier.

Paul, thanks for investigating.  While I'm glad to hear the automatic merge was not the main root cause of the problem, it would make sense for it to detect that merges in the opposite direction have been done and at least warn the user.  It really doesn't make sense to proceed with the merging in a case like this.

- Julian

Re: Problems Reintegrating the fsfs-format7 branch (Was: FSFS format7 status and first results)

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Feb 27, 2013 at 10:21 PM, Paul Burba <pt...@gmail.com> wrote:

> (Stefan - If you don't have time to read all this please at least take
> a look at the short questions at the very end)
>

No worries :) Thanks for digging into this problem.

General remark: I'm working at and committing from my
Ubuntu 12.04 laptop since mid January and will continue
to do so until Easter. That means: Subversion 1.6.17.


> On Mon, Feb 18, 2013 at 11:54 AM, Mark Phippard <ma...@gmail.com>
> wrote:
>
> > BTW, how are you managing your branch?  I tried merging it back to
> > trunk to get an idea on the diff and there were a lot of text and tree
> > conflicts.  I thought I had seen you doing synch merges from trunk in
> > the past so I assumed it would merge back.
>
> On Thu, Feb 21, 2013 at 5:05 AM, Stefan Fuhrmann
> <st...@wandisco.com> wrote:
>
> > Hm. I split fsfs.c and .h into multiple files on the
> > branch and the first merge from /trunk required
> > significant manual intervention. Since that, merges
> > have been clean because fsfs.* wasn't touched
> > on /trunk.
> >
> > If I understand Julian's merge changes in 1.8,
> > reintegrating should work because there has been
> > no cherry picking etc.
>
> On Thu, Feb 21, 2013 at 11:04 AM, Mark Phippard <ma...@gmail.com>
> wrote:
>
> > I see this when using 1.8:
> >
> > $ svn mergeinfo ^/subversion/branches/fsfs-format7
> >     youngest common ancestor
> >     |         last full merge
> >     |         |        tip of branch
> >     |         |        |         repository path
> >
> >     1414755            1448697
> >     |                  |
> >        --| |------------         subversion/branches/fsfs-format7
> >       /
> >      /
> >   -------| |------------         subversion/trunk
> >                        |
> >                        1447423
> >
> > It seems to imply that it does not think you have ever synched with
> > trunk.  So maybe it is trying to replay every revision from your
> > branch when I merge back and that is why it gets so many conflicts?
>
> I found the cause of the conflict filled reintegrate merge.  The
> automatic merge code seems to be doing the right thing re Mark's
> automatic merge above, the problem arises earlier.
>
> First, here is the relationship between the two branches in question:
>
> trunk@1414755------------@1442089---@r1445080------------>
>        |                    |           |           ^
>       copy                 sync        sync         |
>        |                   merge       merge   conflict-filled
>        |                    |           |         automatic
>        |                    |           |          merge
>        |                    |           |           |
>        V                    V           V           |
>    fsfs-format7@1414757---r1442344---r1445479------------>
>
> The first problem is that the "sync" merge in r1442344 recorded the
> wrong mergeinfo:
>
> >svn log ^^/subversion/branches/fsfs-format7 -r1442344
> ------------------------------------------------------------------------
> r1442344 | stefan2 | 2013-02-04 15:48:05 -0500 (Mon, 04 Feb 2013) | 2 lines
>
> On the fsfs-format7: ketchup merge from /trunk up to and including
> r1442089.
> Some conflicts due to splitting up fs_fs.c needed to be resolved.
> ------------------------------------------------------------------------
>
> >svn diff --depth empty ^^/subversion/branches/fsfs-format7 -c1442344
> Index: .
> ===================================================================
> --- .   (revision 1442343)
> +++ .   (revision 1442344)
>
> Property changes on: .
> ___________________________________________________________________
> Modified: svn:ignore
> ## -49,3 +49,6 ##
>  zlib
>  sqlite-amalgamation
>  serf
> +gtest
> +.git
> +.gitignore
> Modified: svn:mergeinfo
>    Merged /subversion/branches/issue-4116-dev:r1424719-1425040
>    Merged /subversion/trunk:r1414758-1442089
>                              ^^^^^^^
>                              Why not r1414756?
>                              That is the first revision on trunk after
>                              the branch was created.
>

I simply didn't notice it, so I wrongly assumed at the
branch was created from 1414756. Don't remember
which tool I used to created the repo-repo copy to
open the branch.

   Merged
> /subversion/branches/tweak-build-take-two:r1424288-1425049,1425051-1425613
>    Merged /subversion/branches/in-repo-authz:r1414342-1424779
>    Merged /subversion/branches/issue-4194-dev:r1410507-1414880
>    Merged /subversion/branches/wc-collate-path:r1407642
>
> The mergeinfo recorded makes it look like a big cherrypick of
> r1414758-1442089 was done from ^/subversion/trunk to
> ^/subversion/branches/fsfs-format7, rather than a proper sync merge.
> Well, not to worry, the subsequent sync merge committed in r1445479
> will notice that /subversion/trunk:r1414756-1414757 has *not* been
> merged and merge those missing revisions -- they are inoperative but
> that doesn't matter, the mergeinfo should still be recorded.  Except
> that didn't happen either:
>

That is no surprise as I merged (lastmerge+1):headRev.

>svn diff --depth empty ^^/subversion/branches/fsfs-format7 -c1445479
> Index: .
> ===================================================================
> --- .   (revision 1445478)
> +++ .   (revision 1445479)
>
> Property changes on: .
> ___________________________________________________________________
> Modified: svn:mergeinfo
>    Merged /subversion/trunk:r1442090-1445080
>
> So why is this a problem?  Because later, when we try to merge the
> branch back to trunk, the automatic merge code is trying to determine
> if it is dealing with a "sync" or "reintegrate" style merge, it sees
> that there is a gap in the mergeinfo on the source branch (i.e.
> /subversion/trunk:r1414756-1414757) that makes it appear that the
> branch is *not* fully in sync with the trunk.  Not-in-sync means no
> reintegrate-style merge, but rather a normal sync style merge, which
> (just like 1.7 if the --reintegrate option isn't used) doesn't do what
> we want.  It tries to merge all of the branches changes, resulting in
> dozens of tree conflicts.
>

Why doesn't the merge code recognize r1414756 as inconsequential?

So that is the cause of the problem Mark observed, but why did the
> sync merges Stefan performed in r1442344 and 1445479 do the wrong
> thing?  Repeating those two offending merges I am not able to
> replicate the faulty mergeinfo in either case:
>
> The first sync merge done in r1442344:
>
> C:\SVN\src-branch-fsfs-format7>svn info
> Path: .
> Working Copy Root Path: C:\SVN\src-branch-fsfs-format7
> URL: https://svn.apache.org/repos/asf/subversion/branches/fsfs-format7
> Relative URL: ^/subversion/branches/fsfs-format7
> Repository Root: https://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1442343
> Node Kind: directory
> Schedule: normal
> Last Changed Author: stefan2
> Last Changed Rev: 1442081
> Last Changed Date: 2013-02-04 06:27:56 -0500 (Mon, 04 Feb 2013)
>
> C:\SVN\src-branch-fsfs-format7>svn st
>
> C:\SVN\src-branch-fsfs-format7>svn merge ^^/subversion/trunk . --accept
> postpone
> --- Merging r1414756 through r1450917 into '.':
> U    get-deps.sh
> U    Makefile.in
> U    build.conf
> .
> <SNIP>
> .
> U    autogen.sh
> U    aclocal.m4
> D    packages
>  U   .
> --- Recording mergeinfo for merge of r1414756 through r1450917 into '.':
>  G   .
> Summary of conflicts:
>   Text conflicts: 5
>   Tree conflicts: 1
>
> C:\SVN\src-branch-fsfs-format7>svn diff --depth empty
> Index: .
> ===================================================================
> --- .   (revision 1442343)
> +++ .   (working copy)
>
> Property changes on: .
> ___________________________________________________________________
> Modified: svn:mergeinfo
>    Merged /subversion/branches/issue-4116-dev:r1424719-1425040
>    Merged /subversion/trunk:r1414756-1450917
>                              ^^^^^^^
>                              That's right! The mergeinfo picks up right
>                              after the yca.
>    Merged
> /subversion/branches/tweak-build-take-two:r1424288-1425049,1425051-1425613
>    Merged /subversion/branches/in-repo-authz:r1414342-1424779
>    Merged /subversion/branches/issue-4194-dev:r1410507-1414880
>    Merged /subversion/branches/wc-collate-path:r1407642
> Modified: svn:ignore
> ## -49,3 +49,6 ##
>  zlib
>  sqlite-amalgamation
>  serf
> +gtest
> +.git
> +.gitignore
>
>
> The second merge in r1445479, which should have filled in the
> mergeinfo gap, also works when I tried it:
>
> C:\SVN\src-branch-fsfs-format7>svn info
> Path: .
> Working Copy Root Path: C:\SVN\src-branch-fsfs-format7
> URL: https://svn.apache.org/repos/asf/subversion/branches/fsfs-format7
> Relative URL: ^/subversion/branches/fsfs-format7
> Repository Root: https://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1445478
> Node Kind: directory
> Schedule: normal
> Last Changed Author: stefan2
> Last Changed Rev: 1445080
> Last Changed Date: 2013-02-12 05:02:13 -0500 (Tue, 12 Feb 2013)
>
>
> C:\SVN\src-branch-fsfs-format7>svn st
>
> C:\SVN\src-branch-fsfs-format7>svn merge ^^/subversion/trunk --accept
> postpone
> --- Merging r1442090 through r1450954 into '.':
> U    get-deps.sh
> U    Makefile.in
> U    build.conf
> .
> <SNIP>
> .
> U    INSTALL
> U    CHANGES
> D    packages
> --- Recording mergeinfo for merge of r1414756 through r1450954 into '.':
>  U   .
> Summary of conflicts:
>   Text conflicts: 1
>
> C:\SVN\src-branch-fsfs-format7>svn diff --depth empty
> Index: .
> ===================================================================
> --- .   (revision 1445478)
> +++ .   (working copy)
>
> Property changes on: .
> ___________________________________________________________________
> Modified: svn:mergeinfo
>    Merged /subversion/trunk:r1414756-1414757,1442090-1450954
>                              ^^^^^^^^^^^^^^^
>                              The gap is plugged!
>
> So what happened?  There are a few options that I can see:
>
> 1) Transient mergeinfo recording bug that afflicted the client that
> Stefan used to do the merges (I don't recall anything like this
> recently and this is pretty basic stuff that has been in place since
> 1.6).
>

Used 1.6.17. But that does not seem to be the root cause:

$ /usr/bin/svn merge ^/subversion/trunk .
--- Merging r1445081 through r1450992 into '.':
              ^^^^
       So, it recognizes that r1414756 as inconsequential.
U    get-deps.sh
... gazillion changes and a conflict on fs_fs.c ...

$ /usr/bin/svn pl -v | grep "trunk"
    /subversion/trunk:1414756-1450992

The mergeinfo now covers r1414756 as well.

2) Stefan didn't actually do a sync merge, but rather did a cherry
> pick, e.g. 'svn merge ^/subversion/trunk branch-wc -r1414757:HEAD'.
> Stefan - do you recall how you did these merges?  How about what
> client version you used?  Did you use --allow-mixed-revisions by
> chance
>

The client that I used will always send revision ranges
to the SVN libs and that is the trigger of what we are
seeing here. So, yes, technically, a cherry pick as in
"picking all cherries that there are".

I guess the question is why we make that a user problem.
The user (me) intended to merge all changes from /trunk
to the branch and told svn where to find these. The fact
that SVN gets confused by completely unrelated changes
not being listed explicitly is a tool (SVN) problem.

3) Some as of yet undiscovered bug
>

Only a usability issue.


> P.S. Stefan, In the meantime I suggest you do the following record
> only merge to plug the gap, so when your branch is ultimately merged
> back to trunk it will be possible via a simple automatic merge:
>
> svn merge ^/subversion/trunk --record-only -r1414755:1414757
>

Done in r1451004.

-- Stefan^2.

-- 
Certified & Supported Apache Subversion Downloads:
*

http://www.wandisco.com/subversion/download
*