You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@btopenworld.com> on 2008/08/06 11:29:44 UTC

Re: [PATCH] Merge deleting a file - compare its content

On Tue, 2008-08-05 at 15:57 +0100, Julian Foad wrote:
> On Fri, 2008-07-25 at 12:52 +0100, Julian Foad wrote:
> > On Thu, 2008-07-24 at 11:57 -0400, Paul Burba wrote:
> > > Now we try the reverse merge, and as in < 1.5 the files scheduled for
> > > addition are skipped:
> > > 
> > >   1.5.1>svn.exe merge -r3:2 %URL%/branches/mybranch
> > >   Skipped 'newfile1'
> > >   Skipped 'newfile2'
> > >   Skipped 'newfile3'
> > 
> > A good enhancement will be to compare the locally-added file with the
> > one that the incoming change thinks it's deleting (i.e. the merge-left
> > source), and only skip if they're different. That check is something
> > we're doing for tree conflict detection anyway, but can be done
> > independently.
> > 
> > I'll have a go at a patch to make this happen. (It so happens that this
> > is what I am about to work on for tree conflicts anyway.) I haven't
> > missed any reason why we'd not want to change this, have I?
> 
> Here we go.
> 
> Can anyone take a look at the attached patch and comment on the places
> I've flagged with "###" and generally on whether it's going the right
> way?
> 
> Much appreciated...

Here's an improved version. It now compares the properties as well as
the text content. It ignores the "svn:mergeinfo" property because that
property does not contribute to a difference in the file's current state
but only tells how the file reached its current state.

Just one test fails: merge_tests.py 77. It looks like the reverse-merge
"svn merge -r9:2 A A_COPY" is omitting r8 from the revision ranges it
thinks it needs to handle. r8 was a text mod to A/D/H/nu. Is this
because Subversion thinks, "there's no point merging r8 (a text mod)
because I can see that I would follow that with deleting the file, so it
would be a waste of effort"?

Paul or anyone, can you show me where this logic takes place in the
merge code path?

[[[
Properties on 'A_COPY/D/H/nu':
  svn:mergeinfo : /A/D/H/nu:5-8

CMD: svn ci -m "log msg"
Sending        A_COPY
Sending        A_COPY/B/E/beta
Sending        A_COPY/D/H
Adding         A_COPY/D/H/nu
Sending        A_COPY/D/H/omega
Transmitting file data ...
Committed revision 9.

CMD: svn merge -r9:2 A A_COPY
--- Reverse-merging r7 through r5 into 'A_COPY/D/H':
U    A_COPY/D/H/omega
Skipped 'A_COPY/D/H/nu'
--- Reverse-merging r6 through r5 into 'A_COPY':
U    A_COPY/B/E/beta
]]]

The file 'nu' is skipped because the change r7-r5 wants to delete it but
its content in A/D/H/nu@7 does not match A_COPY/D/H/nu@9. (The content
of A/D/H/nu@8 does match. That's the "merge-left source" that we need to
provide to the "file_deleted()" method.)

- Julian


Re: [PATCH] Merge deleting a file - compare its content

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Aug 08, 2008 at 05:53:01PM +0100, Julian Foad wrote:
> I must declare that this makes a semantic change to all the
> svn_client_merge3() family of APIs. I recently told someone they
> couldn't do that, and I should tell myself I can't do this.

Heh, I believe that was me :)

Good you mentioned it again, I've filed an issue now so we don't
forget about it: "API regression in svn_client_copy4/3/2/1()"
http://subversion.tigris.org/issues/show_bug.cgi?id=3256

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Merge deleting a file - compare its content

Posted by Julian Foad <ju...@btopenworld.com>.
On Fri, 2008-08-08 at 11:53 +0100, Julian Foad wrote:
> On Thu, 2008-08-07 at 19:47 -0400, Paul Burba wrote:
> > On Thu, Aug 7, 2008 at 7:34 AM, Julian Foad <ju...@btopenworld.com> wrote:
> > > On Wed, 2008-08-06 at 17:53 -0400, Paul Burba wrote:
> > I think you can commit it and mark merge_tests.py 77 as XFail (with a
> > link to this thread in the test's comments).  I'm still fairly certain

Oops, I forgot to add a link to this thread... r32407.

> > I can tweak the 3067-related code to work with it (though I still
> > haven't quite got it working)...
> 
> Phew. Thanks. I can't commit until the comparison accounts for
> "translation" (keywords, line endings) between repository-normal form
> and working-copy form, but that's what I'm doing now.

Done and committed in r32405.

I must declare that this makes a semantic change to all the
svn_client_merge3() family of APIs. I recently told someone they
couldn't do that, and I should tell myself I can't do this. Instead, I
should create a new set of APIs with the new semantics and preserve the
old semantics on the old ones, as third-party tool authors might be
relying on the behaviour.

The API functions affected are
  svn_client_merge/2/3()
  svn_client_merge_reintegrate()
  svn_client_merge_peg/2/3()

The changes required to do preserve the old semantics are:

  New API revs:
    svn_client_merge4()
    svn_client_merge_reintegrate2()
    svn_client_merge_peg4()

  Add some flag into the merge baton that the implementation uses,
  and use it to select "old" or "new" behaviour in merge_file_deleted().

Do folks think this needs to be done, or is the change enough of a
harmless one that we're OK with it?

Personally, I think we (I) should rev the APIs to maintain the old
behaviour, but I'm asking as I don't like to do the extra work if it's
not necessary.

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Merge deleting a file - compare its content

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2008-08-07 at 19:47 -0400, Paul Burba wrote:
> On Thu, Aug 7, 2008 at 7:34 AM, Julian Foad <ju...@btopenworld.com> wrote:
> > On Wed, 2008-08-06 at 17:53 -0400, Paul Burba wrote:
> >> I'm playing around with some tweaks to filter_merged_revisions() and
> >> prepare_subtree_ranges()'s behaviors in case F (and maybe it's forward
> >> merge equivalent described in case A) to see if your patch and the
> >> issue #3067 fixes can coexist.
> >
> > Cool. Thanks.
> >
> > May I commit my patch with this one case not yet working? Or, less
> > onerously, do you agree that the merge_file_deleted method is being
> > called incorrectly and we should look to fix it as a separate exercise?
> 
> I think you can commit it and mark merge_tests.py 77 as XFail (with a
> link to this thread in the test's comments).  I'm still fairly certain
> I can tweak the 3067-related code to work with it (though I still
> haven't quite got it working)...

Phew. Thanks. I can't commit until the comparison accounts for
"translation" (keywords, line endings) between repository-normal form
and working-copy form, but that's what I'm doing now.


> P.S. Now do you know why everyone was afraid to review the 3067
> backport for 1.5.1? :-D

Argh. Only too well!

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Doc string of prepare_subtree_ranges() [was: Merge deleting a file - compare its content]

Posted by Julian Foad <ju...@btopenworld.com>.
On Fri, 2008-08-08 at 12:05 +0100, Julian Foad wrote:
> Paul Burba wrote:
> > Probably just stating the obvious, but keep in mind that
> > prepare_subtree_ranges() is a dedicated helper of
> > filter_merged_revisions() and almost all the arguments to the former
> > are straight from the latter.  So understanding this function in a
> > vacuum is quite difficult.
> 
> Yes, ack. Didn't mean to be too harsh. [...]

But then I continued...
[...]
> OK to an extent. But again, the reader does also need to know the common
> single definition of the output, otherwise he can't make use of the
> output without knowing which case it comes from.
[...]

Paul,

I've "gone off on one" about this particular function's doc string
because you drew it to my attention, but it may not be the best place to
spend energy now so don't feel you have to respond or do anything with
this.

Plenty of other stuff we need to do...

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Doc string of prepare_subtree_ranges() [was: Merge deleting a file - compare its content]

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2008-08-07 at 19:47 -0400, Paul Burba wrote:
> On Thu, Aug 7, 2008 at 7:34 AM, Julian Foad <ju...@btopenworld.com> wrote:
[...]
> > We are reverse-merging changes r9:2 from a source branch "A" to a
> > destination branch "A_COPY". The only revisions we could possibly
> > consider merging are those within 9:2 that were operative in the source
> > branch's natural history (the revisions in which any kind of change was
> > committed to this tree). In this test, "svn log" shows that those are
> > A/: r3,4,5,6,7,8; A/D/H: r3,6,7,8; A/D/H/nu: r7,8.
> >
> > I think are saying that our merge-tracking logic deems that, of those
> > possible source revisions, the only ones that it makes sense to
> > reverse-merge are those that the mergeinfo says are present in the
> > target. (For A_COPY/, those are r5,6. For A_COPY/D/H/nu, the
> > intersection is r7,8.)
> 
> Exactly right for this example, but it's important to note that things
> aren't always so simple.  With the current implementation of merge
> tracking, particularly the nature of svn:mergeinfo as an inheritable
> property, a path can end up with inherited or explicit mergeinfo that
> is *not* part of its natural history - see

I don't understand. I acknowledge that a path can have (inherited or
explicit) mergeinfo that is not part of its natural history; isn't that
in fact how mergeinfo is always supposed to be? I thought #3157#desc8
says it is an aberration that mergeinfo sometimes DOES include bits of
the path's natural history.

So I'm not sure what less-simple case you're thinking of.


> http://subversion.tigris.org/issues/show_bug.cgi?id=3157#desc8,
> particularly the section "Several potential problems remain though".

[...]
> >> First 'A_COPY', this is simple, it needs r5 and r6 reverse merged.
> >>
> >> Next 'A_COPY\D\H', again this is straightforward, needing r5, r6, and
> >> r7 reverse merged.
> >>
> >> Lastly 'A_COPY\D\H\nu', this is where things get slightly more
> >> complicated.  On the face of it (going simply by the mergeinfo) this
> >> path needs r5-r8 reverse merged.  But 'nu' doesn't exist in the merge
> >> source until r7 so we don't want to reverse merge r5 or r6 into 'nu'
> >> directly as this would break the editor since 'nu' doesn't exist in
> >> the source at those revisions.
> >
> > Sure, that's clear.
> >
> >> I don't want to bury you in #3067 minutiae but now might be a good
> >> time to read the doc string for merge.c:prepare_subtree_ranges().
> >
> > Urg. :-) Comments/review, just of the doc string:
> 
> Probably just stating the obvious, but keep in mind that
> prepare_subtree_ranges() is a dedicated helper of
> filter_merged_revisions() and almost all the arguments to the former
> are straight from the latter.  So understanding this function in a
> vacuum is quite difficult.

Yes, ack. Didn't mean to be too harsh. I did of course (briefly) read up
on the remaining arguments where they're defined in the parent or
grandparent function. As you are only too well aware, there is a huge
mass of knowledge and understanding of how it all works required to
understand how any part of it works. Where I'm trying to help is by
identifying certain bits of this knowledge that need not be deeply
entangled and should be readily understandable by an outsider in a
"black box description" way.

> > Can you add a sentence that says what the REQUESTED_RANGELIST parameter
> > is for and what its element type is? There are partial descriptions of
> > it in Note1, Note2 and some of the 8 cases, but an overall statement
> > about it would be helpful.
> 
> merge.c's private functions use the term "rangelist" quite frequently,
> the assumption being the reader of this code will have looked at
> svn_mergeinfo.h and seen its "Terminology for data structures that
> contain mergeinfo" comment.  I'm torn between being concise and being
> pedantic!

Yes. Good to be concise. I understand the notion of a "rangelist" and
it's great that it's documented elsewhere. What I want to know here is
(1) the type of parameter, and especially (2) its purpose (meaning) with
respect to this function's job.

(1) Thanks for your update in r32401. That helps a lot. It makes clear
that it is an output not in-out, allocated by this function not its
caller, of element type (svn_merge_range_t *).

(BTW, how's the attached patch to the descriptions in
"svn_mergeinfo.h"?)

(2) As a caller, I ask myself "What do I get from this function?" As
well as the case-by-case description which provides understanding at a
deeper level, I need to know, "These are the ranges in which a node
exists at PRIMARY_URL" (no) or "These are the ranges in which the object
PRIMARY_URL@REVISION1 exists" (no) or "These are the ranges which ..."
WHAT?!!! What's the common feature of this output, that I can make use
of without knowing which case (A, B, ..., G, H) it came from?

The key description is this:

  [...] The purpose of this helper is to identify these cases of broken
history and where possible to adjust the requested range
REVISION1:REVISION2 being merged to the subtree so that we don't try to
describe invalid path/revisions to the merge report editor [...]

OK, so maybe, just maybe ...

  "Set *REQUESTED_RANGELIST to the list of unbroken history segments of
PARENT_PATH in which changes occurred to this sub-tree."

(Am I getting warmer with each guess?)

 
> > Maybe something like:
> >
> >  Set *REQUESTED_RANGELIST to a newly allocated array of
> >  unordered
> 
> Did you see a case where they would be unordered?  This shouldn't be the case.

The sentence "Then take the intersection of REVISION1:N [...] and add it
to *REQUESTED_RANGELIST" gave me a feeling of uncertainty about it, so I
wrote that so you'd check it. I'm glad it's an ordered list.

> > (svn_merge_range_t) elements representing the
> >  revision ranges in which the node PRIMARY_URL@REVISION1
> >  existed, plus any range in which its parent existed before
> >  it did, between REVISION1 and REVISION2. ("Before" being
> >  relative to the direction REVISION1 >> REVISION2.) (???)
> 
> I'd prefer not to do this at it would be redundant with the case A-H
> listing that is already there no?  I did rework the comment in r32401
> in an effort to make it clearer...let me know if it helps.

OK to an extent. But again, the reader does also need to know the common
single definition of the output, otherwise he can't make use of the
output without knowing which case it comes from.

[...]
> > Intro: "Note in *CHILD_DELETED_OR_NONEXISTANT if the subtree doesn't
> > exist, is deleted, or is renamed in the requested range." Say, "or is
> > renamed or replaced". Should this say, "doesn't exist at all in the
> > requested range, or is deleted, renamed or replaced before the end of
> > the range"?
> 
> r32401 removed that comment, referring to the 8 cases, rather than
> trying to sum up the behavior.

Again, that's now a more precise specification of what the function
does, but the reader also needs to know what the common meaning is, in
order to be able to make use of that flag.

> > What I'm trying to get at is it's not clear about how the
> > object is allowed to come into existence after the start of the range.
> 
> Not entirely sure I follow you here...do you mean how can we be asking
> about PRIMARY_URL's history between REVISION1:REVISION2 when that
> history might not be complete?  If so see my next comment.  If that's
> not what you meant, can you say it again, but with different words?
> :-P

Well, I'm not too clear either. Something like: you're saying that this
function returns the history segments in which "the object" existed...
but if "the object" didn't exist at REVISION1 (its peg rev), then how
can we claim to return any information about "the object"? Or, if you're
saying that this function returns the history segments in which "this
PATH" existed, or something else, then, well, maybe that story would
hang together better...


Very sorry but though you've written a lot more below I can't take any
more today and am skipping to the end (in a separate reply) :-)

> > Case B: I'm afraid I can't follow the description. "Ranges ... at which
> > PRIMARY_URL exists": meaning all ranges in which any node exists at
> > PRIMARY_URL? If there are two or more, does it matter that these might
> > be history segments of different objects?
> 
> Not really, because we are dealing with merge target subtrees here,
> recall the comment (emphasis on subtrees added):
> 
> "Since this function is only invoked for *SUBTREES* of the merge
> target, the guarantees afforded by normalize_merge_sources() don't
> apply.  Therefore it is possible that PRIMARY_URL@REVISION1 and
> PRIMARY_URL@REVISION2 don't describe an unbroken line of history."
> 
> This might be a good time to look at normalize_merge_sources() and the
> 'MERGEINFO MERGE SOURCE NORMALIZATION' comment at the top of merge.c
> if you haven't already.
> 
> Anyway, the fact that subtrees don't adhere the the merge source
> normalization rules is the crux of issue #3067.  It might help to
> think of it as being a side effect of how merging -rX:Y to PATH can
> have a different effect on PATH/SUBTREE than can merging -rX:Y
> *directly* to PATH/SUBTREE.
> 
> For example, say we have this repos structure:
> 
> /
> trunk/
> trunk/file
> branch/file
> 
> Where trunk and trunk/file were created in r1
> 
> branch was copied from trunk in r2
> 
> trunk/file was modified in r3
> 
> trunk/file was deleted in r4
> 
> trunk/file was added (with identical content to trunk/file@2) without
> history in r5
> 
> Now we could merge r3 from trunk_URL to branch and branch/file will be
> updated.  This is because the merge only cares that trunk@2:3 and
> trunk@HEAD are ancestrally related so it does the merge.  But if we
> merge r3 from trunk_URL/file to branch/file directly, then nothing
> happens, trunk/file@2:3 is not ancestrally related to trunk/file@HEAD
> so the merge is a no-op.
> 
> Now imagine if we made many changes to trunk/file before it was
> replaced and then made many more after.  Imagine that subset of the
> changes from before and after had been merged in such a way that
> branch/file had it's own explicit mergeinfo, which might look like:
> 
>   svn:mergeinfo : /trunk/file:3-4,7,23,25-40,42
> 
> This is valid, but realize that /trunk/file:3-4,7 and say
> /trunk/file:23,25-40,42 could be two completely unrelated lines of
> history (let's say trunk/file was replaced without history in r12).
> 
> Let's further assume that r5 and r45 are mods to /trunk/file on those
> two different lines of history.  What happens if we merge -r3:45
> trunk_URL to branch?  Since trunk/file is a subtree of the merge
> target it should attempt to get both r5 and r45 merged into it.
> 
> Sorry, that is the long answer as to why it doesn't "matter that these might
> > be history segments of different objects".  I swear to you I tried to formulate a more concise answer...but it made no sense :-(
> 
> > Then, adding a segment in
> > which it does not exist: is that behaviour specific to a particular
> > requirement that the current caller has in this case, or something that
> > can be explained more generally?
> 
> It's to do with how merge.c:drive_merge_report_editor() works.  The
> connection is probably not obvious, so here it is:
> 
> do_directory_merge() uses this group of helper functions to determine
> what ranges to merge:
> 
> populate_remaining_ranges
> calculate_remaining_ranges
> filter_merged_revisions
> prepare_subtree_ranges
> 
> Then do_directory_merge() calls drive_merge_report_editor() to
> actually merge those revision.
> 
> Sidebar: It's almost certainly possible to streamline that first group
> of helpers, but it won't be easy.
> 
> Anyway, look in drive_merge_report_editor() where we loop over the
> CHILDREN_WITH_MERGEINFO, right after the comment "Describe children
> with mergeinfo overlapping this merge operation such that no repeated
> diff is retrieved for them from the repository."  This is where issue
> #3067 ultimately manifested itself, those reporter->set_path() calls
> at the bottom of the loop were describing *subtree* path/revisions
> that didn't exist.  But notice this bit:
> 
>           /* If a subtree needs the same range applied as it's nearest parent
>              with mergeinfo, then we don't need to describe the subtree
>              separately. */
> 
> That's why case 'B' sets a subtree's remaining_ranges to include the
> ranges it's nearest parent also has, so we don't try to describe the
> subtree at a revision it doesn't exist.
> 
> First, does that make sense at all?
> 
> Second, if it does, do you have a suggestion as to how to better
> document it in the code?  Maybe refer to that specific part of
> drive_merge_report_editor() in prepare_subtree_ranges?
> 
> I know, this code is not clearly organized but I'd like to move in
> that direction as much as possible (and resist the urge to enumerate
> the reasons it got this way).
> 
> > "PARENT->REQUESTED_RANGELIST": do you
> > mean PARENT->REMAINING_RANGES?
> 
> Yes, that was a typo, fixed.
> 
> >
> > Case D says "not replaced within that range". Whereas case H is
> > complemented by case F which covers the broken-line-of-history variant,
> > case D is not complemented by such a case.
> 
> The 4 forward merge cases are not exactly analogous to the 4 reverse
> merge cases.  This is because of the way the workhorse of this
> function, namely svn_client__repos_location_segments(), works.
> svn_client__repos_location_segments() requires that its START_REV
> argument be younger than it's END_REV arg and PEG_REV must be at least
> as young as START_REV.
> 
> 
> > Case E: The "###" note: Urg? Sorry, overwhelmed and baffled :-0
> 
> Sorry about the format of the comment, this was in place in a 3067
> patch I submitted to dev and was intended to to draw reviewers into
> the part of the 3067 fix I was least confident in...I'm not sure that
> happened.  Anyway it should have been cleaned up before going into
> trunk, but got committed as-is.  I didn't touch it much in my r32401
> cleanup...I hesitate to remove them as the comments still seem valid.
> 
> Ok, since you are probably as sick of reading this as I am of writing
> it :-) I'll say this:
> 
> Tweak merge_tests.py 98 'subtree merge source might not exist' to
> raise a failure right after the "# Now test a reverse merge where part
> of the requested range postdates" comment.  Check out what the test is
> doing there and see if Case E starts to make any sense.

- Julian


Re: [PATCH] Merge deleting a file - compare its content

Posted by Paul Burba <pt...@gmail.com>.
On Thu, Aug 7, 2008 at 7:34 AM, Julian Foad <ju...@btopenworld.com> wrote:
> On Wed, 2008-08-06 at 17:53 -0400, Paul Burba wrote:
>> On Wed, Aug 6, 2008 at 7:29 AM, Julian Foad <ju...@btopenworld.com> wrote:
>> > Here's an improved version. It now compares the properties as well as
>> > the text content. It ignores the "svn:mergeinfo" property because that
>> > property does not contribute to a difference in the file's current state
>> > but only tells how the file reached its current state.
>>
>> +1
>>
>> > Just one test fails: merge_tests.py 77. It looks like the reverse-merge
>> > "svn merge -r9:2 A A_COPY" is omitting r8 from the revision ranges it
>> > thinks it needs to handle. r8 was a text mod to A/D/H/nu. Is this
>> > because Subversion thinks, "there's no point merging r8 (a text mod)
>> > because I can see that I would follow that with deleting the file, so it
>> > would be a waste of effort"?
>>
>> Your guess is pretty much spot on, r8 is being ignored, as far as 'nu'
>> is concerned, because 'nu' will be deleted anyway when r7 is reverse
>> merged.  Well, to be accurate, it's not doing this simply because 'nu'
>> will be deleted, rather it is taking advantage of that fact as part of
>> the fix for issue #3067, which tries to avoid describing
>> path/revisions to the editor which don't actually exist.
>>
>> It might help to walk through exactly what is happening in merge_tests.py 77:
>
> Thanks for all this, Paul.
>
>> Right before we reverse merge -r9:2 from 'A' to 'A_COPY', the tree
>> rooted at 'A_COPY' has the following paths explicit mergeinfo:
>>
>>                               Path             Mergeinfo
>> 1) The merge target itself  : A_COPY        :  /A:5-6
>> 2) A subtree with mergeinfo : A_COPY\D\H    :  /A/D/H:5-7
>> 3) A subtree with mergeinfo : A_COPY\D\H\nu :  /A/D/H/nu:5-8
>>
>> (Forgive me if I over-explain but I'm not sure how much you have
>> looked at these parts of the merge code)
>>
>> Each of these three paths are stored in the ubiquitous
>> children_with_mergeinfo array in a svn_client__merge_path_t struct
>> which contains a slew of merge-related info about the path, including
>> the remaining_ranges to be merged.
>>
>> The merge looks at each path (this is all happening in
>> lbisvn_client/merge.c:populate_remaining_ranges() and its many
>> helpers) and determines what revisions of the requested r9:2 should be
>> reverse merged into that path.
>
> OK.
>
>> Note, that since this *is* a reverse merge we only consider revisions
>> which are present in the implicit mergeinfo (i.e. natural history) or
>> its inherited/explicit mergeinfo.
>
> I didn't quite follow this sentence at first, but I think I get it now.
> Your mention of "natural history"

(Kamesh, added you to the cc as you know as much about this topic as
me, so if you have any other comments please have at it!)

The terms "natural history" and "implicit mergeinfo" were coined
during the development of merge tracking (well the latter term
certainly was anyway).  Originally tied to the idea that a copy would
create explict mergeinfo describing its natural history or implicit
mergeinfo on the copy destination.  Of course this was later seen to
be unnecessary, the repository knows the copies history after all, but
the term stuck (in my head at least!) - see
svn_client__get_history_as_mergeinfo() for example.  Also see "Natural
History and Implicit Mergeinfo" in
http://www.collab.net/community/subversion/articles/merge-info.html.

> refers to the general case where we
> might be doing a reverse-merge from our branch's own history, but that's
> not what we're doing in this case.

Correct.  In stating the general rule about what is eligible for
reverse merging I just muddied the waters.

> We are reverse-merging changes r9:2 from a source branch "A" to a
> destination branch "A_COPY". The only revisions we could possibly
> consider merging are those within 9:2 that were operative in the source
> branch's natural history (the revisions in which any kind of change was
> committed to this tree). In this test, "svn log" shows that those are
> A/: r3,4,5,6,7,8; A/D/H: r3,6,7,8; A/D/H/nu: r7,8.
>
> I think are saying that our merge-tracking logic deems that, of those
> possible source revisions, the only ones that it makes sense to
> reverse-merge are those that the mergeinfo says are present in the
> target. (For A_COPY/, those are r5,6. For A_COPY/D/H/nu, the
> intersection is r7,8.)

Exactly right for this example, but it's important to note that things
aren't always so simple.  With the current implementation of merge
tracking, particularly the nature of svn:mergeinfo as an inheritable
property, a path can end up with inherited or explicit mergeinfo that
is *not* part of its natural history - see
http://subversion.tigris.org/issues/show_bug.cgi?id=3157#desc8,
particularly the section "Several potential problems remain though".

Sidebar: Notice I said it is a problem with the implementation rather
than the design (though the design was mute or vague at best on these
issues).  I think this could be fixed, I just wonder at what cost of
effort, code complexity, and performance.

>> First 'A_COPY', this is simple, it needs r5 and r6 reverse merged.
>>
>> Next 'A_COPY\D\H', again this is straightforward, needing r5, r6, and
>> r7 reverse merged.
>>
>> Lastly 'A_COPY\D\H\nu', this is where things get slightly more
>> complicated.  On the face of it (going simply by the mergeinfo) this
>> path needs r5-r8 reverse merged.  But 'nu' doesn't exist in the merge
>> source until r7 so we don't want to reverse merge r5 or r6 into 'nu'
>> directly as this would break the editor since 'nu' doesn't exist in
>> the source at those revisions.
>
> Sure, that's clear.
>
>> I don't want to bury you in #3067 minutiae but now might be a good
>> time to read the doc string for merge.c:prepare_subtree_ranges().
>
> Urg. :-) Comments/review, just of the doc string:

Probably just stating the obvious, but keep in mind that
prepare_subtree_ranges() is a dedicated helper of
filter_merged_revisions() and almost all the arguments to the former
are straight from the latter.  So understanding this function in a
vacuum is quite difficult.

> Can you add a sentence that says what the REQUESTED_RANGELIST parameter
> is for and what its element type is? There are partial descriptions of
> it in Note1, Note2 and some of the 8 cases, but an overall statement
> about it would be helpful.

merge.c's private functions use the term "rangelist" quite frequently,
the assumption being the reader of this code will have looked at
svn_mergeinfo.h and seen its "Terminology for data structures that
contain mergeinfo" comment.  I'm torn between being concise and being
pedantic!

> Maybe something like:
>
>  Set *REQUESTED_RANGELIST to a newly allocated array of
>  unordered

Did you see a case where they would be unordered?  This shouldn't be the case.

> (svn_merge_range_t) elements representing the
>  revision ranges in which the node PRIMARY_URL@REVISION1
>  existed, plus any range in which its parent existed before
>  it did, between REVISION1 and REVISION2. ("Before" being
>  relative to the direction REVISION1 >> REVISION2.) (???)

I'd prefer not to do this at it would be redundant with the case A-H
listing that is already there no?  I did rework the comment in r32401
in an effort to make it clearer...let me know if it helps.

> Sorry, that attempt is surely wrong and unhelpful!
>
> Intro: "Filter the requested ranges" -> "Filter the requested range
> REVISION1:REVISION2"?

Done.

> Intro: "Note in *CHILD_DELETED_OR_NONEXISTANT if the subtree doesn't
> exist, is deleted, or is renamed in the requested range." Say, "or is
> renamed or replaced". Should this say, "doesn't exist at all in the
> requested range, or is deleted, renamed or replaced before the end of
> the range"?

r32401 removed that comment, referring to the 8 cases, rather than
trying to sum up the behavior.

> What I'm trying to get at is it's not clear about how the
> object is allowed to come into existence after the start of the range.

Not entirely sure I follow you here...do you mean how can we be asking
about PRIMARY_URL's history between REVISION1:REVISION2 when that
history might not be complete?  If so see my next comment.  If that's
not what you meant, can you say it again, but with different words?
:-P

> Case B: I'm afraid I can't follow the description. "Ranges ... at which
> PRIMARY_URL exists": meaning all ranges in which any node exists at
> PRIMARY_URL? If there are two or more, does it matter that these might
> be history segments of different objects?

Not really, because we are dealing with merge target subtrees here,
recall the comment (emphasis on subtrees added):

"Since this function is only invoked for *SUBTREES* of the merge
target, the guarantees afforded by normalize_merge_sources() don't
apply.  Therefore it is possible that PRIMARY_URL@REVISION1 and
PRIMARY_URL@REVISION2 don't describe an unbroken line of history."

This might be a good time to look at normalize_merge_sources() and the
'MERGEINFO MERGE SOURCE NORMALIZATION' comment at the top of merge.c
if you haven't already.

Anyway, the fact that subtrees don't adhere the the merge source
normalization rules is the crux of issue #3067.  It might help to
think of it as being a side effect of how merging -rX:Y to PATH can
have a different effect on PATH/SUBTREE than can merging -rX:Y
*directly* to PATH/SUBTREE.

For example, say we have this repos structure:

/
trunk/
trunk/file
branch/file

Where trunk and trunk/file were created in r1

branch was copied from trunk in r2

trunk/file was modified in r3

trunk/file was deleted in r4

trunk/file was added (with identical content to trunk/file@2) without
history in r5

Now we could merge r3 from trunk_URL to branch and branch/file will be
updated.  This is because the merge only cares that trunk@2:3 and
trunk@HEAD are ancestrally related so it does the merge.  But if we
merge r3 from trunk_URL/file to branch/file directly, then nothing
happens, trunk/file@2:3 is not ancestrally related to trunk/file@HEAD
so the merge is a no-op.

Now imagine if we made many changes to trunk/file before it was
replaced and then made many more after.  Imagine that subset of the
changes from before and after had been merged in such a way that
branch/file had it's own explicit mergeinfo, which might look like:

  svn:mergeinfo : /trunk/file:3-4,7,23,25-40,42

This is valid, but realize that /trunk/file:3-4,7 and say
/trunk/file:23,25-40,42 could be two completely unrelated lines of
history (let's say trunk/file was replaced without history in r12).

Let's further assume that r5 and r45 are mods to /trunk/file on those
two different lines of history.  What happens if we merge -r3:45
trunk_URL to branch?  Since trunk/file is a subtree of the merge
target it should attempt to get both r5 and r45 merged into it.

Sorry, that is the long answer as to why it doesn't "matter that these might
> be history segments of different objects".  I swear to you I tried to formulate a more concise answer...but it made no sense :-(

> Then, adding a segment in
> which it does not exist: is that behaviour specific to a particular
> requirement that the current caller has in this case, or something that
> can be explained more generally?

It's to do with how merge.c:drive_merge_report_editor() works.  The
connection is probably not obvious, so here it is:

do_directory_merge() uses this group of helper functions to determine
what ranges to merge:

populate_remaining_ranges
calculate_remaining_ranges
filter_merged_revisions
prepare_subtree_ranges

Then do_directory_merge() calls drive_merge_report_editor() to
actually merge those revision.

Sidebar: It's almost certainly possible to streamline that first group
of helpers, but it won't be easy.

Anyway, look in drive_merge_report_editor() where we loop over the
CHILDREN_WITH_MERGEINFO, right after the comment "Describe children
with mergeinfo overlapping this merge operation such that no repeated
diff is retrieved for them from the repository."  This is where issue
#3067 ultimately manifested itself, those reporter->set_path() calls
at the bottom of the loop were describing *subtree* path/revisions
that didn't exist.  But notice this bit:

          /* If a subtree needs the same range applied as it's nearest parent
             with mergeinfo, then we don't need to describe the subtree
             separately. */

That's why case 'B' sets a subtree's remaining_ranges to include the
ranges it's nearest parent also has, so we don't try to describe the
subtree at a revision it doesn't exist.

First, does that make sense at all?

Second, if it does, do you have a suggestion as to how to better
document it in the code?  Maybe refer to that specific part of
drive_merge_report_editor() in prepare_subtree_ranges?

I know, this code is not clearly organized but I'd like to move in
that direction as much as possible (and resist the urge to enumerate
the reasons it got this way).

> "PARENT->REQUESTED_RANGELIST": do you
> mean PARENT->REMAINING_RANGES?

Yes, that was a typo, fixed.

>
> Case D says "not replaced within that range". Whereas case H is
> complemented by case F which covers the broken-line-of-history variant,
> case D is not complemented by such a case.

The 4 forward merge cases are not exactly analogous to the 4 reverse
merge cases.  This is because of the way the workhorse of this
function, namely svn_client__repos_location_segments(), works.
svn_client__repos_location_segments() requires that its START_REV
argument be younger than it's END_REV arg and PEG_REV must be at least
as young as START_REV.


> Case E: The "###" note: Urg? Sorry, overwhelmed and baffled :-0

Sorry about the format of the comment, this was in place in a 3067
patch I submitted to dev and was intended to to draw reviewers into
the part of the 3067 fix I was least confident in...I'm not sure that
happened.  Anyway it should have been cleaned up before going into
trunk, but got committed as-is.  I didn't touch it much in my r32401
cleanup...I hesitate to remove them as the comments still seem valid.

Ok, since you are probably as sick of reading this as I am of writing
it :-) I'll say this:

Tweak merge_tests.py 98 'subtree merge source might not exist' to
raise a failure right after the "# Now test a reverse merge where part
of the requested range postdates" comment.  Check out what the test is
doing there and see if Case E starts to make any sense.

>> What we have merge_tests.py 77 is described in case 'F' in that doc
>> string.  prepare_subtree_ranges() figures out that 'nu' will be
>> deleted by the merge, so doesn't do any filering of the requested
>> ranges, rather it reports to its caller filter_merged_revisions() that
>> 'nu' is going to be deleted.  Now see "A little trick" in
>> filter_merged_revisions() -- It sees that the subtree 'nu' will be
>> deleted, so it sets its remaining_ranges to be the same as its nearest
>> parent in the children_with_mergeinfo array, namely 'A_COPY/D/H', so
>> r8 is never reverse merged, which obviously breaks your patch in this
>> case.
>>
>> I'm playing around with some tweaks to filter_merged_revisions() and
>> prepare_subtree_ranges()'s behaviors in case F (and maybe it's forward
>> merge equivalent described in case A) to see if your patch and the
>> issue #3067 fixes can coexist.
>
> Cool. Thanks.
>
> May I commit my patch with this one case not yet working? Or, less
> onerously, do you agree that the merge_file_deleted method is being
> called incorrectly and we should look to fix it as a separate exercise?

I think you can commit it and mark merge_tests.py 77 as XFail (with a
link to this thread in the test's comments).  I'm still fairly certain
I can tweak the 3067-related code to work with it (though I still
haven't quite got it working)...

Paul

P.S. Now do you know why everyone was afraid to review the 3067
backport for 1.5.1? :-D

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Merge deleting a file - compare its content

Posted by Julian Foad <ju...@btopenworld.com>.
On Wed, 2008-08-06 at 17:53 -0400, Paul Burba wrote:
> On Wed, Aug 6, 2008 at 7:29 AM, Julian Foad <ju...@btopenworld.com> wrote:
> > Here's an improved version. It now compares the properties as well as
> > the text content. It ignores the "svn:mergeinfo" property because that
> > property does not contribute to a difference in the file's current state
> > but only tells how the file reached its current state.
> 
> +1
> 
> > Just one test fails: merge_tests.py 77. It looks like the reverse-merge
> > "svn merge -r9:2 A A_COPY" is omitting r8 from the revision ranges it
> > thinks it needs to handle. r8 was a text mod to A/D/H/nu. Is this
> > because Subversion thinks, "there's no point merging r8 (a text mod)
> > because I can see that I would follow that with deleting the file, so it
> > would be a waste of effort"?
> 
> Your guess is pretty much spot on, r8 is being ignored, as far as 'nu'
> is concerned, because 'nu' will be deleted anyway when r7 is reverse
> merged.  Well, to be accurate, it's not doing this simply because 'nu'
> will be deleted, rather it is taking advantage of that fact as part of
> the fix for issue #3067, which tries to avoid describing
> path/revisions to the editor which don't actually exist.
> 
> It might help to walk through exactly what is happening in merge_tests.py 77:

Thanks for all this, Paul.

> Right before we reverse merge -r9:2 from 'A' to 'A_COPY', the tree
> rooted at 'A_COPY' has the following paths explicit mergeinfo:
> 
>                               Path             Mergeinfo
> 1) The merge target itself  : A_COPY        :  /A:5-6
> 2) A subtree with mergeinfo : A_COPY\D\H    :  /A/D/H:5-7
> 3) A subtree with mergeinfo : A_COPY\D\H\nu :  /A/D/H/nu:5-8
> 
> (Forgive me if I over-explain but I'm not sure how much you have
> looked at these parts of the merge code)
> 
> Each of these three paths are stored in the ubiquitous
> children_with_mergeinfo array in a svn_client__merge_path_t struct
> which contains a slew of merge-related info about the path, including
> the remaining_ranges to be merged.
> 
> The merge looks at each path (this is all happening in
> lbisvn_client/merge.c:populate_remaining_ranges() and its many
> helpers) and determines what revisions of the requested r9:2 should be
> reverse merged into that path.

OK.

> Note, that since this *is* a reverse merge we only consider revisions
> which are present in the implicit mergeinfo (i.e. natural history) or
> its inherited/explicit mergeinfo.

I didn't quite follow this sentence at first, but I think I get it now.
Your mention of "natural history" refers to the general case where we
might be doing a reverse-merge from our branch's own history, but that's
not what we're doing in this case.

We are reverse-merging changes r9:2 from a source branch "A" to a
destination branch "A_COPY". The only revisions we could possibly
consider merging are those within 9:2 that were operative in the source
branch's natural history (the revisions in which any kind of change was
committed to this tree). In this test, "svn log" shows that those are
A/: r3,4,5,6,7,8; A/D/H: r3,6,7,8; A/D/H/nu: r7,8.

I think are saying that our merge-tracking logic deems that, of those
possible source revisions, the only ones that it makes sense to
reverse-merge are those that the mergeinfo says are present in the
target. (For A_COPY/, those are r5,6. For A_COPY/D/H/nu, the
intersection is r7,8.)

> First 'A_COPY', this is simple, it needs r5 and r6 reverse merged.
> 
> Next 'A_COPY\D\H', again this is straightforward, needing r5, r6, and
> r7 reverse merged.
> 
> Lastly 'A_COPY\D\H\nu', this is where things get slightly more
> complicated.  On the face of it (going simply by the mergeinfo) this
> path needs r5-r8 reverse merged.  But 'nu' doesn't exist in the merge
> source until r7 so we don't want to reverse merge r5 or r6 into 'nu'
> directly as this would break the editor since 'nu' doesn't exist in
> the source at those revisions.

Sure, that's clear.

> I don't want to bury you in #3067 minutiae but now might be a good
> time to read the doc string for merge.c:prepare_subtree_ranges().

Urg. :-) Comments/review, just of the doc string:

Can you add a sentence that says what the REQUESTED_RANGELIST parameter
is for and what its element type is? There are partial descriptions of
it in Note1, Note2 and some of the 8 cases, but an overall statement
about it would be helpful. Maybe something like:

  Set *REQUESTED_RANGELIST to a newly allocated array of
  unordered (svn_merge_range_t) elements representing the
  revision ranges in which the node PRIMARY_URL@REVISION1
  existed, plus any range in which its parent existed before
  it did, between REVISION1 and REVISION2. ("Before" being
  relative to the direction REVISION1 >> REVISION2.) (???)

Sorry, that attempt is surely wrong and unhelpful!

Intro: "Filter the requested ranges" -> "Filter the requested range
REVISION1:REVISION2"?

Intro: "Note in *CHILD_DELETED_OR_NONEXISTANT if the subtree doesn't
exist, is deleted, or is renamed in the requested range." Say, "or is
renamed or replaced". Should this say, "doesn't exist at all in the
requested range, or is deleted, renamed or replaced before the end of
the range"? What I'm trying to get at is it's not clear about how the
object is allowed to come into existence after the start of the range.

Case B: I'm afraid I can't follow the description. "Ranges ... at which
PRIMARY_URL exists": meaning all ranges in which any node exists at
PRIMARY_URL? If there are two or more, does it matter that these might
be history segments of different objects? Then, adding a segment in
which it does not exist: is that behaviour specific to a particular
requirement that the current caller has in this case, or something that
can be explained more generally? "PARENT->REQUESTED_RANGELIST": do you
mean PARENT->REMAINING_RANGES?

Case D says "not replaced within that range". Whereas case H is
complemented by case F which covers the broken-line-of-history variant,
case D is not complemented by such a case.

Case E: The "###" note: Urg? Sorry, overwhelmed and baffled :-0


> What we have merge_tests.py 77 is described in case 'F' in that doc
> string.  prepare_subtree_ranges() figures out that 'nu' will be
> deleted by the merge, so doesn't do any filering of the requested
> ranges, rather it reports to its caller filter_merged_revisions() that
> 'nu' is going to be deleted.  Now see "A little trick" in
> filter_merged_revisions() -- It sees that the subtree 'nu' will be
> deleted, so it sets its remaining_ranges to be the same as its nearest
> parent in the children_with_mergeinfo array, namely 'A_COPY/D/H', so
> r8 is never reverse merged, which obviously breaks your patch in this
> case.
> 
> I'm playing around with some tweaks to filter_merged_revisions() and
> prepare_subtree_ranges()'s behaviors in case F (and maybe it's forward
> merge equivalent described in case A) to see if your patch and the
> issue #3067 fixes can coexist.

Cool. Thanks.

May I commit my patch with this one case not yet working? Or, less
onerously, do you agree that the merge_file_deleted method is being
called incorrectly and we should look to fix it as a separate exercise?

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Merge deleting a file - compare its content

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2008-08-07 at 14:08 +0200, Jens Seidel wrote:
> On Wed, Aug 06, 2008 at 05:53:53PM -0400, Paul Burba wrote:
> > On Wed, Aug 6, 2008 at 7:29 AM, Julian Foad <ju...@btopenworld.com> wrote:
> > > Here's an improved version. It now compares the properties as well as
> > > the text content. It ignores the "svn:mergeinfo" property because that
> > > property does not contribute to a difference in the file's current state
> > > but only tells how the file reached its current state.
> 
> The svn:executable, svn:mime-type and svn:eol-style should not matter as
> well. This file is not changed just because it differs in these properties.

In some cases, you are right: the user would not care about the change
they made to one of those properties, if the file is simply being
deleted. However, in other cases, especially if the file is deleted
because the incoming change is renaming/moving it, then the user
probably does care and they will want to notice this difference and copy
the new property value to the new file. Therefore I think it is better
not to guess that the user won't care, but instead guess that they will
care.

>  I often change these in one branch but forget it in another
> one (and merging is not always possible if the branches diverged a lot).

I'm not sure about "merging is not always possible": remember that the
very case we're discussing IS a merge.

> If the content is equal but other (non-trivial) properties are not, it

I cannot think of a rule that we could use to distinguish "important"
properties from "trivial" ones.

> is a good idea to keep the properties as proposed. But would it be OK to
> change the file to an empty one (the content matched the to be removed
> file, we are in trouble because of the properties only)? Maybe also
> changing the file content into:
> 
> <<<<<<<<
> ========
> Subversion note: This file was indented to be removed but
> the properties didn't match. That's why it still exists with
> this pseudo conflict in it but with removed content.
> >>>>>>>>

Eww, no, I don't think it would be nice to have yet another "in-between"
result. Let us stick to the simple results: either the deletion is done
or it is not done.

Note that tree conflict handling will make this better. Specifically, if
the properties differ then a conflict will be raised and the user will
easily be able to say "resolve by using Theirs" which means delete the
file.

Thanks,
- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Merge deleting a file - compare its content

Posted by Jens Seidel <je...@users.sourceforge.net>.
On Wed, Aug 06, 2008 at 05:53:53PM -0400, Paul Burba wrote:
> On Wed, Aug 6, 2008 at 7:29 AM, Julian Foad <ju...@btopenworld.com> wrote:
> > On Tue, 2008-08-05 at 15:57 +0100, Julian Foad wrote:
> >> On Fri, 2008-07-25 at 12:52 +0100, Julian Foad wrote:
> >> > On Thu, 2008-07-24 at 11:57 -0400, Paul Burba wrote:
> >> > > Now we try the reverse merge, and as in < 1.5 the files scheduled for
> >> > > addition are skipped:
> >> > >
> >> > >   1.5.1>svn.exe merge -r3:2 %URL%/branches/mybranch
> >> > >   Skipped 'newfile1'
> >> > >   Skipped 'newfile2'
> >> > >   Skipped 'newfile3'
> >> >
> >> > A good enhancement will be to compare the locally-added file with the
> >> > one that the incoming change thinks it's deleting (i.e. the merge-left
> >> > source), and only skip if they're different. That check is something
> >> > we're doing for tree conflict detection anyway, but can be done
> >> > independently.

> > Here's an improved version. It now compares the properties as well as
> > the text content. It ignores the "svn:mergeinfo" property because that
> > property does not contribute to a difference in the file's current state
> > but only tells how the file reached its current state.

The svn:executable, svn:mime-type and svn:eol-style should not matter as
well. This file is not changed just because it differs in these
properties. I often change these in one branch but forget it in another
one (and merging is not always possible if the branches diverged a lot).

If the content is equal but other (non-trivial) properties are not, it
is a good idea to keep the properties as proposed. But would it be OK to
change the file to an empty one (the content matched the to be removed
file, we are in trouble because of the properties only)? Maybe also
changing the file content into:

<<<<<<<<
========
Subversion note: This file was indented to be removed but
the properties didn't match. That's why it still exists with
this pseudo conflict in it but with removed content.
>>>>>>>>

Jens

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Merge deleting a file - compare its content

Posted by Paul Burba <pt...@gmail.com>.
On Wed, Aug 6, 2008 at 7:29 AM, Julian Foad <ju...@btopenworld.com> wrote:
> On Tue, 2008-08-05 at 15:57 +0100, Julian Foad wrote:
>> On Fri, 2008-07-25 at 12:52 +0100, Julian Foad wrote:
>> > On Thu, 2008-07-24 at 11:57 -0400, Paul Burba wrote:
>> > > Now we try the reverse merge, and as in < 1.5 the files scheduled for
>> > > addition are skipped:
>> > >
>> > >   1.5.1>svn.exe merge -r3:2 %URL%/branches/mybranch
>> > >   Skipped 'newfile1'
>> > >   Skipped 'newfile2'
>> > >   Skipped 'newfile3'
>> >
>> > A good enhancement will be to compare the locally-added file with the
>> > one that the incoming change thinks it's deleting (i.e. the merge-left
>> > source), and only skip if they're different. That check is something
>> > we're doing for tree conflict detection anyway, but can be done
>> > independently.
>> >
>> > I'll have a go at a patch to make this happen. (It so happens that this
>> > is what I am about to work on for tree conflicts anyway.) I haven't
>> > missed any reason why we'd not want to change this, have I?
>>
>> Here we go.
>>
>> Can anyone take a look at the attached patch and comment on the places
>> I've flagged with "###" and generally on whether it's going the right
>> way?
>>
>> Much appreciated...
>
> Here's an improved version. It now compares the properties as well as
> the text content. It ignores the "svn:mergeinfo" property because that
> property does not contribute to a difference in the file's current state
> but only tells how the file reached its current state.

+1

> Just one test fails: merge_tests.py 77. It looks like the reverse-merge
> "svn merge -r9:2 A A_COPY" is omitting r8 from the revision ranges it
> thinks it needs to handle. r8 was a text mod to A/D/H/nu. Is this
> because Subversion thinks, "there's no point merging r8 (a text mod)
> because I can see that I would follow that with deleting the file, so it
> would be a waste of effort"?

Your guess is pretty much spot on, r8 is being ignored, as far as 'nu'
is concerned, because 'nu' will be deleted anyway when r7 is reverse
merged.  Well, to be accurate, it's not doing this simply because 'nu'
will be deleted, rather it is taking advantage of that fact as part of
the fix for issue #3067, which tries to avoid describing
path/revisions to the editor which don't actually exist.

It might help to walk through exactly what is happening in merge_tests.py 77:

Right before we reverse merge -r9:2 from 'A' to 'A_COPY', the tree
rooted at 'A_COPY' has the following paths explicit mergeinfo:

                              Path             Mergeinfo
1) The merge target itself  : A_COPY        :  /A:5-6
2) A subtree with mergeinfo : A_COPY\D\H    :  /A/D/H:5-7
3) A subtree with mergeinfo : A_COPY\D\H\nu :  /A/D/H/nu:5-8

(Forgive me if I over-explain but I'm not sure how much you have
looked at these parts of the merge code)

Each of these three paths are stored in the ubiquitous
children_with_mergeinfo array in a svn_client__merge_path_t struct
which contains a slew of merge-related info about the path, including
the remaining_ranges to be merged.

The merge looks at each path (this is all happening in
lbisvn_client/merge.c:populate_remaining_ranges() and its many
helpers) and determines what revisions of the requested r9:2 should be
reverse merged into that path.

Note, that since this *is* a reverse merge we only consider revisions
which are present in the implicit mergeinfo (i.e. natural history) or
its inherited/explicit mergeinfo.

First 'A_COPY', this is simple, it needs r5 and r6 reverse merged.

Next 'A_COPY\D\H', again this is straightforward, needing r5, r6, and
r7 reverse merged.

Lastly 'A_COPY\D\H\nu', this is where things get slightly more
complicated.  On the face of it (going simply by the mergeinfo) this
path needs r5-r8 reverse merged.  But 'nu' doesn't exist in the merge
source until r7 so we don't want to reverse merge r5 or r6 into 'nu'
directly as this would break the editor since 'nu' doesn't exist in
the source at those revisions.

I don't want to bury you in #3067 minutiae but now might be a good
time to read the doc string for merge.c:prepare_subtree_ranges().
What we have merge_tests.py 77 is described in case 'F' in that doc
string.  prepare_subtree_ranges() figures out that 'nu' will be
deleted by the merge, so doesn't do any filering of the requested
ranges, rather it reports to its caller filter_merged_revisions() that
'nu' is going to be deleted.  Now see "A little trick" in
filter_merged_revisions() -- It sees that the subtree 'nu' will be
deleted, so it sets its remaining_ranges to be the same as its nearest
parent in the children_with_mergeinfo array, namely 'A_COPY/D/H', so
r8 is never reverse merged, which obviously breaks your patch in this
case.

I'm playing around with some tweaks to filter_merged_revisions() and
prepare_subtree_ranges()'s behaviors in case F (and maybe it's forward
merge equivalent described in case A) to see if your patch and the
issue #3067 fixes can coexist.

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org