You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@subversion.apache.org by James Hanley <jh...@dgtlrift.com> on 2014/01/03 04:16:24 UTC

Keyword expansion from merged changes

I've used the Rev keyword in some of our code, and we noticed that there
may be a use case gap for the Rev/Revision and possibly Id keyword.

As expected the keyword gets updated with any checkin, but there may be a
need to have a merge-history aware version these keywords.  Meaning that
the Rev shown from a merge isn't the actual checkin of the merge, but the
change origin of the merge.

So the use case would be branching a project P from the trunk to do some
work, them merging those changes back.  In its current form, the rev takes
the revision of the merged checkin rather then the origin of the merged
change.

We see a need for this capability of having a merge-history aware keyword,
and one way to accomplish this without adding yet another keyword would be
to have the existing keywords support some kind of argument like options.

For example, when setting for the Rev keyword (as it works right now),
simply set
Rev
for the property svn:keywords, but if the user really wanted Rev to
indicate the delta origin of the rev they would use
rev.g
or perhaps
rev -g
in the svn:keyword entry...

The disadvantage with this meathod vs creating either new svn keywords or
having the options embedded directly in the keyword referenced within the
file is that you cannot have two different revs in the same file.

Fuel for though?
-Jim

RE: Keyword expansion from merged changes

Posted by Andrew Reedick <An...@cbeyond.net>.


> -----Original Message-----
> From: James Hanley [mailto:jhanley@dgtlrift.com]
> Sent: Saturday, January 04, 2014 2:47 AM
> To: Ben Reser
> Cc: users@subversion.apache.org
> Subject: Re: Keyword expansion from merged changes
> 
> 
> > So in my opinion I don't think this is a good suggested feature.
> 
> Fair enough, and one of the workarounds you previously mentioned may be
> useful, but in my opinion there is still gap between keywords and merge
> history even if this specific feature proposal is not a desired
> solution to close that gap.
> 
> Where do we go from here?

Nowhere.

<soapbox>

IMHO, you should change your process to not rely on keywords, for the simple reason that merge edge-cases require human intervention and/or interpretation (e.g. extra changes made in the merge revision, non-trivial conflict resolution, partial merges, reverse merges, cherry picked merges, record only merges, etc.)  The svn tools (e.g. 'svn diff' or 'svn mergeinfo --show-revs eligible') simply help to notify you that there may be a problem that needs to be investigated.  A difference between exported code bases could be acceptable, but only a human can make that determination.  "Missing" merges may be okay if the skipped revisions represent an unwanted or incomplete feature, (i.e. you don't want to merge incomplete work to trunk.)  

>From a previous post:
"our need would be for during the release process for validating all changes are completely synchronized and that there are no missing changes between branches, but aside from our need, it just doesn't seem right that there would show differences between the exported branches"

There are two types of bicycle riders:  those who have fallen and those who have yet to fall.  Right now you have a very easy to merge code base since you can "safely" make the assumption that exported merges should be identical.  But trust me, you will eventually hit a merge edge case which completely negate your ability to maintain that assumption.

Your process, workflow, and issue/defect/bug/ticket tracking system should be instrumental in ensuring that work is being tracked (in addition to 'svn mergeinfo' and 'svn diff'.)  A "merge aware" keyword just isn't enough, because even if a "merge aware" keyword were implemented, it would become useless once you hit a merge edge case.

</soapbox>

Re: Keyword expansion from merged changes

Posted by James Hanley <jh...@dgtlrift.com>.


> On Jan 3, 2014, at 12:58 PM, Ben Reser <be...@reser.org> wrote:
> 
>> On 1/3/14, 8:59 AM, James Hanley wrote:
>> Can you expand on this - I am missing where the preceding differences would be
>> an issue.  From what I can see, if there is a delta, it is either the result of
>> direct modification or a merge - if the former, that would be the rev expanding
>> the keyword, if the later, the library/client/server would need to trace
>> through the most recent svn:mergeinfo in an analogous method to "log -g" until
>> it finds a delta without mergeinfo - which should be a a direct modification
>> and the delta indicated.
> 
> At least in my opinion the entire purpose of the keywords feature is to allow
> people to embed data into exports and checkouts that uniquely identify the
> files so that the exact version being used can be found in version control.

This is probably the best argument against the idea - the small caveat to our flow (which I didn't show in my example graph) is that the /trunk is off limits for direct deltas on our program - meaning it is a merge only destination - even for the most minor of edits - so large sweeps of changes in a branch are bundled under one merge change, unless each delta from the bench is merged and checked in independently.

> 
> Your feature subverts that entirely.  You seem to understand that with your
> response to my example.
> 
>> However, in the simple case of merging back changes:
>> 
>> 1 - * /trunk - create project
>>     |\
>> 2 - | * /branches/1.8.x - some change
>> 3 - * | /trunk - some more changes
>> 4 - * | /trunk - even more changes
>> 5 - | * /branches/1.8.x - some more changes
>> 6 - | * /branches/1.8.x - even more changes
>> 7 - * | /trunk - one more change
>>     |\|
>> 8 - | * /branches/1.8.x - merge missing changes from trunk
>>     |/|
>> 9 - * | /trunk - merge missing changes from branch 1.8.x
>>     ! !
>>     : :
>> 
>> After r9, in theory, if there was an export done on the trunk and the branch,
>> they should be identical, but the keywords (without the concept of merge aware
>> keywords) would show a difference on every file that the keywords are used,
>> since the branch would expand to show r9 and the trunk would expand to show
>> r8.  In the case of a merge aware keyword, the keyword would expand to the last
>> non-merged delta - so a file in r5 changed would show r5 on the trunk after the
>> merge of r9, and a file changed in r4 would show r4 in the branch after the
>> merge of r8 - even though the file may have changed in r6, but chronologically
>> the last change was the merge of r6.
> 
> I'm going to ignore the fact that in your example they might not be identical
> since you made changes in r5 and r6 to /branches/1.8.x that weren't coming from
> trunk, since I don't think it's important to understanding what you want.
> Let's just pretend those were merges from trunk.

I believe you are correct in saying, it shouldn't mater, but those were changes in that branch branch, independent of the trunk and later merged into the trunk in rev9 via the hidden auto-reintegrate of svn1.8

> 
> So what you're trying to solve is that a diff of /branches/1.8.x and /trunk
> show spurious changes?  Why don't you just ask Subversion to compare them
> before doing the export.  The diffs we produce ignore the keywords.

Unfortunately, that's not an option, but I understand it would easily overcome the false positive check for spurious changes.  

> 
> For example consider the README file at the root of the Subversion source.
> 
> As you can see with a diff command between 1.7.x and trunk I get a delta on the
> keyword:
> [[[
> $ diff -u svn-1.7.x/README svn-trunk/README
> --- svn-1.7.x/README    2013-07-16 09:54:11.000000000 -0700
> +++ svn-trunk/README    2013-11-01 13:46:21.000000000 -0700
> @@ -2,7 +2,7 @@
>                Subversion, a version control system.
>                =====================================
> 
> -$LastChangedDate: 2011-06-30 10:54:40 -0700 (Thu, 30 Jun 2011) $
> +$LastChangedDate: 2012-02-10 06:58:53 -0800 (Fri, 10 Feb 2012) $
> 
> Contents:
> 
> ]]]
> 
> However if I compare the two with SVN:
> [[[
> $ svn diff https://svn.apache.org/repos/asf/subversion/branches/1.7.x/README
> https://svn.apache.org/repos/asf/subversion/trunk/README
> 
> ]]]
> 
> I get no output because there are no differences.
> 
> If that doesn't work for you then you can export with --ignore-keywords.  And
> do the diff the same way you are now.

This might be the best workaround for us.

> 
> Keep in mind you can use -r options in order to make sure you're comparing
> exactly the same versions as your real export is going to use when you do it.
> 
> You could also use the mergeinfo command with --show-revs to list out what is
> eligible to merge.  I.E.
> 
> [[[
> $ svn mergeinfo --show-revs=eligible ^/subversion/trunk ^/subversion/branches/1.8.x
> ]]]
> 
> Which is going to list of all of the changes that have not been merged
> completely to the branch.  In your case if all of the changes made to trunk
> since 1.8.x was branched, then you should get no output.
> 
> In 1.9.0 mergeinfo should have a --log option which will give you output like
> log would instead of just a list of revision numbers (trunk does but I can't
> promise that it won't be removed or changed before 1.9.0).

This is intriguing but we cannot use 1.9.0 until officially released.

> 
>> So the example above with merging changes back down to the trunk would be an
>> example, and our need would be for during the release process for validating
>> all changes are completely synchronized and that there are no missing changes
>> between branches, but aside from our need, it just doesn't seem right that
>> there would show differences between the exported branches. I'm not suggesting
>> the existing behavior be changed, just give the option to the user to configure
>> the expansion.   
> 
> I don't think your suggested solution to your issue is better than the existing
> solutions.
> 
> Your suggested solution would complicate keywords and make them harder to use
> for their intended purpose.  If a user 

Agreed that it complicates, as the intension was to not modify and maintain existing behavior while providing a way to augment the expansion, but the concept of keywords pre-dates merge history meta data, so the original intent for keywords had no visibility to such use cases or information as it didn't exist.

> turned your proposed feature on as in my
> example then the keywords would not be functional for determining the exact
> code in source control that the user was using.  Using this setting would be
> error prone because a user would have to know their branching model when they
> set it which may change in the future.

Right now it tells you precisely the change but not accurately if it was the result of a merge - precisely because you know what delta the merged was checked into, but not accurately as you know have to run through log -g to trace the origin of that merge - which is all thus is trying to do for the user, if they choose to use it for that file.

> 
> So in my opinion I don't think this is a good suggested feature.

Fair enough, and one of the workarounds you previously mentioned may be useful, but in my opinion there is still gap between keywords and merge history even if this specific feature proposal is not a desired solution to close that gap.

Where do we go from here?
Jim

Re: Keyword expansion from merged changes

Posted by Ben Reser <be...@reser.org>.

On 1/3/14, 8:59 AM, James Hanley wrote:
> Can you expand on this - I am missing where the preceding differences would be
> an issue.  From what I can see, if there is a delta, it is either the result of
> direct modification or a merge - if the former, that would be the rev expanding
> the keyword, if the later, the library/client/server would need to trace
> through the most recent svn:mergeinfo in an analogous method to "log -g" until
> it finds a delta without mergeinfo - which should be a a direct modification
> and the delta indicated.

At least in my opinion the entire purpose of the keywords feature is to allow
people to embed data into exports and checkouts that uniquely identify the
files so that the exact version being used can be found in version control.

Your feature subverts that entirely.  You seem to understand that with your
response to my example.

> However, in the simple case of merging back changes:
> 
>  1 - * /trunk - create project
>      |\
>  2 - | * /branches/1.8.x - some change
>  3 - * | /trunk - some more changes
>  4 - * | /trunk - even more changes
>  5 - | * /branches/1.8.x - some more changes
>  6 - | * /branches/1.8.x - even more changes
>  7 - * | /trunk - one more change
>      |\|
>  8 - | * /branches/1.8.x - merge missing changes from trunk
>      |/|
>  9 - * | /trunk - merge missing changes from branch 1.8.x
>      ! !
>      : :
> 
> After r9, in theory, if there was an export done on the trunk and the branch,
> they should be identical, but the keywords (without the concept of merge aware
> keywords) would show a difference on every file that the keywords are used,
> since the branch would expand to show r9 and the trunk would expand to show
> r8.  In the case of a merge aware keyword, the keyword would expand to the last
> non-merged delta - so a file in r5 changed would show r5 on the trunk after the
> merge of r9, and a file changed in r4 would show r4 in the branch after the
> merge of r8 - even though the file may have changed in r6, but chronologically
> the last change was the merge of r6.

I'm going to ignore the fact that in your example they might not be identical
since you made changes in r5 and r6 to /branches/1.8.x that weren't coming from
trunk, since I don't think it's important to understanding what you want.
Let's just pretend those were merges from trunk.

So what you're trying to solve is that a diff of /branches/1.8.x and /trunk
show spurious changes?  Why don't you just ask Subversion to compare them
before doing the export.  The diffs we produce ignore the keywords.

For example consider the README file at the root of the Subversion source.

As you can see with a diff command between 1.7.x and trunk I get a delta on the
keyword:
[[[
$ diff -u svn-1.7.x/README svn-trunk/README
--- svn-1.7.x/README	2013-07-16 09:54:11.000000000 -0700
+++ svn-trunk/README	2013-11-01 13:46:21.000000000 -0700
@@ -2,7 +2,7 @@
                Subversion, a version control system.
                =====================================

-$LastChangedDate: 2011-06-30 10:54:40 -0700 (Thu, 30 Jun 2011) $
+$LastChangedDate: 2012-02-10 06:58:53 -0800 (Fri, 10 Feb 2012) $

 Contents:

]]]

However if I compare the two with SVN:
[[[
$ svn diff https://svn.apache.org/repos/asf/subversion/branches/1.7.x/README
https://svn.apache.org/repos/asf/subversion/trunk/README

]]]

I get no output because there are no differences.

If that doesn't work for you then you can export with --ignore-keywords.  And
do the diff the same way you are now.

Keep in mind you can use -r options in order to make sure you're comparing
exactly the same versions as your real export is going to use when you do it.

You could also use the mergeinfo command with --show-revs to list out what is
eligible to merge.  I.E.

[[[
$ svn mergeinfo --show-revs=eligible ^/subversion/trunk ^/subversion/branches/1.8.x
]]]

Which is going to list of all of the changes that have not been merged
completely to the branch.  In your case if all of the changes made to trunk
since 1.8.x was branched, then you should get no output.

In 1.9.0 mergeinfo should have a --log option which will give you output like
log would instead of just a list of revision numbers (trunk does but I can't
promise that it won't be removed or changed before 1.9.0).

> So the example above with merging changes back down to the trunk would be an
> example, and our need would be for during the release process for validating
> all changes are completely synchronized and that there are no missing changes
> between branches, but aside from our need, it just doesn't seem right that
> there would show differences between the exported branches. I'm not suggesting
> the existing behavior be changed, just give the option to the user to configure
> the expansion.   

I don't think your suggested solution to your issue is better than the existing
solutions.

Your suggested solution would complicate keywords and make them harder to use
for their intended purpose.  If a user turned your proposed feature on as in my
example then the keywords would not be functional for determining the exact
code in source control that the user was using.  Using this setting would be
error prone because a user would have to know their branching model when they
set it which may change in the future.

So in my opinion I don't think this is a good suggested feature.

Re: Keyword expansion from merged changes

Posted by Branko Čibej <br...@wandisco.com>.

On 03.01.2014 17:59, James Hanley wrote:
> Can you expand on this - I am missing where the preceding differences
> would be an issue.  From what I can see, if there is a delta, it is
> either the result of direct modification or a merge

There is no guarantee that committed changes are the result of either a
merge or a modification. It could be both, or several merges, or any
combination of merges and modifications in any order.

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane@wandisco.com

Re: Keyword expansion from merged changes

Posted by James Hanley <jh...@dgtlrift.com>.

On Fri, Jan 3, 2014 at 12:35 AM, Ben Reser <be...@reser.org> wrote:

> On 1/2/14, 7:16 PM, James Hanley wrote:
> > I've used the Rev keyword in some of our code, and we noticed that there
> may be
> > a use case gap for the Rev/Revision and possibly Id keyword.
> >
> > As expected the keyword gets updated with any checkin, but there may be
> a need
> > to have a merge-history aware version these keywords.  Meaning that the
> Rev
> > shown from a merge isn't the actual checkin of the merge, but the change
> origin
> > of the merge.
> >
> > So the use case would be branching a project P from the trunk to do some
> work,
> > them merging those changes back.  In its current form, the rev takes the
> > revision of the merged checkin rather then the origin of the merged
> change.
> >
> > We see a need for this capability of having a merge-history aware
> keyword, and
> > one way to accomplish this without adding yet another keyword would be
> to have
> > the existing keywords support some kind of argument like options.
>
> The ONLY way to uniquely identify a point in history for a path is with the
> revision for that path (even a timestamp isn't unique since we allow dates
> that
> aren't sequential in the repository, yes this breaks using dates with -r).
>
> You can't identify the point in history a file is from on a path from the
> source of the merge because that doesn't account for any preceding
> differences.
>

Can you expand on this - I am missing where the preceding differences would
be an issue.  From what I can see, if there is a delta, it is either the
result of direct modification or a merge - if the former, that would be the
rev expanding the keyword, if the later, the library/client/server would
need to trace through the most recent svn:mergeinfo in an analogous method
to "log -g" until it finds a delta without mergeinfo - which should be a a
direct modification and the delta indicated.

One corner case may be "undoing" changes with "svn merge -c -X
filewithkeyword.txt ; svn ci" but this should use the original keyword
expansion prior to X - assuming the keyword would be using merge history.

>
> Consider a change to /trunk that gets merged to /branches/1.7.x and
> /branches/1.8.x.  If you set the keywords based on the trunk revision that
> was
> merged then the changed files in 1.7.x and 1.8.x would show the same
> keyword
> info, even though they may be drastically different.
>

I believe I understand the this case you speak of, for my own benefit of
visualizing the case better, if we have the change graph:

 1 - * /trunk - create project
     |\
     | \
     |  \
 2 - |   * /branches/1.7.x - create branch
 3 - |   * /branches/1.7.x - some 1.7 specific changes
     |\  |
 4 - | * | /branches/1.8.x - create branch
 5 - | * | /branches/1.8.x - some 1.8 specific changes
 6 - * | | /trunk - some trunk generic changes
 7 - | | * /branches/1.7.x - some more 1.7 specific changes
 8 - | * | /branches/1.8.x - some more 1.8 specific changes
     |\| |
 9 - | * | /branches/1.8.x - merge from trunk
     |\| |
     | \ |
     | |\|
10 - | | * /branches/1.7.x - merge from trunk
     ! ! !
     : : :

In this case r10 and r9's corresponding merge aware keywords would expand
to r6 even though there were chronological changes r7 & r8 respective of
each branch for those files that happened after r6 on each of those
branches. In this case it is possible to have very different contents for
each file of each branch.

However, in the simple case of merging back changes:

 1 - * /trunk - create project
     |\
 2 - | * /branches/1.8.x - some change
 3 - * | /trunk - some more changes
 4 - * | /trunk - even more changes
 5 - | * /branches/1.8.x - some more changes
 6 - | * /branches/1.8.x - even more changes
 7 - * | /trunk - one more change
     |\|
 8 - | * /branches/1.8.x - merge missing changes from trunk
     |/|
 9 - * | /trunk - merge missing changes from branch 1.8.x
     ! !
     : :

After r9, in theory, if there was an export done on the trunk and the
branch, they should be identical, but the keywords (without the concept of
merge aware keywords) would show a difference on every file that the
keywords are used, since the branch would expand to show r9 and the trunk
would expand to show r8.  In the case of a merge aware keyword, the keyword
would expand to the last non-merged delta - so a file in r5 changed would
show r5 on the trunk after the merge of r9, and a file changed in r4 would
show r4 in the branch after the merge of r8 - even though the file may have
changed in r6, but chronologically the last change was the merge of r6.

>
> You say that "you see a need for this capability" and then immediately
> launch
> into ideas for how to implement it.  I'd focus on explaining why you see
> the
> need for the capability before we get to any discussion about how to
> implement it.
>

So the example above with merging changes back down to the trunk would be
an example, and our need would be for during the release process for
validating all changes are completely synchronized and that there are no
missing changes between branches, but aside from our need, it just doesn't
seem right that there would show differences between the exported branches.
I'm not suggesting the existing behavior be changed, just give the option
to the user to configure the expansion.

Thanks much,
-Jim

Re: Keyword expansion from merged changes

Posted by Ben Reser <be...@reser.org>.

On 1/2/14, 7:16 PM, James Hanley wrote:
> I've used the Rev keyword in some of our code, and we noticed that there may be
> a use case gap for the Rev/Revision and possibly Id keyword.
> 
> As expected the keyword gets updated with any checkin, but there may be a need
> to have a merge-history aware version these keywords.  Meaning that the Rev
> shown from a merge isn't the actual checkin of the merge, but the change origin
> of the merge.
> 
> So the use case would be branching a project P from the trunk to do some work,
> them merging those changes back.  In its current form, the rev takes the
> revision of the merged checkin rather then the origin of the merged change.
> 
> We see a need for this capability of having a merge-history aware keyword, and
> one way to accomplish this without adding yet another keyword would be to have
> the existing keywords support some kind of argument like options.

The ONLY way to uniquely identify a point in history for a path is with the
revision for that path (even a timestamp isn't unique since we allow dates that
aren't sequential in the repository, yes this breaks using dates with -r).

You can't identify the point in history a file is from on a path from the
source of the merge because that doesn't account for any preceding differences.

Consider a change to /trunk that gets merged to /branches/1.7.x and
/branches/1.8.x.  If you set the keywords based on the trunk revision that was
merged then the changed files in 1.7.x and 1.8.x would show the same keyword
info, even though they may be drastically different.

You say that "you see a need for this capability" and then immediately launch
into ideas for how to implement it.  I'd focus on explaining why you see the
need for the capability before we get to any discussion about how to implement it.