You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ben Reser <be...@reser.org> on 2013/08/30 00:20:10 UTC

Improving CHANGES (or at least making it easier to produce)

Right now we produce the CHANGES file by someone going through the log and
looking at the individual commits and coming up with the entries for CHANGES.
It's an after the fact process.

The problem with this is that it's not always obvious from commit messages what
the user impact is.  I could probably find some examples but I'm not going to
bother to pick on anyone in particular.  Ultimately, our commit messages are
for developers and the CHANGES entries are for users.  There's a wide gap
sometimes between what goes where.

So I'd like to suggest that we start including a Changes field in the STATUS
file entries.  I haven't exactly worked out the details so nobody needs to rush
out right now and start doing it immediately.

Since the people proposing the backport and the people voting for it usually
have the best idea of the impact it should improve the quality of our CHANGES file.

If a STATUS entry doesn't require a CHANGES entry (e.g. improvement to an
already merged change that wasn't released yet) then we can just ommit this
line.  I can then simply search through the commit logs (since the backport.pl
includes the STATUS entries in the commit log when it commits) and find all the
CHANGES entries.

It'll still take some editing for consistency and style probably.  But it'll be
a lot better in my humble opinion.

This of course does nothing to help producing the CHANGES file for a 1.x.0
release, because there are tons of changes going on trunk that do not ever get
backported.  A huge thing that can help there is to start trying to describe
why a user would care about the commit and not just a developer.  This is
something that I think we all can put a little bit more effort into on our
trunk commits that'll help us when we produce 1.9.0.

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Johan Corveleyn <jc...@gmail.com>.
On Fri, Aug 30, 2013 at 12:26 PM, Stefan Sperling <st...@elego.de> wrote:
> On Fri, Aug 30, 2013 at 11:48:56AM +0200, Johan Corveleyn wrote:
>> At my workplace, we have a convention (enforced by pre-commit hook) to
>> use a prefix between square brackets ([U] for the user-facing text,
>> [S9] for the developer details (our team is called the "system9" team)
>> -> which also get extracted to another text file for an overview of
>> all the dev-messages of a single release). Here we could use something
>> similar to the contribulyzer syntax (for instance "Change: blablabla
>> (issue #1234)").
>
> I like this. No need for pre-commit enforcements, of course.
>
> We often don't know in advance which changes will be backported
> to a release. I guess we would have to delay updates to CHANGES
> until the moment tarballs are rolled. And instead of updating
> CHANGES manually we'd simply add or change 'Change:' annotations
> in log messages, and the release.py script would update CHANGES
> as necessary before tagging a release.

Why do it only for changes that will be backported? If you try to do
it immediately for every relevant change, you don't necessarily have
to come to it later, and both 1.8.x CHANGES and 1.9.0 CHANGES can be
extracted at any point.

Of course, there is always the difficulty of determining whether a
commit is relevant (and with which granularity you should document
your changes), but that problem is always there. Maybe it's a little
bit easier to answer the relevance question (and the concrete
phrasing) with some retrospect, but perhaps 90% can be done on the
spot, when you make the commit, and the other 10% can be filled in
later by editing the log message.

-- 
Johan

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Aug 30, 2013 at 11:48:56AM +0200, Johan Corveleyn wrote:
> At my workplace, we have a convention (enforced by pre-commit hook) to
> use a prefix between square brackets ([U] for the user-facing text,
> [S9] for the developer details (our team is called the "system9" team)
> -> which also get extracted to another text file for an overview of
> all the dev-messages of a single release). Here we could use something
> similar to the contribulyzer syntax (for instance "Change: blablabla
> (issue #1234)").

I like this. No need for pre-commit enforcements, of course.

We often don't know in advance which changes will be backported
to a release. I guess we would have to delay updates to CHANGES
until the moment tarballs are rolled. And instead of updating
CHANGES manually we'd simply add or change 'Change:' annotations
in log messages, and the release.py script would update CHANGES
as necessary before tagging a release.

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Johan Corveleyn <jc...@gmail.com>.
On Fri, Aug 30, 2013 at 1:01 PM, Branko Čibej <br...@wandisco.com> wrote:
> On 30.08.2013 11:48, Johan Corveleyn wrote:
>> On Fri, Aug 30, 2013 at 12:20 AM, Ben Reser <be...@reser.org> wrote:
>>> Right now we produce the CHANGES file by someone going through the log and
>>> looking at the individual commits and coming up with the entries for CHANGES.
>>> It's an after the fact process.
>>>
>>> The problem with this is that it's not always obvious from commit messages what
>>> the user impact is.  I could probably find some examples but I'm not going to
>>> bother to pick on anyone in particular.  Ultimately, our commit messages are
>>> for developers and the CHANGES entries are for users.  There's a wide gap
>>> sometimes between what goes where.
>>>
>>> So I'd like to suggest that we start including a Changes field in the STATUS
>>> file entries.  I haven't exactly worked out the details so nobody needs to rush
>>> out right now and start doing it immediately.
>>>
>>> Since the people proposing the backport and the people voting for it usually
>>> have the best idea of the impact it should improve the quality of our CHANGES file.
>>>
>>> If a STATUS entry doesn't require a CHANGES entry (e.g. improvement to an
>>> already merged change that wasn't released yet) then we can just ommit this
>>> line.  I can then simply search through the commit logs (since the backport.pl
>>> includes the STATUS entries in the commit log when it commits) and find all the
>>> CHANGES entries.
>>>
>>> It'll still take some editing for consistency and style probably.  But it'll be
>>> a lot better in my humble opinion.
>>>
>>> This of course does nothing to help producing the CHANGES file for a 1.x.0
>>> release, because there are tons of changes going on trunk that do not ever get
>>> backported.  A huge thing that can help there is to start trying to describe
>>> why a user would care about the commit and not just a developer.  This is
>>> something that I think we all can put a little bit more effort into on our
>>> trunk commits that'll help us when we produce 1.9.0.
>> Here is an alternative approach, that addresses both problems (CHANGES
>> for backports, and CHANGES for next big release), and it's the way we
>> currently do this at my workplace:
>>
>> Put the information for end-users directly in the commit message which
>> makes the change (or one of the commit messages if there is a whole
>> series of related commits), encoded in some parseable way. That way a
>> release tool can extract this information and construct a draft of
>> CHANGES (which can then still get final edits).
>>
>> This avoids duplication of information over both STATUS and other
>> places (the description of the change is usually the same for trunk
>> (next 1.x.0 to be) as for backports). At the cost of the developer who
>> commits the change having to think, at that point, about how one would
>> phrase this for end-users (but this is probably the best time (and the
>> best person) to think about this -- and if need be, it can always be
>> added after the fact to the log message). As an added bonus, the
>> user-facing message is then also immediately there in the log message,
>> which can be handy for devs looking through a file's history.
>>
>> At my workplace, we have a convention (enforced by pre-commit hook) to
>> use a prefix between square brackets ([U] for the user-facing text,
>> [S9] for the developer details (our team is called the "system9" team)
>> -> which also get extracted to another text file for an overview of
>> all the dev-messages of a single release). Here we could use something
>> similar to the contribulyzer syntax (for instance "Change: blablabla
>> (issue #1234)").
>>
>> Any revision numbers that are related to such a "change" can maybe be
>> extracted automatically (for inclusion in the CHANGES, if there is no
>> issue number mentioned). "Change" entries which are identical (same
>> Change in 10 different commits) would of course be folded into one
>> line in CHANGES.
>
> I'm a bit confused by the idea that you'd require a log message to
> describe a change twice. Doesn't make much sense to me at all.

It would not be a requirement, just optional (just like we don't
require every commit to have a corresponding entry in CHANGES).

Re. the describing twice: that's already the case currently, except we
describe them twice in two different places (once in the log message,
and once in CHANGES), and often with much time in between and by
different people (which makes it more fragile IMO).

On top of the message for developers ("describe what changed in the
code, and why"), you get an additional line of comment targeted
towards users (which can be formulated quite differently).

> A log message should describe what changed in the code. Automatically
> generating release notes and/or CHANGES from log messages is, in my
> experience, quite impractical. A better approach would be to require the
> CHANGES file to be updated in the same commit as the actual relevant
> change. But even that's not realistic, because often such a change will
> be split across several commits -- or, for example, developed on a branch.

Wouldn't that almost the same as my suggestion, just that you split it
accross two places (log message and CHANGES file)? I don't think
that's better than putting it right there in the log message, where it
can be extracted by a tool into any CHANGES file that has this commit
backported.

In fact, when you put it in the log messages, you might not even have
to keep a CHANGES file in version control (except perhaps for the
manual post-processing -- though perhaps you could also put those
edits directly in the "sources", i.e. the log message).

Anyway, it's just an idea. It's not perfect, but I've seen it work
quite well in practice.
-- 
Johan

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Ben Reser <be...@reser.org>.
On Fri Aug 30 14:45:24 2013, Branko Čibej wrote:
> Right. The question then is, how do we migrate the 1.8 backport CHANGES
> to trunk? Or do you expect this to be a manual step during the creation
> of the release branch?

I was expecting to do something like this:

svn-role commits the changes with the STATUS entries in the commit 
message (as it already does).
add a command to filter out the changes lines since the the previous 
minor release to release.py (could also make this list revisions with 
changes to files other than just STATUS that are missing the line to 
review these)
RM takes that output and then edits it and puts it on trunk.
RM merges trunk changes to branch.

Essentially I want to shift most of the effort in writing the CHANGES 
entries for minor releases to the people proposing the backports.  
There still needs to be some editing for style and other reasons.  But 
I'd suspect this will reduce the time spent producing this 
significantly since the RM doesn't end up having to try to understand 
the details of every change in order to write a CHANGES entry.

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Branko Čibej <br...@wandisco.com>.
On 30.08.2013 23:38, Ben Reser wrote:
> On 8/30/13 4:01 AM, Branko Čibej wrote:
>> A log message should describe what changed in the code. Automatically
>> generating release notes and/or CHANGES from log messages is, in my
>> experience, quite impractical. A better approach would be to require the
>> CHANGES file to be updated in the same commit as the actual relevant
>> change. But even that's not realistic, because often such a change will
>> be split across several commits -- or, for example, developed on a branch.
> I'm going to respond to Branko's email because he brings up some important
> points, but in general I'm replying to everyone.
>
> I agree completely automated generation is a not going to happen.  My
> motivation is to just make the job easier.  It can take a lot of time to figure
> out what to say to users when producing CHANGES.  This is a much less ambitious
> goal than full automation.
>
> Making commits to CHANGES along side your other changes is a bad idea.  Let me
> explain why:
>
> 1) Conflicts.  CHANGES on trunk is drastically different than CHANGES on the
> branches.  So it'll increase the hoop jumping we have to do to avoid conflicts
> when backporting changes.
>
> 2) Backporting.  We are never really sure what we're going to backport.  1.9.x
> should not mention a change that was included in say 1.8.6.  It's not a change
> from the user's perspective.  So it's entirely unclear where you should add
> your data to the CHANGES file.  Already the fact that we start putting 1.9.x
> CHANGES entries into trunk messes up release.py's attempt to detect unmerged
> CHANGES.  I've changed that to a warning.  I haven't objected to this practice
> because I think it makes sense to put things we know we'll never backport in
> CHANGES.  But we also can't just start doing it all the time.
>
> We also can't require changes entries be attached to commits since sometimes
> we're commiting fixes to things that were never released in a broken state.  So
> there is no effective change as far as a user is concerned.
>
> If you look at the 411 Content-Length issue I think any attempt to put the
> CHANGES entry in the log files would have been a mess.
>
> Putting details from the user perspective in the commit message is still
> helpful in that case, since the person nominating a change may not be the
> person who wrote the change.  Also sometimes what we thought the user impact
> was at commit time is incomplete.  We find many times where a change made for
> one reason fixes something else and we decide to backport it for that reason.
>
> So I felt that STATUS was a reasonable compromise for now.

Right. The question then is, how do we migrate the 1.8 backport CHANGES
to trunk? Or do you expect this to be a manual step during the creation
of the release branch?

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane@wandisco.com

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Johan Corveleyn <jc...@gmail.com>.
On Wed, Sep 11, 2013 at 9:38 PM, Ben Reser <be...@reser.org> wrote:
> On 9/1/13 2:42 AM, Johan Corveleyn wrote:
>> That's not really an argument, because that's true whether you do it
>> at commit time or afterwards. You don't save any time (quite the
>> contrary) by postponing it until CHANGES needs to be produced, or
>> until the revision needs to be backported. Except perhaps by having
>> better understanding of which changes matter to users and why, but I'd
>> say that's true for perhaps 5% of CHANGES.
>>
>> Unless you're saying you don't want to put this burden on the
>> committer (right at the point of committing), because that would
>> hinder commits? Then I'd argue that this is just a habit, and once you
>> get used to it, it doesn't really slow you down much (just like we are
>> all used to writing good, concise, high-information-density log
>> messages for our fellow developers). And for those really hard ones,
>> you can always come back to the log message later, and massage the
>> user-facing text (or add it if it's not there).
>>
>> So that would leave the argument of "that's not what we are used to
>> do", and I can accept that (habits matter, and can be hard to change).
>> It's an important argument, but not a technical one.
>>
>> So far, I haven't read a really rational argument against my
>> suggestion. "That's not what log message are for"? Why not?
>
> I think I already replied giving those reasons which you reply to further on down.
>
> It essentially comes down to whatever we put in commit messages on trunk won't
> necessarily be accurate when we go to produce a new minor release (as in
> major.minor.patch)
>
>> Fair enough. I understand you're currently only trying to solve the
>> CHANGES-for-backports problem for the RM, and that's okay.
>>
>> I just want to put this in the larger context of producing user-facing
>> messages for our changes for *all* our releases, minor or major. Think
>> back to all the time spent by several developers while preparing for
>> the 1.8.0 release, sifting through all the log messages since the last
>> release (in batches of 500 commits), trying to extract user-facing
>> information [1]. That's a huge waste of valuable developer-time, IMO.
>
> If there really is a better solution I'm open to it, I'm just not convinced
> that there is a good solution for trunk -> minor release.
>
> There are so many problems with automating it that I think it'll take a lot of
> time to sort them all out and for all pracitcal purposes minor releases are
> relatively rare.
>
> If we start doing them more often then a lot of the problems with automating it
> go away because there's less likelihood that the Changes: line in the commit is
> out of date by the time we do the minor release.
>
>> I agree, making concurrent commits to CHANGES is a bad idea. Much
>> better to put it in the log message :-).
>
> Realize I'm not opposed to putting better information in our commit messages to
> help produce the CHANGES later, I just don't think we can automate the
> production in a very useful way.
>
> I do think we can put a Change: line in, but I still think someone is going to
> have to review all of them for reasons I'll get into below.
>
>> Ah, but that can be solved perfectly automatically (we do this at my
>> workplace too): it's the perfect use case for 'svn mergeinfo
>> --show-revs eligible'.
>>
>> When you're about to produce CHANGES for 1.9.0, and 1.8.6 is currently
>> the latest 1.8 release, you'll want to process all the revisions
>> produced by
>>
>>     svn mergeinfo --show-revs eligible ^/branches/1.9.x ^/tags/1.8.6
>
> This doesn't completely work, sometimes we make changes that only apply to a
> given branch due to trunk having a completely different fix that can't be
> backported.

Actually, it works perfectly. Note that I requested the eligibles
between the "to-be-released-branch" and the
"previously-released-tag-of-which-we-don't-want-to-repeat-changes".

If you apply rN to ^/branches/1.9.x, that revision will show up in the
above query.

I don't have our script handy, but with some simple operations, it
works fine for collecting the commit messages of:
- revisions applied directly to the new branch
- revisions backported to the branch (and not to the previous tag)
- revisions from the natural history of the branch (and not part of
the previous tag)

>> Okay, but usually the developer who makes such a change knows he's
>> fixing something (on trunk or whatever) that's never been in the hands
>> of users. In that case, of course he doesn't write such a user-facing
>> note in the log message. The entire idea hinges on the fact that a
>> developer usually knows what his change means to users.
>
> My concern here is that people will forget to put them in when they should.  If
> we automate it then there's no chance to catch this.

Okay, but then I say: this can be caught by regular post-commit
review. Similar to how we catch other omissions to the log message.

Occasionally something will be missed, but with good post-commit
review this will be rare. With the current process, with one single
person scanning hundreds of commit messages, there will also be
occasional mistakes / omissions.

>> Also, when working with feature branches, it's clear that commits to
>> the branch are usually not interesting to users. In this case you
>> usually end up with one summary "change", when the branch gets
>> reintegrate (e.g. "Change: add new FSX repository back-end
>> (experimental)", as a result of integrating the fsx branch).
>>
>>> If you look at the 411 Content-Length issue I think any attempt to put the
>>> CHANGES entry in the log files would have been a mess.
>>
>> You'll have to refresh my memory here. But there can be exceptions of
>> course, where it's better to wait a bit for the dust to settle on a
>> particular issue, and then come up with what it means to users. But
>> those are exceptions.
>
> We went back and forth between several different implementations of the fix.
> With entirely different default behavior.
>
> The other issue here is how do you remove a CHANGES entry that's no longer
> important?  The only way I can see if you put it in commit logs is to go back
> and edit the commit log to remove the Changes: line.  But I'm not convinced
> that we will actually do that.

Hm, you may have a point there. With the weekly release cycle at our
company such situations are quite rare. Still, sometimes we end up
with conflicting or overlapping notes (not often, but it happens).

Like:
- (issue #1234) Change color of SomeForm to have more contrast.
- (issue #1234) Rollback color change to SomeForm. Testers pointed out
potential problems for color-blind people.

Or:
- (issue #1234) Change color of SomeForm to yellow.
- (issue #1234) Change color of SomeForm to green.
- (issue #1234) Make color of SomeForm user-configurable.


>> Also, most commits won't end up with a "Change:" note, just like not
>> every commit has a "Review by:" note. Only when you have something
>> interesting to say to users.
>
> Sure I understand that.  But that's exactly the problem that complicates this.
>  Those attribution tags get missed all the time as well.  The difference though
> is that the person who should have been attributed can say something or fix it
> themselves.
>
> If it was something we needed on every commit it's much easier to review for
> that.  But when it may be appropriate to ommit it then it's unlikely someone is
> going to point out they are missing because then they'd have to go through the
> process of figuring out if it was necessary.
>
> Maybe the alternative is to require some sort of Change line in every commit
> and to specify why one isn't necessary somehow when that's the case.  But I
> suspect people will object to that because it's just more tedious work to put
> into a commit.

Indeed, it's not intended to be added to every commit, that would be
tedious. Like I said, I think post-commit review would have to ensure
that no things are missed.

But I agree that "evolving changes" and "changes being rolled back",
without going back to edit the outdated "change lines", could be a
significant problem (especially with long release cycles). So I won't
push this any further :-), and leave it at that.

-- 
Johan

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Ben Reser <be...@reser.org>.
On 9/1/13 2:42 AM, Johan Corveleyn wrote:
> That's not really an argument, because that's true whether you do it
> at commit time or afterwards. You don't save any time (quite the
> contrary) by postponing it until CHANGES needs to be produced, or
> until the revision needs to be backported. Except perhaps by having
> better understanding of which changes matter to users and why, but I'd
> say that's true for perhaps 5% of CHANGES.
> 
> Unless you're saying you don't want to put this burden on the
> committer (right at the point of committing), because that would
> hinder commits? Then I'd argue that this is just a habit, and once you
> get used to it, it doesn't really slow you down much (just like we are
> all used to writing good, concise, high-information-density log
> messages for our fellow developers). And for those really hard ones,
> you can always come back to the log message later, and massage the
> user-facing text (or add it if it's not there).
> 
> So that would leave the argument of "that's not what we are used to
> do", and I can accept that (habits matter, and can be hard to change).
> It's an important argument, but not a technical one.
> 
> So far, I haven't read a really rational argument against my
> suggestion. "That's not what log message are for"? Why not?

I think I already replied giving those reasons which you reply to further on down.

It essentially comes down to whatever we put in commit messages on trunk won't
necessarily be accurate when we go to produce a new minor release (as in
major.minor.patch)

> Fair enough. I understand you're currently only trying to solve the
> CHANGES-for-backports problem for the RM, and that's okay.
> 
> I just want to put this in the larger context of producing user-facing
> messages for our changes for *all* our releases, minor or major. Think
> back to all the time spent by several developers while preparing for
> the 1.8.0 release, sifting through all the log messages since the last
> release (in batches of 500 commits), trying to extract user-facing
> information [1]. That's a huge waste of valuable developer-time, IMO.

If there really is a better solution I'm open to it, I'm just not convinced
that there is a good solution for trunk -> minor release.

There are so many problems with automating it that I think it'll take a lot of
time to sort them all out and for all pracitcal purposes minor releases are
relatively rare.

If we start doing them more often then a lot of the problems with automating it
go away because there's less likelihood that the Changes: line in the commit is
out of date by the time we do the minor release.

> I agree, making concurrent commits to CHANGES is a bad idea. Much
> better to put it in the log message :-).

Realize I'm not opposed to putting better information in our commit messages to
help produce the CHANGES later, I just don't think we can automate the
production in a very useful way.

I do think we can put a Change: line in, but I still think someone is going to
have to review all of them for reasons I'll get into below.

> Ah, but that can be solved perfectly automatically (we do this at my
> workplace too): it's the perfect use case for 'svn mergeinfo
> --show-revs eligible'.
> 
> When you're about to produce CHANGES for 1.9.0, and 1.8.6 is currently
> the latest 1.8 release, you'll want to process all the revisions
> produced by
> 
>     svn mergeinfo --show-revs eligible ^/branches/1.9.x ^/tags/1.8.6

This doesn't completely work, sometimes we make changes that only apply to a
given branch due to trunk having a completely different fix that can't be
backported.

> Okay, but usually the developer who makes such a change knows he's
> fixing something (on trunk or whatever) that's never been in the hands
> of users. In that case, of course he doesn't write such a user-facing
> note in the log message. The entire idea hinges on the fact that a
> developer usually knows what his change means to users.

My concern here is that people will forget to put them in when they should.  If
we automate it then there's no chance to catch this.

> Also, when working with feature branches, it's clear that commits to
> the branch are usually not interesting to users. In this case you
> usually end up with one summary "change", when the branch gets
> reintegrate (e.g. "Change: add new FSX repository back-end
> (experimental)", as a result of integrating the fsx branch).
> 
>> If you look at the 411 Content-Length issue I think any attempt to put the
>> CHANGES entry in the log files would have been a mess.
> 
> You'll have to refresh my memory here. But there can be exceptions of
> course, where it's better to wait a bit for the dust to settle on a
> particular issue, and then come up with what it means to users. But
> those are exceptions.

We went back and forth between several different implementations of the fix.
With entirely different default behavior.

The other issue here is how do you remove a CHANGES entry that's no longer
important?  The only way I can see if you put it in commit logs is to go back
and edit the commit log to remove the Changes: line.  But I'm not convinced
that we will actually do that.

> Also, most commits won't end up with a "Change:" note, just like not
> every commit has a "Review by:" note. Only when you have something
> interesting to say to users.

Sure I understand that.  But that's exactly the problem that complicates this.
 Those attribution tags get missed all the time as well.  The difference though
is that the person who should have been attributed can say something or fix it
themselves.

If it was something we needed on every commit it's much easier to review for
that.  But when it may be appropriate to ommit it then it's unlikely someone is
going to point out they are missing because then they'd have to go through the
process of figuring out if it was necessary.

Maybe the alternative is to require some sort of Change line in every commit
and to specify why one isn't necessary somehow when that's the case.  But I
suspect people will object to that because it's just more tedious work to put
into a commit.

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Johan Corveleyn <jc...@gmail.com>.
On Fri, Aug 30, 2013 at 11:38 PM, Ben Reser <be...@reser.org> wrote:
> On 8/30/13 4:01 AM, Branko Čibej wrote:
>> A log message should describe what changed in the code. Automatically
>> generating release notes and/or CHANGES from log messages is, in my
>> experience, quite impractical. A better approach would be to require the
>> CHANGES file to be updated in the same commit as the actual relevant
>> change. But even that's not realistic, because often such a change will
>> be split across several commits -- or, for example, developed on a branch.
>
> I'm going to respond to Branko's email because he brings up some important
> points, but in general I'm replying to everyone.
>
> I agree completely automated generation is a not going to happen.  My
> motivation is to just make the job easier.  It can take a lot of time to figure
> out what to say to users when producing CHANGES.

That's not really an argument, because that's true whether you do it
at commit time or afterwards. You don't save any time (quite the
contrary) by postponing it until CHANGES needs to be produced, or
until the revision needs to be backported. Except perhaps by having
better understanding of which changes matter to users and why, but I'd
say that's true for perhaps 5% of CHANGES.

Unless you're saying you don't want to put this burden on the
committer (right at the point of committing), because that would
hinder commits? Then I'd argue that this is just a habit, and once you
get used to it, it doesn't really slow you down much (just like we are
all used to writing good, concise, high-information-density log
messages for our fellow developers). And for those really hard ones,
you can always come back to the log message later, and massage the
user-facing text (or add it if it's not there).

So that would leave the argument of "that's not what we are used to
do", and I can accept that (habits matter, and can be hard to change).
It's an important argument, but not a technical one.

So far, I haven't read a really rational argument against my
suggestion. "That's not what log message are for"? Why not?

> This is a much less ambitious
> goal than full automation.

Fair enough. I understand you're currently only trying to solve the
CHANGES-for-backports problem for the RM, and that's okay.

I just want to put this in the larger context of producing user-facing
messages for our changes for *all* our releases, minor or major. Think
back to all the time spent by several developers while preparing for
the 1.8.0 release, sifting through all the log messages since the last
release (in batches of 500 commits), trying to extract user-facing
information [1]. That's a huge waste of valuable developer-time, IMO.

> Making commits to CHANGES along side your other changes is a bad idea.  Let me
> explain why:
>
> 1) Conflicts.  CHANGES on trunk is drastically different than CHANGES on the
> branches.  So it'll increase the hoop jumping we have to do to avoid conflicts
> when backporting changes.

I agree, making concurrent commits to CHANGES is a bad idea. Much
better to put it in the log message :-).

> 2) Backporting.  We are never really sure what we're going to backport.  1.9.x
> should not mention a change that was included in say 1.8.6.  It's not a change
> from the user's perspective.  So it's entirely unclear where you should add
> your data to the CHANGES file.

Ah, but that can be solved perfectly automatically (we do this at my
workplace too): it's the perfect use case for 'svn mergeinfo
--show-revs eligible'.

When you're about to produce CHANGES for 1.9.0, and 1.8.6 is currently
the latest 1.8 release, you'll want to process all the revisions
produced by

    svn mergeinfo --show-revs eligible ^/branches/1.9.x ^/tags/1.8.6

> Already the fact that we start putting 1.9.x
> CHANGES entries into trunk messes up release.py's attempt to detect unmerged
> CHANGES.  I've changed that to a warning.  I haven't objected to this practice
> because I think it makes sense to put things we know we'll never backport in
> CHANGES.  But we also can't just start doing it all the time.
>
> We also can't require changes entries be attached to commits since sometimes
> we're commiting fixes to things that were never released in a broken state.  So
> there is no effective change as far as a user is concerned.

Okay, but usually the developer who makes such a change knows he's
fixing something (on trunk or whatever) that's never been in the hands
of users. In that case, of course he doesn't write such a user-facing
note in the log message. The entire idea hinges on the fact that a
developer usually knows what his change means to users.

Also, when working with feature branches, it's clear that commits to
the branch are usually not interesting to users. In this case you
usually end up with one summary "change", when the branch gets
reintegrate (e.g. "Change: add new FSX repository back-end
(experimental)", as a result of integrating the fsx branch).

> If you look at the 411 Content-Length issue I think any attempt to put the
> CHANGES entry in the log files would have been a mess.

You'll have to refresh my memory here. But there can be exceptions of
course, where it's better to wait a bit for the dust to settle on a
particular issue, and then come up with what it means to users. But
those are exceptions.

Also, most commits won't end up with a "Change:" note, just like not
every commit has a "Review by:" note. Only when you have something
interesting to say to users.

> Putting details from the user perspective in the commit message is still
> helpful in that case, since the person nominating a change may not be the
> person who wrote the change.  Also sometimes what we thought the user impact
> was at commit time is incomplete.  We find many times where a change made for
> one reason fixes something else and we decide to backport it for that reason.

Yes, in that case the log message can be amended after the fact. This
can happen both for changes that need to be backported, and for
changes that are just slated for the next major release.

> So I felt that STATUS was a reasonable compromise for now.

Okay, and it's a very reasonable solution if you're only focused on
solving the CHANGES-for-backports problem.

[1] http://wiki.apache.org/subversion/Svn18Changes

-- 
Johan

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Ben Reser <be...@reser.org>.
On 8/30/13 4:01 AM, Branko Čibej wrote:
> A log message should describe what changed in the code. Automatically
> generating release notes and/or CHANGES from log messages is, in my
> experience, quite impractical. A better approach would be to require the
> CHANGES file to be updated in the same commit as the actual relevant
> change. But even that's not realistic, because often such a change will
> be split across several commits -- or, for example, developed on a branch.

I'm going to respond to Branko's email because he brings up some important
points, but in general I'm replying to everyone.

I agree completely automated generation is a not going to happen.  My
motivation is to just make the job easier.  It can take a lot of time to figure
out what to say to users when producing CHANGES.  This is a much less ambitious
goal than full automation.

Making commits to CHANGES along side your other changes is a bad idea.  Let me
explain why:

1) Conflicts.  CHANGES on trunk is drastically different than CHANGES on the
branches.  So it'll increase the hoop jumping we have to do to avoid conflicts
when backporting changes.

2) Backporting.  We are never really sure what we're going to backport.  1.9.x
should not mention a change that was included in say 1.8.6.  It's not a change
from the user's perspective.  So it's entirely unclear where you should add
your data to the CHANGES file.  Already the fact that we start putting 1.9.x
CHANGES entries into trunk messes up release.py's attempt to detect unmerged
CHANGES.  I've changed that to a warning.  I haven't objected to this practice
because I think it makes sense to put things we know we'll never backport in
CHANGES.  But we also can't just start doing it all the time.

We also can't require changes entries be attached to commits since sometimes
we're commiting fixes to things that were never released in a broken state.  So
there is no effective change as far as a user is concerned.

If you look at the 411 Content-Length issue I think any attempt to put the
CHANGES entry in the log files would have been a mess.

Putting details from the user perspective in the commit message is still
helpful in that case, since the person nominating a change may not be the
person who wrote the change.  Also sometimes what we thought the user impact
was at commit time is incomplete.  We find many times where a change made for
one reason fixes something else and we decide to backport it for that reason.

So I felt that STATUS was a reasonable compromise for now.

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Branko Čibej <br...@wandisco.com>.
On 30.08.2013 11:48, Johan Corveleyn wrote:
> On Fri, Aug 30, 2013 at 12:20 AM, Ben Reser <be...@reser.org> wrote:
>> Right now we produce the CHANGES file by someone going through the log and
>> looking at the individual commits and coming up with the entries for CHANGES.
>> It's an after the fact process.
>>
>> The problem with this is that it's not always obvious from commit messages what
>> the user impact is.  I could probably find some examples but I'm not going to
>> bother to pick on anyone in particular.  Ultimately, our commit messages are
>> for developers and the CHANGES entries are for users.  There's a wide gap
>> sometimes between what goes where.
>>
>> So I'd like to suggest that we start including a Changes field in the STATUS
>> file entries.  I haven't exactly worked out the details so nobody needs to rush
>> out right now and start doing it immediately.
>>
>> Since the people proposing the backport and the people voting for it usually
>> have the best idea of the impact it should improve the quality of our CHANGES file.
>>
>> If a STATUS entry doesn't require a CHANGES entry (e.g. improvement to an
>> already merged change that wasn't released yet) then we can just ommit this
>> line.  I can then simply search through the commit logs (since the backport.pl
>> includes the STATUS entries in the commit log when it commits) and find all the
>> CHANGES entries.
>>
>> It'll still take some editing for consistency and style probably.  But it'll be
>> a lot better in my humble opinion.
>>
>> This of course does nothing to help producing the CHANGES file for a 1.x.0
>> release, because there are tons of changes going on trunk that do not ever get
>> backported.  A huge thing that can help there is to start trying to describe
>> why a user would care about the commit and not just a developer.  This is
>> something that I think we all can put a little bit more effort into on our
>> trunk commits that'll help us when we produce 1.9.0.
> Here is an alternative approach, that addresses both problems (CHANGES
> for backports, and CHANGES for next big release), and it's the way we
> currently do this at my workplace:
>
> Put the information for end-users directly in the commit message which
> makes the change (or one of the commit messages if there is a whole
> series of related commits), encoded in some parseable way. That way a
> release tool can extract this information and construct a draft of
> CHANGES (which can then still get final edits).
>
> This avoids duplication of information over both STATUS and other
> places (the description of the change is usually the same for trunk
> (next 1.x.0 to be) as for backports). At the cost of the developer who
> commits the change having to think, at that point, about how one would
> phrase this for end-users (but this is probably the best time (and the
> best person) to think about this -- and if need be, it can always be
> added after the fact to the log message). As an added bonus, the
> user-facing message is then also immediately there in the log message,
> which can be handy for devs looking through a file's history.
>
> At my workplace, we have a convention (enforced by pre-commit hook) to
> use a prefix between square brackets ([U] for the user-facing text,
> [S9] for the developer details (our team is called the "system9" team)
> -> which also get extracted to another text file for an overview of
> all the dev-messages of a single release). Here we could use something
> similar to the contribulyzer syntax (for instance "Change: blablabla
> (issue #1234)").
>
> Any revision numbers that are related to such a "change" can maybe be
> extracted automatically (for inclusion in the CHANGES, if there is no
> issue number mentioned). "Change" entries which are identical (same
> Change in 10 different commits) would of course be folded into one
> line in CHANGES.

I'm a bit confused by the idea that you'd require a log message to
describe a change twice. Doesn't make much sense to me at all.

A log message should describe what changed in the code. Automatically
generating release notes and/or CHANGES from log messages is, in my
experience, quite impractical. A better approach would be to require the
CHANGES file to be updated in the same commit as the actual relevant
change. But even that's not realistic, because often such a change will
be split across several commits -- or, for example, developed on a branch.

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane@wandisco.com

Re: Improving CHANGES (or at least making it easier to produce)

Posted by Johan Corveleyn <jc...@gmail.com>.
On Fri, Aug 30, 2013 at 12:20 AM, Ben Reser <be...@reser.org> wrote:
> Right now we produce the CHANGES file by someone going through the log and
> looking at the individual commits and coming up with the entries for CHANGES.
> It's an after the fact process.
>
> The problem with this is that it's not always obvious from commit messages what
> the user impact is.  I could probably find some examples but I'm not going to
> bother to pick on anyone in particular.  Ultimately, our commit messages are
> for developers and the CHANGES entries are for users.  There's a wide gap
> sometimes between what goes where.
>
> So I'd like to suggest that we start including a Changes field in the STATUS
> file entries.  I haven't exactly worked out the details so nobody needs to rush
> out right now and start doing it immediately.
>
> Since the people proposing the backport and the people voting for it usually
> have the best idea of the impact it should improve the quality of our CHANGES file.
>
> If a STATUS entry doesn't require a CHANGES entry (e.g. improvement to an
> already merged change that wasn't released yet) then we can just ommit this
> line.  I can then simply search through the commit logs (since the backport.pl
> includes the STATUS entries in the commit log when it commits) and find all the
> CHANGES entries.
>
> It'll still take some editing for consistency and style probably.  But it'll be
> a lot better in my humble opinion.
>
> This of course does nothing to help producing the CHANGES file for a 1.x.0
> release, because there are tons of changes going on trunk that do not ever get
> backported.  A huge thing that can help there is to start trying to describe
> why a user would care about the commit and not just a developer.  This is
> something that I think we all can put a little bit more effort into on our
> trunk commits that'll help us when we produce 1.9.0.

Here is an alternative approach, that addresses both problems (CHANGES
for backports, and CHANGES for next big release), and it's the way we
currently do this at my workplace:

Put the information for end-users directly in the commit message which
makes the change (or one of the commit messages if there is a whole
series of related commits), encoded in some parseable way. That way a
release tool can extract this information and construct a draft of
CHANGES (which can then still get final edits).

This avoids duplication of information over both STATUS and other
places (the description of the change is usually the same for trunk
(next 1.x.0 to be) as for backports). At the cost of the developer who
commits the change having to think, at that point, about how one would
phrase this for end-users (but this is probably the best time (and the
best person) to think about this -- and if need be, it can always be
added after the fact to the log message). As an added bonus, the
user-facing message is then also immediately there in the log message,
which can be handy for devs looking through a file's history.

At my workplace, we have a convention (enforced by pre-commit hook) to
use a prefix between square brackets ([U] for the user-facing text,
[S9] for the developer details (our team is called the "system9" team)
-> which also get extracted to another text file for an overview of
all the dev-messages of a single release). Here we could use something
similar to the contribulyzer syntax (for instance "Change: blablabla
(issue #1234)").

Any revision numbers that are related to such a "change" can maybe be
extracted automatically (for inclusion in the CHANGES, if there is no
issue number mentioned). "Change" entries which are identical (same
Change in 10 different commits) would of course be folded into one
line in CHANGES.

-- 
Johan