You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Etienne Chauchot <ec...@apache.org> on 2019/04/04 13:18:51 UTC

Re: [PROPOSAL] commit granularity in master

Brain,It is good that you automated commits quality checks, thanks. But it don't agree with reducing the commit history
of a PR to only one commit, I was just referring about meaningless commits such as fixup, checktyle, spotless ... I
prefer not to squash everything and only squash meaningless commits because:- sometimes small related fixes to different
parts (with different jiras) are done in the same PR and they should stay separate commits because they deal with
different problems- more important, keeping the commit at a relative small size but still isolable is better to track
bugs/regressions (among other things during bisect sessions) that if the commit is big.
Etienne

Le vendredi 22 mars 2019 à 09:38 -0700, Brian Hulette a écrit :
> It sounds like maybe we've already reached a consensus that committers just need to be more vigilant about squashing
> fixup commits, and hopefully automate it as much as possible. But I just thought I'd also point out how the arrow
> project handles this as a point of reference, since it's kind of interesting. 
> 
> They've written a merge_arrow_pr.py script [1], which committers run to merge a PR. It enforces that the PR has an
> associated JIRA in the title, squashes the entire PR into a single commit, and closes the associated JIRA with the
> appropriate fix version.
> 
> As a result, the commit granularity is equal to the granularity of PRs, JIRAs are always linked to PRs, and the commit
> history on master is completely linear (they also have to force push master after releases in order to maintain this,
> which is the subject of much consternation and debate).
> 
> The simplicity of 1 PR = 1 commit is a nice way to avoid the manual intervention required to squash fixup commits and
> enforce that every commit has passed CI, but it does have down-sides as Etienne already pointed out.
> 
> Brian
> 
> [1] https://github.com/apache/arrow/blob/master/dev/merge_arrow_pr.py
> 
> 
> On Fri, Mar 22, 2019 at 7:46 AM Mikhail Gryzykhin <gr...@gmail.com> wrote:
> > I agree with keeping history clean.
> > Although, Small commits like address PR comments are useful during review process. They allow reviewer to see only
> > new changes, not review whole diff again. Best to squash then before/on merge though.
> > On Fri, Mar 22, 2019, 07:34 Ismaël Mejía <ie...@gmail.com> wrote:
> > > > I like the extra delimitation the brackets give, worth the two extra
> > > 
> > > > characters to me. More importantly, it's nice to have consistency, and
> > > 
> > > > the only way to be consistent with the past is leave them there.
> > > 
> > > 
> > > 
> > > My point with the brackets is that we are 'getting close' to 10K issue
> > > 
> > > so we will then have 3 chars less, probably it does not change much
> > > 
> > > but still.
> > > 
> > > 
> > > 
> > > On Fri, Mar 22, 2019 at 3:19 PM Robert Bradshaw <ro...@google.com> wrote:
> > > 
> > > >
> > > 
> > > > On Fri, Mar 22, 2019 at 3:02 PM Ismaël Mejía <ie...@gmail.com> wrote:
> > > 
> > > > >
> > > 
> > > > > It is good to remind committers of their responsability on the
> > > 
> > > > > 'cleanliness' of the merged code. Github sadly does not have an easy
> > > 
> > > > > interface to do this and this should be done manually in many cases,
> > > 
> > > > > sadly I have seen many committers just merging code with multiple
> > > 
> > > > > 'fixup' style commits by clicking Github's merge button. Maybe it is
> > > 
> > > > > time to find a way to automatically detect these cases and disallow
> > > 
> > > > > the merge or maybe we should reconsider the policy altogether if they
> > > 
> > > > > are people who don't see the value of this.
> > > 
> > > >
> > > 
> > > > I agree about keeping our history clean and useful, and think those
> > > 
> > > > four points summarize things well (but a clarification on fixup
> > > 
> > > > commits would be good).
> > > 
> > > >
> > > 
> > > > +1 to an automated check that there are many extraneous commits.
> > > 
> > > > Anything the person hitting the merge button would easily see before
> > > 
> > > > doing the merge.
> > > 
> > > >
> > > 
> > > > > I would like to propose a small modification to the commit title style
> > > 
> > > > > on that guide. We use two brackets to enclose the issue id, but that
> > > 
> > > > > really does not improve much the readibility and uses 2 extra spaces
> > > 
> > > > > of the already short space title, what about getting rid of them?
> > > 
> > > > >
> > > 
> > > > > Current style: "[BEAM-XXXX] Commit title"
> > > 
> > > > > Proposed style: "BEAM-XXXX Commit title"
> > > 
> > > > >
> > > 
> > > > > Any ideas or opinons pro/con ?
> > > 
> > > >
> > > 
> > > > I like the extra delimitation the brackets give, worth the two extra
> > > 
> > > > characters to me. More importantly, it's nice to have consistency, and
> > > 
> > > > the only way to be consistent with the past is leave them there.
> > > 
> > > >
> > > 
> > > > > On Fri, Mar 22, 2019 at 2:32 PM Etienne Chauchot <ec...@apache.org> wrote:
> > > 
> > > > > >
> > > 
> > > > > > Thanks Alexey to point this out. I did not know about these 4 points in the guide. I agree with them also. I
> > > would just add "Avoid keeping in history formatting messages such as checktyle or spotless fixes"
> > > 
> > > > > > If it is ok, I'll submit a PR to add this point.
> > > 
> > > > > > Le vendredi 22 mars 2019 à 11:33 +0100, Alexey Romanenko a écrit :
> > > 
> > > > > >
> > > 
> > > > > > Etienne, thanks for bringing this topic.
> > > 
> > > > > >
> > > 
> > > > > > I think it was already discussed several times before and we have finally came to what we have in the
> > > current Committer guide “Granularity of changes" [1].
> > > 
> > > > > >
> > > 
> > > > > > Personally, I completely agree with these 4 rules presented there. The main concern is that all committers
> > > should follow them as well, otherwise we still have sometimes a bunch of small commits with inexpressive messages
> > > (I believe they were added during review process and were not squashed before merging).
> > > 
> > > > > >
> > > 
> > > > > > In my opinion, the most important rule is that every commit should be atomic in terms of added/fixed
> > > functionality and rolling it back should not break master branch.
> > > 
> > > > > >
> > > 
> > > > > > [1] https://beam.apache.org/contribute/committer-guide/#pull-request-review-objectives
> > > 
> > > > > >
> > > 
> > > > > >
> > > 
> > > > > > On 22 Mar 2019, at 10:16, Etienne Chauchot <ec...@apache.org> wrote:
> > > 
> > > > > >
> > > 
> > > > > > Hi all,
> > > 
> > > > > > It has already been discussed partially but I would like that we agree on the commit granularity that we
> > > want in our history.
> > > 
> > > > > > Some features were squashed to only one commit which seems a bit too granular to me for a big feature.
> > > 
> > > > > > On the contrary I see PRs with very small commits such as "apply spotless" or "fix checkstyle".
> > > 
> > > > > >
> > > 
> > > > > > IMHO I think a good commit size is an isolable portion of a feature such as for ex "implement Read part of
> > > Kudu IO" or "reduce concurrency in Test A". Such a granularity allows to isolate problems easily (git bisect for
> > > ex) and rollback only a part if necessary.
> > > 
> > > > > > WDYT about:
> > > 
> > > > > > - squashing non meaningful commits such as "apply review comments" (and rather state what they do and group
> > > them if needed), or "apply spotless" or "fix checkstyle"
> > > 
> > > > > > - trying to stick to a commit size as described above
> > > 
> > > > > >
> > > 
> > > > > > => and of course update the contribution guide at the end
> > > 
> > > > > > ?
> > > 
> > > > > >
> > > 
> > > > > > Best
> > > 
> > > > > > Etienne
> > > 
> > > > > >
> > > 
> > > > > >
> > > 

Re: [PROPOSAL] commit granularity in master

Posted by Robert Bradshaw <ro...@google.com>.
On Thu, Apr 4, 2019 at 3:18 PM Etienne Chauchot <ec...@apache.org> wrote:
>
> Brain,
> It is good that you automated commits quality checks, thanks.
>
> But it don't agree with reducing the commit history of a PR to only one commit, I was just referring about meaningless commits such as fixup, checktyle, spotless ... I prefer not to squash everything and only squash meaningless commits because:
> - sometimes small related fixes to different parts (with different jiras) are done in the same PR and they should stay separate commits because they deal with different problems
> - more important, keeping the commit at a relative small size but still isolable is better to track bugs/regressions (among other things during bisect sessions) that if the commit is big.

Agreed, we should not enforce one commit per PR; there are many good
reasons to break a PR into multiple commits.

> Le vendredi 22 mars 2019 à 09:38 -0700, Brian Hulette a écrit :
>
> It sounds like maybe we've already reached a consensus that committers just need to be more vigilant about squashing fixup commits, and hopefully automate it as much as possible. But I just thought I'd also point out how the arrow project handles this as a point of reference, since it's kind of interesting.
>
> They've written a merge_arrow_pr.py script [1], which committers run to merge a PR. It enforces that the PR has an associated JIRA in the title, squashes the entire PR into a single commit, and closes the associated JIRA with the appropriate fix version.
>
> As a result, the commit granularity is equal to the granularity of PRs, JIRAs are always linked to PRs, and the commit history on master is completely linear (they also have to force push master after releases in order to maintain this, which is the subject of much consternation and debate).
>
> The simplicity of 1 PR = 1 commit is a nice way to avoid the manual intervention required to squash fixup commits and enforce that every commit has passed CI, but it does have down-sides as Etienne already pointed out.
>
> Brian
>
> [1] https://github.com/apache/arrow/blob/master/dev/merge_arrow_pr.py
>
>
> On Fri, Mar 22, 2019 at 7:46 AM Mikhail Gryzykhin <gr...@gmail.com> wrote:
>
> I agree with keeping history clean.
>
> Although, Small commits like address PR comments are useful during review process. They allow reviewer to see only new changes, not review whole diff again. Best to squash then before/on merge though.
>
> On Fri, Mar 22, 2019, 07:34 Ismaël Mejía <ie...@gmail.com> wrote:
>
> > I like the extra delimitation the brackets give, worth the two extra
> > characters to me. More importantly, it's nice to have consistency, and
> > the only way to be consistent with the past is leave them there.
>
> My point with the brackets is that we are 'getting close' to 10K issue
> so we will then have 3 chars less, probably it does not change much
> but still.
>
> On Fri, Mar 22, 2019 at 3:19 PM Robert Bradshaw <ro...@google.com> wrote:
> >
> > On Fri, Mar 22, 2019 at 3:02 PM Ismaël Mejía <ie...@gmail.com> wrote:
> > >
> > > It is good to remind committers of their responsability on the
> > > 'cleanliness' of the merged code. Github sadly does not have an easy
> > > interface to do this and this should be done manually in many cases,
> > > sadly I have seen many committers just merging code with multiple
> > > 'fixup' style commits by clicking Github's merge button. Maybe it is
> > > time to find a way to automatically detect these cases and disallow
> > > the merge or maybe we should reconsider the policy altogether if they
> > > are people who don't see the value of this.
> >
> > I agree about keeping our history clean and useful, and think those
> > four points summarize things well (but a clarification on fixup
> > commits would be good).
> >
> > +1 to an automated check that there are many extraneous commits.
> > Anything the person hitting the merge button would easily see before
> > doing the merge.
> >
> > > I would like to propose a small modification to the commit title style
> > > on that guide. We use two brackets to enclose the issue id, but that
> > > really does not improve much the readibility and uses 2 extra spaces
> > > of the already short space title, what about getting rid of them?
> > >
> > > Current style: "[BEAM-XXXX] Commit title"
> > > Proposed style: "BEAM-XXXX Commit title"
> > >
> > > Any ideas or opinons pro/con ?
> >
> > I like the extra delimitation the brackets give, worth the two extra
> > characters to me. More importantly, it's nice to have consistency, and
> > the only way to be consistent with the past is leave them there.
> >
> > > On Fri, Mar 22, 2019 at 2:32 PM Etienne Chauchot <ec...@apache.org> wrote:
> > > >
> > > > Thanks Alexey to point this out. I did not know about these 4 points in the guide. I agree with them also. I would just add "Avoid keeping in history formatting messages such as checktyle or spotless fixes"
> > > > If it is ok, I'll submit a PR to add this point.
> > > > Le vendredi 22 mars 2019 à 11:33 +0100, Alexey Romanenko a écrit :
> > > >
> > > > Etienne, thanks for bringing this topic.
> > > >
> > > > I think it was already discussed several times before and we have finally came to what we have in the current Committer guide “Granularity of changes" [1].
> > > >
> > > > Personally, I completely agree with these 4 rules presented there. The main concern is that all committers should follow them as well, otherwise we still have sometimes a bunch of small commits with inexpressive messages (I believe they were added during review process and were not squashed before merging).
> > > >
> > > > In my opinion, the most important rule is that every commit should be atomic in terms of added/fixed functionality and rolling it back should not break master branch.
> > > >
> > > > [1] https://beam.apache.org/contribute/committer-guide/#pull-request-review-objectives
> > > >
> > > >
> > > > On 22 Mar 2019, at 10:16, Etienne Chauchot <ec...@apache.org> wrote:
> > > >
> > > > Hi all,
> > > > It has already been discussed partially but I would like that we agree on the commit granularity that we want in our history.
> > > > Some features were squashed to only one commit which seems a bit too granular to me for a big feature.
> > > > On the contrary I see PRs with very small commits such as "apply spotless" or "fix checkstyle".
> > > >
> > > > IMHO I think a good commit size is an isolable portion of a feature such as for ex "implement Read part of Kudu IO" or "reduce concurrency in Test A". Such a granularity allows to isolate problems easily (git bisect for ex) and rollback only a part if necessary.
> > > > WDYT about:
> > > > - squashing non meaningful commits such as "apply review comments" (and rather state what they do and group them if needed), or "apply spotless" or "fix checkstyle"
> > > > - trying to stick to a commit size as described above
> > > >
> > > > => and of course update the contribution guide at the end
> > > > ?
> > > >
> > > > Best
> > > > Etienne
> > > >
> > > >