You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Wes McKinney <we...@gmail.com> on 2018/12/19 13:42:46 UTC

Arrow pull requests: please limit squashing your commits

hi folks,

As the contributor base has grown, our development styles have grown
increasingly diverse.

Sometimes contributors are used to working in a Gerrit-style workflow
where patches are always squashed with `git rebase -i` into a single
patch, and then force pushed to the PR branch.

I'd like to ask you to avoid doing this, as it can make things harder
for maintainers. Let me explain:

* When you rebase and force-push, GitHub fails to generate an e-mail
notification. I use the GitHub notifications to tell which branches
are being actively developed and may need to be reviewed again. Many
times now I have thought a branch was inactive only to look more
closely and see that it's been updated via force-push. Since it took
GitHub 10 years to start showing force push changes at all in their UI
I'm not holding out for them to send e-mail notifications about this

* GitHub is not Gerrit. We don't have the awesome incremental diff
feature. So in lieu of this it's easier to be able to look at
incremental diffs (e.g. responses to code review comments) by clicking
on the individual commits

* Our PR merge tool (dev/merge_arrow_py.py) squashes all the commits
anyway, so squashing twice is redundant

Sometimes I'll have commits like this in my branch

* IMPLEMENTING THE FEATURE
* lint
* fixing CI
* fixing toolchain issue
* code review commits
* fixing CI issues
* more code review comments
* documentation

I think it's fine to combine some of the commits this so long as the
produced commits reflect the logical evolution of your patch, for the
purposes of code review.

In the event of a gnarly rebase on master, sometimes it is easiest to
create a single commit and then rebase that.

Thanks,
Wes

Re: Arrow pull requests: please limit squashing your commits

Posted by Wes McKinney <we...@gmail.com>.
On Wed, Dec 19, 2018 at 7:47 AM Antoine Pitrou <an...@python.org> wrote:
>
>
> Le 19/12/2018 à 14:42, Wes McKinney a écrit :
> >
> > * Our PR merge tool (dev/merge_arrow_py.py) squashes all the commits
> > anyway, so squashing twice is redundant
>
> The problem is you can then get spurious conflicts if you base a PR on
> another.  Happened to me several times.

Agreed -- I didn't say "never squash" but to "avoid" or "limit" it.
The stacked PR use case is a good example where things can be painful
if all your commits are not atomic. This does not describe the average
pull request, though

>
> Regards
>
> Antoine.

Re: Arrow pull requests: please limit squashing your commits

Posted by Antoine Pitrou <an...@python.org>.
Le 19/12/2018 à 14:42, Wes McKinney a écrit :
> 
> * Our PR merge tool (dev/merge_arrow_py.py) squashes all the commits
> anyway, so squashing twice is redundant

The problem is you can then get spurious conflicts if you base a PR on
another.  Happened to me several times.

Regards

Antoine.

Re: Arrow pull requests: please limit squashing your commits

Posted by Wes McKinney <we...@gmail.com>.
On Wed, Dec 19, 2018 at 7:47 AM Francois Saint-Jacques
<fs...@gmail.com> wrote:
>
> No issue with this.
>
> When the final squash is done, which title/body is preserved?

The PR title (in GitHub) and the PR description are what matter. The
commit messages don't really matter

>
> On Wed, Dec 19, 2018 at 8:43 AM Wes McKinney <we...@gmail.com> wrote:
>
> > hi folks,
> >
> > As the contributor base has grown, our development styles have grown
> > increasingly diverse.
> >
> > Sometimes contributors are used to working in a Gerrit-style workflow
> > where patches are always squashed with `git rebase -i` into a single
> > patch, and then force pushed to the PR branch.
> >
> > I'd like to ask you to avoid doing this, as it can make things harder
> > for maintainers. Let me explain:
> >
> > * When you rebase and force-push, GitHub fails to generate an e-mail
> > notification. I use the GitHub notifications to tell which branches
> > are being actively developed and may need to be reviewed again. Many
> > times now I have thought a branch was inactive only to look more
> > closely and see that it's been updated via force-push. Since it took
> > GitHub 10 years to start showing force push changes at all in their UI
> > I'm not holding out for them to send e-mail notifications about this
> >
> > * GitHub is not Gerrit. We don't have the awesome incremental diff
> > feature. So in lieu of this it's easier to be able to look at
> > incremental diffs (e.g. responses to code review comments) by clicking
> > on the individual commits
> >
> > * Our PR merge tool (dev/merge_arrow_py.py) squashes all the commits
> > anyway, so squashing twice is redundant
> >
> > Sometimes I'll have commits like this in my branch
> >
> > * IMPLEMENTING THE FEATURE
> > * lint
> > * fixing CI
> > * fixing toolchain issue
> > * code review commits
> > * fixing CI issues
> > * more code review comments
> > * documentation
> >
> > I think it's fine to combine some of the commits this so long as the
> > produced commits reflect the logical evolution of your patch, for the
> > purposes of code review.
> >
> > In the event of a gnarly rebase on master, sometimes it is easiest to
> > create a single commit and then rebase that.
> >
> > Thanks,
> > Wes
> >

Re: Arrow pull requests: please limit squashing your commits

Posted by Francois Saint-Jacques <fs...@gmail.com>.
No issue with this.

When the final squash is done, which title/body is preserved?

On Wed, Dec 19, 2018 at 8:43 AM Wes McKinney <we...@gmail.com> wrote:

> hi folks,
>
> As the contributor base has grown, our development styles have grown
> increasingly diverse.
>
> Sometimes contributors are used to working in a Gerrit-style workflow
> where patches are always squashed with `git rebase -i` into a single
> patch, and then force pushed to the PR branch.
>
> I'd like to ask you to avoid doing this, as it can make things harder
> for maintainers. Let me explain:
>
> * When you rebase and force-push, GitHub fails to generate an e-mail
> notification. I use the GitHub notifications to tell which branches
> are being actively developed and may need to be reviewed again. Many
> times now I have thought a branch was inactive only to look more
> closely and see that it's been updated via force-push. Since it took
> GitHub 10 years to start showing force push changes at all in their UI
> I'm not holding out for them to send e-mail notifications about this
>
> * GitHub is not Gerrit. We don't have the awesome incremental diff
> feature. So in lieu of this it's easier to be able to look at
> incremental diffs (e.g. responses to code review comments) by clicking
> on the individual commits
>
> * Our PR merge tool (dev/merge_arrow_py.py) squashes all the commits
> anyway, so squashing twice is redundant
>
> Sometimes I'll have commits like this in my branch
>
> * IMPLEMENTING THE FEATURE
> * lint
> * fixing CI
> * fixing toolchain issue
> * code review commits
> * fixing CI issues
> * more code review comments
> * documentation
>
> I think it's fine to combine some of the commits this so long as the
> produced commits reflect the logical evolution of your patch, for the
> purposes of code review.
>
> In the event of a gnarly rebase on master, sometimes it is easiest to
> create a single commit and then rebase that.
>
> Thanks,
> Wes
>