You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Owen Nichols <on...@pivotal.io> on 2020/04/29 23:06:10 UTC

[DISCUSS] etiquette around breaking the pipeline

There are many ways a commit can break the Geode CI pipeline[1] despite our required PR checks, such as:
- merge a PR that failed some non-required PR checks
- merge a PR that last ran checks awhile ago and now interacts unexpectedly with other recent changes
- merge a PR that fails Windows tests
- merge a PR that fails Benchmarks tests

When a bad commit gets through for whatever reason, what should happen next?  For example:
- bring it up on the dev list
- someone should revert the change as soon as possible, allowing the pipeline to remain green while the issue is addressed
- a reasonable amount of time should be allowed for someone to make a quick fix, otherwise revert.
- the pipeline should remain broken for as long as it takes to fix, as long as there is clear communication that someone is working on it
- everyone look the other way and hope it fixes itself

I’m sure there are other ideas as well.  A related question is *who* can or should revert a bad commit?  Only the person that merged the PR?  Only the original author of the PR?  The first person to notice the problem?

What is a reasonable amount of time for this to happen?  2 hours? 2 days? 2 weeks? Does it depend whether the bad commit is also affecting PR checks for everyone else vs only depriving downstream consumers of new Geode -SNAPSHOTs?

Would you take offense if someone else reverted your commit?  Does is make a difference if your commit is exposing an existing issue (as opposed to introducing a new bug)?

Is there a perception that reverts create a lot of extra work? (they shouldn’t--just start your rework PR with a revert of the revert, then add additional commits that resolve the issue, so reviewers can easily see what was missing the first time)

This is a discussion thread, not a vote.  We trust committers to do what’s best.  Would embracing a “anyone can revert, no shame” revert-first-then-fix culture benefit our community, or is our current easygoing approach working just fine?

[1] https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main

Re: [DISCUSS] etiquette around breaking the pipeline

Posted by Jinmei Liao <ji...@pivotal.io>.
1. create the revert PR ASAP
2. work on the fix properly and create the fix PR
3. wait and merge whichever goes green and approved first.

On Wed, Apr 29, 2020 at 7:13 PM Joris Melchior <jm...@vmware.com> wrote:

> Recent experience makes me lean towards quick revert as well. Takes the
> pressure off when trying to produce a fix and if done soon enough it is
> usually quite straightforward.
> ________________________________
> From: Owen Nichols <on...@pivotal.io>
> Sent: April 29, 2020 7:06 PM
> To: dev@geode.apache.org <de...@geode.apache.org>
> Subject: [DISCUSS] etiquette around breaking the pipeline
>
> There are many ways a commit can break the Geode CI pipeline[1] despite
> our required PR checks, such as:
> - merge a PR that failed some non-required PR checks
> - merge a PR that last ran checks awhile ago and now interacts
> unexpectedly with other recent changes
> - merge a PR that fails Windows tests
> - merge a PR that fails Benchmarks tests
>
> When a bad commit gets through for whatever reason, what should happen
> next?  For example:
> - bring it up on the dev list
> - someone should revert the change as soon as possible, allowing the
> pipeline to remain green while the issue is addressed
> - a reasonable amount of time should be allowed for someone to make a
> quick fix, otherwise revert.
> - the pipeline should remain broken for as long as it takes to fix, as
> long as there is clear communication that someone is working on it
> - everyone look the other way and hope it fixes itself
>
> I’m sure there are other ideas as well.  A related question is *who* can
> or should revert a bad commit?  Only the person that merged the PR?  Only
> the original author of the PR?  The first person to notice the problem?
>
> What is a reasonable amount of time for this to happen?  2 hours? 2 days?
> 2 weeks? Does it depend whether the bad commit is also affecting PR checks
> for everyone else vs only depriving downstream consumers of new Geode
> -SNAPSHOTs?
>
> Would you take offense if someone else reverted your commit?  Does is make
> a difference if your commit is exposing an existing issue (as opposed to
> introducing a new bug)?
>
> Is there a perception that reverts create a lot of extra work? (they
> shouldn’t--just start your rework PR with a revert of the revert, then add
> additional commits that resolve the issue, so reviewers can easily see what
> was missing the first time)
>
> This is a discussion thread, not a vote.  We trust committers to do what’s
> best.  Would embracing a “anyone can revert, no shame”
> revert-first-then-fix culture benefit our community, or is our current
> easygoing approach working just fine?
>
> [1]
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Cjmelchior%40vmware.com%7C147db9dcc029489fb48a08d7ec91f5c1%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237983950472313&amp;sdata=iEWXjnfwt1tUx6s2ODrF5aAoaNuFAphKjIo0ozE0AXU%3D&amp;reserved=0
>


-- 
Cheers

Jinmei

Re: [DISCUSS] etiquette around breaking the pipeline

Posted by Joris Melchior <jm...@vmware.com>.
Recent experience makes me lean towards quick revert as well. Takes the pressure off when trying to produce a fix and if done soon enough it is usually quite straightforward.
________________________________
From: Owen Nichols <on...@pivotal.io>
Sent: April 29, 2020 7:06 PM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: [DISCUSS] etiquette around breaking the pipeline

There are many ways a commit can break the Geode CI pipeline[1] despite our required PR checks, such as:
- merge a PR that failed some non-required PR checks
- merge a PR that last ran checks awhile ago and now interacts unexpectedly with other recent changes
- merge a PR that fails Windows tests
- merge a PR that fails Benchmarks tests

When a bad commit gets through for whatever reason, what should happen next?  For example:
- bring it up on the dev list
- someone should revert the change as soon as possible, allowing the pipeline to remain green while the issue is addressed
- a reasonable amount of time should be allowed for someone to make a quick fix, otherwise revert.
- the pipeline should remain broken for as long as it takes to fix, as long as there is clear communication that someone is working on it
- everyone look the other way and hope it fixes itself

I’m sure there are other ideas as well.  A related question is *who* can or should revert a bad commit?  Only the person that merged the PR?  Only the original author of the PR?  The first person to notice the problem?

What is a reasonable amount of time for this to happen?  2 hours? 2 days? 2 weeks? Does it depend whether the bad commit is also affecting PR checks for everyone else vs only depriving downstream consumers of new Geode -SNAPSHOTs?

Would you take offense if someone else reverted your commit?  Does is make a difference if your commit is exposing an existing issue (as opposed to introducing a new bug)?

Is there a perception that reverts create a lot of extra work? (they shouldn’t--just start your rework PR with a revert of the revert, then add additional commits that resolve the issue, so reviewers can easily see what was missing the first time)

This is a discussion thread, not a vote.  We trust committers to do what’s best.  Would embracing a “anyone can revert, no shame” revert-first-then-fix culture benefit our community, or is our current easygoing approach working just fine?

[1] https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Cjmelchior%40vmware.com%7C147db9dcc029489fb48a08d7ec91f5c1%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237983950472313&amp;sdata=iEWXjnfwt1tUx6s2ODrF5aAoaNuFAphKjIo0ozE0AXU%3D&amp;reserved=0

Re: [DISCUSS] etiquette around breaking the pipeline

Posted by Ernest Burghardt <eb...@pivotal.io>.
+1 to quick reverts.  echoing @Donal its best if the committer of the
offending commit does the revert

On Thu, Apr 30, 2020 at 1:28 AM Ju@N <ju...@gmail.com> wrote:

> I'm in favour of quick reverts as well.
> Even though a failure might seem easy to fix at a first glance, experience
> has proven otherwise in many cases, so quickly reverting the offending
> commit seems the right thing to do.
> Whom should revert the offending commit?, I'd say the first person that
> detects the failure... It should, of course, be communicated to the
> originator of the PR so he/she can work towards solving the problem and
> submit a new PR afterwards. In my opinion nobody should be offended by a
> revert, as long as the person reverting the commit
> communicates empathetically and helps the originator of the PR to
> understand what was wrong, why it was reverted and, how can be solved (if
> needed).
> Maybe it's just me, but I'd feel more offended if my commit is breaking the
> pipeline and preventing other community members from continue working
> normally, than if somebody just reverts the bad changes I introduced.
> Cheers.
>
> On Thu, 30 Apr 2020 at 05:18, Owen Nichols <on...@pivotal.io> wrote:
>
> > Hi Mark, I’ve noticed some voluntarily quick reverts, which is awesome,
> > but other times CI has stayed broken for days, so it doesn’t feel like
> > we’re all on the same page.  I cannot find anything in the wiki or dev
> list
> > archives to suggest this was “settled” previously, but please share a
> link
> > if I missed something.  Consensus that "quick reverts are good” sounds
> > nice, but the details of who can notice and initiate the revert make a
> > difference (do we really expect every contributor to stare at CI for the
> > next 10 hours after their PR is merged?).
> >
> > > On Apr 29, 2020, at 8:12 PM, Mark Hanson <mh...@pivotal.io> wrote:
> > >
> > > As far as I am aware, I think this is already a settled decision. The
> > decision was quick revert.
> > > In almost every case the build is more important than the change.
> > >
> > > Thanks,
> > > Mark
> > >
> > >> On Apr 29, 2020, at 4:14 PM, Robert Houghton <rh...@pivotal.io>
> > wrote:
> > >>
> > >> I am very pro-revert for breaking changes. The Benchmark or Windows
> > tests
> > >> being a culprit is unfortunate, since the PR pipeline cannot tell us
> > about
> > >> those, but thats the cost of our excellence :)
> > >>
> > >> On Wed, Apr 29, 2020 at 4:06 PM Owen Nichols <onichols@pivotal.io
> > <ma...@pivotal.io>> wrote:
> > >>
> > >>> There are many ways a commit can break the Geode CI pipeline[1]
> despite
> > >>> our required PR checks, such as:
> > >>> - merge a PR that failed some non-required PR checks
> > >>> - merge a PR that last ran checks awhile ago and now interacts
> > >>> unexpectedly with other recent changes
> > >>> - merge a PR that fails Windows tests
> > >>> - merge a PR that fails Benchmarks tests
> > >>>
> > >>> When a bad commit gets through for whatever reason, what should
> happen
> > >>> next?  For example:
> > >>> - bring it up on the dev list
> > >>> - someone should revert the change as soon as possible, allowing the
> > >>> pipeline to remain green while the issue is addressed
> > >>> - a reasonable amount of time should be allowed for someone to make a
> > >>> quick fix, otherwise revert.
> > >>> - the pipeline should remain broken for as long as it takes to fix,
> as
> > >>> long as there is clear communication that someone is working on it
> > >>> - everyone look the other way and hope it fixes itself
> > >>>
> > >>> I’m sure there are other ideas as well.  A related question is *who*
> > can
> > >>> or should revert a bad commit?  Only the person that merged the PR?
> > Only
> > >>> the original author of the PR?  The first person to notice the
> problem?
> > >>>
> > >>> What is a reasonable amount of time for this to happen?  2 hours? 2
> > days?
> > >>> 2 weeks? Does it depend whether the bad commit is also affecting PR
> > checks
> > >>> for everyone else vs only depriving downstream consumers of new Geode
> > >>> -SNAPSHOTs?
> > >>>
> > >>> Would you take offense if someone else reverted your commit?  Does is
> > make
> > >>> a difference if your commit is exposing an existing issue (as opposed
> > to
> > >>> introducing a new bug)?
> > >>>
> > >>> Is there a perception that reverts create a lot of extra work? (they
> > >>> shouldn’t--just start your rework PR with a revert of the revert,
> then
> > add
> > >>> additional commits that resolve the issue, so reviewers can easily
> see
> > what
> > >>> was missing the first time)
> > >>>
> > >>> This is a discussion thread, not a vote.  We trust committers to do
> > what’s
> > >>> best.  Would embracing a “anyone can revert, no shame”
> > >>> revert-first-then-fix culture benefit our community, or is our
> current
> > >>> easygoing approach working just fine?
> > >>>
> > >>> [1]
> > >>>
> >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&amp;sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&amp;reserved=0
> > <
> >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&amp;sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&amp;reserved=0
> > >
> >
> >
>
> --
> Ju@N
>

Re: [DISCUSS] etiquette around breaking the pipeline

Posted by "Ju@N" <ju...@gmail.com>.
I'm in favour of quick reverts as well.
Even though a failure might seem easy to fix at a first glance, experience
has proven otherwise in many cases, so quickly reverting the offending
commit seems the right thing to do.
Whom should revert the offending commit?, I'd say the first person that
detects the failure... It should, of course, be communicated to the
originator of the PR so he/she can work towards solving the problem and
submit a new PR afterwards. In my opinion nobody should be offended by a
revert, as long as the person reverting the commit
communicates empathetically and helps the originator of the PR to
understand what was wrong, why it was reverted and, how can be solved (if
needed).
Maybe it's just me, but I'd feel more offended if my commit is breaking the
pipeline and preventing other community members from continue working
normally, than if somebody just reverts the bad changes I introduced.
Cheers.

On Thu, 30 Apr 2020 at 05:18, Owen Nichols <on...@pivotal.io> wrote:

> Hi Mark, I’ve noticed some voluntarily quick reverts, which is awesome,
> but other times CI has stayed broken for days, so it doesn’t feel like
> we’re all on the same page.  I cannot find anything in the wiki or dev list
> archives to suggest this was “settled” previously, but please share a link
> if I missed something.  Consensus that "quick reverts are good” sounds
> nice, but the details of who can notice and initiate the revert make a
> difference (do we really expect every contributor to stare at CI for the
> next 10 hours after their PR is merged?).
>
> > On Apr 29, 2020, at 8:12 PM, Mark Hanson <mh...@pivotal.io> wrote:
> >
> > As far as I am aware, I think this is already a settled decision. The
> decision was quick revert.
> > In almost every case the build is more important than the change.
> >
> > Thanks,
> > Mark
> >
> >> On Apr 29, 2020, at 4:14 PM, Robert Houghton <rh...@pivotal.io>
> wrote:
> >>
> >> I am very pro-revert for breaking changes. The Benchmark or Windows
> tests
> >> being a culprit is unfortunate, since the PR pipeline cannot tell us
> about
> >> those, but thats the cost of our excellence :)
> >>
> >> On Wed, Apr 29, 2020 at 4:06 PM Owen Nichols <onichols@pivotal.io
> <ma...@pivotal.io>> wrote:
> >>
> >>> There are many ways a commit can break the Geode CI pipeline[1] despite
> >>> our required PR checks, such as:
> >>> - merge a PR that failed some non-required PR checks
> >>> - merge a PR that last ran checks awhile ago and now interacts
> >>> unexpectedly with other recent changes
> >>> - merge a PR that fails Windows tests
> >>> - merge a PR that fails Benchmarks tests
> >>>
> >>> When a bad commit gets through for whatever reason, what should happen
> >>> next?  For example:
> >>> - bring it up on the dev list
> >>> - someone should revert the change as soon as possible, allowing the
> >>> pipeline to remain green while the issue is addressed
> >>> - a reasonable amount of time should be allowed for someone to make a
> >>> quick fix, otherwise revert.
> >>> - the pipeline should remain broken for as long as it takes to fix, as
> >>> long as there is clear communication that someone is working on it
> >>> - everyone look the other way and hope it fixes itself
> >>>
> >>> I’m sure there are other ideas as well.  A related question is *who*
> can
> >>> or should revert a bad commit?  Only the person that merged the PR?
> Only
> >>> the original author of the PR?  The first person to notice the problem?
> >>>
> >>> What is a reasonable amount of time for this to happen?  2 hours? 2
> days?
> >>> 2 weeks? Does it depend whether the bad commit is also affecting PR
> checks
> >>> for everyone else vs only depriving downstream consumers of new Geode
> >>> -SNAPSHOTs?
> >>>
> >>> Would you take offense if someone else reverted your commit?  Does is
> make
> >>> a difference if your commit is exposing an existing issue (as opposed
> to
> >>> introducing a new bug)?
> >>>
> >>> Is there a perception that reverts create a lot of extra work? (they
> >>> shouldn’t--just start your rework PR with a revert of the revert, then
> add
> >>> additional commits that resolve the issue, so reviewers can easily see
> what
> >>> was missing the first time)
> >>>
> >>> This is a discussion thread, not a vote.  We trust committers to do
> what’s
> >>> best.  Would embracing a “anyone can revert, no shame”
> >>> revert-first-then-fix culture benefit our community, or is our current
> >>> easygoing approach working just fine?
> >>>
> >>> [1]
> >>>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&amp;sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&amp;reserved=0
> <
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&amp;sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&amp;reserved=0
> >
>
>

-- 
Ju@N

Re: [DISCUSS] etiquette around breaking the pipeline

Posted by Owen Nichols <on...@pivotal.io>.
Hi Mark, I’ve noticed some voluntarily quick reverts, which is awesome, but other times CI has stayed broken for days, so it doesn’t feel like we’re all on the same page.  I cannot find anything in the wiki or dev list archives to suggest this was “settled” previously, but please share a link if I missed something.  Consensus that "quick reverts are good” sounds nice, but the details of who can notice and initiate the revert make a difference (do we really expect every contributor to stare at CI for the next 10 hours after their PR is merged?).

> On Apr 29, 2020, at 8:12 PM, Mark Hanson <mh...@pivotal.io> wrote:
> 
> As far as I am aware, I think this is already a settled decision. The decision was quick revert. 
> In almost every case the build is more important than the change.
> 
> Thanks,
> Mark
> 
>> On Apr 29, 2020, at 4:14 PM, Robert Houghton <rh...@pivotal.io> wrote:
>> 
>> I am very pro-revert for breaking changes. The Benchmark or Windows tests
>> being a culprit is unfortunate, since the PR pipeline cannot tell us about
>> those, but thats the cost of our excellence :)
>> 
>> On Wed, Apr 29, 2020 at 4:06 PM Owen Nichols <onichols@pivotal.io <ma...@pivotal.io>> wrote:
>> 
>>> There are many ways a commit can break the Geode CI pipeline[1] despite
>>> our required PR checks, such as:
>>> - merge a PR that failed some non-required PR checks
>>> - merge a PR that last ran checks awhile ago and now interacts
>>> unexpectedly with other recent changes
>>> - merge a PR that fails Windows tests
>>> - merge a PR that fails Benchmarks tests
>>> 
>>> When a bad commit gets through for whatever reason, what should happen
>>> next?  For example:
>>> - bring it up on the dev list
>>> - someone should revert the change as soon as possible, allowing the
>>> pipeline to remain green while the issue is addressed
>>> - a reasonable amount of time should be allowed for someone to make a
>>> quick fix, otherwise revert.
>>> - the pipeline should remain broken for as long as it takes to fix, as
>>> long as there is clear communication that someone is working on it
>>> - everyone look the other way and hope it fixes itself
>>> 
>>> I’m sure there are other ideas as well.  A related question is *who* can
>>> or should revert a bad commit?  Only the person that merged the PR?  Only
>>> the original author of the PR?  The first person to notice the problem?
>>> 
>>> What is a reasonable amount of time for this to happen?  2 hours? 2 days?
>>> 2 weeks? Does it depend whether the bad commit is also affecting PR checks
>>> for everyone else vs only depriving downstream consumers of new Geode
>>> -SNAPSHOTs?
>>> 
>>> Would you take offense if someone else reverted your commit?  Does is make
>>> a difference if your commit is exposing an existing issue (as opposed to
>>> introducing a new bug)?
>>> 
>>> Is there a perception that reverts create a lot of extra work? (they
>>> shouldn’t--just start your rework PR with a revert of the revert, then add
>>> additional commits that resolve the issue, so reviewers can easily see what
>>> was missing the first time)
>>> 
>>> This is a discussion thread, not a vote.  We trust committers to do what’s
>>> best.  Would embracing a “anyone can revert, no shame”
>>> revert-first-then-fix culture benefit our community, or is our current
>>> easygoing approach working just fine?
>>> 
>>> [1]
>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&amp;sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&amp;reserved=0 <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&amp;sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&amp;reserved=0>


Re: [DISCUSS] etiquette around breaking the pipeline

Posted by Mark Hanson <mh...@pivotal.io>.
As far as I am aware, I think this is already a settled decision. The decision was quick revert. 
In almost every case the build is more important than the change.

Thanks,
Mark

> On Apr 29, 2020, at 4:14 PM, Robert Houghton <rh...@pivotal.io> wrote:
> 
> I am very pro-revert for breaking changes. The Benchmark or Windows tests
> being a culprit is unfortunate, since the PR pipeline cannot tell us about
> those, but thats the cost of our excellence :)
> 
> On Wed, Apr 29, 2020 at 4:06 PM Owen Nichols <onichols@pivotal.io <ma...@pivotal.io>> wrote:
> 
>> There are many ways a commit can break the Geode CI pipeline[1] despite
>> our required PR checks, such as:
>> - merge a PR that failed some non-required PR checks
>> - merge a PR that last ran checks awhile ago and now interacts
>> unexpectedly with other recent changes
>> - merge a PR that fails Windows tests
>> - merge a PR that fails Benchmarks tests
>> 
>> When a bad commit gets through for whatever reason, what should happen
>> next?  For example:
>> - bring it up on the dev list
>> - someone should revert the change as soon as possible, allowing the
>> pipeline to remain green while the issue is addressed
>> - a reasonable amount of time should be allowed for someone to make a
>> quick fix, otherwise revert.
>> - the pipeline should remain broken for as long as it takes to fix, as
>> long as there is clear communication that someone is working on it
>> - everyone look the other way and hope it fixes itself
>> 
>> I’m sure there are other ideas as well.  A related question is *who* can
>> or should revert a bad commit?  Only the person that merged the PR?  Only
>> the original author of the PR?  The first person to notice the problem?
>> 
>> What is a reasonable amount of time for this to happen?  2 hours? 2 days?
>> 2 weeks? Does it depend whether the bad commit is also affecting PR checks
>> for everyone else vs only depriving downstream consumers of new Geode
>> -SNAPSHOTs?
>> 
>> Would you take offense if someone else reverted your commit?  Does is make
>> a difference if your commit is exposing an existing issue (as opposed to
>> introducing a new bug)?
>> 
>> Is there a perception that reverts create a lot of extra work? (they
>> shouldn’t--just start your rework PR with a revert of the revert, then add
>> additional commits that resolve the issue, so reviewers can easily see what
>> was missing the first time)
>> 
>> This is a discussion thread, not a vote.  We trust committers to do what’s
>> best.  Would embracing a “anyone can revert, no shame”
>> revert-first-then-fix culture benefit our community, or is our current
>> easygoing approach working just fine?
>> 
>> [1]
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&amp;sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&amp;reserved=0 <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main&amp;data=02%7C01%7Chansonm%40vmware.com%7C631cbe6f65c14bbb030108d7ec931267%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637237988725318710&amp;sdata=2ecuh535cj7G9i8STCT32zyunsnrcupg4c8dowrGwvo%3D&amp;reserved=0>

Re: [DISCUSS] etiquette around breaking the pipeline

Posted by Donal Evans <do...@pivotal.io>.
+1 for a quick revert from me. Preferably by the person whose commit is
causing the issue, since I think it's healthy to encourage a
stigma-free revert process and a sense of individual responsibility for the
health of the pipeline, but theoretically by anyone, as long there's a
concrete idea of who has ownership of the problem following the revert.

As far as the (perceived?) pain of reverting, my recent personal experience
with reverting a troublesome commit has made me even more in favour of
doing it. I hit an unanticipated test ordering issue in the Windows unit
tests after merging a commit and thanks to good communication and
coordination from people who were monitoring the pipeline was able to get
it reverted within 2 hours. The fix was very straightforward, and I was
tempted to leave the problem commit in place and just apply the fix on top
of it, but by swiftly reverting the commit that was exposing the problem, I
stopped several subsequent commits from other committers from getting
blocked in the brief time it took to get the fix verified and merged. The
extra work involved consisted of a few messages asking for PR approvals and
30 seconds in git, so in this instance, at least, it's difficult for me to
see a justification in leaving the pipeline red while working on a fix.

On Wed, Apr 29, 2020 at 4:14 PM Robert Houghton <rh...@pivotal.io>
wrote:

> I am very pro-revert for breaking changes. The Benchmark or Windows tests
> being a culprit is unfortunate, since the PR pipeline cannot tell us about
> those, but thats the cost of our excellence :)
>
> On Wed, Apr 29, 2020 at 4:06 PM Owen Nichols <on...@pivotal.io> wrote:
>
> > There are many ways a commit can break the Geode CI pipeline[1] despite
> > our required PR checks, such as:
> > - merge a PR that failed some non-required PR checks
> > - merge a PR that last ran checks awhile ago and now interacts
> > unexpectedly with other recent changes
> > - merge a PR that fails Windows tests
> > - merge a PR that fails Benchmarks tests
> >
> > When a bad commit gets through for whatever reason, what should happen
> > next?  For example:
> > - bring it up on the dev list
> > - someone should revert the change as soon as possible, allowing the
> > pipeline to remain green while the issue is addressed
> > - a reasonable amount of time should be allowed for someone to make a
> > quick fix, otherwise revert.
> > - the pipeline should remain broken for as long as it takes to fix, as
> > long as there is clear communication that someone is working on it
> > - everyone look the other way and hope it fixes itself
> >
> > I’m sure there are other ideas as well.  A related question is *who* can
> > or should revert a bad commit?  Only the person that merged the PR?  Only
> > the original author of the PR?  The first person to notice the problem?
> >
> > What is a reasonable amount of time for this to happen?  2 hours? 2 days?
> > 2 weeks? Does it depend whether the bad commit is also affecting PR
> checks
> > for everyone else vs only depriving downstream consumers of new Geode
> > -SNAPSHOTs?
> >
> > Would you take offense if someone else reverted your commit?  Does is
> make
> > a difference if your commit is exposing an existing issue (as opposed to
> > introducing a new bug)?
> >
> > Is there a perception that reverts create a lot of extra work? (they
> > shouldn’t--just start your rework PR with a revert of the revert, then
> add
> > additional commits that resolve the issue, so reviewers can easily see
> what
> > was missing the first time)
> >
> > This is a discussion thread, not a vote.  We trust committers to do
> what’s
> > best.  Would embracing a “anyone can revert, no shame”
> > revert-first-then-fix culture benefit our community, or is our current
> > easygoing approach working just fine?
> >
> > [1]
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main
>

Re: [DISCUSS] etiquette around breaking the pipeline

Posted by Robert Houghton <rh...@pivotal.io>.
I am very pro-revert for breaking changes. The Benchmark or Windows tests
being a culprit is unfortunate, since the PR pipeline cannot tell us about
those, but thats the cost of our excellence :)

On Wed, Apr 29, 2020 at 4:06 PM Owen Nichols <on...@pivotal.io> wrote:

> There are many ways a commit can break the Geode CI pipeline[1] despite
> our required PR checks, such as:
> - merge a PR that failed some non-required PR checks
> - merge a PR that last ran checks awhile ago and now interacts
> unexpectedly with other recent changes
> - merge a PR that fails Windows tests
> - merge a PR that fails Benchmarks tests
>
> When a bad commit gets through for whatever reason, what should happen
> next?  For example:
> - bring it up on the dev list
> - someone should revert the change as soon as possible, allowing the
> pipeline to remain green while the issue is addressed
> - a reasonable amount of time should be allowed for someone to make a
> quick fix, otherwise revert.
> - the pipeline should remain broken for as long as it takes to fix, as
> long as there is clear communication that someone is working on it
> - everyone look the other way and hope it fixes itself
>
> I’m sure there are other ideas as well.  A related question is *who* can
> or should revert a bad commit?  Only the person that merged the PR?  Only
> the original author of the PR?  The first person to notice the problem?
>
> What is a reasonable amount of time for this to happen?  2 hours? 2 days?
> 2 weeks? Does it depend whether the bad commit is also affecting PR checks
> for everyone else vs only depriving downstream consumers of new Geode
> -SNAPSHOTs?
>
> Would you take offense if someone else reverted your commit?  Does is make
> a difference if your commit is exposing an existing issue (as opposed to
> introducing a new bug)?
>
> Is there a perception that reverts create a lot of extra work? (they
> shouldn’t--just start your rework PR with a revert of the revert, then add
> additional commits that resolve the issue, so reviewers can easily see what
> was missing the first time)
>
> This is a discussion thread, not a vote.  We trust committers to do what’s
> best.  Would embracing a “anyone can revert, no shame”
> revert-first-then-fix culture benefit our community, or is our current
> easygoing approach working just fine?
>
> [1]
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main