You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Damon Douglas <da...@google.com> on 2022/06/27 19:01:51 UTC

Update PR description template

I write this to prevent a recent experience I encountered contributing to
the Apache Beam repository.  To address a system issue, I write in SBAR
<https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
format for clarity.  If there are no objections to the recommendation, I'm
happy to fulfill the first recommendation i.e. create GitHub issues.

*Subject*

I submitted a PR <https://github.com/apache/beam/pull/21953> that was
subsequently
reverted <https://github.com/apache/beam/pull/22044>.

*Background*

Despite passing the following gradle tasks, the PreCommit_Java_Commit failed
with a test unrelated to the code in the PR
<https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
The PR was merged and then reverted due to discovered vendor dependency
issues and failed style checks.

./gradlew rat
./gradlew spotlessCheck
./gradlew sdks:java:io:google-cloud-platform:test

*Assessment*

Output from the following gradle commands did not surface prior to the PR
merge.

./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
./gradlew sdks:java:io:google-cloud-platform:checkStyleTest

Both the aforementioned failing PreCommit_Java_Commit test and a lack of
knowledge contributed to the root cause.

*Recommendation / Proposal*

I propose updating PR template description
<https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md>
that
prescribes gradle tasks to be successfully executed prior to submitting a
PR.  Toward this aim:

1.  Four separate GitHub issues in https://github.com/apache/beam
representative of *common*, and language specific: *Java*, *Python*, and
*Go*, prescribed gradle tasks
2.  Don't reply to this email with specifics on how; delegation of
comments/discussion within the relevant context of the GitHub issues in 1
3.  PRs that fix the GitHub issues in 1

-- 



*Damon Douglas*

Strategic Cloud Engineer, Data & Analytics, Google Cloud

damondouglas@google.com

Re: Update PR description template

Posted by Danny McCormick via dev <de...@beam.apache.org>.
Hey Yi,

Thanks for pushing this forward! I'm +1 on doing all 3 action items, along
with making the github action a required check once auto-merge is
implemented.

Thanks,
Danny

On Thu, Jul 7, 2022 at 12:08 PM Yi Hu via dev <de...@beam.apache.org> wrote:

> Hi all,
>
> I have drafted a document summarizing the ideas suggested in this thread.
>
>
> https://docs.google.com/document/d/10FlXOo_hL2QYTPhwS8uHSyJbQCzwC3K3C12tFccANA8/edit#
>
> where I extracted the following action items:
>
> short-term (easy) items
> - Split the linting task (e.g. java checkStyle) out from the test (java
> percommit) step
>
> mid-to-long-term items
> - A single github action that indicates the PR is ready to merge
> - Merging the PR through action
>
> I am happy to volunteer to take on this task. Once we have decided the
> direction(s) to proceed I am going to detail the design.
>
> Regards,
> Yi
>
>
>
> On Tue, Jun 28, 2022 at 10:40 PM Ahmet Altay <al...@google.com> wrote:
>
>> I am worried about adding our own custom tooling. They require
>> maintenance, they introduce new flakiness, and so far we have not been
>> great about maintaining custom infra.
>>
>> On Tue, Jun 28, 2022 at 1:36 PM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> Thanks for all the super useful info. I can add some experience from the
>>> early days of Beam: Because the GitHub "merge" button did not exist, we
>>> rolled our own "merge bot". The pros/cons below are not general to the
>>> concept, but to our experience with it:
>>>
>>> PROs:
>>>  - It single-threaded commits to the repo so there weren't race
>>> conditions on test results. I think this will be OK for our scale for the
>>> foreseeable future.
>>>  - It re-ran tests after merge but before pushing to master, which is
>>> the second half of eliminating that race condition.
>>>
>>> CONs:
>>>  - It was very flaky and often just didn't stay up. We danced for joy
>>> when it was gone.
>>>  - It ran a kind of arbitrary set of tests that didn't match the PR
>>> statuses. It did not have any filter on which tests were run during merge.
>>> We were small enough that mostly just "run the tests" was specific enough.
>>>  - It always squashed the commits and then pushed the squashed commit
>>> with a comment "this closes #<pr>". This messes up PRs that have multiple
>>> commits that should remain separate. But more importantly it made it
>>> impossible to easily distinguish PRs that were merged versus those that
>>> were closed without merge. And of course it is way harder to navigate
>>> history when the commits on master have different hashes than the commits
>>> that were authored.
>>>  - Getting reasonable logs with error messages when a merge failed for
>>> whatever reason was hard or impossible IIRC.
>>>
>>> So I think in what you have said there is an option to get the best of
>>> all, something like:
>>>
>>>  - Do merges through an action. Merge commit or squash-and-merge would
>>> be separate labels. Or not bother with squash and merge, instead using
>>> heuristics to block PRs with bad commit histories.
>>>  - Have the workflow check that all PR statuses are green before
>>> continuing to merge.
>>>
>>
>> This sounds like a reasonable process. I assume the only addition in this
>> case would be another github action.
>>
>>
>>>
>>> Kenn
>>>
>>> On Tue, Jun 28, 2022 at 8:03 AM Danny McCormick <
>>> dannymccormick@google.com> wrote:
>>>
>>>> After looking into this a little bit more, I need to revise my opinion
>>>> on how we would do this; I don't think it's practical to have all required
>>>> status checks via the .asf.yml file because those required checks can't be
>>>> filtered by path (for example, if we want to require Python precommit on
>>>> Python PRs, it would need to be required on *all *PRs). That's a GitHub
>>>> limitation
>>>> <https://docs.github.com/en/enterprise-cloud@latest/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks>,
>>>> not an ASF one.
>>>>
>>>> One option is to write an action that makes sure no checks are failing
>>>> (except maybe codecov?) and put a single required check on that. That would
>>>> also make it easy to build in logic to override required checks like Robert
>>>> suggested ("specific wording would have to be in the last comment"). We
>>>> already have logic in the PR bot that does some of this
>>>> <https://github.com/apache/beam/blob/master/scripts/ci/pr-bot/shared/checks.ts>
>>>> .
>>>>
>>>> The downside to that approach is that it's not clear what the best way
>>>> to trigger that workflow is since it has to run after all other checks have
>>>> completed. We could have it trigger on some label (e.g. "ready to merge")
>>>> and then automatically merge the PR when it's done or comment and remove
>>>> the label if checks are failing/incomplete. This changes the workflow for
>>>> committers from "click the merge button" to "add a label", but doesn't
>>>> require significantly more action or oversight and is pretty similar to how
>>>> Kubernetes <https://github.com/kubernetes/kubernetes/pull/110812> and
>>>> some other large repos run things.
>>>>
>>>> Another option would be to trigger that check at the end of every
>>>> Actions run and use the check_suite trigger for external runs
>>>> (unfortunately actions doesn't trigger that). That would require a lot of
>>>> boilerplate on every Actions workflow though. There are other similar
>>>> options which may be cleaner that also require modifying every Actions
>>>> workflow, but I don't love that option.
>>>>
>>>> I'm still in favor of doing this via the "ready to merge" label option
>>>> I just described, but this has dampened my enthusiasm a little bit since
>>>> we'd need to build out/maintain our own tooling.
>>>>
>>>> On Tue, Jun 28, 2022 at 8:49 AM Danny McCormick <
>>>> dannymccormick@google.com> wrote:
>>>>
>>>>> Regarding code cov - it does check overall % coverage as well as the
>>>>> coverage of the diff, but I don't think it's designed to be a blocking
>>>>> metric. A good example of where this is not helpful is this pr to
>>>>> remove dead code <https://github.com/apache/beam/pull/22019>. Because
>>>>> it removes tested code, the pr lowers our coverage percentage and fails the
>>>>> check, but it's a totally reasonable PR that doesn't need additional
>>>>> testing (because it only removes functionality). Other classes of changes
>>>>> that are problematic include things like improving hard to cover error
>>>>> messages or tough to test proto changes which can only be covered in an
>>>>> integration test that doesn't make it into codecov (these are both common
>>>>> at least in the Go SDK).
>>>>>
>>>>> I'd be opposed to including that as a required check, though I support
>>>>> adding pretty much every other suite.
>>>>>
>>>>> > I assume we would have a way to override it in case the repo gets
>>>>> into a bad state (for example needing to modify the workflow definitions in
>>>>> order to get them to pass).
>>>>>
>>>>> The way we make a check required is by adding it in the .asf.yml file
>>>>> <https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Branchprotection>.
>>>>> So we could always update that if needed to push an emergency fix as long
>>>>> as we don't accidentally include any required checks on the .asf.yml
>>>>> itself. If we mess that up we would need to bring in Infra to help.
>>>>>
>>>>> On Mon, Jun 27, 2022 at 7:59 PM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 4:55 PM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I am in favor of requiring precommit green before merge. To make
>>>>>>>> this better, we should:
>>>>>>>>
>>>>>>>> (1) run fewer irrelevant tests, for example almost every
>>>>>>>> language-based presubmit tests vastly more code than could possibly be
>>>>>>>> affected.
>>>>>>>> (2) run more relevant tests, for example when an IO is touched
>>>>>>>> running slightly more heavyweight integration tests before merge is
>>>>>>>> reasonable.
>>>>>>>>
>>>>>>>
>>>>>>> Does it have to be all or nothing? E.g. we could start
>>>>>>> with requiring a smaller subset of tests that must pass, with the goal of
>>>>>>> eventually requiring (almost?) everything.
>>>>>>>
>>>>>>>
>>>>>>>> I worry that requiring total line coverage in unit tests /
>>>>>>>> presubmits will encourage mock-based tests that don't demonstrate any
>>>>>>>> meaningful correctness. But I am OK to try it.
>>>>>>>>
>>>>>>>
>>>>>> IIRC, codecov test is checking that % coverage is not reduced with
>>>>>> the new PR and it is not checking for total line coverage. I agree with you
>>>>>> that total line coverage is probably not meaningful in most cases.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> +1. I think there are a class of presubmits (coverage among them)
>>>>>>> that are advisory rather than binding.
>>>>>>>
>>>>>>
>>>>>> +1 - especially if we can make the tools distinguish between required
>>>>>> and advisory presubmits.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>> I assume we would have a way to override it in case the repo gets
>>>>>>>> into a bad state (for example needing to modify the workflow definitions in
>>>>>>>> order to get them to pass).
>>>>>>>>
>>>>>>>
>>>>>>> +1. But making this explicit (e.g. specific wording would have to be
>>>>>>> in the last comment) would be a good barrier to have even if it's not
>>>>>>> absolute.
>>>>>>>
>>>>>>>
>>>>>>>> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Just an opinion: If we have a precommit (including codecov) we
>>>>>>>>> should be willing to enforce those passing before merging a PR. Selectively
>>>>>>>>> applying when/which precommits would be blockers eventually leads to more
>>>>>>>>> opinions. I think codecov is actually providing value by checking that
>>>>>>>>> delta coverage is not dropped. In my experience, for PRs that fail the
>>>>>>>>> codecov precommit there tends to be an opportunity for adding unit tests.
>>>>>>>>>
>>>>>>>>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Brian, you keep beating my emails by an instant :) I think
>>>>>>>>>> codecov provides value independent of blocking a PR - it is a good sanity
>>>>>>>>>> check (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric
>>>>>>>>>> to track over time.
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yeah, codecov is the one exception I had in mind, there may be
>>>>>>>>>>> others that are reasonable to exclude as well
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <
>>>>>>>>>>> jrmccluskey@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +1 for enforcing some precommit checks as required, but making
>>>>>>>>>>>> them all required would be a bit of a nightmare with how Codecov works. Any
>>>>>>>>>>>> Python or Go patch gets flagged as failing codecov if its changes aren't
>>>>>>>>>>>> covered at or above the current overall coverage level at master (around
>>>>>>>>>>>> 75% right now.) This is a nice goal, but ultimately not every change lends
>>>>>>>>>>>> itself to that level of unit testing. But I would be in favor of things
>>>>>>>>>>>> like the various ValidatesRunner suites being required.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Jack McCluskey
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Brian beat me by a few seconds, but I actually agree that
>>>>>>>>>>>>> there is a problem here - when Damon's PR was submitted there was basically
>>>>>>>>>>>>> no way of knowing that checks would break without having detailed knowledge
>>>>>>>>>>>>> of how the build system works (so I'm on board with your
>>>>>>>>>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>>>>>>>>>> recommended path forward - our automation failed us here and I don't think
>>>>>>>>>>>>> the correct response is to make our manual validation better. That shifts
>>>>>>>>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>>>>>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>>>>>>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>>>>>>>>>> underlying technical problems, which are IMO:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>>>>>>>>>
>>>>>>>>>>>>
>> I missed this part earlier. I would encourage all committers who are
>> blocked on flaky tests to file a new github issue with a link to the flaked
>> test before re-running the flaking test. Re-running tests until they pass
>> also makes it easier to introduce new flaky tests.
>>
>>
>>> 2) There were no guardrails in place to avoid merging a PR that wasn't
>>>>>>>>>>>>> fully validated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Things that would've made this better which I think we should
>>>>>>>>>>>>> focus our attention on:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>>>>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>>>>>>>>> enforcing some (or all?) precommit checks as required. I don't really blame
>>>>>>>>>>>>> anyone for merging this PR since it had repeatedly run into unrelated
>>>>>>>>>>>>> flakes, but that's kinda the point - we need to feel some of that pain to
>>>>>>>>>>>>> avoid accidentally merging bad code. Blocking PRs on failed precommit
>>>>>>>>>>>>> suites would force us to deal with these kinds of flakes and incentivize a
>>>>>>>>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <
>>>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Damon,
>>>>>>>>>>>>>> I think our CI system was working as intended for your PR. We
>>>>>>>>>>>>>> select which PreCommits to run based on the files that are edited, and the
>>>>>>>>>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>>>>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>>>>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks like what broke down in this case was that the PR
>>>>>>>>>>>>>> was accidentally merged despite the failing Java PreCommit run.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Brian
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>>>>>>>>> damondouglas@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>>>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>>>>>>>>>> write in SBAR
>>>>>>>>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>>>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>>>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Subject*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953>
>>>>>>>>>>>>>>> that was subsequently reverted
>>>>>>>>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Background*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to
>>>>>>>>>>>>>>> the code in the PR
>>>>>>>>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>>>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>>>>>>>>>> issues and failed style checks.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ./gradlew rat
>>>>>>>>>>>>>>> ./gradlew spotlessCheck
>>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Assessment*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Output from the following gradle commands did not surface
>>>>>>>>>>>>>>> prior to the PR merge.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test
>>>>>>>>>>>>>>> and a lack of knowledge contributed to the root cause.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Recommendation / Proposal*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I propose updating PR template description
>>>>>>>>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>>>>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>>>>>>>>>> PR.  Toward this aim:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1.  Four separate GitHub issues in
>>>>>>>>>>>>>>> https://github.com/apache/beam representative of *common*,
>>>>>>>>>>>>>>> and language specific: *Java*, *Python*, and *Go*,
>>>>>>>>>>>>>>> prescribed gradle tasks
>>>>>>>>>>>>>>> 2.  Don't reply to this email with specifics on how;
>>>>>>>>>>>>>>> delegation of comments/discussion within the relevant context of the GitHub
>>>>>>>>>>>>>>> issues in 1
>>>>>>>>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Damon Douglas*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> damondouglas@google.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

Re: Update PR description template

Posted by Yi Hu via dev <de...@beam.apache.org>.
Hi all,

I have drafted a document summarizing the ideas suggested in this thread.

https://docs.google.com/document/d/10FlXOo_hL2QYTPhwS8uHSyJbQCzwC3K3C12tFccANA8/edit#

where I extracted the following action items:

short-term (easy) items
- Split the linting task (e.g. java checkStyle) out from the test (java
percommit) step

mid-to-long-term items
- A single github action that indicates the PR is ready to merge
- Merging the PR through action

I am happy to volunteer to take on this task. Once we have decided the
direction(s) to proceed I am going to detail the design.

Regards,
Yi



On Tue, Jun 28, 2022 at 10:40 PM Ahmet Altay <al...@google.com> wrote:

> I am worried about adding our own custom tooling. They require
> maintenance, they introduce new flakiness, and so far we have not been
> great about maintaining custom infra.
>
> On Tue, Jun 28, 2022 at 1:36 PM Kenneth Knowles <ke...@apache.org> wrote:
>
>> Thanks for all the super useful info. I can add some experience from the
>> early days of Beam: Because the GitHub "merge" button did not exist, we
>> rolled our own "merge bot". The pros/cons below are not general to the
>> concept, but to our experience with it:
>>
>> PROs:
>>  - It single-threaded commits to the repo so there weren't race
>> conditions on test results. I think this will be OK for our scale for the
>> foreseeable future.
>>  - It re-ran tests after merge but before pushing to master, which is the
>> second half of eliminating that race condition.
>>
>> CONs:
>>  - It was very flaky and often just didn't stay up. We danced for joy
>> when it was gone.
>>  - It ran a kind of arbitrary set of tests that didn't match the PR
>> statuses. It did not have any filter on which tests were run during merge.
>> We were small enough that mostly just "run the tests" was specific enough.
>>  - It always squashed the commits and then pushed the squashed commit
>> with a comment "this closes #<pr>". This messes up PRs that have multiple
>> commits that should remain separate. But more importantly it made it
>> impossible to easily distinguish PRs that were merged versus those that
>> were closed without merge. And of course it is way harder to navigate
>> history when the commits on master have different hashes than the commits
>> that were authored.
>>  - Getting reasonable logs with error messages when a merge failed for
>> whatever reason was hard or impossible IIRC.
>>
>> So I think in what you have said there is an option to get the best of
>> all, something like:
>>
>>  - Do merges through an action. Merge commit or squash-and-merge would be
>> separate labels. Or not bother with squash and merge, instead using
>> heuristics to block PRs with bad commit histories.
>>  - Have the workflow check that all PR statuses are green before
>> continuing to merge.
>>
>
> This sounds like a reasonable process. I assume the only addition in this
> case would be another github action.
>
>
>>
>> Kenn
>>
>> On Tue, Jun 28, 2022 at 8:03 AM Danny McCormick <
>> dannymccormick@google.com> wrote:
>>
>>> After looking into this a little bit more, I need to revise my opinion
>>> on how we would do this; I don't think it's practical to have all required
>>> status checks via the .asf.yml file because those required checks can't be
>>> filtered by path (for example, if we want to require Python precommit on
>>> Python PRs, it would need to be required on *all *PRs). That's a GitHub
>>> limitation
>>> <https://docs.github.com/en/enterprise-cloud@latest/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks>,
>>> not an ASF one.
>>>
>>> One option is to write an action that makes sure no checks are failing
>>> (except maybe codecov?) and put a single required check on that. That would
>>> also make it easy to build in logic to override required checks like Robert
>>> suggested ("specific wording would have to be in the last comment"). We
>>> already have logic in the PR bot that does some of this
>>> <https://github.com/apache/beam/blob/master/scripts/ci/pr-bot/shared/checks.ts>
>>> .
>>>
>>> The downside to that approach is that it's not clear what the best way
>>> to trigger that workflow is since it has to run after all other checks have
>>> completed. We could have it trigger on some label (e.g. "ready to merge")
>>> and then automatically merge the PR when it's done or comment and remove
>>> the label if checks are failing/incomplete. This changes the workflow for
>>> committers from "click the merge button" to "add a label", but doesn't
>>> require significantly more action or oversight and is pretty similar to how
>>> Kubernetes <https://github.com/kubernetes/kubernetes/pull/110812> and
>>> some other large repos run things.
>>>
>>> Another option would be to trigger that check at the end of every
>>> Actions run and use the check_suite trigger for external runs
>>> (unfortunately actions doesn't trigger that). That would require a lot of
>>> boilerplate on every Actions workflow though. There are other similar
>>> options which may be cleaner that also require modifying every Actions
>>> workflow, but I don't love that option.
>>>
>>> I'm still in favor of doing this via the "ready to merge" label option I
>>> just described, but this has dampened my enthusiasm a little bit since we'd
>>> need to build out/maintain our own tooling.
>>>
>>> On Tue, Jun 28, 2022 at 8:49 AM Danny McCormick <
>>> dannymccormick@google.com> wrote:
>>>
>>>> Regarding code cov - it does check overall % coverage as well as the
>>>> coverage of the diff, but I don't think it's designed to be a blocking
>>>> metric. A good example of where this is not helpful is this pr to
>>>> remove dead code <https://github.com/apache/beam/pull/22019>. Because
>>>> it removes tested code, the pr lowers our coverage percentage and fails the
>>>> check, but it's a totally reasonable PR that doesn't need additional
>>>> testing (because it only removes functionality). Other classes of changes
>>>> that are problematic include things like improving hard to cover error
>>>> messages or tough to test proto changes which can only be covered in an
>>>> integration test that doesn't make it into codecov (these are both common
>>>> at least in the Go SDK).
>>>>
>>>> I'd be opposed to including that as a required check, though I support
>>>> adding pretty much every other suite.
>>>>
>>>> > I assume we would have a way to override it in case the repo gets
>>>> into a bad state (for example needing to modify the workflow definitions in
>>>> order to get them to pass).
>>>>
>>>> The way we make a check required is by adding it in the .asf.yml file
>>>> <https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Branchprotection>.
>>>> So we could always update that if needed to push an emergency fix as long
>>>> as we don't accidentally include any required checks on the .asf.yml
>>>> itself. If we mess that up we would need to bring in Infra to help.
>>>>
>>>> On Mon, Jun 27, 2022 at 7:59 PM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 27, 2022 at 4:55 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> I am in favor of requiring precommit green before merge. To make
>>>>>>> this better, we should:
>>>>>>>
>>>>>>> (1) run fewer irrelevant tests, for example almost every
>>>>>>> language-based presubmit tests vastly more code than could possibly be
>>>>>>> affected.
>>>>>>> (2) run more relevant tests, for example when an IO is touched
>>>>>>> running slightly more heavyweight integration tests before merge is
>>>>>>> reasonable.
>>>>>>>
>>>>>>
>>>>>> Does it have to be all or nothing? E.g. we could start with requiring
>>>>>> a smaller subset of tests that must pass, with the goal of eventually
>>>>>> requiring (almost?) everything.
>>>>>>
>>>>>>
>>>>>>> I worry that requiring total line coverage in unit tests /
>>>>>>> presubmits will encourage mock-based tests that don't demonstrate any
>>>>>>> meaningful correctness. But I am OK to try it.
>>>>>>>
>>>>>>
>>>>> IIRC, codecov test is checking that % coverage is not reduced with the
>>>>> new PR and it is not checking for total line coverage. I agree with you
>>>>> that total line coverage is probably not meaningful in most cases.
>>>>>
>>>>>
>>>>>>
>>>>>> +1. I think there are a class of presubmits (coverage among them)
>>>>>> that are advisory rather than binding.
>>>>>>
>>>>>
>>>>> +1 - especially if we can make the tools distinguish between required
>>>>> and advisory presubmits.
>>>>>
>>>>>
>>>>>>
>>>>>>> I assume we would have a way to override it in case the repo gets
>>>>>>> into a bad state (for example needing to modify the workflow definitions in
>>>>>>> order to get them to pass).
>>>>>>>
>>>>>>
>>>>>> +1. But making this explicit (e.g. specific wording would have to be
>>>>>> in the last comment) would be a good barrier to have even if it's not
>>>>>> absolute.
>>>>>>
>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Just an opinion: If we have a precommit (including codecov) we
>>>>>>>> should be willing to enforce those passing before merging a PR. Selectively
>>>>>>>> applying when/which precommits would be blockers eventually leads to more
>>>>>>>> opinions. I think codecov is actually providing value by checking that
>>>>>>>> delta coverage is not dropped. In my experience, for PRs that fail the
>>>>>>>> codecov precommit there tends to be an opportunity for adding unit tests.
>>>>>>>>
>>>>>>>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>
>>>>>>>>> Brian, you keep beating my emails by an instant :) I think codecov
>>>>>>>>> provides value independent of blocking a PR - it is a good sanity check
>>>>>>>>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>>>>>>>>> track over time.
>>>>>>>>>
>>>>>>>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Yeah, codecov is the one exception I had in mind, there may be
>>>>>>>>>> others that are reasonable to exclude as well
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <
>>>>>>>>>> jrmccluskey@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> +1 for enforcing some precommit checks as required, but making
>>>>>>>>>>> them all required would be a bit of a nightmare with how Codecov works. Any
>>>>>>>>>>> Python or Go patch gets flagged as failing codecov if its changes aren't
>>>>>>>>>>> covered at or above the current overall coverage level at master (around
>>>>>>>>>>> 75% right now.) This is a nice goal, but ultimately not every change lends
>>>>>>>>>>> itself to that level of unit testing. But I would be in favor of things
>>>>>>>>>>> like the various ValidatesRunner suites being required.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Jack McCluskey
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Brian beat me by a few seconds, but I actually agree that there
>>>>>>>>>>>> is a problem here - when Damon's PR was submitted there was basically no
>>>>>>>>>>>> way of knowing that checks would break without having detailed knowledge of
>>>>>>>>>>>> how the build system works (so I'm on board with your
>>>>>>>>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>>>>>>>>> recommended path forward - our automation failed us here and I don't think
>>>>>>>>>>>> the correct response is to make our manual validation better. That shifts
>>>>>>>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>>>>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>>>>>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>>>>>>>>> underlying technical problems, which are IMO:
>>>>>>>>>>>>
>>>>>>>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>>>>>>>>
>>>>>>>>>>>
> I missed this part earlier. I would encourage all committers who are
> blocked on flaky tests to file a new github issue with a link to the flaked
> test before re-running the flaking test. Re-running tests until they pass
> also makes it easier to introduce new flaky tests.
>
>
>> 2) There were no guardrails in place to avoid merging a PR that wasn't
>>>>>>>>>>>> fully validated.
>>>>>>>>>>>>
>>>>>>>>>>>> Things that would've made this better which I think we should
>>>>>>>>>>>> focus our attention on:
>>>>>>>>>>>>
>>>>>>>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>>>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>>>>>>>> enforcing some (or all?) precommit checks as required. I don't really blame
>>>>>>>>>>>> anyone for merging this PR since it had repeatedly run into unrelated
>>>>>>>>>>>> flakes, but that's kinda the point - we need to feel some of that pain to
>>>>>>>>>>>> avoid accidentally merging bad code. Blocking PRs on failed precommit
>>>>>>>>>>>> suites would force us to deal with these kinds of flakes and incentivize a
>>>>>>>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Danny
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <
>>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Damon,
>>>>>>>>>>>>> I think our CI system was working as intended for your PR. We
>>>>>>>>>>>>> select which PreCommits to run based on the files that are edited, and the
>>>>>>>>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>>>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>>>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It looks like what broke down in this case was that the PR was
>>>>>>>>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Brian
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>>>>>>>> damondouglas@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>>>>>>>>> write in SBAR
>>>>>>>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Subject*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953>
>>>>>>>>>>>>>> that was subsequently reverted
>>>>>>>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Background*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to
>>>>>>>>>>>>>> the code in the PR
>>>>>>>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>>>>>>>>> issues and failed style checks.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ./gradlew rat
>>>>>>>>>>>>>> ./gradlew spotlessCheck
>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Assessment*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Output from the following gradle commands did not surface
>>>>>>>>>>>>>> prior to the PR merge.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test
>>>>>>>>>>>>>> and a lack of knowledge contributed to the root cause.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Recommendation / Proposal*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I propose updating PR template description
>>>>>>>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>>>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>>>>>>>>> PR.  Toward this aim:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1.  Four separate GitHub issues in
>>>>>>>>>>>>>> https://github.com/apache/beam representative of *common*,
>>>>>>>>>>>>>> and language specific: *Java*, *Python*, and *Go*,
>>>>>>>>>>>>>> prescribed gradle tasks
>>>>>>>>>>>>>> 2.  Don't reply to this email with specifics on how;
>>>>>>>>>>>>>> delegation of comments/discussion within the relevant context of the GitHub
>>>>>>>>>>>>>> issues in 1
>>>>>>>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Damon Douglas*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> damondouglas@google.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>

Re: Update PR description template

Posted by Ahmet Altay <al...@google.com>.
I am worried about adding our own custom tooling. They require maintenance,
they introduce new flakiness, and so far we have not been great about
maintaining custom infra.

On Tue, Jun 28, 2022 at 1:36 PM Kenneth Knowles <ke...@apache.org> wrote:

> Thanks for all the super useful info. I can add some experience from the
> early days of Beam: Because the GitHub "merge" button did not exist, we
> rolled our own "merge bot". The pros/cons below are not general to the
> concept, but to our experience with it:
>
> PROs:
>  - It single-threaded commits to the repo so there weren't race conditions
> on test results. I think this will be OK for our scale for the foreseeable
> future.
>  - It re-ran tests after merge but before pushing to master, which is the
> second half of eliminating that race condition.
>
> CONs:
>  - It was very flaky and often just didn't stay up. We danced for joy when
> it was gone.
>  - It ran a kind of arbitrary set of tests that didn't match the PR
> statuses. It did not have any filter on which tests were run during merge.
> We were small enough that mostly just "run the tests" was specific enough.
>  - It always squashed the commits and then pushed the squashed commit with
> a comment "this closes #<pr>". This messes up PRs that have multiple
> commits that should remain separate. But more importantly it made it
> impossible to easily distinguish PRs that were merged versus those that
> were closed without merge. And of course it is way harder to navigate
> history when the commits on master have different hashes than the commits
> that were authored.
>  - Getting reasonable logs with error messages when a merge failed for
> whatever reason was hard or impossible IIRC.
>
> So I think in what you have said there is an option to get the best of
> all, something like:
>
>  - Do merges through an action. Merge commit or squash-and-merge would be
> separate labels. Or not bother with squash and merge, instead using
> heuristics to block PRs with bad commit histories.
>  - Have the workflow check that all PR statuses are green before
> continuing to merge.
>

This sounds like a reasonable process. I assume the only addition in this
case would be another github action.


>
> Kenn
>
> On Tue, Jun 28, 2022 at 8:03 AM Danny McCormick <da...@google.com>
> wrote:
>
>> After looking into this a little bit more, I need to revise my opinion on
>> how we would do this; I don't think it's practical to have all required
>> status checks via the .asf.yml file because those required checks can't be
>> filtered by path (for example, if we want to require Python precommit on
>> Python PRs, it would need to be required on *all *PRs). That's a GitHub
>> limitation
>> <https://docs.github.com/en/enterprise-cloud@latest/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks>,
>> not an ASF one.
>>
>> One option is to write an action that makes sure no checks are failing
>> (except maybe codecov?) and put a single required check on that. That would
>> also make it easy to build in logic to override required checks like Robert
>> suggested ("specific wording would have to be in the last comment"). We
>> already have logic in the PR bot that does some of this
>> <https://github.com/apache/beam/blob/master/scripts/ci/pr-bot/shared/checks.ts>
>> .
>>
>> The downside to that approach is that it's not clear what the best way to
>> trigger that workflow is since it has to run after all other checks have
>> completed. We could have it trigger on some label (e.g. "ready to merge")
>> and then automatically merge the PR when it's done or comment and remove
>> the label if checks are failing/incomplete. This changes the workflow for
>> committers from "click the merge button" to "add a label", but doesn't
>> require significantly more action or oversight and is pretty similar to how
>> Kubernetes <https://github.com/kubernetes/kubernetes/pull/110812> and
>> some other large repos run things.
>>
>> Another option would be to trigger that check at the end of every Actions
>> run and use the check_suite trigger for external runs (unfortunately
>> actions doesn't trigger that). That would require a lot of boilerplate on
>> every Actions workflow though. There are other similar options which may be
>> cleaner that also require modifying every Actions workflow, but I don't
>> love that option.
>>
>> I'm still in favor of doing this via the "ready to merge" label option I
>> just described, but this has dampened my enthusiasm a little bit since we'd
>> need to build out/maintain our own tooling.
>>
>> On Tue, Jun 28, 2022 at 8:49 AM Danny McCormick <
>> dannymccormick@google.com> wrote:
>>
>>> Regarding code cov - it does check overall % coverage as well as the
>>> coverage of the diff, but I don't think it's designed to be a blocking
>>> metric. A good example of where this is not helpful is this pr to
>>> remove dead code <https://github.com/apache/beam/pull/22019>. Because
>>> it removes tested code, the pr lowers our coverage percentage and fails the
>>> check, but it's a totally reasonable PR that doesn't need additional
>>> testing (because it only removes functionality). Other classes of changes
>>> that are problematic include things like improving hard to cover error
>>> messages or tough to test proto changes which can only be covered in an
>>> integration test that doesn't make it into codecov (these are both common
>>> at least in the Go SDK).
>>>
>>> I'd be opposed to including that as a required check, though I support
>>> adding pretty much every other suite.
>>>
>>> > I assume we would have a way to override it in case the repo gets into
>>> a bad state (for example needing to modify the workflow definitions in
>>> order to get them to pass).
>>>
>>> The way we make a check required is by adding it in the .asf.yml file
>>> <https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Branchprotection>.
>>> So we could always update that if needed to push an emergency fix as long
>>> as we don't accidentally include any required checks on the .asf.yml
>>> itself. If we mess that up we would need to bring in Infra to help.
>>>
>>> On Mon, Jun 27, 2022 at 7:59 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Jun 27, 2022 at 4:55 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <ke...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> I am in favor of requiring precommit green before merge. To make this
>>>>>> better, we should:
>>>>>>
>>>>>> (1) run fewer irrelevant tests, for example almost every
>>>>>> language-based presubmit tests vastly more code than could possibly be
>>>>>> affected.
>>>>>> (2) run more relevant tests, for example when an IO is touched
>>>>>> running slightly more heavyweight integration tests before merge is
>>>>>> reasonable.
>>>>>>
>>>>>
>>>>> Does it have to be all or nothing? E.g. we could start with requiring
>>>>> a smaller subset of tests that must pass, with the goal of eventually
>>>>> requiring (almost?) everything.
>>>>>
>>>>>
>>>>>> I worry that requiring total line coverage in unit tests / presubmits
>>>>>> will encourage mock-based tests that don't demonstrate any meaningful
>>>>>> correctness. But I am OK to try it.
>>>>>>
>>>>>
>>>> IIRC, codecov test is checking that % coverage is not reduced with the
>>>> new PR and it is not checking for total line coverage. I agree with you
>>>> that total line coverage is probably not meaningful in most cases.
>>>>
>>>>
>>>>>
>>>>> +1. I think there are a class of presubmits (coverage among them) that
>>>>> are advisory rather than binding.
>>>>>
>>>>
>>>> +1 - especially if we can make the tools distinguish between required
>>>> and advisory presubmits.
>>>>
>>>>
>>>>>
>>>>>> I assume we would have a way to override it in case the repo gets
>>>>>> into a bad state (for example needing to modify the workflow definitions in
>>>>>> order to get them to pass).
>>>>>>
>>>>>
>>>>> +1. But making this explicit (e.g. specific wording would have to be
>>>>> in the last comment) would be a good barrier to have even if it's not
>>>>> absolute.
>>>>>
>>>>>
>>>>>> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> wrote:
>>>>>>
>>>>>>> Just an opinion: If we have a precommit (including codecov) we
>>>>>>> should be willing to enforce those passing before merging a PR. Selectively
>>>>>>> applying when/which precommits would be blockers eventually leads to more
>>>>>>> opinions. I think codecov is actually providing value by checking that
>>>>>>> delta coverage is not dropped. In my experience, for PRs that fail the
>>>>>>> codecov precommit there tends to be an opportunity for adding unit tests.
>>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>
>>>>>>>> Brian, you keep beating my emails by an instant :) I think codecov
>>>>>>>> provides value independent of blocking a PR - it is a good sanity check
>>>>>>>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>>>>>>>> track over time.
>>>>>>>>
>>>>>>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>
>>>>>>>>> Yeah, codecov is the one exception I had in mind, there may be
>>>>>>>>> others that are reasonable to exclude as well
>>>>>>>>>
>>>>>>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <
>>>>>>>>> jrmccluskey@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> +1 for enforcing some precommit checks as required, but making
>>>>>>>>>> them all required would be a bit of a nightmare with how Codecov works. Any
>>>>>>>>>> Python or Go patch gets flagged as failing codecov if its changes aren't
>>>>>>>>>> covered at or above the current overall coverage level at master (around
>>>>>>>>>> 75% right now.) This is a nice goal, but ultimately not every change lends
>>>>>>>>>> itself to that level of unit testing. But I would be in favor of things
>>>>>>>>>> like the various ValidatesRunner suites being required.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Jack McCluskey
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Brian beat me by a few seconds, but I actually agree that there
>>>>>>>>>>> is a problem here - when Damon's PR was submitted there was basically no
>>>>>>>>>>> way of knowing that checks would break without having detailed knowledge of
>>>>>>>>>>> how the build system works (so I'm on board with your
>>>>>>>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>>>>>>>> recommended path forward - our automation failed us here and I don't think
>>>>>>>>>>> the correct response is to make our manual validation better. That shifts
>>>>>>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>>>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>>>>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>>>>>>>> underlying technical problems, which are IMO:
>>>>>>>>>>>
>>>>>>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>>>>>>>
>>>>>>>>>>
I missed this part earlier. I would encourage all committers who are
blocked on flaky tests to file a new github issue with a link to the flaked
test before re-running the flaking test. Re-running tests until they pass
also makes it easier to introduce new flaky tests.


> 2) There were no guardrails in place to avoid merging a PR that wasn't
>>>>>>>>>>> fully validated.
>>>>>>>>>>>
>>>>>>>>>>> Things that would've made this better which I think we should
>>>>>>>>>>> focus our attention on:
>>>>>>>>>>>
>>>>>>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>>>>>>> enforcing some (or all?) precommit checks as required. I don't really blame
>>>>>>>>>>> anyone for merging this PR since it had repeatedly run into unrelated
>>>>>>>>>>> flakes, but that's kinda the point - we need to feel some of that pain to
>>>>>>>>>>> avoid accidentally merging bad code. Blocking PRs on failed precommit
>>>>>>>>>>> suites would force us to deal with these kinds of flakes and incentivize a
>>>>>>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Danny
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <
>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Damon,
>>>>>>>>>>>> I think our CI system was working as intended for your PR. We
>>>>>>>>>>>> select which PreCommits to run based on the files that are edited, and the
>>>>>>>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>>>>>>
>>>>>>>>>>>> It looks like what broke down in this case was that the PR was
>>>>>>>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>>>>>>>
>>>>>>>>>>>> Brian
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>>>>>>> damondouglas@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>>>>>>>> write in SBAR
>>>>>>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Subject*
>>>>>>>>>>>>>
>>>>>>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953>
>>>>>>>>>>>>> that was subsequently reverted
>>>>>>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Background*
>>>>>>>>>>>>>
>>>>>>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to the
>>>>>>>>>>>>> code in the PR
>>>>>>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>>>>>>>> issues and failed style checks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> ./gradlew rat
>>>>>>>>>>>>> ./gradlew spotlessCheck
>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Assessment*
>>>>>>>>>>>>>
>>>>>>>>>>>>> Output from the following gradle commands did not surface
>>>>>>>>>>>>> prior to the PR merge.
>>>>>>>>>>>>>
>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>>>>>>
>>>>>>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and
>>>>>>>>>>>>> a lack of knowledge contributed to the root cause.
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Recommendation / Proposal*
>>>>>>>>>>>>>
>>>>>>>>>>>>> I propose updating PR template description
>>>>>>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>>>>>>>> PR.  Toward this aim:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1.  Four separate GitHub issues in
>>>>>>>>>>>>> https://github.com/apache/beam representative of *common*,
>>>>>>>>>>>>> and language specific: *Java*, *Python*, and *Go*, prescribed
>>>>>>>>>>>>> gradle tasks
>>>>>>>>>>>>> 2.  Don't reply to this email with specifics on how;
>>>>>>>>>>>>> delegation of comments/discussion within the relevant context of the GitHub
>>>>>>>>>>>>> issues in 1
>>>>>>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Damon Douglas*
>>>>>>>>>>>>>
>>>>>>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>>>>>>
>>>>>>>>>>>>> damondouglas@google.com
>>>>>>>>>>>>>
>>>>>>>>>>>>

Re: Update PR description template

Posted by Kenneth Knowles <ke...@apache.org>.
Thanks for all the super useful info. I can add some experience from the
early days of Beam: Because the GitHub "merge" button did not exist, we
rolled our own "merge bot". The pros/cons below are not general to the
concept, but to our experience with it:

PROs:
 - It single-threaded commits to the repo so there weren't race conditions
on test results. I think this will be OK for our scale for the foreseeable
future.
 - It re-ran tests after merge but before pushing to master, which is the
second half of eliminating that race condition.

CONs:
 - It was very flaky and often just didn't stay up. We danced for joy when
it was gone.
 - It ran a kind of arbitrary set of tests that didn't match the PR
statuses. It did not have any filter on which tests were run during merge.
We were small enough that mostly just "run the tests" was specific enough.
 - It always squashed the commits and then pushed the squashed commit with
a comment "this closes #<pr>". This messes up PRs that have multiple
commits that should remain separate. But more importantly it made it
impossible to easily distinguish PRs that were merged versus those that
were closed without merge. And of course it is way harder to navigate
history when the commits on master have different hashes than the commits
that were authored.
 - Getting reasonable logs with error messages when a merge failed for
whatever reason was hard or impossible IIRC.

So I think in what you have said there is an option to get the best of all,
something like:

 - Do merges through an action. Merge commit or squash-and-merge would be
separate labels. Or not bother with squash and merge, instead using
heuristics to block PRs with bad commit histories.
 - Have the workflow check that all PR statuses are green before continuing
to merge.

Kenn

On Tue, Jun 28, 2022 at 8:03 AM Danny McCormick <da...@google.com>
wrote:

> After looking into this a little bit more, I need to revise my opinion on
> how we would do this; I don't think it's practical to have all required
> status checks via the .asf.yml file because those required checks can't be
> filtered by path (for example, if we want to require Python precommit on
> Python PRs, it would need to be required on *all *PRs). That's a GitHub
> limitation
> <https://docs.github.com/en/enterprise-cloud@latest/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks>,
> not an ASF one.
>
> One option is to write an action that makes sure no checks are failing
> (except maybe codecov?) and put a single required check on that. That would
> also make it easy to build in logic to override required checks like Robert
> suggested ("specific wording would have to be in the last comment"). We
> already have logic in the PR bot that does some of this
> <https://github.com/apache/beam/blob/master/scripts/ci/pr-bot/shared/checks.ts>
> .
>
> The downside to that approach is that it's not clear what the best way to
> trigger that workflow is since it has to run after all other checks have
> completed. We could have it trigger on some label (e.g. "ready to merge")
> and then automatically merge the PR when it's done or comment and remove
> the label if checks are failing/incomplete. This changes the workflow for
> committers from "click the merge button" to "add a label", but doesn't
> require significantly more action or oversight and is pretty similar to how
> Kubernetes <https://github.com/kubernetes/kubernetes/pull/110812> and
> some other large repos run things.
>
> Another option would be to trigger that check at the end of every Actions
> run and use the check_suite trigger for external runs (unfortunately
> actions doesn't trigger that). That would require a lot of boilerplate on
> every Actions workflow though. There are other similar options which may be
> cleaner that also require modifying every Actions workflow, but I don't
> love that option.
>
> I'm still in favor of doing this via the "ready to merge" label option I
> just described, but this has dampened my enthusiasm a little bit since we'd
> need to build out/maintain our own tooling.
>
> On Tue, Jun 28, 2022 at 8:49 AM Danny McCormick <da...@google.com>
> wrote:
>
>> Regarding code cov - it does check overall % coverage as well as the
>> coverage of the diff, but I don't think it's designed to be a blocking
>> metric. A good example of where this is not helpful is this pr to remove
>> dead code <https://github.com/apache/beam/pull/22019>. Because it
>> removes tested code, the pr lowers our coverage percentage and fails the
>> check, but it's a totally reasonable PR that doesn't need additional
>> testing (because it only removes functionality). Other classes of changes
>> that are problematic include things like improving hard to cover error
>> messages or tough to test proto changes which can only be covered in an
>> integration test that doesn't make it into codecov (these are both common
>> at least in the Go SDK).
>>
>> I'd be opposed to including that as a required check, though I support
>> adding pretty much every other suite.
>>
>> > I assume we would have a way to override it in case the repo gets into
>> a bad state (for example needing to modify the workflow definitions in
>> order to get them to pass).
>>
>> The way we make a check required is by adding it in the .asf.yml file
>> <https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Branchprotection>.
>> So we could always update that if needed to push an emergency fix as long
>> as we don't accidentally include any required checks on the .asf.yml
>> itself. If we mess that up we would need to bring in Infra to help.
>>
>> On Mon, Jun 27, 2022 at 7:59 PM Ahmet Altay <al...@google.com> wrote:
>>
>>>
>>>
>>> On Mon, Jun 27, 2022 at 4:55 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>>
>>>>> I am in favor of requiring precommit green before merge. To make this
>>>>> better, we should:
>>>>>
>>>>> (1) run fewer irrelevant tests, for example almost every
>>>>> language-based presubmit tests vastly more code than could possibly be
>>>>> affected.
>>>>> (2) run more relevant tests, for example when an IO is touched running
>>>>> slightly more heavyweight integration tests before merge is reasonable.
>>>>>
>>>>
>>>> Does it have to be all or nothing? E.g. we could start with requiring a
>>>> smaller subset of tests that must pass, with the goal of eventually
>>>> requiring (almost?) everything.
>>>>
>>>>
>>>>> I worry that requiring total line coverage in unit tests / presubmits
>>>>> will encourage mock-based tests that don't demonstrate any meaningful
>>>>> correctness. But I am OK to try it.
>>>>>
>>>>
>>> IIRC, codecov test is checking that % coverage is not reduced with the
>>> new PR and it is not checking for total line coverage. I agree with you
>>> that total line coverage is probably not meaningful in most cases.
>>>
>>>
>>>>
>>>> +1. I think there are a class of presubmits (coverage among them) that
>>>> are advisory rather than binding.
>>>>
>>>
>>> +1 - especially if we can make the tools distinguish between required
>>> and advisory presubmits.
>>>
>>>
>>>>
>>>>> I assume we would have a way to override it in case the repo gets into
>>>>> a bad state (for example needing to modify the workflow definitions in
>>>>> order to get them to pass).
>>>>>
>>>>
>>>> +1. But making this explicit (e.g. specific wording would have to be in
>>>> the last comment) would be a good barrier to have even if it's not
>>>> absolute.
>>>>
>>>>
>>>>> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>> Just an opinion: If we have a precommit (including codecov) we should
>>>>>> be willing to enforce those passing before merging a PR. Selectively
>>>>>> applying when/which precommits would be blockers eventually leads to more
>>>>>> opinions. I think codecov is actually providing value by checking that
>>>>>> delta coverage is not dropped. In my experience, for PRs that fail the
>>>>>> codecov precommit there tends to be an opportunity for adding unit tests.
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>>>>>> dannymccormick@google.com> wrote:
>>>>>>
>>>>>>> Brian, you keep beating my emails by an instant :) I think codecov
>>>>>>> provides value independent of blocking a PR - it is a good sanity check
>>>>>>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>>>>>>> track over time.
>>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>
>>>>>>>> Yeah, codecov is the one exception I had in mind, there may be
>>>>>>>> others that are reasonable to exclude as well
>>>>>>>>
>>>>>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <
>>>>>>>> jrmccluskey@google.com> wrote:
>>>>>>>>
>>>>>>>>> +1 for enforcing some precommit checks as required, but making
>>>>>>>>> them all required would be a bit of a nightmare with how Codecov works. Any
>>>>>>>>> Python or Go patch gets flagged as failing codecov if its changes aren't
>>>>>>>>> covered at or above the current overall coverage level at master (around
>>>>>>>>> 75% right now.) This is a nice goal, but ultimately not every change lends
>>>>>>>>> itself to that level of unit testing. But I would be in favor of things
>>>>>>>>> like the various ValidatesRunner suites being required.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Jack McCluskey
>>>>>>>>>
>>>>>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Brian beat me by a few seconds, but I actually agree that there
>>>>>>>>>> is a problem here - when Damon's PR was submitted there was basically no
>>>>>>>>>> way of knowing that checks would break without having detailed knowledge of
>>>>>>>>>> how the build system works (so I'm on board with your
>>>>>>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>>>>>>> recommended path forward - our automation failed us here and I don't think
>>>>>>>>>> the correct response is to make our manual validation better. That shifts
>>>>>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>>>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>>>>>>> underlying technical problems, which are IMO:
>>>>>>>>>>
>>>>>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>>>>>> 2) There were no guardrails in place to avoid merging a PR that
>>>>>>>>>> wasn't fully validated.
>>>>>>>>>>
>>>>>>>>>> Things that would've made this better which I think we should
>>>>>>>>>> focus our attention on:
>>>>>>>>>>
>>>>>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>>>>>> enforcing some (or all?) precommit checks as required. I don't really blame
>>>>>>>>>> anyone for merging this PR since it had repeatedly run into unrelated
>>>>>>>>>> flakes, but that's kinda the point - we need to feel some of that pain to
>>>>>>>>>> avoid accidentally merging bad code. Blocking PRs on failed precommit
>>>>>>>>>> suites would force us to deal with these kinds of flakes and incentivize a
>>>>>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Danny
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <
>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Damon,
>>>>>>>>>>> I think our CI system was working as intended for your PR. We
>>>>>>>>>>> select which PreCommits to run based on the files that are edited, and the
>>>>>>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>>>>>
>>>>>>>>>>> It looks like what broke down in this case was that the PR was
>>>>>>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>>>>>>
>>>>>>>>>>> Brian
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>>>>>> damondouglas@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>>>>>>> write in SBAR
>>>>>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>>>>>>
>>>>>>>>>>>> *Subject*
>>>>>>>>>>>>
>>>>>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953>
>>>>>>>>>>>> that was subsequently reverted
>>>>>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>>>>>
>>>>>>>>>>>> *Background*
>>>>>>>>>>>>
>>>>>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to the
>>>>>>>>>>>> code in the PR
>>>>>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>>>>>>> issues and failed style checks.
>>>>>>>>>>>>
>>>>>>>>>>>> ./gradlew rat
>>>>>>>>>>>> ./gradlew spotlessCheck
>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>>>>>
>>>>>>>>>>>> *Assessment*
>>>>>>>>>>>>
>>>>>>>>>>>> Output from the following gradle commands did not surface prior
>>>>>>>>>>>> to the PR merge.
>>>>>>>>>>>>
>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>>>>>
>>>>>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and
>>>>>>>>>>>> a lack of knowledge contributed to the root cause.
>>>>>>>>>>>>
>>>>>>>>>>>> *Recommendation / Proposal*
>>>>>>>>>>>>
>>>>>>>>>>>> I propose updating PR template description
>>>>>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>>>>>>> PR.  Toward this aim:
>>>>>>>>>>>>
>>>>>>>>>>>> 1.  Four separate GitHub issues in
>>>>>>>>>>>> https://github.com/apache/beam representative of *common*, and
>>>>>>>>>>>> language specific: *Java*, *Python*, and *Go*, prescribed
>>>>>>>>>>>> gradle tasks
>>>>>>>>>>>> 2.  Don't reply to this email with specifics on how; delegation
>>>>>>>>>>>> of comments/discussion within the relevant context of the GitHub issues in 1
>>>>>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *Damon Douglas*
>>>>>>>>>>>>
>>>>>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>>>>>
>>>>>>>>>>>> damondouglas@google.com
>>>>>>>>>>>>
>>>>>>>>>>>

Re: Update PR description template

Posted by Danny McCormick <da...@google.com>.
After looking into this a little bit more, I need to revise my opinion on
how we would do this; I don't think it's practical to have all required
status checks via the .asf.yml file because those required checks can't be
filtered by path (for example, if we want to require Python precommit on
Python PRs, it would need to be required on *all *PRs). That's a GitHub
limitation
<https://docs.github.com/en/enterprise-cloud@latest/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks>,
not an ASF one.

One option is to write an action that makes sure no checks are failing
(except maybe codecov?) and put a single required check on that. That would
also make it easy to build in logic to override required checks like Robert
suggested ("specific wording would have to be in the last comment"). We
already have logic in the PR bot that does some of this
<https://github.com/apache/beam/blob/master/scripts/ci/pr-bot/shared/checks.ts>
.

The downside to that approach is that it's not clear what the best way to
trigger that workflow is since it has to run after all other checks have
completed. We could have it trigger on some label (e.g. "ready to merge")
and then automatically merge the PR when it's done or comment and remove
the label if checks are failing/incomplete. This changes the workflow for
committers from "click the merge button" to "add a label", but doesn't
require significantly more action or oversight and is pretty similar to how
Kubernetes <https://github.com/kubernetes/kubernetes/pull/110812> and some
other large repos run things.

Another option would be to trigger that check at the end of every Actions
run and use the check_suite trigger for external runs (unfortunately
actions doesn't trigger that). That would require a lot of boilerplate on
every Actions workflow though. There are other similar options which may be
cleaner that also require modifying every Actions workflow, but I don't
love that option.

I'm still in favor of doing this via the "ready to merge" label option I
just described, but this has dampened my enthusiasm a little bit since we'd
need to build out/maintain our own tooling.

On Tue, Jun 28, 2022 at 8:49 AM Danny McCormick <da...@google.com>
wrote:

> Regarding code cov - it does check overall % coverage as well as the
> coverage of the diff, but I don't think it's designed to be a blocking
> metric. A good example of where this is not helpful is this pr to remove
> dead code <https://github.com/apache/beam/pull/22019>. Because it removes
> tested code, the pr lowers our coverage percentage and fails the check, but
> it's a totally reasonable PR that doesn't need additional testing (because
> it only removes functionality). Other classes of changes that are
> problematic include things like improving hard to cover error messages or
> tough to test proto changes which can only be covered in an integration
> test that doesn't make it into codecov (these are both common at least in
> the Go SDK).
>
> I'd be opposed to including that as a required check, though I support
> adding pretty much every other suite.
>
> > I assume we would have a way to override it in case the repo gets into a
> bad state (for example needing to modify the workflow definitions in order
> to get them to pass).
>
> The way we make a check required is by adding it in the .asf.yml file
> <https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Branchprotection>.
> So we could always update that if needed to push an emergency fix as long
> as we don't accidentally include any required checks on the .asf.yml
> itself. If we mess that up we would need to bring in Infra to help.
>
> On Mon, Jun 27, 2022 at 7:59 PM Ahmet Altay <al...@google.com> wrote:
>
>>
>>
>> On Mon, Jun 27, 2022 at 4:55 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> I am in favor of requiring precommit green before merge. To make this
>>>> better, we should:
>>>>
>>>> (1) run fewer irrelevant tests, for example almost every language-based
>>>> presubmit tests vastly more code than could possibly be affected.
>>>> (2) run more relevant tests, for example when an IO is touched running
>>>> slightly more heavyweight integration tests before merge is reasonable.
>>>>
>>>
>>> Does it have to be all or nothing? E.g. we could start with requiring a
>>> smaller subset of tests that must pass, with the goal of eventually
>>> requiring (almost?) everything.
>>>
>>>
>>>> I worry that requiring total line coverage in unit tests / presubmits
>>>> will encourage mock-based tests that don't demonstrate any meaningful
>>>> correctness. But I am OK to try it.
>>>>
>>>
>> IIRC, codecov test is checking that % coverage is not reduced with the
>> new PR and it is not checking for total line coverage. I agree with you
>> that total line coverage is probably not meaningful in most cases.
>>
>>
>>>
>>> +1. I think there are a class of presubmits (coverage among them) that
>>> are advisory rather than binding.
>>>
>>
>> +1 - especially if we can make the tools distinguish between required and
>> advisory presubmits.
>>
>>
>>>
>>>> I assume we would have a way to override it in case the repo gets into
>>>> a bad state (for example needing to modify the workflow definitions in
>>>> order to get them to pass).
>>>>
>>>
>>> +1. But making this explicit (e.g. specific wording would have to be in
>>> the last comment) would be a good barrier to have even if it's not
>>> absolute.
>>>
>>>
>>>> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> Just an opinion: If we have a precommit (including codecov) we should
>>>>> be willing to enforce those passing before merging a PR. Selectively
>>>>> applying when/which precommits would be blockers eventually leads to more
>>>>> opinions. I think codecov is actually providing value by checking that
>>>>> delta coverage is not dropped. In my experience, for PRs that fail the
>>>>> codecov precommit there tends to be an opportunity for adding unit tests.
>>>>>
>>>>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>>>>> dannymccormick@google.com> wrote:
>>>>>
>>>>>> Brian, you keep beating my emails by an instant :) I think codecov
>>>>>> provides value independent of blocking a PR - it is a good sanity check
>>>>>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>>>>>> track over time.
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>>>>> dannymccormick@google.com> wrote:
>>>>>>
>>>>>>> Yeah, codecov is the one exception I had in mind, there may be
>>>>>>> others that are reasonable to exclude as well
>>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <
>>>>>>> jrmccluskey@google.com> wrote:
>>>>>>>
>>>>>>>> +1 for enforcing some precommit checks as required, but making them
>>>>>>>> all required would be a bit of a nightmare with how Codecov works. Any
>>>>>>>> Python or Go patch gets flagged as failing codecov if its changes aren't
>>>>>>>> covered at or above the current overall coverage level at master (around
>>>>>>>> 75% right now.) This is a nice goal, but ultimately not every change lends
>>>>>>>> itself to that level of unit testing. But I would be in favor of things
>>>>>>>> like the various ValidatesRunner suites being required.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Jack McCluskey
>>>>>>>>
>>>>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>
>>>>>>>>> Brian beat me by a few seconds, but I actually agree that there is
>>>>>>>>> a problem here - when Damon's PR was submitted there was basically no way
>>>>>>>>> of knowing that checks would break without having detailed knowledge of how
>>>>>>>>> the build system works (so I'm on board with your
>>>>>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>>>>>> recommended path forward - our automation failed us here and I don't think
>>>>>>>>> the correct response is to make our manual validation better. That shifts
>>>>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>>>>>> underlying technical problems, which are IMO:
>>>>>>>>>
>>>>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>>>>> 2) There were no guardrails in place to avoid merging a PR that
>>>>>>>>> wasn't fully validated.
>>>>>>>>>
>>>>>>>>> Things that would've made this better which I think we should
>>>>>>>>> focus our attention on:
>>>>>>>>>
>>>>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>>>>> enforcing some (or all?) precommit checks as required. I don't really blame
>>>>>>>>> anyone for merging this PR since it had repeatedly run into unrelated
>>>>>>>>> flakes, but that's kinda the point - we need to feel some of that pain to
>>>>>>>>> avoid accidentally merging bad code. Blocking PRs on failed precommit
>>>>>>>>> suites would force us to deal with these kinds of flakes and incentivize a
>>>>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Danny
>>>>>>>>>
>>>>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Damon,
>>>>>>>>>> I think our CI system was working as intended for your PR. We
>>>>>>>>>> select which PreCommits to run based on the files that are edited, and the
>>>>>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>>>>
>>>>>>>>>> It looks like what broke down in this case was that the PR was
>>>>>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>>>>>
>>>>>>>>>> Brian
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>>>>> damondouglas@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>>>>>> write in SBAR
>>>>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>>>>>
>>>>>>>>>>> *Subject*
>>>>>>>>>>>
>>>>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953>
>>>>>>>>>>> that was subsequently reverted
>>>>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>>>>
>>>>>>>>>>> *Background*
>>>>>>>>>>>
>>>>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to the
>>>>>>>>>>> code in the PR
>>>>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>>>>>> issues and failed style checks.
>>>>>>>>>>>
>>>>>>>>>>> ./gradlew rat
>>>>>>>>>>> ./gradlew spotlessCheck
>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>>>>
>>>>>>>>>>> *Assessment*
>>>>>>>>>>>
>>>>>>>>>>> Output from the following gradle commands did not surface prior
>>>>>>>>>>> to the PR merge.
>>>>>>>>>>>
>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>>>>
>>>>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and a
>>>>>>>>>>> lack of knowledge contributed to the root cause.
>>>>>>>>>>>
>>>>>>>>>>> *Recommendation / Proposal*
>>>>>>>>>>>
>>>>>>>>>>> I propose updating PR template description
>>>>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>>>>>> PR.  Toward this aim:
>>>>>>>>>>>
>>>>>>>>>>> 1.  Four separate GitHub issues in
>>>>>>>>>>> https://github.com/apache/beam representative of *common*, and
>>>>>>>>>>> language specific: *Java*, *Python*, and *Go*, prescribed
>>>>>>>>>>> gradle tasks
>>>>>>>>>>> 2.  Don't reply to this email with specifics on how; delegation
>>>>>>>>>>> of comments/discussion within the relevant context of the GitHub issues in 1
>>>>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *Damon Douglas*
>>>>>>>>>>>
>>>>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>>>>
>>>>>>>>>>> damondouglas@google.com
>>>>>>>>>>>
>>>>>>>>>>

Re: Update PR description template

Posted by Danny McCormick <da...@google.com>.
Regarding code cov - it does check overall % coverage as well as the
coverage of the diff, but I don't think it's designed to be a blocking
metric. A good example of where this is not helpful is this pr to remove
dead code <https://github.com/apache/beam/pull/22019>. Because it removes
tested code, the pr lowers our coverage percentage and fails the check, but
it's a totally reasonable PR that doesn't need additional testing (because
it only removes functionality). Other classes of changes that are
problematic include things like improving hard to cover error messages or
tough to test proto changes which can only be covered in an integration
test that doesn't make it into codecov (these are both common at least in
the Go SDK).

I'd be opposed to including that as a required check, though I support
adding pretty much every other suite.

> I assume we would have a way to override it in case the repo gets into a
bad state (for example needing to modify the workflow definitions in order
to get them to pass).

The way we make a check required is by adding it in the .asf.yml file
<https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Branchprotection>.
So we could always update that if needed to push an emergency fix as long
as we don't accidentally include any required checks on the .asf.yml
itself. If we mess that up we would need to bring in Infra to help.

On Mon, Jun 27, 2022 at 7:59 PM Ahmet Altay <al...@google.com> wrote:

>
>
> On Mon, Jun 27, 2022 at 4:55 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> I am in favor of requiring precommit green before merge. To make this
>>> better, we should:
>>>
>>> (1) run fewer irrelevant tests, for example almost every language-based
>>> presubmit tests vastly more code than could possibly be affected.
>>> (2) run more relevant tests, for example when an IO is touched running
>>> slightly more heavyweight integration tests before merge is reasonable.
>>>
>>
>> Does it have to be all or nothing? E.g. we could start with requiring a
>> smaller subset of tests that must pass, with the goal of eventually
>> requiring (almost?) everything.
>>
>>
>>> I worry that requiring total line coverage in unit tests / presubmits
>>> will encourage mock-based tests that don't demonstrate any meaningful
>>> correctness. But I am OK to try it.
>>>
>>
> IIRC, codecov test is checking that % coverage is not reduced with the new
> PR and it is not checking for total line coverage. I agree with you that
> total line coverage is probably not meaningful in most cases.
>
>
>>
>> +1. I think there are a class of presubmits (coverage among them) that
>> are advisory rather than binding.
>>
>
> +1 - especially if we can make the tools distinguish between required and
> advisory presubmits.
>
>
>>
>>> I assume we would have a way to override it in case the repo gets into a
>>> bad state (for example needing to modify the workflow definitions in order
>>> to get them to pass).
>>>
>>
>> +1. But making this explicit (e.g. specific wording would have to be in
>> the last comment) would be a good barrier to have even if it's not
>> absolute.
>>
>>
>>> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> Just an opinion: If we have a precommit (including codecov) we should
>>>> be willing to enforce those passing before merging a PR. Selectively
>>>> applying when/which precommits would be blockers eventually leads to more
>>>> opinions. I think codecov is actually providing value by checking that
>>>> delta coverage is not dropped. In my experience, for PRs that fail the
>>>> codecov precommit there tends to be an opportunity for adding unit tests.
>>>>
>>>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>>>> dannymccormick@google.com> wrote:
>>>>
>>>>> Brian, you keep beating my emails by an instant :) I think codecov
>>>>> provides value independent of blocking a PR - it is a good sanity check
>>>>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>>>>> track over time.
>>>>>
>>>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>>>> dannymccormick@google.com> wrote:
>>>>>
>>>>>> Yeah, codecov is the one exception I had in mind, there may be others
>>>>>> that are reasonable to exclude as well
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <
>>>>>> jrmccluskey@google.com> wrote:
>>>>>>
>>>>>>> +1 for enforcing some precommit checks as required, but making them
>>>>>>> all required would be a bit of a nightmare with how Codecov works. Any
>>>>>>> Python or Go patch gets flagged as failing codecov if its changes aren't
>>>>>>> covered at or above the current overall coverage level at master (around
>>>>>>> 75% right now.) This is a nice goal, but ultimately not every change lends
>>>>>>> itself to that level of unit testing. But I would be in favor of things
>>>>>>> like the various ValidatesRunner suites being required.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Jack McCluskey
>>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>
>>>>>>>> Brian beat me by a few seconds, but I actually agree that there is
>>>>>>>> a problem here - when Damon's PR was submitted there was basically no way
>>>>>>>> of knowing that checks would break without having detailed knowledge of how
>>>>>>>> the build system works (so I'm on board with your
>>>>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>>>>> recommended path forward - our automation failed us here and I don't think
>>>>>>>> the correct response is to make our manual validation better. That shifts
>>>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>>>>> underlying technical problems, which are IMO:
>>>>>>>>
>>>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>>>> 2) There were no guardrails in place to avoid merging a PR that
>>>>>>>> wasn't fully validated.
>>>>>>>>
>>>>>>>> Things that would've made this better which I think we should focus
>>>>>>>> our attention on:
>>>>>>>>
>>>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>>>> enforcing some (or all?) precommit checks as required. I don't really blame
>>>>>>>> anyone for merging this PR since it had repeatedly run into unrelated
>>>>>>>> flakes, but that's kinda the point - we need to feel some of that pain to
>>>>>>>> avoid accidentally merging bad code. Blocking PRs on failed precommit
>>>>>>>> suites would force us to deal with these kinds of flakes and incentivize a
>>>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Danny
>>>>>>>>
>>>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Damon,
>>>>>>>>> I think our CI system was working as intended for your PR. We
>>>>>>>>> select which PreCommits to run based on the files that are edited, and the
>>>>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>>>
>>>>>>>>> It looks like what broke down in this case was that the PR was
>>>>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>>>>
>>>>>>>>> Brian
>>>>>>>>>
>>>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>>>> damondouglas@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>>>>> write in SBAR
>>>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>>>>
>>>>>>>>>> *Subject*
>>>>>>>>>>
>>>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953>
>>>>>>>>>> that was subsequently reverted
>>>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>>>
>>>>>>>>>> *Background*
>>>>>>>>>>
>>>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to the
>>>>>>>>>> code in the PR
>>>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>>>>> issues and failed style checks.
>>>>>>>>>>
>>>>>>>>>> ./gradlew rat
>>>>>>>>>> ./gradlew spotlessCheck
>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>>>
>>>>>>>>>> *Assessment*
>>>>>>>>>>
>>>>>>>>>> Output from the following gradle commands did not surface prior
>>>>>>>>>> to the PR merge.
>>>>>>>>>>
>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>>>
>>>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and a
>>>>>>>>>> lack of knowledge contributed to the root cause.
>>>>>>>>>>
>>>>>>>>>> *Recommendation / Proposal*
>>>>>>>>>>
>>>>>>>>>> I propose updating PR template description
>>>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>>>>> PR.  Toward this aim:
>>>>>>>>>>
>>>>>>>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>>>>>>>> representative of *common*, and language specific: *Java*,
>>>>>>>>>> *Python*, and *Go*, prescribed gradle tasks
>>>>>>>>>> 2.  Don't reply to this email with specifics on how; delegation
>>>>>>>>>> of comments/discussion within the relevant context of the GitHub issues in 1
>>>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *Damon Douglas*
>>>>>>>>>>
>>>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>>>
>>>>>>>>>> damondouglas@google.com
>>>>>>>>>>
>>>>>>>>>

Re: Update PR description template

Posted by Ahmet Altay <al...@google.com>.
On Mon, Jun 27, 2022 at 4:55 PM Robert Bradshaw <ro...@google.com> wrote:

> On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <ke...@apache.org> wrote:
>
>> I am in favor of requiring precommit green before merge. To make this
>> better, we should:
>>
>> (1) run fewer irrelevant tests, for example almost every language-based
>> presubmit tests vastly more code than could possibly be affected.
>> (2) run more relevant tests, for example when an IO is touched running
>> slightly more heavyweight integration tests before merge is reasonable.
>>
>
> Does it have to be all or nothing? E.g. we could start with requiring a
> smaller subset of tests that must pass, with the goal of eventually
> requiring (almost?) everything.
>
>
>> I worry that requiring total line coverage in unit tests / presubmits
>> will encourage mock-based tests that don't demonstrate any meaningful
>> correctness. But I am OK to try it.
>>
>
IIRC, codecov test is checking that % coverage is not reduced with the new
PR and it is not checking for total line coverage. I agree with you that
total line coverage is probably not meaningful in most cases.


>
> +1. I think there are a class of presubmits (coverage among them) that are
> advisory rather than binding.
>

+1 - especially if we can make the tools distinguish between required and
advisory presubmits.


>
>> I assume we would have a way to override it in case the repo gets into a
>> bad state (for example needing to modify the workflow definitions in order
>> to get them to pass).
>>
>
> +1. But making this explicit (e.g. specific wording would have to be in
> the last comment) would be a good barrier to have even if it's not
> absolute.
>
>
>> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> wrote:
>>
>>> Just an opinion: If we have a precommit (including codecov) we should be
>>> willing to enforce those passing before merging a PR. Selectively applying
>>> when/which precommits would be blockers eventually leads to more opinions.
>>> I think codecov is actually providing value by checking that delta coverage
>>> is not dropped. In my experience, for PRs that fail the codecov precommit
>>> there tends to be an opportunity for adding unit tests.
>>>
>>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>>> dannymccormick@google.com> wrote:
>>>
>>>> Brian, you keep beating my emails by an instant :) I think codecov
>>>> provides value independent of blocking a PR - it is a good sanity check
>>>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>>>> track over time.
>>>>
>>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>>> dannymccormick@google.com> wrote:
>>>>
>>>>> Yeah, codecov is the one exception I had in mind, there may be others
>>>>> that are reasonable to exclude as well
>>>>>
>>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <jr...@google.com>
>>>>> wrote:
>>>>>
>>>>>> +1 for enforcing some precommit checks as required, but making them
>>>>>> all required would be a bit of a nightmare with how Codecov works. Any
>>>>>> Python or Go patch gets flagged as failing codecov if its changes aren't
>>>>>> covered at or above the current overall coverage level at master (around
>>>>>> 75% right now.) This is a nice goal, but ultimately not every change lends
>>>>>> itself to that level of unit testing. But I would be in favor of things
>>>>>> like the various ValidatesRunner suites being required.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jack McCluskey
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>>> dannymccormick@google.com> wrote:
>>>>>>
>>>>>>> Brian beat me by a few seconds, but I actually agree that there is a
>>>>>>> problem here - when Damon's PR was submitted there was basically no way of
>>>>>>> knowing that checks would break without having detailed knowledge of how
>>>>>>> the build system works (so I'm on board with your
>>>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>>>> recommended path forward - our automation failed us here and I don't think
>>>>>>> the correct response is to make our manual validation better. That shifts
>>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>>>> underlying technical problems, which are IMO:
>>>>>>>
>>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>>> 2) There were no guardrails in place to avoid merging a PR that
>>>>>>> wasn't fully validated.
>>>>>>>
>>>>>>> Things that would've made this better which I think we should focus
>>>>>>> our attention on:
>>>>>>>
>>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>>> enforcing some (or all?) precommit checks as required. I don't really blame
>>>>>>> anyone for merging this PR since it had repeatedly run into unrelated
>>>>>>> flakes, but that's kinda the point - we need to feel some of that pain to
>>>>>>> avoid accidentally merging bad code. Blocking PRs on failed precommit
>>>>>>> suites would force us to deal with these kinds of flakes and incentivize a
>>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Danny
>>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Damon,
>>>>>>>> I think our CI system was working as intended for your PR. We
>>>>>>>> select which PreCommits to run based on the files that are edited, and the
>>>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>>
>>>>>>>> It looks like what broke down in this case was that the PR was
>>>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>>>
>>>>>>>> Brian
>>>>>>>>
>>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>>> damondouglas@google.com> wrote:
>>>>>>>>
>>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>>>> write in SBAR
>>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>>>
>>>>>>>>> *Subject*
>>>>>>>>>
>>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that
>>>>>>>>> was subsequently reverted
>>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>>
>>>>>>>>> *Background*
>>>>>>>>>
>>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to the
>>>>>>>>> code in the PR
>>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>>>> issues and failed style checks.
>>>>>>>>>
>>>>>>>>> ./gradlew rat
>>>>>>>>> ./gradlew spotlessCheck
>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>>
>>>>>>>>> *Assessment*
>>>>>>>>>
>>>>>>>>> Output from the following gradle commands did not surface prior to
>>>>>>>>> the PR merge.
>>>>>>>>>
>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>>
>>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and a
>>>>>>>>> lack of knowledge contributed to the root cause.
>>>>>>>>>
>>>>>>>>> *Recommendation / Proposal*
>>>>>>>>>
>>>>>>>>> I propose updating PR template description
>>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>>>> PR.  Toward this aim:
>>>>>>>>>
>>>>>>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>>>>>>> representative of *common*, and language specific: *Java*,
>>>>>>>>> *Python*, and *Go*, prescribed gradle tasks
>>>>>>>>> 2.  Don't reply to this email with specifics on how; delegation of
>>>>>>>>> comments/discussion within the relevant context of the GitHub issues in 1
>>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Damon Douglas*
>>>>>>>>>
>>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>>
>>>>>>>>> damondouglas@google.com
>>>>>>>>>
>>>>>>>>

Re: Update PR description template

Posted by Robert Bradshaw <ro...@google.com>.
On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <ke...@apache.org> wrote:

> I am in favor of requiring precommit green before merge. To make this
> better, we should:
>
> (1) run fewer irrelevant tests, for example almost every language-based
> presubmit tests vastly more code than could possibly be affected.
> (2) run more relevant tests, for example when an IO is touched running
> slightly more heavyweight integration tests before merge is reasonable.
>

Does it have to be all or nothing? E.g. we could start with requiring a
smaller subset of tests that must pass, with the goal of eventually
requiring (almost?) everything.


> I worry that requiring total line coverage in unit tests / presubmits will
> encourage mock-based tests that don't demonstrate any meaningful
> correctness. But I am OK to try it.
>

+1. I think there are a class of presubmits (coverage among them) that are
advisory rather than binding.


> I assume we would have a way to override it in case the repo gets into a
> bad state (for example needing to modify the workflow definitions in order
> to get them to pass).
>

+1. But making this explicit (e.g. specific wording would have to be in the
last comment) would be a good barrier to have even if it's not absolute.


> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> wrote:
>
>> Just an opinion: If we have a precommit (including codecov) we should be
>> willing to enforce those passing before merging a PR. Selectively applying
>> when/which precommits would be blockers eventually leads to more opinions.
>> I think codecov is actually providing value by checking that delta coverage
>> is not dropped. In my experience, for PRs that fail the codecov precommit
>> there tends to be an opportunity for adding unit tests.
>>
>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>> dannymccormick@google.com> wrote:
>>
>>> Brian, you keep beating my emails by an instant :) I think codecov
>>> provides value independent of blocking a PR - it is a good sanity check
>>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>>> track over time.
>>>
>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>> dannymccormick@google.com> wrote:
>>>
>>>> Yeah, codecov is the one exception I had in mind, there may be others
>>>> that are reasonable to exclude as well
>>>>
>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <jr...@google.com>
>>>> wrote:
>>>>
>>>>> +1 for enforcing some precommit checks as required, but making them
>>>>> all required would be a bit of a nightmare with how Codecov works. Any
>>>>> Python or Go patch gets flagged as failing codecov if its changes aren't
>>>>> covered at or above the current overall coverage level at master (around
>>>>> 75% right now.) This is a nice goal, but ultimately not every change lends
>>>>> itself to that level of unit testing. But I would be in favor of things
>>>>> like the various ValidatesRunner suites being required.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Jack McCluskey
>>>>>
>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>> dannymccormick@google.com> wrote:
>>>>>
>>>>>> Brian beat me by a few seconds, but I actually agree that there is a
>>>>>> problem here - when Damon's PR was submitted there was basically no way of
>>>>>> knowing that checks would break without having detailed knowledge of how
>>>>>> the build system works (so I'm on board with your
>>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>>> recommended path forward - our automation failed us here and I don't think
>>>>>> the correct response is to make our manual validation better. That shifts
>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>>> underlying technical problems, which are IMO:
>>>>>>
>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>> 2) There were no guardrails in place to avoid merging a PR that
>>>>>> wasn't fully validated.
>>>>>>
>>>>>> Things that would've made this better which I think we should focus
>>>>>> our attention on:
>>>>>>
>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>> enforcing some (or all?) precommit checks as required. I don't really blame
>>>>>> anyone for merging this PR since it had repeatedly run into unrelated
>>>>>> flakes, but that's kinda the point - we need to feel some of that pain to
>>>>>> avoid accidentally merging bad code. Blocking PRs on failed precommit
>>>>>> suites would force us to deal with these kinds of flakes and incentivize a
>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>
>>>>>> Thanks,
>>>>>> Danny
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Damon,
>>>>>>> I think our CI system was working as intended for your PR. We select
>>>>>>> which PreCommits to run based on the files that are edited, and the
>>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>
>>>>>>> It looks like what broke down in this case was that the PR was
>>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>> damondouglas@google.com> wrote:
>>>>>>>
>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>>> write in SBAR
>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>>
>>>>>>>> *Subject*
>>>>>>>>
>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that
>>>>>>>> was subsequently reverted
>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>
>>>>>>>> *Background*
>>>>>>>>
>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to the code
>>>>>>>> in the PR
>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>>> issues and failed style checks.
>>>>>>>>
>>>>>>>> ./gradlew rat
>>>>>>>> ./gradlew spotlessCheck
>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>
>>>>>>>> *Assessment*
>>>>>>>>
>>>>>>>> Output from the following gradle commands did not surface prior to
>>>>>>>> the PR merge.
>>>>>>>>
>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>
>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and a
>>>>>>>> lack of knowledge contributed to the root cause.
>>>>>>>>
>>>>>>>> *Recommendation / Proposal*
>>>>>>>>
>>>>>>>> I propose updating PR template description
>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>>> PR.  Toward this aim:
>>>>>>>>
>>>>>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>>>>>> representative of *common*, and language specific: *Java*, *Python*,
>>>>>>>> and *Go*, prescribed gradle tasks
>>>>>>>> 2.  Don't reply to this email with specifics on how; delegation of
>>>>>>>> comments/discussion within the relevant context of the GitHub issues in 1
>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Damon Douglas*
>>>>>>>>
>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>
>>>>>>>> damondouglas@google.com
>>>>>>>>
>>>>>>>

Re: Update PR description template

Posted by Kenneth Knowles <ke...@apache.org>.
I am in favor of requiring precommit green before merge. To make this
better, we should:

(1) run fewer irrelevant tests, for example almost every language-based
presubmit tests vastly more code than could possibly be affected.
(2) run more relevant tests, for example when an IO is touched running
slightly more heavyweight integration tests before merge is reasonable.

I worry that requiring total line coverage in unit tests / presubmits will
encourage mock-based tests that don't demonstrate any meaningful
correctness. But I am OK to try it.

I assume we would have a way to override it in case the repo gets into a
bad state (for example needing to modify the workflow definitions in order
to get them to pass).

Kenn

On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> wrote:

> Just an opinion: If we have a precommit (including codecov) we should be
> willing to enforce those passing before merging a PR. Selectively applying
> when/which precommits would be blockers eventually leads to more opinions.
> I think codecov is actually providing value by checking that delta coverage
> is not dropped. In my experience, for PRs that fail the codecov precommit
> there tends to be an opportunity for adding unit tests.
>
> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <da...@google.com>
> wrote:
>
>> Brian, you keep beating my emails by an instant :) I think codecov
>> provides value independent of blocking a PR - it is a good sanity check
>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>> track over time.
>>
>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>> dannymccormick@google.com> wrote:
>>
>>> Yeah, codecov is the one exception I had in mind, there may be others
>>> that are reasonable to exclude as well
>>>
>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <jr...@google.com>
>>> wrote:
>>>
>>>> +1 for enforcing some precommit checks as required, but making them all
>>>> required would be a bit of a nightmare with how Codecov works. Any Python
>>>> or Go patch gets flagged as failing codecov if its changes aren't covered
>>>> at or above the current overall coverage level at master (around 75% right
>>>> now.) This is a nice goal, but ultimately not every change lends itself to
>>>> that level of unit testing. But I would be in favor of things like the
>>>> various ValidatesRunner suites being required.
>>>>
>>>> Thanks,
>>>>
>>>> Jack McCluskey
>>>>
>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>> dannymccormick@google.com> wrote:
>>>>
>>>>> Brian beat me by a few seconds, but I actually agree that there is a
>>>>> problem here - when Damon's PR was submitted there was basically no way of
>>>>> knowing that checks would break without having detailed knowledge of how
>>>>> the build system works (so I'm on board with your
>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>> recommended path forward - our automation failed us here and I don't think
>>>>> the correct response is to make our manual validation better. That shifts
>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>> underlying technical problems, which are IMO:
>>>>>
>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>> 2) There were no guardrails in place to avoid merging a PR that wasn't
>>>>> fully validated.
>>>>>
>>>>> Things that would've made this better which I think we should focus
>>>>> our attention on:
>>>>>
>>>>> 1) (easy) Split the linting task out from the test step.
>>>>> 2) This one is more controversial, but I would vote we start enforcing
>>>>> some (or all?) precommit checks as required. I don't really blame anyone
>>>>> for merging this PR since it had repeatedly run into unrelated flakes, but
>>>>> that's kinda the point - we need to feel some of that pain to avoid
>>>>> accidentally merging bad code. Blocking PRs on failed precommit suites
>>>>> would force us to deal with these kinds of flakes and incentivize a lot of
>>>>> good behavior in the repo (and disincentivize bad tests).
>>>>>
>>>>> Thanks,
>>>>> Danny
>>>>>
>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Damon,
>>>>>> I think our CI system was working as intended for your PR. We select
>>>>>> which PreCommits to run based on the files that are edited, and the
>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>
>>>>>> It looks like what broke down in this case was that the PR was
>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>> damondouglas@google.com> wrote:
>>>>>>
>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>> write in SBAR
>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>
>>>>>>> *Subject*
>>>>>>>
>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that
>>>>>>> was subsequently reverted
>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>
>>>>>>> *Background*
>>>>>>>
>>>>>>> Despite passing the following gradle tasks,
>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to the code
>>>>>>> in the PR
>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>> issues and failed style checks.
>>>>>>>
>>>>>>> ./gradlew rat
>>>>>>> ./gradlew spotlessCheck
>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>
>>>>>>> *Assessment*
>>>>>>>
>>>>>>> Output from the following gradle commands did not surface prior to
>>>>>>> the PR merge.
>>>>>>>
>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>
>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and a
>>>>>>> lack of knowledge contributed to the root cause.
>>>>>>>
>>>>>>> *Recommendation / Proposal*
>>>>>>>
>>>>>>> I propose updating PR template description
>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>> PR.  Toward this aim:
>>>>>>>
>>>>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>>>>> representative of *common*, and language specific: *Java*, *Python*,
>>>>>>> and *Go*, prescribed gradle tasks
>>>>>>> 2.  Don't reply to this email with specifics on how; delegation of
>>>>>>> comments/discussion within the relevant context of the GitHub issues in 1
>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *Damon Douglas*
>>>>>>>
>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>
>>>>>>> damondouglas@google.com
>>>>>>>
>>>>>>

Re: Update PR description template

Posted by Ahmet Altay <al...@google.com>.
On Mon, Jun 27, 2022 at 3:00 PM Pablo Estrada <pa...@google.com> wrote:

> This was my bad, as I misread the precommit failure. I will be careful in
> the future : (
>

I made the same mistake before. It is normal for humans to do this, we need
to fix the process and the tools.


> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> wrote:
>
>> Just an opinion: If we have a precommit (including codecov) we should be
>> willing to enforce those passing before merging a PR. Selectively applying
>> when/which precommits would be blockers eventually leads to more opinions.
>> I think codecov is actually providing value by checking that delta coverage
>> is not dropped. In my experience, for PRs that fail the codecov precommit
>> there tends to be an opportunity for adding unit tests.
>>
>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>> dannymccormick@google.com> wrote:
>>
>>> Brian, you keep beating my emails by an instant :) I think codecov
>>> provides value independent of blocking a PR - it is a good sanity check
>>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>>> track over time.
>>>
>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>> dannymccormick@google.com> wrote:
>>>
>>>> Yeah, codecov is the one exception I had in mind, there may be others
>>>> that are reasonable to exclude as well
>>>>
>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <jr...@google.com>
>>>> wrote:
>>>>
>>>>> +1 for enforcing some precommit checks as required, but making them
>>>>> all required would be a bit of a nightmare with how Codecov works. Any
>>>>> Python or Go patch gets flagged as failing codecov if its changes aren't
>>>>> covered at or above the current overall coverage level at master (around
>>>>> 75% right now.) This is a nice goal, but ultimately not every change lends
>>>>> itself to that level of unit testing. But I would be in favor of things
>>>>> like the various ValidatesRunner suites being required.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Jack McCluskey
>>>>>
>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>> dannymccormick@google.com> wrote:
>>>>>
>>>>>> Brian beat me by a few seconds, but I actually agree that there is a
>>>>>> problem here - when Damon's PR was submitted there was basically no way of
>>>>>> knowing that checks would break without having detailed knowledge of how
>>>>>> the build system works (so I'm on board with your
>>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>>> recommended path forward - our automation failed us here and I don't think
>>>>>> the correct response is to make our manual validation better. That shifts
>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>>> underlying technical problems, which are IMO:
>>>>>>
>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>> 2) There were no guardrails in place to avoid merging a PR that
>>>>>> wasn't fully validated.
>>>>>>
>>>>>> Things that would've made this better which I think we should focus
>>>>>> our attention on:
>>>>>>
>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>> enforcing some (or all?) precommit checks as required. I don't really blame
>>>>>> anyone for merging this PR since it had repeatedly run into unrelated
>>>>>> flakes, but that's kinda the point - we need to feel some of that pain to
>>>>>> avoid accidentally merging bad code. Blocking PRs on failed precommit
>>>>>> suites would force us to deal with these kinds of flakes and incentivize a
>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>
>>>>>> Thanks,
>>>>>> Danny
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Damon,
>>>>>>> I think our CI system was working as intended for your PR. We select
>>>>>>> which PreCommits to run based on the files that are edited, and the
>>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>
>>>>>>> It looks like what broke down in this case was that the PR was
>>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>> damondouglas@google.com> wrote:
>>>>>>>
>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>>> write in SBAR
>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>>
>>>>>>>> *Subject*
>>>>>>>>
>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that
>>>>>>>> was subsequently reverted
>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>
>>>>>>>> *Background*
>>>>>>>>
>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to the code
>>>>>>>> in the PR
>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>>> issues and failed style checks.
>>>>>>>>
>>>>>>>> ./gradlew rat
>>>>>>>> ./gradlew spotlessCheck
>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>
>>>>>>>> *Assessment*
>>>>>>>>
>>>>>>>> Output from the following gradle commands did not surface prior to
>>>>>>>> the PR merge.
>>>>>>>>
>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>
>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and a
>>>>>>>> lack of knowledge contributed to the root cause.
>>>>>>>>
>>>>>>>> *Recommendation / Proposal*
>>>>>>>>
>>>>>>>> I propose updating PR template description
>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>>> PR.  Toward this aim:
>>>>>>>>
>>>>>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>>>>>> representative of *common*, and language specific: *Java*, *Python*,
>>>>>>>> and *Go*, prescribed gradle tasks
>>>>>>>> 2.  Don't reply to this email with specifics on how; delegation of
>>>>>>>> comments/discussion within the relevant context of the GitHub issues in 1
>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Damon Douglas*
>>>>>>>>
>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>
>>>>>>>> damondouglas@google.com
>>>>>>>>
>>>>>>>

Re: Update PR description template

Posted by Pablo Estrada <pa...@google.com>.
This was my bad, as I misread the precommit failure. I will be careful in
the future : (

On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> wrote:

> Just an opinion: If we have a precommit (including codecov) we should be
> willing to enforce those passing before merging a PR. Selectively applying
> when/which precommits would be blockers eventually leads to more opinions.
> I think codecov is actually providing value by checking that delta coverage
> is not dropped. In my experience, for PRs that fail the codecov precommit
> there tends to be an opportunity for adding unit tests.
>
> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <da...@google.com>
> wrote:
>
>> Brian, you keep beating my emails by an instant :) I think codecov
>> provides value independent of blocking a PR - it is a good sanity check
>> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
>> track over time.
>>
>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>> dannymccormick@google.com> wrote:
>>
>>> Yeah, codecov is the one exception I had in mind, there may be others
>>> that are reasonable to exclude as well
>>>
>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <jr...@google.com>
>>> wrote:
>>>
>>>> +1 for enforcing some precommit checks as required, but making them all
>>>> required would be a bit of a nightmare with how Codecov works. Any Python
>>>> or Go patch gets flagged as failing codecov if its changes aren't covered
>>>> at or above the current overall coverage level at master (around 75% right
>>>> now.) This is a nice goal, but ultimately not every change lends itself to
>>>> that level of unit testing. But I would be in favor of things like the
>>>> various ValidatesRunner suites being required.
>>>>
>>>> Thanks,
>>>>
>>>> Jack McCluskey
>>>>
>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>> dannymccormick@google.com> wrote:
>>>>
>>>>> Brian beat me by a few seconds, but I actually agree that there is a
>>>>> problem here - when Damon's PR was submitted there was basically no way of
>>>>> knowing that checks would break without having detailed knowledge of how
>>>>> the build system works (so I'm on board with your
>>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>>> recommended path forward - our automation failed us here and I don't think
>>>>> the correct response is to make our manual validation better. That shifts
>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>>> especially for newer contributors). Instead, I think we should fix the
>>>>> underlying technical problems, which are IMO:
>>>>>
>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>> 2) There were no guardrails in place to avoid merging a PR that wasn't
>>>>> fully validated.
>>>>>
>>>>> Things that would've made this better which I think we should focus
>>>>> our attention on:
>>>>>
>>>>> 1) (easy) Split the linting task out from the test step.
>>>>> 2) This one is more controversial, but I would vote we start enforcing
>>>>> some (or all?) precommit checks as required. I don't really blame anyone
>>>>> for merging this PR since it had repeatedly run into unrelated flakes, but
>>>>> that's kinda the point - we need to feel some of that pain to avoid
>>>>> accidentally merging bad code. Blocking PRs on failed precommit suites
>>>>> would force us to deal with these kinds of flakes and incentivize a lot of
>>>>> good behavior in the repo (and disincentivize bad tests).
>>>>>
>>>>> Thanks,
>>>>> Danny
>>>>>
>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Damon,
>>>>>> I think our CI system was working as intended for your PR. We select
>>>>>> which PreCommits to run based on the files that are edited, and the
>>>>>> author/reviewer should collaborate to get those green before merging the
>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>
>>>>>> It looks like what broke down in this case was that the PR was
>>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>> damondouglas@google.com> wrote:
>>>>>>
>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>>> write in SBAR
>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>>
>>>>>>> *Subject*
>>>>>>>
>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that
>>>>>>> was subsequently reverted
>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>
>>>>>>> *Background*
>>>>>>>
>>>>>>> Despite passing the following gradle tasks,
>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to the code
>>>>>>> in the PR
>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>>> issues and failed style checks.
>>>>>>>
>>>>>>> ./gradlew rat
>>>>>>> ./gradlew spotlessCheck
>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>
>>>>>>> *Assessment*
>>>>>>>
>>>>>>> Output from the following gradle commands did not surface prior to
>>>>>>> the PR merge.
>>>>>>>
>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>
>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and a
>>>>>>> lack of knowledge contributed to the root cause.
>>>>>>>
>>>>>>> *Recommendation / Proposal*
>>>>>>>
>>>>>>> I propose updating PR template description
>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>>> PR.  Toward this aim:
>>>>>>>
>>>>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>>>>> representative of *common*, and language specific: *Java*, *Python*,
>>>>>>> and *Go*, prescribed gradle tasks
>>>>>>> 2.  Don't reply to this email with specifics on how; delegation of
>>>>>>> comments/discussion within the relevant context of the GitHub issues in 1
>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *Damon Douglas*
>>>>>>>
>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>
>>>>>>> damondouglas@google.com
>>>>>>>
>>>>>>

Re: Update PR description template

Posted by Ahmet Altay <al...@google.com>.
Just an opinion: If we have a precommit (including codecov) we should be
willing to enforce those passing before merging a PR. Selectively applying
when/which precommits would be blockers eventually leads to more opinions.
I think codecov is actually providing value by checking that delta coverage
is not dropped. In my experience, for PRs that fail the codecov precommit
there tends to be an opportunity for adding unit tests.

On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <da...@google.com>
wrote:

> Brian, you keep beating my emails by an instant :) I think codecov
> provides value independent of blocking a PR - it is a good sanity check
> (e.g. if codecov drops from 50% to 5%) and it gives us a nice metric to
> track over time.
>
> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <da...@google.com>
> wrote:
>
>> Yeah, codecov is the one exception I had in mind, there may be others
>> that are reasonable to exclude as well
>>
>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <jr...@google.com>
>> wrote:
>>
>>> +1 for enforcing some precommit checks as required, but making them all
>>> required would be a bit of a nightmare with how Codecov works. Any Python
>>> or Go patch gets flagged as failing codecov if its changes aren't covered
>>> at or above the current overall coverage level at master (around 75% right
>>> now.) This is a nice goal, but ultimately not every change lends itself to
>>> that level of unit testing. But I would be in favor of things like the
>>> various ValidatesRunner suites being required.
>>>
>>> Thanks,
>>>
>>> Jack McCluskey
>>>
>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>> dannymccormick@google.com> wrote:
>>>
>>>> Brian beat me by a few seconds, but I actually agree that there is a
>>>> problem here - when Damon's PR was submitted there was basically no way of
>>>> knowing that checks would break without having detailed knowledge of how
>>>> the build system works (so I'm on board with your
>>>> Subject/Background/Assessment). With that said, I disagree with the
>>>> recommended path forward - our automation failed us here and I don't think
>>>> the correct response is to make our manual validation better. That shifts
>>>> the burden of verifying correctness from robots (easy to validate,
>>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>>> especially for newer contributors). Instead, I think we should fix the
>>>> underlying technical problems, which are IMO:
>>>>
>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>> 2) There were no guardrails in place to avoid merging a PR that wasn't
>>>> fully validated.
>>>>
>>>> Things that would've made this better which I think we should focus our
>>>> attention on:
>>>>
>>>> 1) (easy) Split the linting task out from the test step.
>>>> 2) This one is more controversial, but I would vote we start enforcing
>>>> some (or all?) precommit checks as required. I don't really blame anyone
>>>> for merging this PR since it had repeatedly run into unrelated flakes, but
>>>> that's kinda the point - we need to feel some of that pain to avoid
>>>> accidentally merging bad code. Blocking PRs on failed precommit suites
>>>> would force us to deal with these kinds of flakes and incentivize a lot of
>>>> good behavior in the repo (and disincentivize bad tests).
>>>>
>>>> Thanks,
>>>> Danny
>>>>
>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>>>> wrote:
>>>>
>>>>> Hi Damon,
>>>>> I think our CI system was working as intended for your PR. We select
>>>>> which PreCommits to run based on the files that are edited, and the
>>>>> author/reviewer should collaborate to get those green before merging the
>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>
>>>>> It looks like what broke down in this case was that the PR was
>>>>> accidentally merged despite the failing Java PreCommit run.
>>>>>
>>>>> Brian
>>>>>
>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>> damondouglas@google.com> wrote:
>>>>>
>>>>>> I write this to prevent a recent experience I encountered
>>>>>> contributing to the Apache Beam repository.  To address a system issue, I
>>>>>> write in SBAR
>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>>
>>>>>> *Subject*
>>>>>>
>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that
>>>>>> was subsequently reverted <https://github.com/apache/beam/pull/22044>
>>>>>> .
>>>>>>
>>>>>> *Background*
>>>>>>
>>>>>> Despite passing the following gradle tasks, the PreCommit_Java_Commit failed
>>>>>> with a test unrelated to the code in the PR
>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>>> issues and failed style checks.
>>>>>>
>>>>>> ./gradlew rat
>>>>>> ./gradlew spotlessCheck
>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>
>>>>>> *Assessment*
>>>>>>
>>>>>> Output from the following gradle commands did not surface prior to
>>>>>> the PR merge.
>>>>>>
>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>
>>>>>> Both the aforementioned failing PreCommit_Java_Commit test and a lack
>>>>>> of knowledge contributed to the root cause.
>>>>>>
>>>>>> *Recommendation / Proposal*
>>>>>>
>>>>>> I propose updating PR template description
>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>>> PR.  Toward this aim:
>>>>>>
>>>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>>>> representative of *common*, and language specific: *Java*, *Python*,
>>>>>> and *Go*, prescribed gradle tasks
>>>>>> 2.  Don't reply to this email with specifics on how; delegation of
>>>>>> comments/discussion within the relevant context of the GitHub issues in 1
>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Damon Douglas*
>>>>>>
>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>
>>>>>> damondouglas@google.com
>>>>>>
>>>>>

Re: Update PR description template

Posted by Danny McCormick <da...@google.com>.
Brian, you keep beating my emails by an instant :) I think codecov provides
value independent of blocking a PR - it is a good sanity check (e.g. if
codecov drops from 50% to 5%) and it gives us a nice metric to track over
time.

On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <da...@google.com>
wrote:

> Yeah, codecov is the one exception I had in mind, there may be others that
> are reasonable to exclude as well
>
> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <jr...@google.com>
> wrote:
>
>> +1 for enforcing some precommit checks as required, but making them all
>> required would be a bit of a nightmare with how Codecov works. Any Python
>> or Go patch gets flagged as failing codecov if its changes aren't covered
>> at or above the current overall coverage level at master (around 75% right
>> now.) This is a nice goal, but ultimately not every change lends itself to
>> that level of unit testing. But I would be in favor of things like the
>> various ValidatesRunner suites being required.
>>
>> Thanks,
>>
>> Jack McCluskey
>>
>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>> dannymccormick@google.com> wrote:
>>
>>> Brian beat me by a few seconds, but I actually agree that there is a
>>> problem here - when Damon's PR was submitted there was basically no way of
>>> knowing that checks would break without having detailed knowledge of how
>>> the build system works (so I'm on board with your
>>> Subject/Background/Assessment). With that said, I disagree with the
>>> recommended path forward - our automation failed us here and I don't think
>>> the correct response is to make our manual validation better. That shifts
>>> the burden of verifying correctness from robots (easy to validate,
>>> basically free) to humans (expensive, forgetful, and hard to validate,
>>> especially for newer contributors). Instead, I think we should fix the
>>> underlying technical problems, which are IMO:
>>>
>>> 1) Flaky tests unrelated to the PR caused it to fail
>>> 2) There were no guardrails in place to avoid merging a PR that wasn't
>>> fully validated.
>>>
>>> Things that would've made this better which I think we should focus our
>>> attention on:
>>>
>>> 1) (easy) Split the linting task out from the test step.
>>> 2) This one is more controversial, but I would vote we start enforcing
>>> some (or all?) precommit checks as required. I don't really blame anyone
>>> for merging this PR since it had repeatedly run into unrelated flakes, but
>>> that's kinda the point - we need to feel some of that pain to avoid
>>> accidentally merging bad code. Blocking PRs on failed precommit suites
>>> would force us to deal with these kinds of flakes and incentivize a lot of
>>> good behavior in the repo (and disincentivize bad tests).
>>>
>>> Thanks,
>>> Danny
>>>
>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>>> wrote:
>>>
>>>> Hi Damon,
>>>> I think our CI system was working as intended for your PR. We select
>>>> which PreCommits to run based on the files that are edited, and the
>>>> author/reviewer should collaborate to get those green before merging the
>>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>>> failing gradle tasks and run them locally to reproduce.
>>>>
>>>> It looks like what broke down in this case was that the PR was
>>>> accidentally merged despite the failing Java PreCommit run.
>>>>
>>>> Brian
>>>>
>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <da...@google.com>
>>>> wrote:
>>>>
>>>>> I write this to prevent a recent experience I encountered contributing
>>>>> to the Apache Beam repository.  To address a system issue, I write in
>>>>> SBAR
>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>>
>>>>> *Subject*
>>>>>
>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that was subsequently
>>>>> reverted <https://github.com/apache/beam/pull/22044>.
>>>>>
>>>>> *Background*
>>>>>
>>>>> Despite passing the following gradle tasks, the PreCommit_Java_Commit failed
>>>>> with a test unrelated to the code in the PR
>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>>> issues and failed style checks.
>>>>>
>>>>> ./gradlew rat
>>>>> ./gradlew spotlessCheck
>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>
>>>>> *Assessment*
>>>>>
>>>>> Output from the following gradle commands did not surface prior to the
>>>>> PR merge.
>>>>>
>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>
>>>>> Both the aforementioned failing PreCommit_Java_Commit test and a lack
>>>>> of knowledge contributed to the root cause.
>>>>>
>>>>> *Recommendation / Proposal*
>>>>>
>>>>> I propose updating PR template description
>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>>> PR.  Toward this aim:
>>>>>
>>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>>> representative of *common*, and language specific: *Java*, *Python*,
>>>>> and *Go*, prescribed gradle tasks
>>>>> 2.  Don't reply to this email with specifics on how; delegation of
>>>>> comments/discussion within the relevant context of the GitHub issues in 1
>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>> *Damon Douglas*
>>>>>
>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>
>>>>> damondouglas@google.com
>>>>>
>>>>

Re: Update PR description template

Posted by Danny McCormick <da...@google.com>.
Yeah, codecov is the one exception I had in mind, there may be others that
are reasonable to exclude as well

On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <jr...@google.com>
wrote:

> +1 for enforcing some precommit checks as required, but making them all
> required would be a bit of a nightmare with how Codecov works. Any Python
> or Go patch gets flagged as failing codecov if its changes aren't covered
> at or above the current overall coverage level at master (around 75% right
> now.) This is a nice goal, but ultimately not every change lends itself to
> that level of unit testing. But I would be in favor of things like the
> various ValidatesRunner suites being required.
>
> Thanks,
>
> Jack McCluskey
>
> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <da...@google.com>
> wrote:
>
>> Brian beat me by a few seconds, but I actually agree that there is a
>> problem here - when Damon's PR was submitted there was basically no way of
>> knowing that checks would break without having detailed knowledge of how
>> the build system works (so I'm on board with your
>> Subject/Background/Assessment). With that said, I disagree with the
>> recommended path forward - our automation failed us here and I don't think
>> the correct response is to make our manual validation better. That shifts
>> the burden of verifying correctness from robots (easy to validate,
>> basically free) to humans (expensive, forgetful, and hard to validate,
>> especially for newer contributors). Instead, I think we should fix the
>> underlying technical problems, which are IMO:
>>
>> 1) Flaky tests unrelated to the PR caused it to fail
>> 2) There were no guardrails in place to avoid merging a PR that wasn't
>> fully validated.
>>
>> Things that would've made this better which I think we should focus our
>> attention on:
>>
>> 1) (easy) Split the linting task out from the test step.
>> 2) This one is more controversial, but I would vote we start enforcing
>> some (or all?) precommit checks as required. I don't really blame anyone
>> for merging this PR since it had repeatedly run into unrelated flakes, but
>> that's kinda the point - we need to feel some of that pain to avoid
>> accidentally merging bad code. Blocking PRs on failed precommit suites
>> would force us to deal with these kinds of flakes and incentivize a lot of
>> good behavior in the repo (and disincentivize bad tests).
>>
>> Thanks,
>> Danny
>>
>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>> wrote:
>>
>>> Hi Damon,
>>> I think our CI system was working as intended for your PR. We select
>>> which PreCommits to run based on the files that are edited, and the
>>> author/reviewer should collaborate to get those green before merging the
>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>> failing gradle tasks and run them locally to reproduce.
>>>
>>> It looks like what broke down in this case was that the PR was
>>> accidentally merged despite the failing Java PreCommit run.
>>>
>>> Brian
>>>
>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <da...@google.com>
>>> wrote:
>>>
>>>> I write this to prevent a recent experience I encountered contributing
>>>> to the Apache Beam repository.  To address a system issue, I write in
>>>> SBAR
>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>
>>>> *Subject*
>>>>
>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that was subsequently
>>>> reverted <https://github.com/apache/beam/pull/22044>.
>>>>
>>>> *Background*
>>>>
>>>> Despite passing the following gradle tasks, the PreCommit_Java_Commit failed
>>>> with a test unrelated to the code in the PR
>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>> issues and failed style checks.
>>>>
>>>> ./gradlew rat
>>>> ./gradlew spotlessCheck
>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>
>>>> *Assessment*
>>>>
>>>> Output from the following gradle commands did not surface prior to the
>>>> PR merge.
>>>>
>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>
>>>> Both the aforementioned failing PreCommit_Java_Commit test and a lack
>>>> of knowledge contributed to the root cause.
>>>>
>>>> *Recommendation / Proposal*
>>>>
>>>> I propose updating PR template description
>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>> PR.  Toward this aim:
>>>>
>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>> representative of *common*, and language specific: *Java*, *Python*,
>>>> and *Go*, prescribed gradle tasks
>>>> 2.  Don't reply to this email with specifics on how; delegation of
>>>> comments/discussion within the relevant context of the GitHub issues in 1
>>>> 3.  PRs that fix the GitHub issues in 1
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> *Damon Douglas*
>>>>
>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>
>>>> damondouglas@google.com
>>>>
>>>

Re: Update PR description template

Posted by Brian Hulette <bh...@google.com>.
> when Damon's PR was submitted there was basically no way of knowing that
checks would break

I was confused on this point - the context I was missing is that the test
failure was a flake, masking the checkstyle issues. I thought it was just a
surprising breakage.

Anyway I agree with both of Danny's proposed actions. An argument against
(1) for the Java PreCommit might be longer builds due to fewer cache hits?

A counterargument to Jack's point - if we don't think the codecov check is
worth blocking a PR over, is it worth having?

On Mon, Jun 27, 2022 at 1:59 PM Jack McCluskey <jr...@google.com>
wrote:

> +1 for enforcing some precommit checks as required, but making them all
> required would be a bit of a nightmare with how Codecov works. Any Python
> or Go patch gets flagged as failing codecov if its changes aren't covered
> at or above the current overall coverage level at master (around 75% right
> now.) This is a nice goal, but ultimately not every change lends itself to
> that level of unit testing. But I would be in favor of things like the
> various ValidatesRunner suites being required.
>
> Thanks,
>
> Jack McCluskey
>
> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <da...@google.com>
> wrote:
>
>> Brian beat me by a few seconds, but I actually agree that there is a
>> problem here - when Damon's PR was submitted there was basically no way of
>> knowing that checks would break without having detailed knowledge of how
>> the build system works (so I'm on board with your
>> Subject/Background/Assessment). With that said, I disagree with the
>> recommended path forward - our automation failed us here and I don't think
>> the correct response is to make our manual validation better. That shifts
>> the burden of verifying correctness from robots (easy to validate,
>> basically free) to humans (expensive, forgetful, and hard to validate,
>> especially for newer contributors). Instead, I think we should fix the
>> underlying technical problems, which are IMO:
>>
>> 1) Flaky tests unrelated to the PR caused it to fail
>> 2) There were no guardrails in place to avoid merging a PR that wasn't
>> fully validated.
>>
>> Things that would've made this better which I think we should focus our
>> attention on:
>>
>> 1) (easy) Split the linting task out from the test step.
>> 2) This one is more controversial, but I would vote we start enforcing
>> some (or all?) precommit checks as required. I don't really blame anyone
>> for merging this PR since it had repeatedly run into unrelated flakes, but
>> that's kinda the point - we need to feel some of that pain to avoid
>> accidentally merging bad code. Blocking PRs on failed precommit suites
>> would force us to deal with these kinds of flakes and incentivize a lot of
>> good behavior in the repo (and disincentivize bad tests).
>>
>> Thanks,
>> Danny
>>
>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com>
>> wrote:
>>
>>> Hi Damon,
>>> I think our CI system was working as intended for your PR. We select
>>> which PreCommits to run based on the files that are edited, and the
>>> author/reviewer should collaborate to get those green before merging the
>>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>>> failing gradle tasks and run them locally to reproduce.
>>>
>>> It looks like what broke down in this case was that the PR was
>>> accidentally merged despite the failing Java PreCommit run.
>>>
>>> Brian
>>>
>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <da...@google.com>
>>> wrote:
>>>
>>>> I write this to prevent a recent experience I encountered contributing
>>>> to the Apache Beam repository.  To address a system issue, I write in
>>>> SBAR
>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>> format for clarity.  If there are no objections to the recommendation, I'm
>>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>>
>>>> *Subject*
>>>>
>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that was subsequently
>>>> reverted <https://github.com/apache/beam/pull/22044>.
>>>>
>>>> *Background*
>>>>
>>>> Despite passing the following gradle tasks, the PreCommit_Java_Commit failed
>>>> with a test unrelated to the code in the PR
>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>> The PR was merged and then reverted due to discovered vendor dependency
>>>> issues and failed style checks.
>>>>
>>>> ./gradlew rat
>>>> ./gradlew spotlessCheck
>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>
>>>> *Assessment*
>>>>
>>>> Output from the following gradle commands did not surface prior to the
>>>> PR merge.
>>>>
>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>
>>>> Both the aforementioned failing PreCommit_Java_Commit test and a lack
>>>> of knowledge contributed to the root cause.
>>>>
>>>> *Recommendation / Proposal*
>>>>
>>>> I propose updating PR template description
>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>>> PR.  Toward this aim:
>>>>
>>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>>> representative of *common*, and language specific: *Java*, *Python*,
>>>> and *Go*, prescribed gradle tasks
>>>> 2.  Don't reply to this email with specifics on how; delegation of
>>>> comments/discussion within the relevant context of the GitHub issues in 1
>>>> 3.  PRs that fix the GitHub issues in 1
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> *Damon Douglas*
>>>>
>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>
>>>> damondouglas@google.com
>>>>
>>>

Re: Update PR description template

Posted by Jack McCluskey <jr...@google.com>.
+1 for enforcing some precommit checks as required, but making them all
required would be a bit of a nightmare with how Codecov works. Any Python
or Go patch gets flagged as failing codecov if its changes aren't covered
at or above the current overall coverage level at master (around 75% right
now.) This is a nice goal, but ultimately not every change lends itself to
that level of unit testing. But I would be in favor of things like the
various ValidatesRunner suites being required.

Thanks,

Jack McCluskey

On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <da...@google.com>
wrote:

> Brian beat me by a few seconds, but I actually agree that there is a
> problem here - when Damon's PR was submitted there was basically no way of
> knowing that checks would break without having detailed knowledge of how
> the build system works (so I'm on board with your
> Subject/Background/Assessment). With that said, I disagree with the
> recommended path forward - our automation failed us here and I don't think
> the correct response is to make our manual validation better. That shifts
> the burden of verifying correctness from robots (easy to validate,
> basically free) to humans (expensive, forgetful, and hard to validate,
> especially for newer contributors). Instead, I think we should fix the
> underlying technical problems, which are IMO:
>
> 1) Flaky tests unrelated to the PR caused it to fail
> 2) There were no guardrails in place to avoid merging a PR that wasn't
> fully validated.
>
> Things that would've made this better which I think we should focus our
> attention on:
>
> 1) (easy) Split the linting task out from the test step.
> 2) This one is more controversial, but I would vote we start enforcing
> some (or all?) precommit checks as required. I don't really blame anyone
> for merging this PR since it had repeatedly run into unrelated flakes, but
> that's kinda the point - we need to feel some of that pain to avoid
> accidentally merging bad code. Blocking PRs on failed precommit suites
> would force us to deal with these kinds of flakes and incentivize a lot of
> good behavior in the repo (and disincentivize bad tests).
>
> Thanks,
> Danny
>
> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com> wrote:
>
>> Hi Damon,
>> I think our CI system was working as intended for your PR. We select
>> which PreCommits to run based on the files that are edited, and the
>> author/reviewer should collaborate to get those green before merging the
>> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
>> failing gradle tasks and run them locally to reproduce.
>>
>> It looks like what broke down in this case was that the PR was
>> accidentally merged despite the failing Java PreCommit run.
>>
>> Brian
>>
>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <da...@google.com>
>> wrote:
>>
>>> I write this to prevent a recent experience I encountered contributing
>>> to the Apache Beam repository.  To address a system issue, I write in
>>> SBAR
>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>> format for clarity.  If there are no objections to the recommendation, I'm
>>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>>
>>> *Subject*
>>>
>>> I submitted a PR <https://github.com/apache/beam/pull/21953> that was subsequently
>>> reverted <https://github.com/apache/beam/pull/22044>.
>>>
>>> *Background*
>>>
>>> Despite passing the following gradle tasks, the PreCommit_Java_Commit failed
>>> with a test unrelated to the code in the PR
>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>> The PR was merged and then reverted due to discovered vendor dependency
>>> issues and failed style checks.
>>>
>>> ./gradlew rat
>>> ./gradlew spotlessCheck
>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>
>>> *Assessment*
>>>
>>> Output from the following gradle commands did not surface prior to the
>>> PR merge.
>>>
>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>
>>> Both the aforementioned failing PreCommit_Java_Commit test and a lack of
>>> knowledge contributed to the root cause.
>>>
>>> *Recommendation / Proposal*
>>>
>>> I propose updating PR template description
>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>>> prescribes gradle tasks to be successfully executed prior to submitting a
>>> PR.  Toward this aim:
>>>
>>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>>> representative of *common*, and language specific: *Java*, *Python*,
>>> and *Go*, prescribed gradle tasks
>>> 2.  Don't reply to this email with specifics on how; delegation of
>>> comments/discussion within the relevant context of the GitHub issues in 1
>>> 3.  PRs that fix the GitHub issues in 1
>>>
>>> --
>>>
>>>
>>>
>>> *Damon Douglas*
>>>
>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>
>>> damondouglas@google.com
>>>
>>

Re: Update PR description template

Posted by Danny McCormick <da...@google.com>.
Brian beat me by a few seconds, but I actually agree that there is a
problem here - when Damon's PR was submitted there was basically no way of
knowing that checks would break without having detailed knowledge of how
the build system works (so I'm on board with your
Subject/Background/Assessment). With that said, I disagree with the
recommended path forward - our automation failed us here and I don't think
the correct response is to make our manual validation better. That shifts
the burden of verifying correctness from robots (easy to validate,
basically free) to humans (expensive, forgetful, and hard to validate,
especially for newer contributors). Instead, I think we should fix the
underlying technical problems, which are IMO:

1) Flaky tests unrelated to the PR caused it to fail
2) There were no guardrails in place to avoid merging a PR that wasn't
fully validated.

Things that would've made this better which I think we should focus our
attention on:

1) (easy) Split the linting task out from the test step.
2) This one is more controversial, but I would vote we start enforcing some
(or all?) precommit checks as required. I don't really blame anyone for
merging this PR since it had repeatedly run into unrelated flakes, but
that's kinda the point - we need to feel some of that pain to avoid
accidentally merging bad code. Blocking PRs on failed precommit suites
would force us to deal with these kinds of flakes and incentivize a lot of
good behavior in the repo (and disincentivize bad tests).

Thanks,
Danny

On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <bh...@google.com> wrote:

> Hi Damon,
> I think our CI system was working as intended for your PR. We select which
> PreCommits to run based on the files that are edited, and the
> author/reviewer should collaborate to get those green before merging the
> PR. Sometimes that will involve inspecting the logs in Jenkins to find the
> failing gradle tasks and run them locally to reproduce.
>
> It looks like what broke down in this case was that the PR was
> accidentally merged despite the failing Java PreCommit run.
>
> Brian
>
> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <da...@google.com>
> wrote:
>
>> I write this to prevent a recent experience I encountered contributing to
>> the Apache Beam repository.  To address a system issue, I write in SBAR
>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>> format for clarity.  If there are no objections to the recommendation, I'm
>> happy to fulfill the first recommendation i.e. create GitHub issues.
>>
>> *Subject*
>>
>> I submitted a PR <https://github.com/apache/beam/pull/21953> that was subsequently
>> reverted <https://github.com/apache/beam/pull/22044>.
>>
>> *Background*
>>
>> Despite passing the following gradle tasks, the PreCommit_Java_Commit failed
>> with a test unrelated to the code in the PR
>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>> The PR was merged and then reverted due to discovered vendor dependency
>> issues and failed style checks.
>>
>> ./gradlew rat
>> ./gradlew spotlessCheck
>> ./gradlew sdks:java:io:google-cloud-platform:test
>>
>> *Assessment*
>>
>> Output from the following gradle commands did not surface prior to the PR
>> merge.
>>
>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>
>> Both the aforementioned failing PreCommit_Java_Commit test and a lack of
>> knowledge contributed to the root cause.
>>
>> *Recommendation / Proposal*
>>
>> I propose updating PR template description
>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
>> prescribes gradle tasks to be successfully executed prior to submitting a
>> PR.  Toward this aim:
>>
>> 1.  Four separate GitHub issues in https://github.com/apache/beam
>> representative of *common*, and language specific: *Java*, *Python*, and
>> *Go*, prescribed gradle tasks
>> 2.  Don't reply to this email with specifics on how; delegation of
>> comments/discussion within the relevant context of the GitHub issues in 1
>> 3.  PRs that fix the GitHub issues in 1
>>
>> --
>>
>>
>>
>> *Damon Douglas*
>>
>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>
>> damondouglas@google.com
>>
>

Re: Update PR description template

Posted by Brian Hulette <bh...@google.com>.
Hi Damon,
I think our CI system was working as intended for your PR. We select which
PreCommits to run based on the files that are edited, and the
author/reviewer should collaborate to get those green before merging the
PR. Sometimes that will involve inspecting the logs in Jenkins to find the
failing gradle tasks and run them locally to reproduce.

It looks like what broke down in this case was that the PR was accidentally
merged despite the failing Java PreCommit run.

Brian

On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <da...@google.com>
wrote:

> I write this to prevent a recent experience I encountered contributing to
> the Apache Beam repository.  To address a system issue, I write in SBAR
> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
> format for clarity.  If there are no objections to the recommendation, I'm
> happy to fulfill the first recommendation i.e. create GitHub issues.
>
> *Subject*
>
> I submitted a PR <https://github.com/apache/beam/pull/21953> that was subsequently
> reverted <https://github.com/apache/beam/pull/22044>.
>
> *Background*
>
> Despite passing the following gradle tasks, the PreCommit_Java_Commit failed
> with a test unrelated to the code in the PR
> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
> The PR was merged and then reverted due to discovered vendor dependency
> issues and failed style checks.
>
> ./gradlew rat
> ./gradlew spotlessCheck
> ./gradlew sdks:java:io:google-cloud-platform:test
>
> *Assessment*
>
> Output from the following gradle commands did not surface prior to the PR
> merge.
>
> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>
> Both the aforementioned failing PreCommit_Java_Commit test and a lack of
> knowledge contributed to the root cause.
>
> *Recommendation / Proposal*
>
> I propose updating PR template description
> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> that
> prescribes gradle tasks to be successfully executed prior to submitting a
> PR.  Toward this aim:
>
> 1.  Four separate GitHub issues in https://github.com/apache/beam
> representative of *common*, and language specific: *Java*, *Python*, and
> *Go*, prescribed gradle tasks
> 2.  Don't reply to this email with specifics on how; delegation of
> comments/discussion within the relevant context of the GitHub issues in 1
> 3.  PRs that fix the GitHub issues in 1
>
> --
>
>
>
> *Damon Douglas*
>
> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>
> damondouglas@google.com
>