You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by me...@apache.org on 2018/08/20 18:01:19 UTC

[beam-site] 01/03: Add precommit policies and triage guide.

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 6fe36cc22a3b67fa0a4d4e635a463646e607577f
Author: Udi Meiri <eh...@google.com>
AuthorDate: Thu Aug 2 18:20:51 2018 -0700

    Add precommit policies and triage guide.
    
    Also update some paragraphs regarding precommits and postcommits in the
    testing guide.
---
 src/_includes/section-menu/contribute.html  |   4 +
 src/contribute/precommit-policies.md        |  66 +++++++++++++++
 src/contribute/precommit-triage-guide.md    | 125 ++++++++++++++++++++++++++++
 src/contribute/testing.md                   |  51 +++++++-----
 src/images/precommit_durations.png          | Bin 0 -> 45673 bytes
 src/images/precommit_graph_queuing_time.png | Bin 0 -> 25809 bytes
 6 files changed, 224 insertions(+), 22 deletions(-)

diff --git a/src/_includes/section-menu/contribute.html b/src/_includes/section-menu/contribute.html
index 07affbc..7a70f62 100644
--- a/src/_includes/section-menu/contribute.html
+++ b/src/_includes/section-menu/contribute.html
@@ -25,6 +25,9 @@
 
   <ul class="section-nav-list">
     <li><a href="{{ site.baseurl }}/contribute/testing/">Testing guide</a></li>
+    <ul>
+      <li><a href="{{ site.baseurl }}/contribute/precommit-triage-guide/">Precommit Slowness Triage Guide</a></li>
+    </ul>
     <li><a href="{{ site.baseurl }}/contribute/ptransform-style-guide/">PTransform style guide</a></li>
     <li><a href="{{ site.baseurl }}/contribute/runner-guide/">Runner authoring guide</a></li>
     <li><a href="{{ site.baseurl }}/contribute/portability/">Portability Framework</a></li>
@@ -36,6 +39,7 @@
 <li>
   <span class="section-nav-list-title">Policies</span>
   <ul class="section-nav-list">
+    <li><a href="{{ site.baseurl }}/contribute/precommit-policies/">Precommit test policies</a></li>
     <li><a href="{{ site.baseurl }}/contribute/postcommits-policies/">Post-commit tests policies</a></li>
   </ul>
 </li>
diff --git a/src/contribute/precommit-policies.md b/src/contribute/precommit-policies.md
new file mode 100644
index 0000000..7261283
--- /dev/null
+++ b/src/contribute/precommit-policies.md
@@ -0,0 +1,66 @@
+---
+layout: section
+title: "Precommit Test Policies"
+permalink: /contribute/precommit-policies/
+section_menu: section-menu/contribute.html
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Precommit test policies
+
+## Definitions
+
+- Precommit test - Any single test in a precommit test suite.
+- Precommit test suite - A collection of precommit tests that have a common
+denominator. A test suite runs in a single Jenkins job. Currently, suites are
+grouped by SDK languages, e.g., Python, Java, and Go.
+
+## Policies
+
+### Pull Requests
+
+- A PR must pass precommit tests before being committed to the main Beam repo.
+  - The relevant precommit test suites are automatically launched according to
+    PR contents.
+
+### Problems
+
+#### Breakage
+
+Breakage is when one or more tests in a precommit test suite fails or
+is flaky (occasionally fails).
+
+- Breakages should be fixed within 8 hours.
+
+#### Slowness
+
+Slowness is when the total time to run a precommit suite exceeds 30 minutes\*,
+including the time the job spends in the Jenkins queue.
+
+- Slowness should be fixed within 24 hours.
+
+\* See the [Precommit Slowness Triage
+Guide](/contribute/precommit-triage-guide/) for a precise definition of slowness
+and for information on dealing with slowness.
+
+### Problem Resolution
+
+For any problem, the options are, one of:
+
+- Roll back the culprit PR.
+- Roll out a fix within 24 hours.
+- Disable the slow test or feature temporarily (make sure there's a tracking
+  issue to re-enable it).
+
diff --git a/src/contribute/precommit-triage-guide.md b/src/contribute/precommit-triage-guide.md
new file mode 100644
index 0000000..4fc67a8
--- /dev/null
+++ b/src/contribute/precommit-triage-guide.md
@@ -0,0 +1,125 @@
+---
+layout: section
+title: "Precommit Slowness Triage Guide"
+permalink: /contribute/precommit-triage-guide/
+section_menu: section-menu/contribute.html
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Precommit Slowness Triage Guide
+
+Beam precommit jobs are suites of tests run automatically on Jenkins build
+machines for each pull request (PR) submitted to
+[apache/beam](https://github.com/apache/beam). For more information and the
+difference between precommits and postcommits, see
+[testing](/contribute/testing/).
+
+## What are fast precommits?
+
+Precommit tests are required to pass before a pull request (PR) may be merged.
+When these tests are slow they slow down Beam's development process.
+
+The aim is to have 95% of precommit jobs complete within 30 minutes
+(failing or passing).
+Technically, the 95th percentile of running time should be below 30 minutes over
+the past 4 weeks, where running time is the duration of time the job spends in
+the Jenkins queue + the actual time it spends running.
+
+## Detemining Slowness
+
+The current method for determining if precommmits are slow is to look at the
+[Jupyter
+notebook](https://github.com/apache/beam/tree/master/.test-infra/jupyter)
+`precommit_job_times.ipynb`.
+
+Run the notebook. It should output a table with running times. The numbers in
+the column `totalDurationMinutes_all` and the rows with a `job_name` like `4
+weeks 95th` contain the target numbers for determining slowness.
+If any of these numbers are above 30, triaging is required.
+
+### Example
+Here's an example table of running times:
+![example precommit duration table](/images/precommit_durations.png)
+
+In this example, Go precommits are taking approximately 14 minutes, which is
+fast. Java and Python precommits are taking 78 and 32 minutes respectively,
+which is slow. Both Java and Python precommits require triage.
+
+## Triage Process
+
+1. [Search for existing
+   issues](https://issues.apache.org/jira/issues/?filter=12344461)
+1. Create a new issue if needed: [Apache
+   JIRA](https://issues.apache.org/jira/issues)
+  - Project: Beam
+  - Components: testing, anything else relevant
+  - Label: precommit
+  - Reference this page in the description.
+1. Determine where the slowness is coming from and identify issues. Open
+   additional issues if needed (such as for multiple issues).
+1. Assign the issue as appropriate, e.g., to the test's or PR's author.
+
+## Resolution
+
+It is expected that slowness is resolved promptly. See [precommit test
+policies](/contribute/precommit-policies/) for details.
+
+## Possible Causes and Solutions
+
+This section lists some starting off points for fixing precommit slowness.
+
+### Jenkins
+
+Have a look at the graphs in the Jupyter notebook. Does the rise in total
+duration match the rise in queuing time? If so, the slowness might be unrelated
+to this specific precommit job.
+
+Example of when total and queuing durations rise and fall together (mostly):
+![graph of precommit times](/images/precommit_graph_queuing_time.png)
+
+Since Jenkins machines are a limited resource, other jobs can
+affect precommit queueing times. Try to figure out if other jobs have been
+recently slower, increased in frequency, or new jobs have been introduced.
+
+Another option is to look at adding more Jenkins machines.
+
+### Slow individual tests
+
+Sometimes a precommit job is slowed down due to one or more tests. One way of
+determining if this is the case is by looking at individual test timings.
+
+Where to find individual test timings:
+
+- Look at the `Gradle Build Scan` link on the precommit job's Jenkins page. This
+  page will contain individual test timings for Java tests only (2018-08).
+- Look at the `Test Result` link on the precommit job's Jenkins page. This
+  should be available for Java and Python tests (2018-08).
+
+Sometimes tests can be made faster by refactoring. A test that spends a lot of
+time waiting (such as an integration test) could be made to run concurrently with
+the other tests.
+
+If a test is determined to be too slow to be part of precommit tests, it could
+be removed from precommit and placed in postcommit instead (but it should be in
+postcommit already). In addition, ensure that the code covered by the removed
+test is covered by a unit test in precommit.
+
+### Slow integration tests
+
+Integration test slowdowns may be caused by dependent services.
+
+## References
+
+- [Beam Fast Precommits design doc](https://docs.google.com/document/d/1udtvggmS2LTMmdwjEtZCcUQy6aQAiYTI3OrTP8CLfJM/edit?usp=sharing)
diff --git a/src/contribute/testing.md b/src/contribute/testing.md
index ef0814b..301b931 100644
--- a/src/contribute/testing.md
+++ b/src/contribute/testing.md
@@ -26,30 +26,37 @@ systems at the bottom.
 
 ## Testing Scenarios
 
-With the tools at our disposal, we have a good set of utilities which we can use
-to verify Beam correctness. To ensure an ongoing high quality of code, we use
-precommit and postcommit testing.
+Ideally, all available tests should be run against a pull request (PR) before
+it's allowed to be committed to Beam's [Github](https://github.com/apache/beam)
+repo. This is not possible, however, due to a combination of time and resource
+constraints. Running all tests for each PR would take hours or even days using
+available resources, which would slow down development considerably.
+
+Thus tests are split into *precommit* and *postcommit* suites. Precommit is
+fast, while postcommit is comprehensive. (Or at least that's the idea.) As their
+names imply, precommit tests are run on each PR before it is committed, while
+postcommits run periodically against the master branch (i.e. on already
+committed PRs).
+
+Beam uses [Jenkins](https://builds.apache.org/view/A-D/view/Beam/) to run
+precommit and postcommit tests.
 
 ### Precommit
 
-For precommit testing, Beam uses
-[Jenkins](https://builds.apache.org/view/A-D/view/Beam/) and a code coverage tool
-called [Coveralls](https://coveralls.io/github/apache/beam), hooked up
-to [Github](https://github.com/apache/beam), to ensure that pull
-requests meet a certain quality bar. These precommits verify correctness via two
-of the below testing tools: unit tests (with coverage monitored by Coveralls)
-and E2E tests. We run the full slate of unit tests in precommit, ensuring
-correctness at a basic level, and then run the WordCount E2E test in both batch
-and streaming (coming soon!) against each supported SDK / runner combination as
-a smoke test, to verify that a basic level of functionality exists. We think
-that this hits the appropriate tradeoff between a desire for short (ideally
-\<30m) precommit times and a desire to verify that pull requests going into Beam
-function in the way in which they are intended.
-
-Precommit tests are kicked off when a user makes a Pull Request against the
-`apache/beam` repository and the Jenkins and Coveralls statuses are displayed at
-the bottom of the pull request page. Clicking on “Details” will open the status
-page in the selected tool; there, test status and output can be viewed.
+The precommit test suite verifies correctness via two testing tools: unit tests
+and end-to-end (E2E) tests. Unit tests ensure correctness at a basic level,
+while WordCount E2E tests are run againsts each supported SDK / runner
+combination as a smoke test, to verify that a basic level of functionality
+exists.
+
+This combination of tests hits the appropriate tradeoff between a desire for
+short (ideally \<30m) precommit times and a desire to verify that PRs going into
+Beam function in the way in which they are intended.
+
+Precommit jobs are kicked off when a contributor makes a PR against the
+`apache/beam` repository. Job statuses are displayed at the bottom of the PR
+page. Clicking on “Details” will open the status page in the selected tool;
+there, test status and output can be viewed.
 
 ### Postcommit
 
@@ -87,7 +94,7 @@ To run all unit tests, execute the following command in the ``sdks/python``
 subdirectory
 
 ```
-python setup.py test [-s apache_beam.package.module.TestClass.test_method]
+$ python setup.py test [-s apache_beam.package.module.TestClass.test_method]
 ```
 
 We also provide a [tox](https://tox.readthedocs.io/en/latest/) configuration
diff --git a/src/images/precommit_durations.png b/src/images/precommit_durations.png
new file mode 100644
index 0000000..c659677
Binary files /dev/null and b/src/images/precommit_durations.png differ
diff --git a/src/images/precommit_graph_queuing_time.png b/src/images/precommit_graph_queuing_time.png
new file mode 100644
index 0000000..5082943
Binary files /dev/null and b/src/images/precommit_graph_queuing_time.png differ