You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by ke...@apache.org on 2019/01/17 02:11:59 UTC
[beam] branch master updated: Move the contribution testing guide to the confluence wiki (#7486)

This is an automated email from the ASF dual-hosted git repository.

kenn pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
     new 1c1c7be  Move the contribution testing guide to the confluence wiki (#7486)
1c1c7be is described below

commit 1c1c7bee2361f6fc45c3c82a904419b777a4b3ef
Author: Alex Amato <aj...@google.com>
AuthorDate: Wed Jan 16 18:11:51 2019 -0800

    Move the contribution testing guide to the confluence wiki (#7486)
    
    * Move the contribution testing guide to the confluence wiki
    
    * Remove old page entirely and update links to it to go to the new wiki page
    
    * Made old testing contribution guide webpage redirect to new wiki webpage
    
    * remove redirect_in testing.md
    
    * add ignore for link checking
---
 website/Rakefile                                   |   1 +
 website/src/_includes/section-menu/contribute.html |   2 +-
 website/src/contribute/index.md                    |  10 +-
 website/src/contribute/postcommits-policies.md     |   2 +-
 website/src/contribute/precommit-triage-guide.md   |   2 +-
 website/src/contribute/ptransform-style-guide.md   |   2 +-
 website/src/contribute/testing.md                  | 416 +--------------------
 7 files changed, 12 insertions(+), 423 deletions(-)

diff --git a/website/Rakefile b/website/Rakefile
index 31ec492..14206eb 100644
--- a/website/Rakefile
+++ b/website/Rakefile
@@ -12,6 +12,7 @@ task :test do
     :check_html => true,
     :file_ignore => [/v2/],
     :url_ignore => [
+        /104.154.241.245/  # Preccommit dashboard is down [BEAM-6455].
         # To ignore link checking for a URL, i.e. if it is temporariliy down,
         # add a URL regex below with a tracking JIRA issue. For example:
         # /example.com/, # BEAM-1234 failing due to expired SSL cert
diff --git a/website/src/_includes/section-menu/contribute.html b/website/src/_includes/section-menu/contribute.html
index 7b34c90..98417d2 100644
--- a/website/src/_includes/section-menu/contribute.html
+++ b/website/src/_includes/section-menu/contribute.html
@@ -17,7 +17,7 @@
   <span class="section-nav-list-title">Technical Docs</span>
 
   <ul class="section-nav-list">
-    <li><a href="{{ site.baseurl }}/contribute/testing/">Testing guide</a></li>
+    <li><a href="https://cwiki.apache.org/confluence/display/BEAM/Contribution+Testing+Guide">Testing guide</a></li>
     <li><a href="{{ site.baseurl }}/contribute/precommit-triage-guide/">Pre-commit Slowness Triage</a></li>
     <li><a href="{{ site.baseurl }}/contribute/ptransform-style-guide/">PTransform style guide</a></li>
     <li><a href="{{ site.baseurl }}/contribute/runner-guide/">Runner authoring guide</a></li>
diff --git a/website/src/contribute/index.md b/website/src/contribute/index.md
index 865cd8c..011cfe2 100644
--- a/website/src/contribute/index.md
+++ b/website/src/contribute/index.md
@@ -101,7 +101,7 @@ To contribute code, you need
    [examples](https://s.apache.org/beam-design-docs)) and email it to the dev@ mailing list.
 
 ### Development Setup
-   
+
 1. If you need help with git forking, cloning, branching, committing, pull requests, and squashing commits, see
    [Git workflow tips](https://cwiki.apache.org/confluence/display/BEAM/Git+Tips)
 1. Familiarize yourself with gradle and the project structure. At the root of the git repository, run:
@@ -131,7 +131,7 @@ To contribute code, you need
        $ ./gradlew -p sdks/go check
        $ ./gradlew -p sdks/java/io/cassandra check
        $ ./gradlew -p runners/flink check
-       
+
 1. Now you may want to set up your preferred IDE and other aspects of your development
    environment. See the Developers' wiki for tips, guides, and FAQs on:
    - [IntelliJ](https://cwiki.apache.org/confluence/display/BEAM/Using+IntelliJ+IDE)
@@ -142,9 +142,9 @@ To contribute code, you need
    - [Gradle](https://cwiki.apache.org/confluence/display/BEAM/Gradle+Tips)
    - [Jenkins](https://cwiki.apache.org/confluence/display/BEAM/Jenkins+Tips)
    - [FAQ](https://cwiki.apache.org/confluence/display/BEAM/Contributor+FAQ)
-   
+
 ### Make your change
-   
+
 1. Make your code change. Every source file needs to include the Apache license header. Every new dependency needs to
    have an open source license [compatible](https://www.apache.org/legal/resolved.html#criteria) with Apache.
 1. Add unit tests for your change
@@ -155,7 +155,7 @@ To contribute code, you need
    Use descriptive commit messages that make it easy to identify changes and provide a clear history.
    To support efficient and quality review, avoid tiny or out-of-context changes and huge mega-changes.
 1. The pull request and any changes pushed to it will trigger [pre-commit
-   jobs](/contribute/testing/). If a test fails and appears unrelated to your
+   jobs](https://cwiki.apache.org/confluence/display/BEAM/Contribution+Testing+Guide#ContributionTestingGuide-Pre-commit). If a test fails and appears unrelated to your
    change, you can cause tests to be re-run by adding a single line comment on your
    PR
 
diff --git a/website/src/contribute/postcommits-policies.md b/website/src/contribute/postcommits-policies.md
index 5e2989a..7cf362b 100644
--- a/website/src/contribute/postcommits-policies.md
+++ b/website/src/contribute/postcommits-policies.md
@@ -94,7 +94,7 @@ If the bug is not in your code, here is how to "create a fix":
 
 ## Useful links
 
-*   [Best practices for writing tests]({{ site.baseurl }}/contribute/testing/index.html#best_practices)
+*   [Best practices for writing tests](https://cwiki.apache.org/confluence/display/BEAM/Contribution+Testing+Guide#ContributionTestingGuide-Bestpracticesforwritingtests)
 
 ## References
 
diff --git a/website/src/contribute/precommit-triage-guide.md b/website/src/contribute/precommit-triage-guide.md
index 6b48e78..33a7e08 100644
--- a/website/src/contribute/precommit-triage-guide.md
+++ b/website/src/contribute/precommit-triage-guide.md
@@ -24,7 +24,7 @@ Beam pre-commit jobs are suites of tests run automatically on Jenkins build
 machines for each pull request (PR) submitted to
 [apache/beam](https://github.com/apache/beam). For more information and the
 difference between pre-commits and post-commits, see
-[testing](/contribute/testing/).
+[testing](https://cwiki.apache.org/confluence/display/BEAM/Contribution+Testing+Guide).
 
 ## What are fast pre-commits?
 
diff --git a/website/src/contribute/ptransform-style-guide.md b/website/src/contribute/ptransform-style-guide.md
index d07529c..acf09de 100644
--- a/website/src/contribute/ptransform-style-guide.md
+++ b/website/src/contribute/ptransform-style-guide.md
@@ -181,7 +181,7 @@ Data processing is tricky, full of corner cases, and difficult to debug, because
 * To unit test `DoFn`s, `CombineFn`s, and `BoundedSource`s, consider using `DoFnTester`, `CombineFnTester`, and `SourceTestUtils` respectively which can exercise the code in non-trivial ways to flesh out potential bugs.
 * For transforms that work over unbounded collections, test their behavior in the presence of late or out-of-order data using `TestStream`.
 * Tests must pass 100% of the time, including in hostile, CPU- or network-constrained environments (continuous integration servers). Never put timing-dependent code (e.g. sleeps) into tests. Experience shows that no reasonable amount of sleeping is enough - code can be suspended for more than several seconds.
-* For detailed instructions on test code organization, see the [Beam Testing Guide]({{ site.baseurl }}/contribute/testing/).
+* For detailed instructions on test code organization, see the [Beam Testing Guide](https://cwiki.apache.org/confluence/display/BEAM/Contribution+Testing+Guide).
 
 #### Testing transform construction and validation
 
diff --git a/website/src/contribute/testing.md b/website/src/contribute/testing.md
index 7e21c22..37d21e9 100644
--- a/website/src/contribute/testing.md
+++ b/website/src/contribute/testing.md
@@ -3,428 +3,16 @@ layout: section
 title: 'Beam Testing'
 section_menu: section-menu/contribute.html
 permalink: /contribute/testing/
+redirect_to: https://cwiki.apache.org/confluence/display/BEAM/Contribution+Testing+Guide
 ---
 <!--
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
-
 http://www.apache.org/licenses/LICENSE-2.0
-
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
--->
-
-# Beam Testing
-
-This document outlines how to write tests, which tests are appropriate where,
-and when tests are run, with some additional information about the testing
-systems at the bottom.
-
-## Testing Scenarios
-
-Ideally, all available tests should be run against a pull request (PR) before
-it's allowed to be committed to Beam's [Github](https://github.com/apache/beam)
-repo. This is not possible, however, due to a combination of time and resource
-constraints. Running all tests for each PR would take hours or even days using
-available resources, which would slow down development considerably.
-
-Thus tests are split into *pre-commit* and *post-commit* suites. Pre-commit is
-fast, while post-commit is comprehensive. As their names imply, pre-commit tests
-are run on each PR before it is committed, while post-commits run periodically
-against the master branch (i.e. on already committed PRs).
-
-Beam uses [Jenkins](https://builds.apache.org/view/A-D/view/Beam/) to run
-pre-commit and post-commit tests.
-
-### Pre-commit
-
-The pre-commit test suite verifies correctness via two testing tools: unit tests
-and end-to-end (E2E) tests. Unit tests ensure correctness at a basic level,
-while WordCount E2E tests are run againsts each supported SDK / runner
-combination as a smoke test, to verify that a basic level of functionality
-exists.
-
-This combination of tests hits the appropriate tradeoff between a desire for
-short (ideally \<30m) pre-commit times and a desire to verify that PRs going
-into Beam function in the way in which they are intended.
-
-Pre-commit jobs are kicked off when a contributor makes a PR against the
-`apache/beam` repository. Job statuses are displayed at the bottom of the PR
-page. Clicking on “Details” will open the status page in the selected tool;
-there, you can view test status and output.
-
-### Post-commit
-
-Running in post-commit removes as stringent of a time constraint, which gives us
-the ability to do some more comprehensive testing. In post-commit we have a test
-suite running the ValidatesRunner tests against each supported runner, and
-another for running the full set of E2E tests against each runner.
-Currently-supported runners are Dataflow, Flink, Spark, and Gearpump, with
-others soon to follow. Work is ongoing to enable Flink, Spark, and Gearpump in
-the E2E framework, with full support targeted for end of August 2016.
-Post-commit tests run periodically, with timing defined in their Jenkins
-configurations.
-
-Adding new post-commit E2E tests is generally as easy as adding a \*IT.java file
-to the repository - Failsafe will notice it and run it - but if you want to do
-more interesting things, take a look at
-[WordCountIT.java](https://github.com/apache/beam/blob/master/examples/java/src/test/java/org/apache/beam/examples/WordCountIT.java).
-
-Post-commit test results can be found in
-[Jenkins](https://builds.apache.org/view/A-D/view/Beam/).
-
-## Testing Types
-
-### Unit
-
-Unit tests are, in Beam as everywhere else, the first line of defense in
-ensuring software correctness. As all of the contributors to Beam understand the
-importance of testing, Beam has a robust set of unit tests, as well as testing
-coverage measurement tools, which protect the codebase from simple to moderate
-breakages. Beam Java unit tests are written in JUnit.
-
-#### How to run Python unit tests
-
-Python tests are written using the standard Python unittest library.
-To run all unit tests, execute the following command in the ``sdks/python``
-subdirectory
-
-```
-$ python setup.py test [-s apache_beam.package.module.TestClass.test_method]
-```
-
-We also provide a [tox](https://tox.readthedocs.io/en/latest/) configuration
-in that same directory to run all the tests, including lint, cleanly in all
-desired configurations.
-
-#### How to run Java NeedsRunner tests
-
-NeedsRunner is a category of tests that require a Beam runner. To run
-NeedsRunner tests:
-
-```
-$ ./gradlew beam-runners-direct-java:needsRunnerTests
-```
-
-To run a single NeedsRunner test use the `test` property, e.g.
-
-```
-$ ./gradlew beam-runners-direct-java:needsRunnerTests --tests org.apache.beam.sdk.transforms.MapElementsTest.testMapBasic
-```
-
-will run the `MapElementsTest.testMapBasic()` test.
-
-NeedsRunner tests in modules that are not required to build runners (e.g.
-`sdks/java/io/google-cloud-platform`) can be executed with the `gradle test`
-command:
-
-```
-$ ./gradlew beam-sdks-java-io-google-cloud-platform:test --tests org.apache.beam.sdk.io.gcp.spanner.SpannerIOWriteTest
-```
-
-### ValidatesRunner
-
-ValidatesRunner tests contain components of both component and end-to-end
-tests. They fulfill the typical purpose of a component test - they are meant to
-test a well-scoped piece of Beam functionality or the interactions between two
-such pieces and can be run in a component-test-type fashion against the
-DirectRunner. Additionally, they are built with the ability to run in an
-end-to-end fashion against a runner, allowing them to verify not only core Beam
-functionality, but runner functionality as well. They are more lightweight than
-a traditional end-to-end test and, because of their well-scoped nature, provide
-good signal as to what exactly is working or broken against a particular runner.
-
-### E2E
-
-End-to-End tests are meant to verify at the very highest level that the Beam
-codebase is working as intended. Because they are implemented as a thin wrapper
-around existing pipelines, they can be used to prove that the core Beam
-functionality is available. They will be used to verify runner correctness, but
-they can also be used for IO connectors and other core functionality.
-
-## Testing Systems
-
-### E2E Testing Framework
-
-The Beam end-to-end testing framework is a framework designed in a
-runner-agnostic fashion to exercise the entire lifecycle of a Beam pipeline. We
-run a pipeline as a user would and allow it to run to completion in the same
-way, verifying after completion that it behaved how we expected. Using pipelines
-from the Beam examples, or custom-built pipelines, the framework will provide
-hooks during several pipeline lifecycle events, e.g., pipeline creation,
-pipeline success, and pipeline failure, to allow verification of pipeline state.
-
-The E2E testing framework is currently built to execute the tests in [PerfKit
-Benchmarker](https://github.com/GoogleCloudPlatform/PerfKitBenchmarker),
-invoked via Gradle tasks. Once it is determined how Python and other future
-languages will integrate into the overall build/test system (via Gradle or
-otherwise) we will adjust this. The framework provides a wrapper around actual
-Beam pipelines, enabling those pipelines to be run in an environment which
-facilitates verification of pipeline results and details.
-
-Verifiers include:
-
-*   Output verification. Output verifiers ensure that the pipeline has produced
-    the expected output. Current verifiers check text-based output, but future
-    verifiers could support other output such as BigQuery and Datastore.
-*   Aggregator verification. Aggregator verifiers ensure that the user-defined
-    aggregators present in the pipelines under test finish in the expected
-    state.
-
-The E2E framework will support running on various different configurations of
-environments. We currently provide the ability to run against the DirectRunner,
-against a local Spark instance, a local Flink instance, and against the Google
-Cloud Dataflow service.
-
-### ValidatesRunner Tests
-
-ValidatesRunner tests are tests built to use the Beam TestPipeline class,
-which enables test authors to write simple functionality verification. They are
-meant to use some of the built-in utilities of the SDK, namely PAssert, to
-verify that the simple pipelines they run end in the correct state.
-
-
-### Effective use of the TestPipeline JUnit rule
-
-`TestPipeline` is JUnit rule designed to facilitate testing pipelines.
-In combination with `PAssert`, the two can be used for testing and
-writing assertions over pipelines. However, in order for these assertions
-to be effective, the constructed pipeline **must** be run by a pipeline
-runner. If the pipeline is not run (i.e., executed) then the
-constructed `PAssert` statements will not be triggered, and will thus
-be ineffective.
-
-To prevent such cases, `TestPipeline` has some protection mechanisms in place.
-
-__Abandoned node detection (performed automatically)__
-
-Abandoned nodes are `PTransforms`, `PAsserts` included, that were not
-executed by the pipeline runner. Abandoned nodes are most likely to occur
-due to the one of the following scenarios:
- 1. Lack of a `pipeline.run()` statement at the end of a test.
- 2. Addition of `PTransform`s  after the pipeline has already run.
-
-Abandoned node detection is *automatically enabled* when a real pipeline
-runner (i.e. not a `CrashingRunner`) and/or a
-`@NeedsRunner` / `@ValidatesRunner` annotation are detected.
-
-Consider the following test:
-
-```java
-// Note the @Rule annotation here
-@Rule
-public final transient TestPipeline pipeline = TestPipeline.create();
-
-@Test
-@Category(NeedsRunner.class)
-public void myPipelineTest() throws Exception {
-
-final PCollection<String> pCollection =
-  pipeline
-    .apply("Create", Create.of(WORDS).withCoder(StringUtf8Coder.of()))
-    .apply(
-        "Map1",
-        MapElements.via(
-            new SimpleFunction<String, String>() {
-
-              @Override
-              public String apply(final String input) {
-                return WHATEVER;
-              }
-            }));
-
-PAssert.that(pCollection).containsInAnyOrder(WHATEVER);       
-
-/* ERROR: pipeline.run() is missing, PAsserts are ineffective */
-}
-```
-
-```py
-# The suggested pattern of using pipelines as targets of with statements
-# eliminates the possibility for this kind of error or a framework
-# to catch it.
-
-with beam.Pipeline(...) as p:
-    [...arbitrary construction...]
-    # p.run() is automatically called on successfully exiting the context
-```
-
-The `PAssert` at the end of this test method will not be executed, since
-`pipeline` is never run, making this test ineffective. If this test method
-is run using an actual pipeline runner, an exception will be thrown
-indicating that there was no `run()` invocation in the test.
-
-Exceptions that are thrown prior to executing a pipeline, will fail
-the test unless handled by an `ExpectedException` rule.
-
-Consider the following test:  
-
-```java
-// Note the @Rule annotation here
-@Rule
-public final transient TestPipeline pipeline = TestPipeline.create();
-
-@Test
-public void testReadingFailsTableDoesNotExist() throws Exception {
-  final String table = "TEST-TABLE";
-
-  BigtableIO.Read read =
-      BigtableIO.read()
-          .withBigtableOptions(BIGTABLE_OPTIONS)
-          .withTableId(table)
-          .withBigtableService(service);
-
-  // Exception will be thrown by read.validate() when read is applied.
-  thrown.expect(IllegalArgumentException.class);
-  thrown.expectMessage(String.format("Table %s does not exist", table));
-
-  p.apply(read);
-}
-```
-
-```py
-# Unneeded in Beam's Python SDK.
-```  
-
-The application of the `read` transform throws an exception, which is then
-handled by the `thrown` `ExpectedException` rule.
-In light of this exception, the fact this test has abandoned nodes
-(the `read` transform) does not play a role since the test fails before
-the pipeline would have been executed (had there been a `run()` statement).
-
-__Auto-add `pipeline.run()` (disabled by default)__
-
-A `TestPipeline` instance can be configured to auto-add a missing `run()`
-statement by setting `testPipeline.enableAutoRunIfMissing(true/false)`.
-If this feature is enabled, no exception will be thrown in case of a
-missing `run()` statement, instead, one will be added automatically.
-
-
-### API Surface testing
-
-The surface of an API is the set of public classes that are exposed to the
-outer world. In order to keep the API tight and avoid unnecessarily exposing
-classes, Beam provides the `ApiSurface` utility class.
-Using the `ApiSurface` class,  we can assert the API surface against an
-expected set of classes.
-
-Consider the following snippet:
-```java
-@Test
-public void testMyApiSurface() throws Exception {
-
-    final Package thisPackage = getClass().getPackage();
-    final ClassLoader thisClassLoader = getClass().getClassLoader();
-
-    final ApiSurface apiSurface =
-        ApiSurface.ofPackage(thisPackage, thisClassLoader)
-            .pruningPattern("org[.]apache[.]beam[.].*Test.*")
-            .pruningPattern("org[.]apache[.]beam[.].*IT")
-            .pruningPattern("java[.]lang.*");
-
-    @SuppressWarnings("unchecked")
-    final Set<Matcher<Class<?>>> allowed =
-        ImmutableSet.of(
-            classesInPackage("org.apache.beam.x"),
-            classesInPackage("org.apache.beam.y"),
-            classesInPackage("org.apache.beam.z"),
-            Matchers.<Class<?>>equalTo(Other.class));
-
-    assertThat(apiSurface, containsOnlyClassesMatching(allowed));
-}
-```
-
-```py
-# Unsupported in Beam's Python SDK.
-```
-
-This test will fail if the classes exposed by `getClass().getPackage()`, except
-classes which reside under `"org[.]apache[.]beam[.].*Test.*"`,  
-`"org[.]apache[.]beam[.].*IT"` or `"java[.]lang.*"`, belong to neither
-of the packages: `org.apache.beam.x`, `org.apache.beam.y`, `org.apache.beam.z`,
-nor equal to `Other.class`.
-
-## Best practices for writing tests {#best_practices}
-
-The following best practices help you to write reliable and maintainable tests.
-
-### Aim for one failure path
-
-An ideal test has one failure path. When you create your tests, minimize the
-possible reasons for a test failure. A developer can debug a problem more
-easily when there are fewer failure paths.
-
-### Avoid non-deterministic code
-
-Reliable tests are predictable and deterministic. Tests that contain
-non-deterministic code are hard to debug and are often flaky. Non-deterministic
-code includes the use of randomness, time, and multithreading.
-
-To avoid non-deterministic code, mock the corresponding methods or classes.
-
-### Use descriptive test names
-
-Helpful test names contain details about your test, such as test parameters and
-the expected result. Ideally, a developer can read the test name and know where
-the buggy code is and how to reproduce the bug.
-
-An easy and effective way to name your methods is to use these three questions:
-
-*   What you are testing?
-*   What are the parameters of the test?
-*   What is the expected result of the test?
-
-For example, consider a scenario where you want to add a test for the
-`Divide` method:
-
-```java
-float Divide(float dividend, float divisor) {
-  return dividend / divisor;
-}
-
-...
-
-@Test
-void <--TestMethodName-->() {
-    assertThrows(Divide(10, 0))
-}
-```
-
-If you use a simple test name, such as `testDivide()`, you are missing important
-information such as the expected action, parameter information, and expected
-test result. As a result, triaging a test failure requires you to look at the
-test implementation to see what the test does.
-
-Instead, use a name such as `invokingDivideWithDivisorEqualToZeroThrowsException()`,
-which specifies:
-
-*   the expected action of the test (`invokingDivide`)
-*   details about important parameters (the divisor is zero)
-*   the expected result (the test throws an exception)
-
-If this test fails, you can look at the descriptive test name to find the most
-probable cause of the failure. In addition, test frameworks and test result
-dashboards use the test name when reporting test results. Descriptive names
-enable contributors to look at test suite results and easily see what
-features are failing.
-
-Long method names are not a problem for test code. Test names are rarely used
-(usually when you triage and debug), and when you do need to look at a
-test, it is helpful to have descriptive names.
-
-
-### Use a pre-commit test if possible
-
-Post-commit tests validate that Beam works correctly in broad variety of
-scenarios. The tests catch errors that are hard to predict in the design and
-implementation stages
-
-However, we often write a test to verify a specific scenario. In this situation,
-it is usually possible to implement the test as a unit test or a component test.
-You can add your unit tests or component tests to the pre-commit test suite, and
-the pre-commit test results give you faster code health feedback during the
-development stage, when a bug is cheap to fix.
+-->
\ No newline at end of file