You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by fr...@apache.org on 2016/10/20 18:11:16 UTC

[1/3] incubator-beam-site git commit: Add Blog Post on TestStream

Repository: incubator-beam-site
Updated Branches:
  refs/heads/asf-site a7be66d9e -> fdd12fa2c


Add Blog Post on TestStream

The post goes into details on the previous and expanded testing
infrastructure provided by the Beam SDK and provides examples of the
times at which elements arrive, and how that impacts the outputs
produced by a PipelineRunner.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/d4ebf2a6
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/d4ebf2a6
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/d4ebf2a6

Branch: refs/heads/asf-site
Commit: d4ebf2a67e58d2428710bc7b930d6caf62f4d630
Parents: a7be66d
Author: Thomas Groh <tg...@google.com>
Authored: Thu Oct 13 17:26:30 2016 -0700
Committer: Frances Perry <fj...@google.com>
Committed: Thu Oct 20 11:03:03 2016 -0700

----------------------------------------------------------------------
 _data/authors.yml                               |   5 +-
 _posts/2016-10-20-test-stream.md                | 309 +++++++++++++++++++
 .../blog/test-stream/elements-all-on-time.png   | Bin 0 -> 14817 bytes
 .../test-stream/elements-droppably-late.png     | Bin 0 -> 14958 bytes
 .../test-stream/elements-observably-late.png    | Bin 0 -> 16338 bytes
 .../elements-processing-speculative.png         | Bin 0 -> 17074 bytes
 .../test-stream/elements-unobservably-late.png  | Bin 0 -> 14550 bytes
 .../blog/test-stream/elements-all-on-time.png   | Bin 0 -> 14743 bytes
 .../test-stream/elements-droppably-late.png     | Bin 0 -> 14800 bytes
 .../test-stream/elements-observably-late.png    | Bin 0 -> 16219 bytes
 .../elements-processing-speculative.png         | Bin 0 -> 16987 bytes
 .../test-stream/elements-unobservably-late.png  | Bin 0 -> 14437 bytes
 12 files changed, 313 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/_data/authors.yml
----------------------------------------------------------------------
diff --git a/_data/authors.yml b/_data/authors.yml
index 808513f..ffcb419 100644
--- a/_data/authors.yml
+++ b/_data/authors.yml
@@ -26,6 +26,9 @@ takidau:
     name: Tyler Akidau
     email: takidau@apache.org
     twitter: takidau
+tgroh:
+    name: Thomas Groh
+    email: tgroh@google.com
 jesseanderson:
     name: Jesse Anderson
-    twitter: jessetanderson
\ No newline at end of file
+    twitter: jessetanderson

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/_posts/2016-10-20-test-stream.md
----------------------------------------------------------------------
diff --git a/_posts/2016-10-20-test-stream.md b/_posts/2016-10-20-test-stream.md
new file mode 100644
index 0000000..2bd3e79
--- /dev/null
+++ b/_posts/2016-10-20-test-stream.md
@@ -0,0 +1,309 @@
+---
+layout: post
+title:  "Testing Unbounded Pipelines in Apache Beam"
+date:   2016-10-20 10:00:00 -0800
+excerpt_separator: <!--more-->
+categories: blog
+authors:
+- tgroh
+---
+
+The Beam Programming Model unifies writing pipelines for Batch and Streaming
+pipelines. We\u2019ve recently introduced a new PTransform to write tests for
+pipelines that will be run over unbounded datasets and must handle out-of-order
+and delayed data.
+<!--more-->
+
+Watermarks, Windows and Triggers form a core part of the Beam programming model
+-- they respectively determine how your data are grouped, when your input is
+complete, and when to produce results. This is true for all pipelines,
+regardless of if they are processing bounded or unbounded inputs. If you\u2019re not
+familiar with watermarks, windowing, and triggering in the Beam model,
+[Streaming 101](https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101)
+and [Streaming 102](https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102)
+are an excellent place to get started. A key takeaway from
+these articles: in realistic streaming scenarios with intermittent failures and
+disconnected users, data can arrive out of order or be delayed. Beam\u2019s
+primitives provide a way for users to perform useful, powerful, and correct
+computations in spite of these challenges.
+
+As Beam pipeline authors, we need comprehensive tests that cover crucial
+failure scenarios and corner cases to gain real confidence that a pipeline is
+ready for production. The existing testing infrastructure within the Beam SDKs
+permits tests to be written which examine the contents of a Pipeline at
+execution time. However, writing unit tests for pipelines that may receive
+late data or trigger multiple times has historically ranged from complex to
+not possible, as pipelines that read from unbounded sources do not shut down
+without external intervention, while pipelines that read from bounded sources
+exclusively cannot test behavior with late data nor most speculative triggers.
+Without additional tools, pipelines that use custom triggers and handle
+out-of-order data could not be easily tested.
+
+This blog post introduces our new framework for writing tests for pipelines that
+handle delayed and out-of-order data in the context of the LeaderBoard pipeline
+from the Mobile Gaming example series.
+
+## LeaderBoard and the Mobile Gaming Example
+
+[LeaderBoard](https://github.com/apache/incubator-beam/blob/master/examples/java8/src/main/java/org/apache/beam/examples/complete/game/LeaderBoard.java#L177)
+is part of the [Beam mobile gaming examples](https://github.com/apache/incubator-beam/tree/master/examples/java8/src/main/java/org/apache/beam/examples/complete/game)
+(and [walkthroughs](http://beam.incubator.apache.org/use/walkthroughs/))
+which produces a continuous accounting of user and team scores. User scores are
+calculated over the lifetime of the program, while team scores are calculated
+within fixed windows with a default duration of one hour. The LeaderBoard
+pipeline produces speculative and late panes as appropriate, based on the
+configured triggering and allowed lateness of the pipeline. The expected outputs
+of the LeaderBoard pipeline vary depending on when elements arrive in relation
+to the watermark and the progress of processing time, which could not previously
+be controlled within a test.
+
+## Writing Deterministic Tests to Emulate Nondeterminism
+
+The Beam testing infrastructure provides the
+[PAssert](http://beam.incubator.apache.org/learn/sdks/javadoc/0.2.0-incubating/)
+methods, which assert properties about the contents of a PCollection from within
+a pipeline. We have expanded this infrastructure to include
+[TestStream](https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestStream.java),
+which is a PTransform that performs a series of events, consisting of adding
+additional elements to a pipeline, advancing the watermark of the TestStream,
+and advancing the pipeline processing time clock. TestStream permits tests which
+observe the effects of triggers on the output a pipeline produces.
+
+While executing a pipeline that reads from a TestStream, the read waits for all
+of the consequences of each event to complete before continuing on to the next
+event, ensuring that when processing time advances, triggers that are based on
+processing time fire as appropriate. With this transform, the effect of
+triggering and allowed lateness can be observed on a pipeline, including
+reactions to speculative and late panes and dropped data.
+
+## Element Timings
+
+Elements arrive either behind, with, or after the watermark, which categorizes
+them into the "early", "on-time", and "late" divisions. "Late" elements can be
+further subdivided into "unobservably", "observably", and "droppably" late,
+depending on the window to which they are assigned and the maximum allowed
+lateness, as specified by the windowing strategy. Elements that arrive with
+these timings are emitted into panes, which can be "EARLY", "ON-TIME", or
+"LATE", depending on the position of the watermark when the pane was emitted.
+
+Using TestStream, we can write tests that demonstrate that speculative panes are
+output after their trigger condition is met, that the advancing of the watermark
+causes the on-time pane to be produced, and that late-arriving data produces
+refinements when it arrives before the maximum allowed lateness, and is dropped
+after.
+
+The following examples demonstrate how you can use TestStream to provide a
+sequence of events to the Pipeline, where the arrival of elements is interspersed
+with updates to the watermark and the advance of processing time. Each of these
+events runs to completion before additional events occur.
+
+In the diagrams, the time at which events occurred in "real" (event) time
+progresses as the graph moves to the right. The time at which the pipeline
+receives them progresses as the graph goes upwards. The watermark is represented
+by the squiggly red line, and each starburst is the firing of a trigger and the
+associated pane.
+
+<img class="center-block" src="{{ "/images/blog/test-stream/elements-all-on-time.png" | prepend: site.baseurl }}" alt="Elements on the Event and Processing time axes, with the Watermark and produced panes" width="442">
+
+### Everything arrives on-time
+
+For example, if we create a TestStream where all the data arrives before the
+watermark and provide the result PCollection as input to the CalculateTeamScores
+PTransform:
+
+```
+TestStream<GameActionInfo> infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo("sky", "blue", 12, new Instant(0L)),
+ ����������������new GameActionInfo("navy", "blue", 3, new Instant(0L)),
+ ����������������new GameActionInfo("navy", "blue", 3, new Instant(0L).plus(Duration.standardMinutes(3))))
+ ���// Move the watermark past the end the end of the window
+    .advanceWatermarkTo(new Instant(0L).plus(TEAM_WINDOW_DURATION)
+ �������������������������������       .plus(Duration.standardMinutes(1)))
+    .advanceWatermarkToInfinity();
+
+PCollection<KV<String, Integer>> teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+```
+
+we can then assert that the result PCollection contains elements that arrived:
+
+<img class="center-block" src="{{ "/images/blog/test-stream/elements-all-on-time.png" | prepend: site.baseurl }}" alt="Elements all arrive before the watermark, and are produced in the on-time pane" width="442">
+
+```
+// Only one value is emitted for the blue team
+PAssert.that(teamScores)
+       .inWindow(window)
+       .containsInAnyOrder(KV.of("blue", 18));
+p.run();
+```
+
+### Some elements are late, but arrive before the end of the window
+
+We can also add data to the TestStream after the watermark, but before the end
+of the window (shown below to the left of the red watermark), which demonstrates
+"unobservably late" data - that is, data that arrives late, but is promoted by
+the system to be on time, as it arrives before the watermark passes the end of
+the window
+
+```
+TestStream<GameActionInfo> infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo("sky", "blue", 3, new Instant(0L)),
+ �������         new GameActionInfo("navy", "blue", 3, new Instant(0L).plus(Duration.standardMinutes(3))))
+ ���// Move the watermark up to "near" the end of the window
+    .advanceWatermarkTo(new Instant(0L).plus(TEAM_WINDOW_DURATION)
+ ���������������       ����������������.minus(Duration.standardMinutes(1)))
+    .addElements(new GameActionInfo("sky", "blue", 12, Duration.ZERO))
+    .advanceWatermarkToInfinity();
+
+PCollection<KV<String, Integer>> teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+```
+
+<img class="center-block" src="{{ "/images/blog/test-stream/elements-unobservably-late.png" | prepend: site.baseurl }}" alt="An element arrives late, but before the watermark passes the end of the window, and is produced in the on-time pane" width="442">
+
+```
+// Only one value is emitted for the blue team
+PAssert.that(teamScores)
+       .inWindow(window)
+       .containsInAnyOrder(KV.of("blue", 18));
+p.run();
+```
+
+### Elements are late, and arrive after the end of the window
+
+By advancing the watermark farther in time before adding the late data, we can
+demonstrate the triggering behavior that causes the system to emit an on-time
+pane, and then after the late data arrives, a pane that refines the result.
+
+```
+TestStream<GameActionInfo> infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo("sky", "blue", 3, new Instant(0L)),
+       ����������new GameActionInfo("navy", "blue", 3, new Instant(0L).plus(Duration.standardMinutes(3))))
+� ��// Move the watermark up to "near" the end of the window
+    .advanceWatermarkTo(new Instant(0L).plus(TEAM_WINDOW_DURATION)
+��������������������������������       .minus(Duration.standardMinutes(1)))
+    .addElements(new GameActionInfo("sky", "blue", 12, Duration.ZERO))
+    .advanceWatermarkToInfinity();
+
+PCollection<KV<String, Integer>> teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+```
+
+<img class="center-block" src="{{ "/images/blog/test-stream/elements-observably-late.png" | prepend: site.baseurl }}" alt="Elements all arrive before the watermark, and are produced in the on-time pane" width="442">
+
+```
+// An on-time pane is emitted with the events that arrived before the window closed
+PAssert.that(teamScores)
+       .inOnTimePane(window)
+       .containsInAnyOrder(KV.of("blue", 6));
+// The final pane contains the late refinement
+PAssert.that(teamScores)
+       .inFinalPane(window)
+       .containsInAnyOrder(KV.of("blue", 18));
+p.run();
+```
+
+### Elements are late, and after the end of the window plus the allowed lateness
+
+If we push the watermark even further into the future, beyond the maximum
+configured allowed lateness, we can demonstrate that the late element is dropped
+by the system.
+
+```
+TestStream<GameActionInfo> infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo("sky", "blue", 3, Duration.ZERO),
+        ���������new GameActionInfo("navy", "blue", 3, Duration.standardMinutes(3)))
+����// Move the watermark up to "near" the end of the window
+    .advanceWatermarkTo(new Instant(0).plus(TEAM_WINDOW_DURATION)
+ ����������������������������������������.plus(ALLOWED_LATENESS)
+ ����������������������������������������.plus(Duration.standardMinutes(1)))
+    .addElements(new GameActionInfo(
+���������������������"sky",
+���������������������"blue",
+���������������������12,
+���������������������new Instant(0).plus(TEAM_WINDOW_DURATION).minus(Duration.standardMinutes(1))))
+    .advanceWatermarkToInfinity();
+
+PCollection<KV<String, Integer>> teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+```
+
+<img class="center-block" src="{{ "/images/blog/test-stream/elements-droppably-late.png" | prepend: site.baseurl }}" alt="Elements all arrive before the watermark, and are produced in the on-time pane" width="442">
+
+```
+// An on-time pane is emitted with the events that arrived before the window closed
+PAssert.that(teamScores)
+       .inWindow(window)
+       .containsInAnyOrder(KV.of("blue", 6));
+
+p.run();
+```
+
+### Elements arrive before the end of the window, and some processing time passes
+Using additional methods, we can demonstrate the behavior of speculative
+triggers by advancing the processing time of the TestStream. If we add elements
+to an input PCollection, occasionally advancing the processing time clock, and
+apply `CalculateUserScores`
+
+```
+TestStream.create(AvroCoder.of(GameActionInfo.class))
+ ���.addElements(new GameActionInfo("scarlet", "red", 3, new Instant(0L)),
+ ��������������  new GameActionInfo("scarlet", "red", 2, new Instant(0L).plus(Duration.standardMinutes(1))))
+    .advanceProcessingTime(Duration.standardMinutes(12))
+ ���.addElements(new GameActionInfo("oxblood", "red", 2, new Instant(0L)).plus(Duration.standardSeconds(22)),
+ ��������������  new GameActionInfo("scarlet", "red", 4, new Instant(0L).plus(Duration.standardMinutes(2))))
+    .advanceProcessingTime(Duration.standardMinutes(15))
+    .advanceWatermarkToInfinity();
+
+PCollection<KV<String, Integer>> userScores =
+ ���p.apply(infos).apply(new CalculateUserScores(ALLOWED_LATENESS));
+```
+
+<img class="center-block" src="{{ "/images/blog/test-stream/elements-processing-speculative.png" | prepend: site.baseurl }}" alt="Elements all arrive before the watermark, and are produced in the on-time pane" width="442">
+
+```
+PAssert.that(userScores)
+       .inEarlyGlobalWindowPanes()
+       .containsInAnyOrder(KV.of("scarlet", 5),
+   ������������������������KV.of("scarlet", 9),
+                           KV.of("oxblood", 2));
+
+p.run();
+```
+
+## TestStream - Under the Hood
+
+TestStream relies on a pipeline concept we\u2019ve introduced, called quiescence, to
+utilize the existing runner infrastructure while providing guarantees about when
+a root transform will called by the runner. This consists of properties about
+pending elements and triggers, namely:
+
+* No trigger is permitted to fire but has not fired
+* All elements are either buffered in state or cannot progress until a side input becomes available
+
+Simplified, this means that, in the absence of an advancement in input
+watermarks or processing time, or additional elements being added to the
+pipeline, the pipeline will not make progress. Whenever the TestStream PTransform
+performs an action, the runner must not reinvoke the same instance until the
+pipeline has quiesced. This ensures that the events specified by TestStream
+happen "in-order", which ensures that input watermarks and the system clock do
+not advance ahead of the elements they hoped to hold up.
+
+The DirectRunner has been modified to use quiescence as the signal that it
+should add more work to the Pipeline, and the implementation of TestStream in
+that runner uses this fact to perform a single output per event. The DirectRunner
+implementation also directly controls the runner\u2019s system clock, ensuring that
+tests will complete promptly even if there is a multi-minute processing time
+trigger located within the pipeline.
+
+The TestStream transform is supported in the DirectRunner. For most users, tests
+written using TestPipeline and PAsserts will automatically function while using
+TestStream.
+
+## Summary
+
+The addition of TestStream alongside window and pane-specific matchers in PAssert
+has enabled the testing of Pipelines which produce speculative and late panes.
+This permits tests for all styles of pipeline to be expressed directly within the
+Java SDK. If you have questions or comments, we\u2019d love to hear them on the
+[mailing lists](http://beam.incubator.apache.org/use/mailing-lists/).

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/content/images/blog/test-stream/elements-all-on-time.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-all-on-time.png b/content/images/blog/test-stream/elements-all-on-time.png
new file mode 100644
index 0000000..c16112f
Binary files /dev/null and b/content/images/blog/test-stream/elements-all-on-time.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/content/images/blog/test-stream/elements-droppably-late.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-droppably-late.png b/content/images/blog/test-stream/elements-droppably-late.png
new file mode 100644
index 0000000..9a12870
Binary files /dev/null and b/content/images/blog/test-stream/elements-droppably-late.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/content/images/blog/test-stream/elements-observably-late.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-observably-late.png b/content/images/blog/test-stream/elements-observably-late.png
new file mode 100644
index 0000000..a874a1d
Binary files /dev/null and b/content/images/blog/test-stream/elements-observably-late.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/content/images/blog/test-stream/elements-processing-speculative.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-processing-speculative.png b/content/images/blog/test-stream/elements-processing-speculative.png
new file mode 100644
index 0000000..8d1224c
Binary files /dev/null and b/content/images/blog/test-stream/elements-processing-speculative.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/content/images/blog/test-stream/elements-unobservably-late.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-unobservably-late.png b/content/images/blog/test-stream/elements-unobservably-late.png
new file mode 100644
index 0000000..f3bcfa1
Binary files /dev/null and b/content/images/blog/test-stream/elements-unobservably-late.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/images/blog/test-stream/elements-all-on-time.png
----------------------------------------------------------------------
diff --git a/images/blog/test-stream/elements-all-on-time.png b/images/blog/test-stream/elements-all-on-time.png
new file mode 100644
index 0000000..5892e88
Binary files /dev/null and b/images/blog/test-stream/elements-all-on-time.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/images/blog/test-stream/elements-droppably-late.png
----------------------------------------------------------------------
diff --git a/images/blog/test-stream/elements-droppably-late.png b/images/blog/test-stream/elements-droppably-late.png
new file mode 100644
index 0000000..0fadb0f
Binary files /dev/null and b/images/blog/test-stream/elements-droppably-late.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/images/blog/test-stream/elements-observably-late.png
----------------------------------------------------------------------
diff --git a/images/blog/test-stream/elements-observably-late.png b/images/blog/test-stream/elements-observably-late.png
new file mode 100644
index 0000000..274b068
Binary files /dev/null and b/images/blog/test-stream/elements-observably-late.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/images/blog/test-stream/elements-processing-speculative.png
----------------------------------------------------------------------
diff --git a/images/blog/test-stream/elements-processing-speculative.png b/images/blog/test-stream/elements-processing-speculative.png
new file mode 100644
index 0000000..b905d2c
Binary files /dev/null and b/images/blog/test-stream/elements-processing-speculative.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/d4ebf2a6/images/blog/test-stream/elements-unobservably-late.png
----------------------------------------------------------------------
diff --git a/images/blog/test-stream/elements-unobservably-late.png b/images/blog/test-stream/elements-unobservably-late.png
new file mode 100644
index 0000000..74c0195
Binary files /dev/null and b/images/blog/test-stream/elements-unobservably-late.png differ


[3/3] incubator-beam-site git commit: This closes #46

Posted by fr...@apache.org.
This closes #46


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/fdd12fa2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/fdd12fa2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/fdd12fa2

Branch: refs/heads/asf-site
Commit: fdd12fa2c8e41ba1f6c560459ba7b726884cc2ed
Parents: a7be66d 41a3691
Author: Frances Perry <fj...@google.com>
Authored: Thu Oct 20 11:10:37 2016 -0700
Committer: Frances Perry <fj...@google.com>
Committed: Thu Oct 20 11:10:37 2016 -0700

----------------------------------------------------------------------
 _data/authors.yml                               |   5 +-
 _posts/2016-10-20-test-stream.md                | 309 ++++++++++++
 content/blog/2016/10/20/test-stream.html        | 481 +++++++++++++++++++
 content/blog/index.html                         |  19 +
 content/feed.xml                                | 351 ++++++++++++--
 .../blog/test-stream/elements-all-on-time.png   | Bin 0 -> 14743 bytes
 .../test-stream/elements-droppably-late.png     | Bin 0 -> 14800 bytes
 .../test-stream/elements-observably-late.png    | Bin 0 -> 16219 bytes
 .../elements-processing-speculative.png         | Bin 0 -> 16987 bytes
 .../test-stream/elements-unobservably-late.png  | Bin 0 -> 14437 bytes
 content/index.html                              |   2 +
 .../learn/runners/capability-matrix/index.html  |   2 +-
 .../blog/test-stream/elements-all-on-time.png   | Bin 0 -> 14743 bytes
 .../test-stream/elements-droppably-late.png     | Bin 0 -> 14800 bytes
 .../test-stream/elements-observably-late.png    | Bin 0 -> 16219 bytes
 .../elements-processing-speculative.png         | Bin 0 -> 16987 bytes
 .../test-stream/elements-unobservably-late.png  | Bin 0 -> 14437 bytes
 17 files changed, 1131 insertions(+), 38 deletions(-)
----------------------------------------------------------------------



[2/3] incubator-beam-site git commit: Regenerate html.

Posted by fr...@apache.org.
Regenerate html.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/41a36913
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/41a36913
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/41a36913

Branch: refs/heads/asf-site
Commit: 41a3691329b39bfdb24b28d050832f72638abc21
Parents: d4ebf2a
Author: Frances Perry <fj...@google.com>
Authored: Thu Oct 20 11:08:17 2016 -0700
Committer: Frances Perry <fj...@google.com>
Committed: Thu Oct 20 11:08:17 2016 -0700

----------------------------------------------------------------------
 content/blog/2016/10/20/test-stream.html        | 481 +++++++++++++++++++
 content/blog/index.html                         |  19 +
 content/feed.xml                                | 351 ++++++++++++--
 .../blog/test-stream/elements-all-on-time.png   | Bin 14817 -> 14743 bytes
 .../test-stream/elements-droppably-late.png     | Bin 14958 -> 14800 bytes
 .../test-stream/elements-observably-late.png    | Bin 16338 -> 16219 bytes
 .../elements-processing-speculative.png         | Bin 17074 -> 16987 bytes
 .../test-stream/elements-unobservably-late.png  | Bin 14550 -> 14437 bytes
 content/index.html                              |   2 +
 .../learn/runners/capability-matrix/index.html  |   2 +-
 10 files changed, 818 insertions(+), 37 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/blog/2016/10/20/test-stream.html
----------------------------------------------------------------------
diff --git a/content/blog/2016/10/20/test-stream.html b/content/blog/2016/10/20/test-stream.html
new file mode 100644
index 0000000..8a48ba7
--- /dev/null
+++ b/content/blog/2016/10/20/test-stream.html
@@ -0,0 +1,481 @@
+<!DOCTYPE html>
+<html lang="en">
+
+  <head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1">
+
+  <title>Testing Unbounded Pipelines in Apache Beam</title>
+  <meta name="description" content="The Beam Programming Model unifies writing pipelines for Batch and Streamingpipelines. We\u2019ve recently introduced a new PTransform to write tests forpipelines...">
+
+  <link rel="stylesheet" href="/styles/site.css">
+  <link rel="stylesheet" href="/css/theme.css">
+  <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js"></script>
+  <script src="/js/bootstrap.min.js"></script>
+  <link rel="canonical" href="http://beam.incubator.apache.org/blog/2016/10/20/test-stream.html" data-proofer-ignore>
+  <link rel="alternate" type="application/rss+xml" title="Apache Beam (incubating)" href="http://beam.incubator.apache.org/feed.xml">
+  <script>
+    (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+    (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+    m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+    })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+    ga('create', 'UA-73650088-1', 'auto');
+    ga('send', 'pageview');
+
+  </script>
+  <link rel="shortcut icon" type="image/x-icon" href="/images/favicon.ico">
+</head>
+
+
+  <body role="document">
+
+    <nav class="navbar navbar-default navbar-fixed-top">
+  <div class="container">
+    <div class="navbar-header">
+      <a href="/" class="navbar-brand" >
+        <img alt="Brand" style="height: 25px" src="/images/beam_logo_navbar.png">
+      </a>
+      <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false" aria-controls="navbar">
+        <span class="sr-only">Toggle navigation</span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+      </button>
+    </div>
+    <div id="navbar" class="navbar-collapse collapse">
+      <ul class="nav navbar-nav">
+        <li class="dropdown">
+		  <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Use <span class="caret"></span></a>
+		  <ul class="dropdown-menu">
+			  <li><a href="/use">User Hub</a></li>
+			  <li role="separator" class="divider"></li>
+			  <li class="dropdown-header">General</li>
+			  <li><a href="/use/beam-overview/">Beam Overview</a></li>
+			  <li><a href="/use/quickstart/">Quickstart</a></li>  
+			  <li><a href="/use/releases">Release Information</a></li>
+			  <li role="separator" class="divider"></li>
+			  <li class="dropdown-header">Example Walkthroughs</li>
+			  <li><a href="/use/walkthroughs/">WordCount</a></li>
+			  <li><a href="/use/walkthroughs/">Mobile Gaming</a></li>
+			  <li role="separator" class="divider"></li>
+			  <li class="dropdown-header">Support</li>
+			  <li><a href="/use/mailing-lists/">Mailing Lists</a></li>
+              <li><a href="/use/issue-tracking/">Issue Tracking</a></li>
+			  <li><a href="http://stackoverflow.com/questions/tagged/apache-beam">Beam on StackOverflow</a></li>
+              <li><a href="http://apachebeam.slack.com">Beam Slack Channel</a></li>
+		  </ul>
+	    </li>
+        <li class="dropdown">
+		  <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Learn <span class="caret"></span></a>
+		  <ul class="dropdown-menu">
+			  <li><a href="/learn">Learner Hub</a></li>
+			  <li role="separator" class="divider"></li>
+			  <li class="dropdown-header">Beam Concepts</li>
+			  <li><a href="/learn/programming-guide/">Programming Guide</a></li>
+			  <li><a href="/learn/presentation-materials/">Presentation Materials</a></li>
+			  <li><a href="/learn/resources/">Additional Resources</a></li>
+			  <li role="separator" class="divider"></li>
+			  <li class="dropdown-header">SDKs</li>
+			  <li><a href="/learn/sdks/java/">Java SDK</a></li>
+			  <li><a href="/learn/sdks/javadoc/">Java SDK API Reference</a></li>
+			  <li role="separator" class="divider"></li>
+			  <li class="dropdown-header">Runners</li>
+			  <li><a href="/learn/runners/capability-matrix/">Capability Matrix</a></li>
+			  <li><a href="/learn/runners/direct/">Direct Runner</a></li>
+			  <li><a href="/learn/runners/flink/">Apache Flink Runner</a></li>
+			  <li><a href="/learn/runners/spark/">Apache Spark Runner</a></li>
+			  <li><a href="/learn/runners/dataflow/">Cloud Dataflow Runner</a></li>
+		  </ul>
+	    </li>
+        <li class="dropdown">
+		  <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Contribute <span class="caret"></span></a>
+		  <ul class="dropdown-menu">
+			  <li><a href="/contribute">Contributor Hub</a></li>
+			  <li role="separator" class="divider"></li>
+			  <li class="dropdown-header">Basics</li>
+			  <li><a href="/contribute/contribution-guide/">Contribution Guide</a></li>
+			  <li><a href="/contribute/work-in-progress/">Work In Progress</a></li>
+			  <li><a href="/use/mailing-lists/">Mailing Lists</a></li>
+              <li><a href="/contribute/source-repository/">Source Repository</a></li>
+              <li><a href="/use/issue-tracking/">Issue Tracking</a></li>
+              <li role="separator" class="divider"></li>
+			  <li class="dropdown-header">Technical References</li>
+			  <li><a href="/contribute/testing/">Testing</a></li>
+              <li><a href="/contribute/design-principles/">Design Principles</a></li>
+			  <li><a href="https://goo.gl/nk5OM0">Technical Vision</a></li>
+		  </ul>
+	    </li>
+        <li><a href="/blog">Blog</a></li>
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Project<span class="caret"></span></a>
+          <ul class="dropdown-menu">
+            <li><a href="/project/logos/">Logos and design</a></li>
+            <li><a href="/project/public-meetings/">Public Meetings</a></li>
+			<li><a href="/project/team/">Team</a></li>
+          </ul>
+        </li>
+      </ul>
+      <ul class="nav navbar-nav navbar-right">
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false"><img src="https://www.apache.org/foundation/press/kit/feather_small.png" alt="Apache Logo" style="height:24px;">Apache Software Foundation<span class="caret"></span></a>
+          <ul class="dropdown-menu dropdown-menu-right">
+            <li><a href="http://www.apache.org/">ASF Homepage</a></li>
+            <li><a href="http://www.apache.org/licenses/">License</a></li>
+            <li><a href="http://www.apache.org/security/">Security</a></li>
+            <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+            <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
+            <li><a href="https://www.apache.org/foundation/policies/conduct">Code of Conduct</a></li>
+          </ul>
+        </li>
+      </ul>
+    </div><!--/.nav-collapse -->
+  </div>
+</nav>
+
+
+<link rel="stylesheet" href="">
+
+
+    <div class="container" role="main">
+
+      <div class="row">
+        
+
+<article class="post" itemscope itemtype="http://schema.org/BlogPosting">
+
+  <header class="post-header">
+    <h1 class="post-title" itemprop="name headline">Testing Unbounded Pipelines in Apache Beam</h1>
+    <p class="post-meta"><time datetime="2016-10-20T11:00:00-07:00" itemprop="datePublished">Oct 20, 2016</time> \u2022  Thomas Groh 
+</p>
+  </header>
+
+  <div class="post-content" itemprop="articleBody">
+    <p>The Beam Programming Model unifies writing pipelines for Batch and Streaming
+pipelines. We\u2019ve recently introduced a new PTransform to write tests for
+pipelines that will be run over unbounded datasets and must handle out-of-order
+and delayed data.
+<!--more--></p>
+
+<p>Watermarks, Windows and Triggers form a core part of the Beam programming model
+\u2013 they respectively determine how your data are grouped, when your input is
+complete, and when to produce results. This is true for all pipelines,
+regardless of if they are processing bounded or unbounded inputs. If you\u2019re not
+familiar with watermarks, windowing, and triggering in the Beam model,
+<a href="https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101">Streaming 101</a>
+and <a href="https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102">Streaming 102</a>
+are an excellent place to get started. A key takeaway from
+these articles: in realistic streaming scenarios with intermittent failures and
+disconnected users, data can arrive out of order or be delayed. Beam\u2019s
+primitives provide a way for users to perform useful, powerful, and correct
+computations in spite of these challenges.</p>
+
+<p>As Beam pipeline authors, we need comprehensive tests that cover crucial
+failure scenarios and corner cases to gain real confidence that a pipeline is
+ready for production. The existing testing infrastructure within the Beam SDKs
+permits tests to be written which examine the contents of a Pipeline at
+execution time. However, writing unit tests for pipelines that may receive
+late data or trigger multiple times has historically ranged from complex to
+not possible, as pipelines that read from unbounded sources do not shut down
+without external intervention, while pipelines that read from bounded sources
+exclusively cannot test behavior with late data nor most speculative triggers.
+Without additional tools, pipelines that use custom triggers and handle
+out-of-order data could not be easily tested.</p>
+
+<p>This blog post introduces our new framework for writing tests for pipelines that
+handle delayed and out-of-order data in the context of the LeaderBoard pipeline
+from the Mobile Gaming example series.</p>
+
+<h2 id="leaderboard-and-the-mobile-gaming-example">LeaderBoard and the Mobile Gaming Example</h2>
+
+<p><a href="https://github.com/apache/incubator-beam/blob/master/examples/java8/src/main/java/org/apache/beam/examples/complete/game/LeaderBoard.java#L177">LeaderBoard</a>
+is part of the <a href="https://github.com/apache/incubator-beam/tree/master/examples/java8/src/main/java/org/apache/beam/examples/complete/game">Beam mobile gaming examples</a>
+(and <a href="http://beam.incubator.apache.org/use/walkthroughs/">walkthroughs</a>)
+which produces a continuous accounting of user and team scores. User scores are
+calculated over the lifetime of the program, while team scores are calculated
+within fixed windows with a default duration of one hour. The LeaderBoard
+pipeline produces speculative and late panes as appropriate, based on the
+configured triggering and allowed lateness of the pipeline. The expected outputs
+of the LeaderBoard pipeline vary depending on when elements arrive in relation
+to the watermark and the progress of processing time, which could not previously
+be controlled within a test.</p>
+
+<h2 id="writing-deterministic-tests-to-emulate-nondeterminism">Writing Deterministic Tests to Emulate Nondeterminism</h2>
+
+<p>The Beam testing infrastructure provides the
+<a href="http://beam.incubator.apache.org/learn/sdks/javadoc/0.2.0-incubating/">PAssert</a>
+methods, which assert properties about the contents of a PCollection from within
+a pipeline. We have expanded this infrastructure to include
+<a href="https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestStream.java">TestStream</a>,
+which is a PTransform that performs a series of events, consisting of adding
+additional elements to a pipeline, advancing the watermark of the TestStream,
+and advancing the pipeline processing time clock. TestStream permits tests which
+observe the effects of triggers on the output a pipeline produces.</p>
+
+<p>While executing a pipeline that reads from a TestStream, the read waits for all
+of the consequences of each event to complete before continuing on to the next
+event, ensuring that when processing time advances, triggers that are based on
+processing time fire as appropriate. With this transform, the effect of
+triggering and allowed lateness can be observed on a pipeline, including
+reactions to speculative and late panes and dropped data.</p>
+
+<h2 id="element-timings">Element Timings</h2>
+
+<p>Elements arrive either behind, with, or after the watermark, which categorizes
+them into the \u201cearly\u201d, \u201con-time\u201d, and \u201clate\u201d divisions. \u201cLate\u201d elements can be
+further subdivided into \u201cunobservably\u201d, \u201cobservably\u201d, and \u201cdroppably\u201d late,
+depending on the window to which they are assigned and the maximum allowed
+lateness, as specified by the windowing strategy. Elements that arrive with
+these timings are emitted into panes, which can be \u201cEARLY\u201d, \u201cON-TIME\u201d, or
+\u201cLATE\u201d, depending on the position of the watermark when the pane was emitted.</p>
+
+<p>Using TestStream, we can write tests that demonstrate that speculative panes are
+output after their trigger condition is met, that the advancing of the watermark
+causes the on-time pane to be produced, and that late-arriving data produces
+refinements when it arrives before the maximum allowed lateness, and is dropped
+after.</p>
+
+<p>The following examples demonstrate how you can use TestStream to provide a
+sequence of events to the Pipeline, where the arrival of elements is interspersed
+with updates to the watermark and the advance of processing time. Each of these
+events runs to completion before additional events occur.</p>
+
+<p>In the diagrams, the time at which events occurred in \u201creal\u201d (event) time
+progresses as the graph moves to the right. The time at which the pipeline
+receives them progresses as the graph goes upwards. The watermark is represented
+by the squiggly red line, and each starburst is the firing of a trigger and the
+associated pane.</p>
+
+<p><img class="center-block" src="/images/blog/test-stream/elements-all-on-time.png" alt="Elements on the Event and Processing time axes, with the Watermark and produced panes" width="442" /></p>
+
+<h3 id="everything-arrives-on-time">Everything arrives on-time</h3>
+
+<p>For example, if we create a TestStream where all the data arrives before the
+watermark and provide the result PCollection as input to the CalculateTeamScores
+PTransform:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>TestStream&lt;GameActionInfo&gt; infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo("sky", "blue", 12, new Instant(0L)),
+ ����������������new GameActionInfo("navy", "blue", 3, new Instant(0L)),
+ ����������������new GameActionInfo("navy", "blue", 3, new Instant(0L).plus(Duration.standardMinutes(3))))
+ ���// Move the watermark past the end the end of the window
+    .advanceWatermarkTo(new Instant(0L).plus(TEAM_WINDOW_DURATION)
+ �������������������������������       .plus(Duration.standardMinutes(1)))
+    .advanceWatermarkToInfinity();
+
+PCollection&lt;KV&lt;String, Integer&gt;&gt; teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+</code></pre>
+</div>
+
+<p>we can then assert that the result PCollection contains elements that arrived:</p>
+
+<p><img class="center-block" src="/images/blog/test-stream/elements-all-on-time.png" alt="Elements all arrive before the watermark, and are produced in the on-time pane" width="442" /></p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>// Only one value is emitted for the blue team
+PAssert.that(teamScores)
+       .inWindow(window)
+       .containsInAnyOrder(KV.of("blue", 18));
+p.run();
+</code></pre>
+</div>
+
+<h3 id="some-elements-are-late-but-arrive-before-the-end-of-the-window">Some elements are late, but arrive before the end of the window</h3>
+
+<p>We can also add data to the TestStream after the watermark, but before the end
+of the window (shown below to the left of the red watermark), which demonstrates
+\u201cunobservably late\u201d data - that is, data that arrives late, but is promoted by
+the system to be on time, as it arrives before the watermark passes the end of
+the window</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>TestStream&lt;GameActionInfo&gt; infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo("sky", "blue", 3, new Instant(0L)),
+ �������         new GameActionInfo("navy", "blue", 3, new Instant(0L).plus(Duration.standardMinutes(3))))
+ ���// Move the watermark up to "near" the end of the window
+    .advanceWatermarkTo(new Instant(0L).plus(TEAM_WINDOW_DURATION)
+ ���������������       ����������������.minus(Duration.standardMinutes(1)))
+    .addElements(new GameActionInfo("sky", "blue", 12, Duration.ZERO))
+    .advanceWatermarkToInfinity();
+
+PCollection&lt;KV&lt;String, Integer&gt;&gt; teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+</code></pre>
+</div>
+
+<p><img class="center-block" src="/images/blog/test-stream/elements-unobservably-late.png" alt="An element arrives late, but before the watermark passes the end of the window, and is produced in the on-time pane" width="442" /></p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>// Only one value is emitted for the blue team
+PAssert.that(teamScores)
+       .inWindow(window)
+       .containsInAnyOrder(KV.of("blue", 18));
+p.run();
+</code></pre>
+</div>
+
+<h3 id="elements-are-late-and-arrive-after-the-end-of-the-window">Elements are late, and arrive after the end of the window</h3>
+
+<p>By advancing the watermark farther in time before adding the late data, we can
+demonstrate the triggering behavior that causes the system to emit an on-time
+pane, and then after the late data arrives, a pane that refines the result.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>TestStream&lt;GameActionInfo&gt; infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo("sky", "blue", 3, new Instant(0L)),
+       ����������new GameActionInfo("navy", "blue", 3, new Instant(0L).plus(Duration.standardMinutes(3))))
+� ��// Move the watermark up to "near" the end of the window
+    .advanceWatermarkTo(new Instant(0L).plus(TEAM_WINDOW_DURATION)
+��������������������������������       .minus(Duration.standardMinutes(1)))
+    .addElements(new GameActionInfo("sky", "blue", 12, Duration.ZERO))
+    .advanceWatermarkToInfinity();
+
+PCollection&lt;KV&lt;String, Integer&gt;&gt; teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+</code></pre>
+</div>
+
+<p><img class="center-block" src="/images/blog/test-stream/elements-observably-late.png" alt="Elements all arrive before the watermark, and are produced in the on-time pane" width="442" /></p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>// An on-time pane is emitted with the events that arrived before the window closed
+PAssert.that(teamScores)
+       .inOnTimePane(window)
+       .containsInAnyOrder(KV.of("blue", 6));
+// The final pane contains the late refinement
+PAssert.that(teamScores)
+       .inFinalPane(window)
+       .containsInAnyOrder(KV.of("blue", 18));
+p.run();
+</code></pre>
+</div>
+
+<h3 id="elements-are-late-and-after-the-end-of-the-window-plus-the-allowed-lateness">Elements are late, and after the end of the window plus the allowed lateness</h3>
+
+<p>If we push the watermark even further into the future, beyond the maximum
+configured allowed lateness, we can demonstrate that the late element is dropped
+by the system.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>TestStream&lt;GameActionInfo&gt; infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo("sky", "blue", 3, Duration.ZERO),
+        ���������new GameActionInfo("navy", "blue", 3, Duration.standardMinutes(3)))
+����// Move the watermark up to "near" the end of the window
+    .advanceWatermarkTo(new Instant(0).plus(TEAM_WINDOW_DURATION)
+ ����������������������������������������.plus(ALLOWED_LATENESS)
+ ����������������������������������������.plus(Duration.standardMinutes(1)))
+    .addElements(new GameActionInfo(
+���������������������"sky",
+���������������������"blue",
+���������������������12,
+���������������������new Instant(0).plus(TEAM_WINDOW_DURATION).minus(Duration.standardMinutes(1))))
+    .advanceWatermarkToInfinity();
+
+PCollection&lt;KV&lt;String, Integer&gt;&gt; teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+</code></pre>
+</div>
+
+<p><img class="center-block" src="/images/blog/test-stream/elements-droppably-late.png" alt="Elements all arrive before the watermark, and are produced in the on-time pane" width="442" /></p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>// An on-time pane is emitted with the events that arrived before the window closed
+PAssert.that(teamScores)
+       .inWindow(window)
+       .containsInAnyOrder(KV.of("blue", 6));
+
+p.run();
+</code></pre>
+</div>
+
+<h3 id="elements-arrive-before-the-end-of-the-window-and-some-processing-time-passes">Elements arrive before the end of the window, and some processing time passes</h3>
+<p>Using additional methods, we can demonstrate the behavior of speculative
+triggers by advancing the processing time of the TestStream. If we add elements
+to an input PCollection, occasionally advancing the processing time clock, and
+apply <code class="highlighter-rouge">CalculateUserScores</code></p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>TestStream.create(AvroCoder.of(GameActionInfo.class))
+ ���.addElements(new GameActionInfo("scarlet", "red", 3, new Instant(0L)),
+ ��������������  new GameActionInfo("scarlet", "red", 2, new Instant(0L).plus(Duration.standardMinutes(1))))
+    .advanceProcessingTime(Duration.standardMinutes(12))
+ ���.addElements(new GameActionInfo("oxblood", "red", 2, new Instant(0L)).plus(Duration.standardSeconds(22)),
+ ��������������  new GameActionInfo("scarlet", "red", 4, new Instant(0L).plus(Duration.standardMinutes(2))))
+    .advanceProcessingTime(Duration.standardMinutes(15))
+    .advanceWatermarkToInfinity();
+
+PCollection&lt;KV&lt;String, Integer&gt;&gt; userScores =
+ ���p.apply(infos).apply(new CalculateUserScores(ALLOWED_LATENESS));
+</code></pre>
+</div>
+
+<p><img class="center-block" src="/images/blog/test-stream/elements-processing-speculative.png" alt="Elements all arrive before the watermark, and are produced in the on-time pane" width="442" /></p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>PAssert.that(userScores)
+       .inEarlyGlobalWindowPanes()
+       .containsInAnyOrder(KV.of("scarlet", 5),
+   ������������������������KV.of("scarlet", 9),
+                           KV.of("oxblood", 2));
+
+p.run();
+</code></pre>
+</div>
+
+<h2 id="teststream---under-the-hood">TestStream - Under the Hood</h2>
+
+<p>TestStream relies on a pipeline concept we\u2019ve introduced, called quiescence, to
+utilize the existing runner infrastructure while providing guarantees about when
+a root transform will called by the runner. This consists of properties about
+pending elements and triggers, namely:</p>
+
+<ul>
+  <li>No trigger is permitted to fire but has not fired</li>
+  <li>All elements are either buffered in state or cannot progress until a side input becomes available</li>
+</ul>
+
+<p>Simplified, this means that, in the absence of an advancement in input
+watermarks or processing time, or additional elements being added to the
+pipeline, the pipeline will not make progress. Whenever the TestStream PTransform
+performs an action, the runner must not reinvoke the same instance until the
+pipeline has quiesced. This ensures that the events specified by TestStream
+happen \u201cin-order\u201d, which ensures that input watermarks and the system clock do
+not advance ahead of the elements they hoped to hold up.</p>
+
+<p>The DirectRunner has been modified to use quiescence as the signal that it
+should add more work to the Pipeline, and the implementation of TestStream in
+that runner uses this fact to perform a single output per event. The DirectRunner
+implementation also directly controls the runner\u2019s system clock, ensuring that
+tests will complete promptly even if there is a multi-minute processing time
+trigger located within the pipeline.</p>
+
+<p>The TestStream transform is supported in the DirectRunner. For most users, tests
+written using TestPipeline and PAsserts will automatically function while using
+TestStream.</p>
+
+<h2 id="summary">Summary</h2>
+
+<p>The addition of TestStream alongside window and pane-specific matchers in PAssert
+has enabled the testing of Pipelines which produce speculative and late panes.
+This permits tests for all styles of pipeline to be expressed directly within the
+Java SDK. If you have questions or comments, we\u2019d love to hear them on the
+<a href="http://beam.incubator.apache.org/use/mailing-lists/">mailing lists</a>.</p>
+
+  </div>
+
+</article>
+
+      </div>
+
+
+    <hr>
+  <div class="row">
+      <div class="col-xs-12">
+          <footer>
+              <p class="text-center">&copy; Copyright 2016
+                <a href="http://www.apache.org">The Apache Software Foundation.</a> All Rights Reserved.</p>
+                <p class="text-center"><a href="/privacy_policy">Privacy Policy</a> |
+                <a href="/feed.xml">RSS Feed</a></p>
+          </footer>
+      </div>
+  </div>
+  <!-- container div end -->
+</div>
+
+
+  </body>
+
+</html>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/blog/index.html
----------------------------------------------------------------------
diff --git a/content/blog/index.html b/content/blog/index.html
index c287f49..e3a8882 100644
--- a/content/blog/index.html
+++ b/content/blog/index.html
@@ -147,6 +147,25 @@
 <p>This is the blog for the Apache Beam project. This blog contains news and updates
 for the project.</p>
 
+<h3 id="a-classpost-link-hrefblog20161020test-streamhtmltesting-unbounded-pipelines-in-apache-beama"><a class="post-link" href="/blog/2016/10/20/test-stream.html">Testing Unbounded Pipelines in Apache Beam</a></h3>
+<p><i>Oct 20, 2016 \u2022  Thomas Groh 
+</i></p>
+
+<p>The Beam Programming Model unifies writing pipelines for Batch and Streaming
+pipelines. We\u2019ve recently introduced a new PTransform to write tests for
+pipelines that will be run over unbounded datasets and must handle out-of-order
+and delayed data.</p>
+
+<!-- Render a "read more" button if the post is longer than the excerpt -->
+
+<p>
+<a class="btn btn-default btn-sm" href="/blog/2016/10/20/test-stream.html" role="button">
+Read more&nbsp;<span class="glyphicon glyphicon-menu-right" aria-hidden="true"></span>
+</a>
+</p>
+
+<hr />
+
 <h3 id="a-classpost-link-hrefbeamupdate20161011strata-hadoop-world-and-beamhtmlstratahadoop-world-and-beama"><a class="post-link" href="/beam/update/2016/10/11/strata-hadoop-world-and-beam.html">Strata+Hadoop World and Beam</a></h3>
 <p><i>Oct 11, 2016 \u2022  Jesse Anderson [<a href="https://twitter.com/jessetanderson">@jessetanderson</a>]
 </i></p>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/feed.xml
----------------------------------------------------------------------
diff --git a/content/feed.xml b/content/feed.xml
index b9015ce..aec2646 100644
--- a/content/feed.xml
+++ b/content/feed.xml
@@ -6,11 +6,324 @@
 </description>
     <link>http://beam.incubator.apache.org/</link>
     <atom:link href="http://beam.incubator.apache.org/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Tue, 18 Oct 2016 20:58:20 -0700</pubDate>
-    <lastBuildDate>Tue, 18 Oct 2016 20:58:20 -0700</lastBuildDate>
+    <pubDate>Thu, 20 Oct 2016 11:04:34 -0700</pubDate>
+    <lastBuildDate>Thu, 20 Oct 2016 11:04:34 -0700</lastBuildDate>
     <generator>Jekyll v3.2.0</generator>
     
       <item>
+        <title>Testing Unbounded Pipelines in Apache Beam</title>
+        <description>&lt;p&gt;The Beam Programming Model unifies writing pipelines for Batch and Streaming
+pipelines. We\u2019ve recently introduced a new PTransform to write tests for
+pipelines that will be run over unbounded datasets and must handle out-of-order
+and delayed data.
+&lt;!--more--&gt;&lt;/p&gt;
+
+&lt;p&gt;Watermarks, Windows and Triggers form a core part of the Beam programming model
+\u2013 they respectively determine how your data are grouped, when your input is
+complete, and when to produce results. This is true for all pipelines,
+regardless of if they are processing bounded or unbounded inputs. If you\u2019re not
+familiar with watermarks, windowing, and triggering in the Beam model,
+&lt;a href=&quot;https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101&quot;&gt;Streaming 101&lt;/a&gt;
+and &lt;a href=&quot;https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102&quot;&gt;Streaming 102&lt;/a&gt;
+are an excellent place to get started. A key takeaway from
+these articles: in realistic streaming scenarios with intermittent failures and
+disconnected users, data can arrive out of order or be delayed. Beam\u2019s
+primitives provide a way for users to perform useful, powerful, and correct
+computations in spite of these challenges.&lt;/p&gt;
+
+&lt;p&gt;As Beam pipeline authors, we need comprehensive tests that cover crucial
+failure scenarios and corner cases to gain real confidence that a pipeline is
+ready for production. The existing testing infrastructure within the Beam SDKs
+permits tests to be written which examine the contents of a Pipeline at
+execution time. However, writing unit tests for pipelines that may receive
+late data or trigger multiple times has historically ranged from complex to
+not possible, as pipelines that read from unbounded sources do not shut down
+without external intervention, while pipelines that read from bounded sources
+exclusively cannot test behavior with late data nor most speculative triggers.
+Without additional tools, pipelines that use custom triggers and handle
+out-of-order data could not be easily tested.&lt;/p&gt;
+
+&lt;p&gt;This blog post introduces our new framework for writing tests for pipelines that
+handle delayed and out-of-order data in the context of the LeaderBoard pipeline
+from the Mobile Gaming example series.&lt;/p&gt;
+
+&lt;h2 id=&quot;leaderboard-and-the-mobile-gaming-example&quot;&gt;LeaderBoard and the Mobile Gaming Example&lt;/h2&gt;
+
+&lt;p&gt;&lt;a href=&quot;https://github.com/apache/incubator-beam/blob/master/examples/java8/src/main/java/org/apache/beam/examples/complete/game/LeaderBoard.java#L177&quot;&gt;LeaderBoard&lt;/a&gt;
+is part of the &lt;a href=&quot;https://github.com/apache/incubator-beam/tree/master/examples/java8/src/main/java/org/apache/beam/examples/complete/game&quot;&gt;Beam mobile gaming examples&lt;/a&gt;
+(and &lt;a href=&quot;http://beam.incubator.apache.org/use/walkthroughs/&quot;&gt;walkthroughs&lt;/a&gt;)
+which produces a continuous accounting of user and team scores. User scores are
+calculated over the lifetime of the program, while team scores are calculated
+within fixed windows with a default duration of one hour. The LeaderBoard
+pipeline produces speculative and late panes as appropriate, based on the
+configured triggering and allowed lateness of the pipeline. The expected outputs
+of the LeaderBoard pipeline vary depending on when elements arrive in relation
+to the watermark and the progress of processing time, which could not previously
+be controlled within a test.&lt;/p&gt;
+
+&lt;h2 id=&quot;writing-deterministic-tests-to-emulate-nondeterminism&quot;&gt;Writing Deterministic Tests to Emulate Nondeterminism&lt;/h2&gt;
+
+&lt;p&gt;The Beam testing infrastructure provides the
+&lt;a href=&quot;http://beam.incubator.apache.org/learn/sdks/javadoc/0.2.0-incubating/&quot;&gt;PAssert&lt;/a&gt;
+methods, which assert properties about the contents of a PCollection from within
+a pipeline. We have expanded this infrastructure to include
+&lt;a href=&quot;https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestStream.java&quot;&gt;TestStream&lt;/a&gt;,
+which is a PTransform that performs a series of events, consisting of adding
+additional elements to a pipeline, advancing the watermark of the TestStream,
+and advancing the pipeline processing time clock. TestStream permits tests which
+observe the effects of triggers on the output a pipeline produces.&lt;/p&gt;
+
+&lt;p&gt;While executing a pipeline that reads from a TestStream, the read waits for all
+of the consequences of each event to complete before continuing on to the next
+event, ensuring that when processing time advances, triggers that are based on
+processing time fire as appropriate. With this transform, the effect of
+triggering and allowed lateness can be observed on a pipeline, including
+reactions to speculative and late panes and dropped data.&lt;/p&gt;
+
+&lt;h2 id=&quot;element-timings&quot;&gt;Element Timings&lt;/h2&gt;
+
+&lt;p&gt;Elements arrive either behind, with, or after the watermark, which categorizes
+them into the \u201cearly\u201d, \u201con-time\u201d, and \u201clate\u201d divisions. \u201cLate\u201d elements can be
+further subdivided into \u201cunobservably\u201d, \u201cobservably\u201d, and \u201cdroppably\u201d late,
+depending on the window to which they are assigned and the maximum allowed
+lateness, as specified by the windowing strategy. Elements that arrive with
+these timings are emitted into panes, which can be \u201cEARLY\u201d, \u201cON-TIME\u201d, or
+\u201cLATE\u201d, depending on the position of the watermark when the pane was emitted.&lt;/p&gt;
+
+&lt;p&gt;Using TestStream, we can write tests that demonstrate that speculative panes are
+output after their trigger condition is met, that the advancing of the watermark
+causes the on-time pane to be produced, and that late-arriving data produces
+refinements when it arrives before the maximum allowed lateness, and is dropped
+after.&lt;/p&gt;
+
+&lt;p&gt;The following examples demonstrate how you can use TestStream to provide a
+sequence of events to the Pipeline, where the arrival of elements is interspersed
+with updates to the watermark and the advance of processing time. Each of these
+events runs to completion before additional events occur.&lt;/p&gt;
+
+&lt;p&gt;In the diagrams, the time at which events occurred in \u201creal\u201d (event) time
+progresses as the graph moves to the right. The time at which the pipeline
+receives them progresses as the graph goes upwards. The watermark is represented
+by the squiggly red line, and each starburst is the firing of a trigger and the
+associated pane.&lt;/p&gt;
+
+&lt;p&gt;&lt;img class=&quot;center-block&quot; src=&quot;/images/blog/test-stream/elements-all-on-time.png&quot; alt=&quot;Elements on the Event and Processing time axes, with the Watermark and produced panes&quot; width=&quot;442&quot; /&gt;&lt;/p&gt;
+
+&lt;h3 id=&quot;everything-arrives-on-time&quot;&gt;Everything arrives on-time&lt;/h3&gt;
+
+&lt;p&gt;For example, if we create a TestStream where all the data arrives before the
+watermark and provide the result PCollection as input to the CalculateTeamScores
+PTransform:&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;TestStream&amp;lt;GameActionInfo&amp;gt; infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo(&quot;sky&quot;, &quot;blue&quot;, 12, new Instant(0L)),
+ ����������������new GameActionInfo(&quot;navy&quot;, &quot;blue&quot;, 3, new Instant(0L)),
+ ����������������new GameActionInfo(&quot;navy&quot;, &quot;blue&quot;, 3, new Instant(0L).plus(Duration.standardMinutes(3))))
+ ���// Move the watermark past the end the end of the window
+    .advanceWatermarkTo(new Instant(0L).plus(TEAM_WINDOW_DURATION)
+ �������������������������������       .plus(Duration.standardMinutes(1)))
+    .advanceWatermarkToInfinity();
+
+PCollection&amp;lt;KV&amp;lt;String, Integer&amp;gt;&amp;gt; teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;p&gt;we can then assert that the result PCollection contains elements that arrived:&lt;/p&gt;
+
+&lt;p&gt;&lt;img class=&quot;center-block&quot; src=&quot;/images/blog/test-stream/elements-all-on-time.png&quot; alt=&quot;Elements all arrive before the watermark, and are produced in the on-time pane&quot; width=&quot;442&quot; /&gt;&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// Only one value is emitted for the blue team
+PAssert.that(teamScores)
+       .inWindow(window)
+       .containsInAnyOrder(KV.of(&quot;blue&quot;, 18));
+p.run();
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;h3 id=&quot;some-elements-are-late-but-arrive-before-the-end-of-the-window&quot;&gt;Some elements are late, but arrive before the end of the window&lt;/h3&gt;
+
+&lt;p&gt;We can also add data to the TestStream after the watermark, but before the end
+of the window (shown below to the left of the red watermark), which demonstrates
+\u201cunobservably late\u201d data - that is, data that arrives late, but is promoted by
+the system to be on time, as it arrives before the watermark passes the end of
+the window&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;TestStream&amp;lt;GameActionInfo&amp;gt; infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo(&quot;sky&quot;, &quot;blue&quot;, 3, new Instant(0L)),
+ �������         new GameActionInfo(&quot;navy&quot;, &quot;blue&quot;, 3, new Instant(0L).plus(Duration.standardMinutes(3))))
+ ���// Move the watermark up to &quot;near&quot; the end of the window
+    .advanceWatermarkTo(new Instant(0L).plus(TEAM_WINDOW_DURATION)
+ ���������������       ����������������.minus(Duration.standardMinutes(1)))
+    .addElements(new GameActionInfo(&quot;sky&quot;, &quot;blue&quot;, 12, Duration.ZERO))
+    .advanceWatermarkToInfinity();
+
+PCollection&amp;lt;KV&amp;lt;String, Integer&amp;gt;&amp;gt; teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;p&gt;&lt;img class=&quot;center-block&quot; src=&quot;/images/blog/test-stream/elements-unobservably-late.png&quot; alt=&quot;An element arrives late, but before the watermark passes the end of the window, and is produced in the on-time pane&quot; width=&quot;442&quot; /&gt;&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// Only one value is emitted for the blue team
+PAssert.that(teamScores)
+       .inWindow(window)
+       .containsInAnyOrder(KV.of(&quot;blue&quot;, 18));
+p.run();
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;h3 id=&quot;elements-are-late-and-arrive-after-the-end-of-the-window&quot;&gt;Elements are late, and arrive after the end of the window&lt;/h3&gt;
+
+&lt;p&gt;By advancing the watermark farther in time before adding the late data, we can
+demonstrate the triggering behavior that causes the system to emit an on-time
+pane, and then after the late data arrives, a pane that refines the result.&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;TestStream&amp;lt;GameActionInfo&amp;gt; infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo(&quot;sky&quot;, &quot;blue&quot;, 3, new Instant(0L)),
+       ����������new GameActionInfo(&quot;navy&quot;, &quot;blue&quot;, 3, new Instant(0L).plus(Duration.standardMinutes(3))))
+� ��// Move the watermark up to &quot;near&quot; the end of the window
+    .advanceWatermarkTo(new Instant(0L).plus(TEAM_WINDOW_DURATION)
+��������������������������������       .minus(Duration.standardMinutes(1)))
+    .addElements(new GameActionInfo(&quot;sky&quot;, &quot;blue&quot;, 12, Duration.ZERO))
+    .advanceWatermarkToInfinity();
+
+PCollection&amp;lt;KV&amp;lt;String, Integer&amp;gt;&amp;gt; teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;p&gt;&lt;img class=&quot;center-block&quot; src=&quot;/images/blog/test-stream/elements-observably-late.png&quot; alt=&quot;Elements all arrive before the watermark, and are produced in the on-time pane&quot; width=&quot;442&quot; /&gt;&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// An on-time pane is emitted with the events that arrived before the window closed
+PAssert.that(teamScores)
+       .inOnTimePane(window)
+       .containsInAnyOrder(KV.of(&quot;blue&quot;, 6));
+// The final pane contains the late refinement
+PAssert.that(teamScores)
+       .inFinalPane(window)
+       .containsInAnyOrder(KV.of(&quot;blue&quot;, 18));
+p.run();
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;h3 id=&quot;elements-are-late-and-after-the-end-of-the-window-plus-the-allowed-lateness&quot;&gt;Elements are late, and after the end of the window plus the allowed lateness&lt;/h3&gt;
+
+&lt;p&gt;If we push the watermark even further into the future, beyond the maximum
+configured allowed lateness, we can demonstrate that the late element is dropped
+by the system.&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;TestStream&amp;lt;GameActionInfo&amp;gt; infos = TestStream.create(AvroCoder.of(GameActionInfo.class))
+    .addElements(new GameActionInfo(&quot;sky&quot;, &quot;blue&quot;, 3, Duration.ZERO),
+        ���������new GameActionInfo(&quot;navy&quot;, &quot;blue&quot;, 3, Duration.standardMinutes(3)))
+����// Move the watermark up to &quot;near&quot; the end of the window
+    .advanceWatermarkTo(new Instant(0).plus(TEAM_WINDOW_DURATION)
+ ����������������������������������������.plus(ALLOWED_LATENESS)
+ ����������������������������������������.plus(Duration.standardMinutes(1)))
+    .addElements(new GameActionInfo(
+���������������������&quot;sky&quot;,
+���������������������&quot;blue&quot;,
+���������������������12,
+���������������������new Instant(0).plus(TEAM_WINDOW_DURATION).minus(Duration.standardMinutes(1))))
+    .advanceWatermarkToInfinity();
+
+PCollection&amp;lt;KV&amp;lt;String, Integer&amp;gt;&amp;gt; teamScores = p.apply(createEvents)
+    .apply(new CalculateTeamScores(TEAM_WINDOW_DURATION, ALLOWED_LATENESS));
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;p&gt;&lt;img class=&quot;center-block&quot; src=&quot;/images/blog/test-stream/elements-droppably-late.png&quot; alt=&quot;Elements all arrive before the watermark, and are produced in the on-time pane&quot; width=&quot;442&quot; /&gt;&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// An on-time pane is emitted with the events that arrived before the window closed
+PAssert.that(teamScores)
+       .inWindow(window)
+       .containsInAnyOrder(KV.of(&quot;blue&quot;, 6));
+
+p.run();
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;h3 id=&quot;elements-arrive-before-the-end-of-the-window-and-some-processing-time-passes&quot;&gt;Elements arrive before the end of the window, and some processing time passes&lt;/h3&gt;
+&lt;p&gt;Using additional methods, we can demonstrate the behavior of speculative
+triggers by advancing the processing time of the TestStream. If we add elements
+to an input PCollection, occasionally advancing the processing time clock, and
+apply &lt;code class=&quot;highlighter-rouge&quot;&gt;CalculateUserScores&lt;/code&gt;&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;TestStream.create(AvroCoder.of(GameActionInfo.class))
+ ���.addElements(new GameActionInfo(&quot;scarlet&quot;, &quot;red&quot;, 3, new Instant(0L)),
+ ��������������  new GameActionInfo(&quot;scarlet&quot;, &quot;red&quot;, 2, new Instant(0L).plus(Duration.standardMinutes(1))))
+    .advanceProcessingTime(Duration.standardMinutes(12))
+ ���.addElements(new GameActionInfo(&quot;oxblood&quot;, &quot;red&quot;, 2, new Instant(0L)).plus(Duration.standardSeconds(22)),
+ ��������������  new GameActionInfo(&quot;scarlet&quot;, &quot;red&quot;, 4, new Instant(0L).plus(Duration.standardMinutes(2))))
+    .advanceProcessingTime(Duration.standardMinutes(15))
+    .advanceWatermarkToInfinity();
+
+PCollection&amp;lt;KV&amp;lt;String, Integer&amp;gt;&amp;gt; userScores =
+ ���p.apply(infos).apply(new CalculateUserScores(ALLOWED_LATENESS));
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;p&gt;&lt;img class=&quot;center-block&quot; src=&quot;/images/blog/test-stream/elements-processing-speculative.png&quot; alt=&quot;Elements all arrive before the watermark, and are produced in the on-time pane&quot; width=&quot;442&quot; /&gt;&lt;/p&gt;
+
+&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;PAssert.that(userScores)
+       .inEarlyGlobalWindowPanes()
+       .containsInAnyOrder(KV.of(&quot;scarlet&quot;, 5),
+   ������������������������KV.of(&quot;scarlet&quot;, 9),
+                           KV.of(&quot;oxblood&quot;, 2));
+
+p.run();
+&lt;/code&gt;&lt;/pre&gt;
+&lt;/div&gt;
+
+&lt;h2 id=&quot;teststream---under-the-hood&quot;&gt;TestStream - Under the Hood&lt;/h2&gt;
+
+&lt;p&gt;TestStream relies on a pipeline concept we\u2019ve introduced, called quiescence, to
+utilize the existing runner infrastructure while providing guarantees about when
+a root transform will called by the runner. This consists of properties about
+pending elements and triggers, namely:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;No trigger is permitted to fire but has not fired&lt;/li&gt;
+  &lt;li&gt;All elements are either buffered in state or cannot progress until a side input becomes available&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;Simplified, this means that, in the absence of an advancement in input
+watermarks or processing time, or additional elements being added to the
+pipeline, the pipeline will not make progress. Whenever the TestStream PTransform
+performs an action, the runner must not reinvoke the same instance until the
+pipeline has quiesced. This ensures that the events specified by TestStream
+happen \u201cin-order\u201d, which ensures that input watermarks and the system clock do
+not advance ahead of the elements they hoped to hold up.&lt;/p&gt;
+
+&lt;p&gt;The DirectRunner has been modified to use quiescence as the signal that it
+should add more work to the Pipeline, and the implementation of TestStream in
+that runner uses this fact to perform a single output per event. The DirectRunner
+implementation also directly controls the runner\u2019s system clock, ensuring that
+tests will complete promptly even if there is a multi-minute processing time
+trigger located within the pipeline.&lt;/p&gt;
+
+&lt;p&gt;The TestStream transform is supported in the DirectRunner. For most users, tests
+written using TestPipeline and PAsserts will automatically function while using
+TestStream.&lt;/p&gt;
+
+&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;
+
+&lt;p&gt;The addition of TestStream alongside window and pane-specific matchers in PAssert
+has enabled the testing of Pipelines which produce speculative and late panes.
+This permits tests for all styles of pipeline to be expressed directly within the
+Java SDK. If you have questions or comments, we\u2019d love to hear them on the
+&lt;a href=&quot;http://beam.incubator.apache.org/use/mailing-lists/&quot;&gt;mailing lists&lt;/a&gt;.&lt;/p&gt;
+</description>
+        <pubDate>Thu, 20 Oct 2016 11:00:00 -0700</pubDate>
+        <link>http://beam.incubator.apache.org/blog/2016/10/20/test-stream.html</link>
+        <guid isPermaLink="true">http://beam.incubator.apache.org/blog/2016/10/20/test-stream.html</guid>
+        
+        
+        <category>blog</category>
+        
+      </item>
+    
+      <item>
         <title>Strata+Hadoop World and Beam</title>
         <description>&lt;p&gt;Tyler Akidau and I gave a &lt;a href=&quot;http://conferences.oreilly.com/strata/hadoop-big-data-ny/public/schedule/detail/52129&quot;&gt;three-hour tutorial&lt;/a&gt; on Apache Beam at Strata+Hadoop World 2016. We had a plethora of help from our TAs: Kenn Knowles, Reuven Lax, Felipe Hoffa, Slava Chernyak, and Jamie Grier. There were a total of 66 people that attended the session.&lt;!--more--&gt;&lt;/p&gt;
 
@@ -1146,39 +1459,5 @@ included the &lt;a href=&quot;https://github.com/GoogleCloudPlatform/DataflowJav
         
       </item>
     
-      <item>
-        <title>Apache Beam has a logo!</title>
-        <description>&lt;p&gt;One of the major benefits of Apache Beam is the fact that it unifies both
-both batch and stream processing into one powerful model. In fact, this unification
-is so important, the name Beam itself comes from the union of &lt;strong&gt;B&lt;/strong&gt;atch + str&lt;strong&gt;EAM&lt;/strong&gt; = Beam&lt;/p&gt;
-
-&lt;p&gt;When the project started, we wanted a logo which was both appealing and visually
-represented this unification. &lt;!--more--&gt; Thanks to the &lt;strong&gt;amazing&lt;/strong&gt; work of Stephanie Smythies, the Apache Beam project
-now has a logo.&lt;/p&gt;
-
-&lt;p&gt;&lt;em&gt;drum roll&lt;/em&gt; - &lt;strong&gt;Presenting, the Apache Beam Logo!&lt;/strong&gt;&lt;/p&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/beam_logo_s.png&quot; alt=&quot;Apache Beam Logo&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;We are excited about this logo because it is &lt;strong&gt;simple&lt;/strong&gt;, &lt;strong&gt;bright&lt;/strong&gt;, and shows the
-unification of bath and streaming, as beams of light, within the \u2018B\u2019. We will base
-our future website and documentation design around this logo and its coloring. We
-will also make various permutations and resolutions of this logo available in the
-coming weeks. For any questions or comments, send an email to the &lt;code class=&quot;highlighter-rouge&quot;&gt;dev@&lt;/code&gt; email list
-for Apache Beam.&lt;/p&gt;
-</description>
-        <pubDate>Mon, 22 Feb 2016 10:21:48 -0800</pubDate>
-        <link>http://beam.incubator.apache.org/beam/update/website/2016/02/22/beam-has-a-logo.html</link>
-        <guid isPermaLink="true">http://beam.incubator.apache.org/beam/update/website/2016/02/22/beam-has-a-logo.html</guid>
-        
-        
-        <category>beam</category>
-        
-        <category>update</category>
-        
-        <category>website</category>
-        
-      </item>
-    
   </channel>
 </rss>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/images/blog/test-stream/elements-all-on-time.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-all-on-time.png b/content/images/blog/test-stream/elements-all-on-time.png
index c16112f..5892e88 100644
Binary files a/content/images/blog/test-stream/elements-all-on-time.png and b/content/images/blog/test-stream/elements-all-on-time.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/images/blog/test-stream/elements-droppably-late.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-droppably-late.png b/content/images/blog/test-stream/elements-droppably-late.png
index 9a12870..0fadb0f 100644
Binary files a/content/images/blog/test-stream/elements-droppably-late.png and b/content/images/blog/test-stream/elements-droppably-late.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/images/blog/test-stream/elements-observably-late.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-observably-late.png b/content/images/blog/test-stream/elements-observably-late.png
index a874a1d..274b068 100644
Binary files a/content/images/blog/test-stream/elements-observably-late.png and b/content/images/blog/test-stream/elements-observably-late.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/images/blog/test-stream/elements-processing-speculative.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-processing-speculative.png b/content/images/blog/test-stream/elements-processing-speculative.png
index 8d1224c..b905d2c 100644
Binary files a/content/images/blog/test-stream/elements-processing-speculative.png and b/content/images/blog/test-stream/elements-processing-speculative.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/images/blog/test-stream/elements-unobservably-late.png
----------------------------------------------------------------------
diff --git a/content/images/blog/test-stream/elements-unobservably-late.png b/content/images/blog/test-stream/elements-unobservably-late.png
index f3bcfa1..74c0195 100644
Binary files a/content/images/blog/test-stream/elements-unobservably-late.png and b/content/images/blog/test-stream/elements-unobservably-late.png differ

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/index.html
----------------------------------------------------------------------
diff --git a/content/index.html b/content/index.html
index ffa552b..681944f 100644
--- a/content/index.html
+++ b/content/index.html
@@ -222,6 +222,8 @@ The Apache Beam project is in the process of bootstrapping. This includes the we
     <h3>Blog</h3>
     <div class="list-group">
     
+    <a class="list-group-item" href="/blog/2016/10/20/test-stream.html">Oct 20, 2016 - Testing Unbounded Pipelines in Apache Beam</a>
+    
     <a class="list-group-item" href="/beam/update/2016/10/11/strata-hadoop-world-and-beam.html">Oct 11, 2016 - Strata+Hadoop World and Beam</a>
     
     <a class="list-group-item" href="/blog/2016/08/03/six-months.html">Aug 3, 2016 - Apache Beam: Six Months in Incubation</a>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/41a36913/content/learn/runners/capability-matrix/index.html
----------------------------------------------------------------------
diff --git a/content/learn/runners/capability-matrix/index.html b/content/learn/runners/capability-matrix/index.html
index 8b374b2..b08f901 100644
--- a/content/learn/runners/capability-matrix/index.html
+++ b/content/learn/runners/capability-matrix/index.html
@@ -143,7 +143,7 @@
 
       <div class="row">
         <h1 id="beam-capability-matrix">Beam Capability Matrix</h1>
-<p><span style="font-size:11px;float:none">Last updated: 2016-10-18 20:58 PDT</span></p>
+<p><span style="font-size:11px;float:none">Last updated: 2016-10-20 11:04 PDT</span></p>
 
 <p>Apache Beam (incubating) provides a portable API layer for building sophisticated data-parallel processing engines that may be executed across a diversity of exeuction engines, or <i>runners</i>. The core concepts of this layer are based upon the Beam Model (formerly referred to as the <a href="http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf">Dataflow Model</a>), and implemented to varying degrees in each Beam runner. To help clarify the capabilities of individual runners, we\u2019ve created the capability matrix below.</p>