You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by da...@apache.org on 2016/11/15 00:35:29 UTC

[1/3] incubator-beam-site git commit: [BEAM-505] Fill in the documentation/runners/direct portion of the website

Repository: incubator-beam-site
Updated Branches:
  refs/heads/asf-site 6ab73c79a -> a82a0f3bb


[BEAM-505] Fill in the documentation/runners/direct portion of the website


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/fe87fb80
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/fe87fb80
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/fe87fb80

Branch: refs/heads/asf-site
Commit: fe87fb807a310fe8e68298d4fe6fde86d7c65522
Parents: 6ab73c7
Author: melissa <me...@google.com>
Authored: Fri Nov 11 10:44:13 2016 -0800
Committer: Davor Bonaci <da...@google.com>
Committed: Mon Nov 14 16:34:51 2016 -0800

----------------------------------------------------------------------
 src/documentation/runners/direct.md | 40 ++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/fe87fb80/src/documentation/runners/direct.md
----------------------------------------------------------------------
diff --git a/src/documentation/runners/direct.md b/src/documentation/runners/direct.md
index 094d44e..1d7470d 100644
--- a/src/documentation/runners/direct.md
+++ b/src/documentation/runners/direct.md
@@ -1,9 +1,45 @@
 ---
 layout: default
-title: "Apache Direct Runner"
+title: "Direct Runner"
 permalink: /documentation/runners/direct/
 redirect_from: /learn/runners/direct/
 ---
 # Using the Direct Runner
 
-This page is under construction ([BEAM-505](https://issues.apache.org/jira/browse/BEAM-505)).
+The Direct Runner executes pipelines on your machine and is designed to validate that pipelines adhere to the Apache Beam model as closely as possible. Instead of focusing on efficient pipeline execution, the Direct Runner performs additional checks to ensure that users do not rely on semantics that are not guaranteed by the model. Some of these checks include:
+
+* enforcing immutability of elements
+* enforcing encodability of elements
+* elements are processed in an arbitrary order at all points
+* serialization of user functions (`DoFn`, `CombineFn`, etc.)
+
+Using the Direct Runner for testing and development helps ensure that pipelines are robust across different Beam runners. In addition, debugging failed runs can be a non-trivial task when a pipeline executes on a remote cluster. Instead, it is often faster and simpler to perform local unit testing on your pipeline code. Unit testing your pipeline locally also allows you to use your preferred local debugging tools.
+
+Here are some resources with information about how to test your pipelines.
+* [Testing Unbounded Pipelines in Apache Beam]({{ site.baseurl }}/blog/2016/10/20/test-stream.html) talks about the use of Java classes [`PAssert`]({{ site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/sdk/testing/PAssert.html) and [`TestStream`]({{ site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/sdk/testing/TestStream.html) to test your pipelines.
+* The [Apache Beam WordCount Example]({{ site.baseurl }}/get-started/wordcount-example/) contains an example of logging and testing a pipeline with [`PAssert`]({{ site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/sdk/testing/PAssert.html).
+
+
+## Direct Runner prerequisites and setup
+
+You must specify your dependency on the Direct Runner.
+
+```java
+<dependency>
+   <groupId>org.apache.beam</groupId>
+   <artifactId>beam-runners-direct-java</artifactId>
+   <version>0.3.0-incubating</version>
+   <scope>runtime</scope>
+</dependency>
+```
+
+## Pipeline options for the Direct Runner
+
+When executing your pipeline from the command-line, set `runner` to `direct`. The default values for the other pipeline options are generally sufficient.
+
+See the reference documentation for the  <span class="language-java">[`DirectOptions`]({{ site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/runners/direct/DirectOptions.html)</span><span class="language-python">[`PipelineOptions`](https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/utils/options.py)</span> interface (and its subinterfaces) for defaults and the complete list of pipeline configuration options.
+
+## Additional information and caveats
+
+Local execution is limited by the memory available in your local environment. It is highly recommended that you run your pipeline with data sets small enough to fit in local memory. You can create a small in-memory data set using a <span class="language-java">[`Create`]({{ site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/sdk/transforms/Create.html)</span><span class="language-python">[`Create`](https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/transforms/core.py)</span> transform, or you can use a <span class="language-java">[`Read`]({{ site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/sdk/io/Read.html)</span><span class="language-python">[`Read`](https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/iobase.py)</span> transform to work with small local or remote files.
+


[3/3] incubator-beam-site git commit: This closes #76

Posted by da...@apache.org.
This closes #76


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/a82a0f3b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/a82a0f3b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/a82a0f3b

Branch: refs/heads/asf-site
Commit: a82a0f3bb80cbbc41eeb9fb56042fc4a33abace1
Parents: 6ab73c7 159ff48
Author: Davor Bonaci <da...@google.com>
Authored: Mon Nov 14 16:35:14 2016 -0800
Committer: Davor Bonaci <da...@google.com>
Committed: Mon Nov 14 16:35:14 2016 -0800

----------------------------------------------------------------------
 content/documentation/runners/direct/index.html | 43 +++++++++++++++++++-
 src/documentation/runners/direct.md             | 40 +++++++++++++++++-
 2 files changed, 79 insertions(+), 4 deletions(-)
----------------------------------------------------------------------



[2/3] incubator-beam-site git commit: Regenerate website

Posted by da...@apache.org.
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/159ff482
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/159ff482
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/159ff482

Branch: refs/heads/asf-site
Commit: 159ff4821ebfb63a93899b41000bb8d00fcfb978
Parents: fe87fb8
Author: Davor Bonaci <da...@google.com>
Authored: Mon Nov 14 16:35:14 2016 -0800
Committer: Davor Bonaci <da...@google.com>
Committed: Mon Nov 14 16:35:14 2016 -0800

----------------------------------------------------------------------
 content/documentation/runners/direct/index.html | 43 +++++++++++++++++++-
 1 file changed, 41 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/159ff482/content/documentation/runners/direct/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/runners/direct/index.html b/content/documentation/runners/direct/index.html
index d47b51b..6cc31f9 100644
--- a/content/documentation/runners/direct/index.html
+++ b/content/documentation/runners/direct/index.html
@@ -6,7 +6,7 @@
   <meta http-equiv="X-UA-Compatible" content="IE=edge">
   <meta name="viewport" content="width=device-width, initial-scale=1">
 
-  <title>Apache Direct Runner</title>
+  <title>Direct Runner</title>
   <meta name="description" content="Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific Languages (DSLs). Dataflow pipelines simplify the mechanics of large-scale batch and streaming data processing and can run on a number of runtimes like Apache Flink, Apache Spark, and Google Cloud Dataflow (a cloud service). Beam also brings DSL in different languages, allowing users to easily implement their data integration processes.
 ">
 
@@ -142,7 +142,46 @@
       <div class="row">
         <h1 id="using-the-direct-runner">Using the Direct Runner</h1>
 
-<p>This page is under construction (<a href="https://issues.apache.org/jira/browse/BEAM-505">BEAM-505</a>).</p>
+<p>The Direct Runner executes pipelines on your machine and is designed to validate that pipelines adhere to the Apache Beam model as closely as possible. Instead of focusing on efficient pipeline execution, the Direct Runner performs additional checks to ensure that users do not rely on semantics that are not guaranteed by the model. Some of these checks include:</p>
+
+<ul>
+  <li>enforcing immutability of elements</li>
+  <li>enforcing encodability of elements</li>
+  <li>elements are processed in an arbitrary order at all points</li>
+  <li>serialization of user functions (<code class="highlighter-rouge">DoFn</code>, <code class="highlighter-rouge">CombineFn</code>, etc.)</li>
+</ul>
+
+<p>Using the Direct Runner for testing and development helps ensure that pipelines are robust across different Beam runners. In addition, debugging failed runs can be a non-trivial task when a pipeline executes on a remote cluster. Instead, it is often faster and simpler to perform local unit testing on your pipeline code. Unit testing your pipeline locally also allows you to use your preferred local debugging tools.</p>
+
+<p>Here are some resources with information about how to test your pipelines.</p>
+<ul>
+  <li><a href="/blog/2016/10/20/test-stream.html">Testing Unbounded Pipelines in Apache Beam</a> talks about the use of Java classes <a href="/documentation/sdks/javadoc/0.3.0-incubating/index.html?org/apache/beam/sdk/testing/PAssert.html"><code class="highlighter-rouge">PAssert</code></a> and <a href="/documentation/sdks/javadoc/0.3.0-incubating/index.html?org/apache/beam/sdk/testing/TestStream.html"><code class="highlighter-rouge">TestStream</code></a> to test your pipelines.</li>
+  <li>The <a href="/get-started/wordcount-example/">Apache Beam WordCount Example</a> contains an example of logging and testing a pipeline with <a href="/documentation/sdks/javadoc/0.3.0-incubating/index.html?org/apache/beam/sdk/testing/PAssert.html"><code class="highlighter-rouge">PAssert</code></a>.</li>
+</ul>
+
+<h2 id="direct-runner-prerequisites-and-setup">Direct Runner prerequisites and setup</h2>
+
+<p>You must specify your dependency on the Direct Runner.</p>
+
+<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="o">&lt;</span><span class="n">dependency</span><span class="o">&gt;</span>
+   <span class="o">&lt;</span><span class="n">groupId</span><span class="o">&gt;</span><span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">beam</span><span class="o">&lt;/</span><span class="n">groupId</span><span class="o">&gt;</span>
+   <span class="o">&lt;</span><span class="n">artifactId</span><span class="o">&gt;</span><span class="n">beam</span><span class="o">-</span><span class="n">runners</span><span class="o">-</span><span class="n">direct</span><span class="o">-</span><span class="n">java</span><span class="o">&lt;/</span><span class="n">artifactId</span><span class="o">&gt;</span>
+   <span class="o">&lt;</span><span class="n">version</span><span class="o">&gt;</span><span class="mf">0.3</span><span class="o">.</span><span class="mi">0</span><span class="o">-</span><span class="n">incubating</span><span class="o">&lt;/</span><span class="n">version</span><span class="o">&gt;</span>
+   <span class="o">&lt;</span><span class="n">scope</span><span class="o">&gt;</span><span class="n">runtime</span><span class="o">&lt;/</span><span class="n">scope</span><span class="o">&gt;</span>
+<span class="o">&lt;/</span><span class="n">dependency</span><span class="o">&gt;</span>
+</code></pre>
+</div>
+
+<h2 id="pipeline-options-for-the-direct-runner">Pipeline options for the Direct Runner</h2>
+
+<p>When executing your pipeline from the command-line, set <code class="highlighter-rouge">runner</code> to <code class="highlighter-rouge">direct</code>. The default values for the other pipeline options are generally sufficient.</p>
+
+<p>See the reference documentation for the  <span class="language-java"><a href="/documentation/sdks/javadoc/0.3.0-incubating/index.html?org/apache/beam/runners/direct/DirectOptions.html"><code class="highlighter-rouge">DirectOptions</code></a></span><span class="language-python"><a href="https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/utils/options.py"><code class="highlighter-rouge">PipelineOptions</code></a></span> interface (and its subinterfaces) for defaults and the complete list of pipeline configuration options.</p>
+
+<h2 id="additional-information-and-caveats">Additional information and caveats</h2>
+
+<p>Local execution is limited by the memory available in your local environment. It is highly recommended that you run your pipeline with data sets small enough to fit in local memory. You can create a small in-memory data set using a <span class="language-java"><a href="/documentation/sdks/javadoc/0.3.0-incubating/index.html?org/apache/beam/sdk/transforms/Create.html"><code class="highlighter-rouge">Create</code></a></span><span class="language-python"><a href="https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/transforms/core.py"><code class="highlighter-rouge">Create</code></a></span> transform, or you can use a <span class="language-java"><a href="/documentation/sdks/javadoc/0.3.0-incubating/index.html?org/apache/beam/sdk/io/Read.html"><code class="highlighter-rouge">Read</code></a></span><span class="language-python"><a href="https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/iobase.py"><code class="highlighter-roug
 e">Read</code></a></span> transform to work with small local or remote files.</p>
+
 
       </div>