You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by da...@apache.org on 2017/05/04 07:41:31 UTC

[1/3] beam-site git commit: Transfer some content from Create Your Pipeline to the Programming Guide.

Repository: beam-site
Updated Branches:
  refs/heads/asf-site 7b3e24f3b -> 8c9a89ebf


Transfer some content from Create Your Pipeline to the Programming Guide.


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/8ea44819
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/8ea44819
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/8ea44819

Branch: refs/heads/asf-site
Commit: 8ea448195fdf7ea04c44e53886d23a148dfcadfc
Parents: 7b3e24f
Author: Hadar Hod <ha...@google.com>
Authored: Wed May 3 15:08:40 2017 -0700
Committer: Davor Bonaci <da...@google.com>
Committed: Thu May 4 00:36:12 2017 -0700

----------------------------------------------------------------------
 .../pipelines/create-your-pipeline.md           |  76 +----------
 src/documentation/programming-guide.md          | 133 ++++++++++++-------
 2 files changed, 91 insertions(+), 118 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/beam-site/blob/8ea44819/src/documentation/pipelines/create-your-pipeline.md
----------------------------------------------------------------------
diff --git a/src/documentation/pipelines/create-your-pipeline.md b/src/documentation/pipelines/create-your-pipeline.md
index 120ec35..b765467 100644
--- a/src/documentation/pipelines/create-your-pipeline.md
+++ b/src/documentation/pipelines/create-your-pipeline.md
@@ -22,7 +22,7 @@ A Beam program often starts by creating a `Pipeline` object.
 
 In the Beam SDKs, each pipeline is represented by an explicit object of type `Pipeline`. Each `Pipeline` object is an independent entity that encapsulates both the data the pipeline operates over and the transforms that get applied to that data.
 
-To create a pipeline, declare a `Pipeline` object, and pass it some configuration options, which are explained in a section below. You pass the configuration options by creating an object of type `PipelineOptions`, which you can build by using the static method `PipelineOptionsFactory.create()`.
+To create a pipeline, declare a `Pipeline` object, and pass it some [configuration options]({{ site.baseurl }}/documentation/programming-guide#options).
 
 ```java
 // Start by defining the options for the pipeline.
@@ -32,71 +32,6 @@ PipelineOptions options = PipelineOptionsFactory.create();
 Pipeline p = Pipeline.create(options);
 ```
 
-### Configuring Pipeline Options
-
-Use the pipeline options to configure different aspects of your pipeline, such as the pipeline runner that will execute your pipeline and any runner-specific configuration required by the chosen runner. Your pipeline options will potentially include information such as your project ID or a location for storing files. 
-
-When you run the pipeline on a runner of your choice, a copy of the PipelineOptions will be available to your code. For example, you can read PipelineOptions from a DoFn's Context.
-
-#### Setting PipelineOptions from Command-Line Arguments
-
-While you can configure your pipeline by creating a `PipelineOptions` object and setting the fields directly, the Beam SDKs include a command-line parser that you can use to set fields in `PipelineOptions` using command-line arguments.
-
-To read options from the command-line, construct your `PipelineOptions` object as demonstrated in the following example code:
-
-```java
-MyOptions options = PipelineOptionsFactory.fromArgs(args).withValidation().create();
-```
-
-This interprets command-line arguments that follow the format:
-
-```java
---<option>=<value>
-```
-
-> **Note:** Appending the method `.withValidation` will check for required command-line arguments and validate argument values.
-
-Building your `PipelineOptions` this way lets you specify any of the options as a command-line argument.
-
-> **Note:** The [WordCount example pipeline]({{ site.baseurl }}/get-started/wordcount-example) demonstrates how to set pipeline options at runtime by using command-line options.
-
-#### Creating Custom Options
-
-You can add your own custom options in addition to the standard `PipelineOptions`. To add your own options, define an interface with getter and setter methods for each option, as in the following example:
-
-```java
-public interface MyOptions extends PipelineOptions {
-    String getMyCustomOption();
-    void setMyCustomOption(String myCustomOption);
-  }
-```
-
-You can also specify a description, which appears when a user passes `--help` as a command-line argument, and a default value.
-
-You set the description and default value using annotations, as follows:
-
-```java
-public interface MyOptions extends PipelineOptions {
-    @Description("My custom command line argument.")
-    @Default.String("DEFAULT")
-    String getMyCustomOption();
-    void setMyCustomOption(String myCustomOption);
-  }
-```
-
-It's recommended that you register your interface with `PipelineOptionsFactory` and then pass the interface when creating the `PipelineOptions` object. When you register your interface with `PipelineOptionsFactory`, the `--help` can find your custom options interface and add it to the output of the `--help` command. `PipelineOptionsFactory` will also validate that your custom options are compatible with all other registered options.
-
-The following example code shows how to register your custom options interface with `PipelineOptionsFactory`:
-
-```java
-PipelineOptionsFactory.register(MyOptions.class);
-MyOptions options = PipelineOptionsFactory.fromArgs(args)
-                                                .withValidation()
-                                                .as(MyOptions.class);
-```
-
-Now your pipeline can accept `--myCustomOption=value` as a command-line argument.
-
 ## Reading Data Into Your Pipeline
 
 To create your pipeline's initial `PCollection`, you apply a root transform to your pipeline object. A root transform creates a `PCollection` from either an external data source or some local data you specify.
@@ -112,13 +47,7 @@ PCollection<String> lines = p.apply(
 
 ## Applying Transforms to Process Pipeline Data
 
-To use transforms in your pipeline, you **apply** them to the `PCollection` that you want to transform.
-
-To apply a transform, you call the `apply` method on each `PCollection` that you want to process, passing the desired transform object as an argument.
-
-The Beam SDKs contain a number of different transforms that you can apply to your pipeline's `PCollection`s. These include general-purpose core transforms, such as [ParDo]({{ site.baseurl }}/documentation/programming-guide/#transforms-pardo) or [Combine]({{ site.baseurl }}/documentation/programming-guide/#transforms-combine). There are also pre-written [composite transforms]({{ site.baseurl }}/documentation/programming-guide/#transforms-composite) included in the SDKs, which combine one or more of the core transforms in a useful processing pattern, such as counting or combining elements in a collection. You can also define your own more complex composite transforms to fit your pipeline's exact use case.
-
-In the Beam Java SDK, each transform is a subclass of the base class `PTransform`. When you call `apply` on a `PCollection`, you pass the `PTransform` you want to use as an argument.
+You can manipulate your data using the various [transforms]({{ site.baseurl }}/documentation/programming-guide/#transforms) provided in the Beam SDKs. To do this, you **apply** the trannsforms to your pipeline's `PCollection` by calling the `apply` method on each `PCollection` that you want to process and passing the desired transform object as an argument.
 
 The following code shows how to `apply` a transform to a `PCollection` of strings. The transform is a user-defined custom transform that reverses the contents of each string and outputs a new `PCollection` containing the reversed strings.
 
@@ -158,4 +87,5 @@ p.run().waitUntilFinish();
 
 ## What's next
 
+*   [Programming Guide]({{ site.baseurl }}/documentation/programming-guide) - Learn the details of creating your pipeline, configuring pipeline options, and applying transforms.
 *   [Test your pipeline]({{ site.baseurl }}/documentation/pipelines/test-your-pipeline).

http://git-wip-us.apache.org/repos/asf/beam-site/blob/8ea44819/src/documentation/programming-guide.md
----------------------------------------------------------------------
diff --git a/src/documentation/programming-guide.md b/src/documentation/programming-guide.md
index e3cd2d9..11ec86d 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -11,7 +11,6 @@ redirect_from:
 
 The **Beam Programming Guide** is intended for Beam users who want to use the Beam SDKs to create data processing pipelines. It provides guidance for using the Beam SDK classes to build and test your pipeline. It is not intended as an exhaustive reference, but as a language-agnostic, high-level guide to programmatically building your Beam pipeline. As the programming guide is filled out, the text will include code samples in multiple languages to help illustrate how to implement Beam concepts in your pipelines.
 
-
 <nav class="language-switcher">
   <strong>Adapt for:</strong>
   <ul>
@@ -24,14 +23,10 @@ The **Beam Programming Guide** is intended for Beam users who want to use the Be
 
 * [Overview](#overview)
 * [Creating the Pipeline](#pipeline)
+  * [Configuring Pipeline Options](#options)
 * [Working with PCollections](#pcollection)
   * [Creating a PCollection](#pccreate)
   * [PCollection Characteristics](#pccharacteristics)
-    * [Element Type](#pcelementtype)
-    * [Immutability](#pcimmutability)
-    * [Random Access](#pcrandomaccess)
-    * [Size and Boundedness](#pcsizebound)
-    * [Element Timestamps](#pctimestamps)
 * [Applying Transforms](#transforms)
   * [Using ParDo](#transforms-pardo)
   * [Using GroupByKey](#transforms-gbk)
@@ -42,7 +37,6 @@ The **Beam Programming Guide** is intended for Beam users who want to use the Be
   * [Additional Outputs](#transforms-outputs)
 * [Composite Transforms](#transforms-composite)
 * [Pipeline I/O](#io)
-* [Running the Pipeline](#running)
 * [Data Encoding and Type Safety](#coders)
 * [Working with Windowing](#windowing)
 * [Working with Triggers](#triggers)
@@ -77,30 +71,100 @@ The `Pipeline` abstraction encapsulates all the data and steps in your data proc
 
 To use Beam, your driver program must first create an instance of the Beam SDK class `Pipeline` (typically in the `main()` function). When you create your `Pipeline`, you'll also need to set some **configuration options**. You can set your pipeline's configuration options programatically, but it's often easier to set the options ahead of time (or read them from the command line) and pass them to the `Pipeline` object when you create the object.
 
-The pipeline configuration options determine, among other things, the `PipelineRunner` that determines where the pipeline gets executed: locally, or using a distributed back-end of your choice. Depending on where your pipeline gets executed and what your specifed Runner requires, the options can also help you specify other aspects of execution.
+```java
+// Start by defining the options for the pipeline.
+PipelineOptions options = PipelineOptionsFactory.create();
+
+// Then create the pipeline.
+Pipeline p = Pipeline.create(options);
+```
+
+```py
+{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets.py tag:pipelines_constructing_creating
+%}
+```
+
+### <a name="options"></a>Configuring Pipeline Options
+
+Use the pipeline options to configure different aspects of your pipeline, such as the pipeline runner that will execute your pipeline and any runner-specific configuration required by the chosen runner. Your pipeline options will potentially include information such as your project ID or a location for storing files. 
+
+When you run the pipeline on a runner of your choice, a copy of the PipelineOptions will be available to your code. For example, you can read PipelineOptions from a DoFn's Context.
 
-To set your pipeline's configuration options and create the pipeline, create an object of type <span class="language-java">[PipelineOptions]({{ site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/sdk/options/PipelineOptions.html)</span><span class="language-py">[PipelineOptions](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/pipeline_options.py)</span> and pass it to `Pipeline.Create()`. The most common way to do this is by parsing arguments from the command-line:
+#### Setting PipelineOptions from Command-Line Arguments
+
+While you can configure your pipeline by creating a `PipelineOptions` object and setting the fields directly, the Beam SDKs include a command-line parser that you can use to set fields in `PipelineOptions` using command-line arguments.
+
+To read options from the command-line, construct your `PipelineOptions` object as demonstrated in the following example code:
 
 ```java
-public static void main(String[] args) {
-   // Will parse the arguments passed into the application and construct a PipelineOptions
-   // Note that --help will print registered options, and --help=PipelineOptionsClassName
-   // will print out usage for the specific class.
-   PipelineOptions options =
-       PipelineOptionsFactory.fromArgs(args).create();
+MyOptions options = PipelineOptionsFactory.fromArgs(args).withValidation().create();
+```
 
-   Pipeline p = Pipeline.create(options);
+```py
+{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets.py tag:pipelines_constructing_creating
+%}
+```
+
+This interprets command-line arguments that follow the format:
+
+```
+--<option>=<value>
+```
+
+> **Note:** Appending the method `.withValidation` will check for required command-line arguments and validate argument values.
+
+Building your `PipelineOptions` this way lets you specify any of the options as a command-line argument.
+
+> **Note:** The [WordCount example pipeline]({{ site.baseurl }}/get-started/wordcount-example) demonstrates how to set pipeline options at runtime by using command-line options.
+
+#### Creating Custom Options
+
+You can add your own custom options in addition to the standard `PipelineOptions`. To add your own options, define an interface with getter and setter methods for each option, as in the following example:
+
+```java
+public interface MyOptions extends PipelineOptions {
+    String getMyCustomOption();
+    void setMyCustomOption(String myCustomOption);
+  }
 ```
 
 ```py
-# Will parse the arguments passed into the application and construct a PipelineOptions object.
-# Note that --help will print registered options.
+{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets.py tag:pipeline_options_define_custom
+%}
+```
 
-{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets.py tag:pipelines_constructing_creating
+You can also specify a description, which appears when a user passes `--help` as a command-line argument, and a default value.
+
+You set the description and default value using annotations, as follows:
+
+```java
+public interface MyOptions extends PipelineOptions {
+    @Description("My custom command line argument.")
+    @Default.String("DEFAULT")
+    String getMyCustomOption();
+    void setMyCustomOption(String myCustomOption);
+  }
+```
+
+```py
+{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets.py tag:pipeline_options_define_custom_with_help_and_default
 %}
 ```
 
-The Beam SDKs contain various subclasses of `PipelineOptions` that correspond to different Runners. For example, `DirectPipelineOptions` contains options for the Direct (local) pipeline runner, while `DataflowPipelineOptions` contains options for using the runner for Google Cloud Dataflow. You can also define your own custom `PipelineOptions` by creating an interface that extends the Beam SDKs' `PipelineOptions` class.
+{:.language-java}
+It's recommended that you register your interface with `PipelineOptionsFactory` and then pass the interface when creating the `PipelineOptions` object. When you register your interface with `PipelineOptionsFactory`, the `--help` can find your custom options interface and add it to the output of the `--help` command. `PipelineOptionsFactory` will also validate that your custom options are compatible with all other registered options.
+
+{:.language-java}
+The following example code shows how to register your custom options interface with `PipelineOptionsFactory`:
+
+```java
+PipelineOptionsFactory.register(MyOptions.class);
+MyOptions options = PipelineOptionsFactory.fromArgs(args)
+                                                .withValidation()
+                                                .as(MyOptions.class);
+```
+
+Now your pipeline can accept `--myCustomOption=value` as a command-line argument.
 
 ## <a name="pcollection"></a>Working with PCollections
 
@@ -125,6 +189,7 @@ public static void main(String[] args) {
         PipelineOptionsFactory.fromArgs(args).create();
     Pipeline p = Pipeline.create(options);
 
+    // Create the PCollection 'lines' by applying a 'Read' transform.
     PCollection<String> lines = p.apply(
       "ReadMyFile", TextIO.Read.from("protocol://path/to/some/inputData.txt"));
 }
@@ -214,7 +279,9 @@ You can manually assign timestamps to the elements of a `PCollection` if the sou
 
 In the Beam SDKs, **transforms** are the operations in your pipeline. A transform takes a `PCollection` (or more than one `PCollection`) as input, performs an operation that you specify on each element in that collection, and produces a new output `PCollection`. To invoke a transform, you must **apply** it to the input `PCollection`.
 
-In Beam SDK each transform has a generic `apply` method <span class="language-py">(or pipe operator `|`)</span>. Invoking multiple Beam transforms is similar to *method chaining*, but with one slight difference: You apply the transform to the input `PCollection`, passing the transform itself as an argument, and the operation returns the output `PCollection`. This takes the general form:
+The Beam SDKs contain a number of different transforms that you can apply to your pipeline's `PCollection`s. These include general-purpose core transforms, such as [ParDo]({{ site.baseurl }}/documentation/programming-guide/#transforms-pardo) or [Combine]({{ site.baseurl }}/documentation/programming-guide/#transforms-combine). There are also pre-written [composite transforms]({{ site.baseurl }}/documentation/programming-guide/#transforms-composite) included in the SDKs, which combine one or more of the core transforms in a useful processing pattern, such as counting or combining elements in a collection. You can also define your own more complex composite transforms to fit your pipeline's exact use case.
+
+Each transform in the Beam SDKs has a generic `apply` method <span class="language-py">(or pipe operator `|`)</span>. Invoking multiple Beam transforms is similar to *method chaining*, but with one slight difference: You apply the transform to the input `PCollection`, passing the transform itself as an argument, and the operation returns the output `PCollection`. This takes the general form:
 
 ```java
 [Output PCollection] = [Input PCollection].apply([Transform])
@@ -1106,29 +1173,6 @@ records.apply("WriteToText",
 ### Beam-provided I/O Transforms
 See the  [Beam-provided I/O Transforms]({{site.baseurl }}/documentation/io/built-in/) page for a list of the currently available I/O transforms.
 
-
-## <a name="running"></a>Running the pipeline
-
-To run your pipeline, use the `run` method. The program you create sends a specification for your pipeline to a pipeline runner, which then constructs and runs the actual series of pipeline operations. Pipelines are executed asynchronously by default.
-
-```java
-pipeline.run();
-```
-
-```py
-pipeline.run()
-```
-
-For blocking execution, append the <span class="language-java">`waitUntilFinish`</span> <span class="language-py">`wait_until_finish`</span> method:
-
-```java
-pipeline.run().waitUntilFinish();
-```
-
-```py
-pipeline.run().wait_until_finish()
-```
-
 ## <a name="coders"></a>Data encoding and type safety
 
 When you create or output pipeline data, you'll need to specify how the elements in your `PCollection`s are encoded and decoded to and from byte strings. Byte strings are used for intermediate storage as well reading from sources and writing to sinks. The Beam SDKs use objects called coders to describe how the elements of a given `PCollection` should be encoded and decoded.
@@ -1773,4 +1817,3 @@ You can also build other sorts of composite triggers. The following example code
 ```py
   # The Beam SDK for Python does not support triggers.
 ```
-


[2/3] beam-site git commit: Regenerate website

Posted by da...@apache.org.
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/722bdfb7
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/722bdfb7
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/722bdfb7

Branch: refs/heads/asf-site
Commit: 722bdfb7820a87c318f48fb015b3e7341930900c
Parents: 8ea4481
Author: Davor Bonaci <da...@google.com>
Authored: Thu May 4 00:38:28 2017 -0700
Committer: Davor Bonaci <da...@google.com>
Committed: Thu May 4 00:38:28 2017 -0700

----------------------------------------------------------------------
 .../pipelines/create-your-pipeline/index.html   |  89 +----------
 .../documentation/programming-guide/index.html  | 160 +++++++++++++------
 2 files changed, 116 insertions(+), 133 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/beam-site/blob/722bdfb7/content/documentation/pipelines/create-your-pipeline/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/pipelines/create-your-pipeline/index.html b/content/documentation/pipelines/create-your-pipeline/index.html
index 8911488..6cfe938 100644
--- a/content/documentation/pipelines/create-your-pipeline/index.html
+++ b/content/documentation/pipelines/create-your-pipeline/index.html
@@ -154,14 +154,7 @@
         <h1 id="create-your-pipeline">Create Your Pipeline</h1>
 
 <ul id="markdown-toc">
-  <li><a href="#creating-your-pipeline-object" id="markdown-toc-creating-your-pipeline-object">Creating Your Pipeline Object</a>    <ul>
-      <li><a href="#configuring-pipeline-options" id="markdown-toc-configuring-pipeline-options">Configuring Pipeline Options</a>        <ul>
-          <li><a href="#setting-pipelineoptions-from-command-line-arguments" id="markdown-toc-setting-pipelineoptions-from-command-line-arguments">Setting PipelineOptions from Command-Line Arguments</a></li>
-          <li><a href="#creating-custom-options" id="markdown-toc-creating-custom-options">Creating Custom Options</a></li>
-        </ul>
-      </li>
-    </ul>
-  </li>
+  <li><a href="#creating-your-pipeline-object" id="markdown-toc-creating-your-pipeline-object">Creating Your Pipeline Object</a></li>
   <li><a href="#reading-data-into-your-pipeline" id="markdown-toc-reading-data-into-your-pipeline">Reading Data Into Your Pipeline</a></li>
   <li><a href="#applying-transforms-to-process-pipeline-data" id="markdown-toc-applying-transforms-to-process-pipeline-data">Applying Transforms to Process Pipeline Data</a></li>
   <li><a href="#writing-or-outputting-your-final-pipeline-data" id="markdown-toc-writing-or-outputting-your-final-pipeline-data">Writing or Outputting Your Final Pipeline Data</a></li>
@@ -185,7 +178,7 @@
 
 <p>In the Beam SDKs, each pipeline is represented by an explicit object of type <code class="highlighter-rouge">Pipeline</code>. Each <code class="highlighter-rouge">Pipeline</code> object is an independent entity that encapsulates both the data the pipeline operates over and the transforms that get applied to that data.</p>
 
-<p>To create a pipeline, declare a <code class="highlighter-rouge">Pipeline</code> object, and pass it some configuration options, which are explained in a section below. You pass the configuration options by creating an object of type <code class="highlighter-rouge">PipelineOptions</code>, which you can build by using the static method <code class="highlighter-rouge">PipelineOptionsFactory.create()</code>.</p>
+<p>To create a pipeline, declare a <code class="highlighter-rouge">Pipeline</code> object, and pass it some <a href="/documentation/programming-guide#options">configuration options</a>.</p>
 
 <div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="c1">// Start by defining the options for the pipeline.</span>
 <span class="n">PipelineOptions</span> <span class="n">options</span> <span class="o">=</span> <span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">create</span><span class="o">();</span>
@@ -195,75 +188,6 @@
 </code></pre>
 </div>
 
-<h3 id="configuring-pipeline-options">Configuring Pipeline Options</h3>
-
-<p>Use the pipeline options to configure different aspects of your pipeline, such as the pipeline runner that will execute your pipeline and any runner-specific configuration required by the chosen runner. Your pipeline options will potentially include information such as your project ID or a location for storing files.</p>
-
-<p>When you run the pipeline on a runner of your choice, a copy of the PipelineOptions will be available to your code. For example, you can read PipelineOptions from a DoFn’s Context.</p>
-
-<h4 id="setting-pipelineoptions-from-command-line-arguments">Setting PipelineOptions from Command-Line Arguments</h4>
-
-<p>While you can configure your pipeline by creating a <code class="highlighter-rouge">PipelineOptions</code> object and setting the fields directly, the Beam SDKs include a command-line parser that you can use to set fields in <code class="highlighter-rouge">PipelineOptions</code> using command-line arguments.</p>
-
-<p>To read options from the command-line, construct your <code class="highlighter-rouge">PipelineOptions</code> object as demonstrated in the following example code:</p>
-
-<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="n">MyOptions</span> <span class="n">options</span> <span class="o">=</span> <span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">).</span><span class="na">withValidation</span><span class="o">().</span><span class="na">create</span><span class="o">();</span>
-</code></pre>
-</div>
-
-<p>This interprets command-line arguments that follow the format:</p>
-
-<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="o">--&lt;</span><span class="n">option</span><span class="o">&gt;=&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-</code></pre>
-</div>
-
-<blockquote>
-  <p><strong>Note:</strong> Appending the method <code class="highlighter-rouge">.withValidation</code> will check for required command-line arguments and validate argument values.</p>
-</blockquote>
-
-<p>Building your <code class="highlighter-rouge">PipelineOptions</code> this way lets you specify any of the options as a command-line argument.</p>
-
-<blockquote>
-  <p><strong>Note:</strong> The <a href="/get-started/wordcount-example">WordCount example pipeline</a> demonstrates how to set pipeline options at runtime by using command-line options.</p>
-</blockquote>
-
-<h4 id="creating-custom-options">Creating Custom Options</h4>
-
-<p>You can add your own custom options in addition to the standard <code class="highlighter-rouge">PipelineOptions</code>. To add your own options, define an interface with getter and setter methods for each option, as in the following example:</p>
-
-<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">MyOptions</span> <span class="kd">extends</span> <span class="n">PipelineOptions</span> <span class="o">{</span>
-    <span class="n">String</span> <span class="nf">getMyCustomOption</span><span class="o">();</span>
-    <span class="kt">void</span> <span class="nf">setMyCustomOption</span><span class="o">(</span><span class="n">String</span> <span class="n">myCustomOption</span><span class="o">);</span>
-  <span class="o">}</span>
-</code></pre>
-</div>
-
-<p>You can also specify a description, which appears when a user passes <code class="highlighter-rouge">--help</code> as a command-line argument, and a default value.</p>
-
-<p>You set the description and default value using annotations, as follows:</p>
-
-<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">MyOptions</span> <span class="kd">extends</span> <span class="n">PipelineOptions</span> <span class="o">{</span>
-    <span class="nd">@Description</span><span class="o">(</span><span class="s">"My custom command line argument."</span><span class="o">)</span>
-    <span class="nd">@Default</span><span class="o">.</span><span class="na">String</span><span class="o">(</span><span class="s">"DEFAULT"</span><span class="o">)</span>
-    <span class="n">String</span> <span class="nf">getMyCustomOption</span><span class="o">();</span>
-    <span class="kt">void</span> <span class="nf">setMyCustomOption</span><span class="o">(</span><span class="n">String</span> <span class="n">myCustomOption</span><span class="o">);</span>
-  <span class="o">}</span>
-</code></pre>
-</div>
-
-<p>It’s recommended that you register your interface with <code class="highlighter-rouge">PipelineOptionsFactory</code> and then pass the interface when creating the <code class="highlighter-rouge">PipelineOptions</code> object. When you register your interface with <code class="highlighter-rouge">PipelineOptionsFactory</code>, the <code class="highlighter-rouge">--help</code> can find your custom options interface and add it to the output of the <code class="highlighter-rouge">--help</code> command. <code class="highlighter-rouge">PipelineOptionsFactory</code> will also validate that your custom options are compatible with all other registered options.</p>
-
-<p>The following example code shows how to register your custom options interface with <code class="highlighter-rouge">PipelineOptionsFactory</code>:</p>
-
-<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">register</span><span class="o">(</span><span class="n">MyOptions</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
-<span class="n">MyOptions</span> <span class="n">options</span> <span class="o">=</span> <span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">)</span>
-                                                <span class="o">.</span><span class="na">withValidation</span><span class="o">()</span>
-                                                <span class="o">.</span><span class="na">as</span><span class="o">(</span><span class="n">MyOptions</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
-</code></pre>
-</div>
-
-<p>Now your pipeline can accept <code class="highlighter-rouge">--myCustomOption=value</code> as a command-line argument.</p>
-
 <h2 id="reading-data-into-your-pipeline">Reading Data Into Your Pipeline</h2>
 
 <p>To create your pipeline’s initial <code class="highlighter-rouge">PCollection</code>, you apply a root transform to your pipeline object. A root transform creates a <code class="highlighter-rouge">PCollection</code> from either an external data source or some local data you specify.</p>
@@ -279,13 +203,7 @@
 
 <h2 id="applying-transforms-to-process-pipeline-data">Applying Transforms to Process Pipeline Data</h2>
 
-<p>To use transforms in your pipeline, you <strong>apply</strong> them to the <code class="highlighter-rouge">PCollection</code> that you want to transform.</p>
-
-<p>To apply a transform, you call the <code class="highlighter-rouge">apply</code> method on each <code class="highlighter-rouge">PCollection</code> that you want to process, passing the desired transform object as an argument.</p>
-
-<p>The Beam SDKs contain a number of different transforms that you can apply to your pipeline’s <code class="highlighter-rouge">PCollection</code>s. These include general-purpose core transforms, such as <a href="/documentation/programming-guide/#transforms-pardo">ParDo</a> or <a href="/documentation/programming-guide/#transforms-combine">Combine</a>. There are also pre-written <a href="/documentation/programming-guide/#transforms-composite">composite transforms</a> included in the SDKs, which combine one or more of the core transforms in a useful processing pattern, such as counting or combining elements in a collection. You can also define your own more complex composite transforms to fit your pipeline’s exact use case.</p>
-
-<p>In the Beam Java SDK, each transform is a subclass of the base class <code class="highlighter-rouge">PTransform</code>. When you call <code class="highlighter-rouge">apply</code> on a <code class="highlighter-rouge">PCollection</code>, you pass the <code class="highlighter-rouge">PTransform</code> you want to use as an argument.</p>
+<p>You can manipulate your data using the various <a href="/documentation/programming-guide/#transforms">transforms</a> provided in the Beam SDKs. To do this, you <strong>apply</strong> the trannsforms to your pipeline’s <code class="highlighter-rouge">PCollection</code> by calling the <code class="highlighter-rouge">apply</code> method on each <code class="highlighter-rouge">PCollection</code> that you want to process and passing the desired transform object as an argument.</p>
 
 <p>The following code shows how to <code class="highlighter-rouge">apply</code> a transform to a <code class="highlighter-rouge">PCollection</code> of strings. The transform is a user-defined custom transform that reverses the contents of each string and outputs a new <code class="highlighter-rouge">PCollection</code> containing the reversed strings.</p>
 
@@ -326,6 +244,7 @@
 <h2 id="whats-next">What’s next</h2>
 
 <ul>
+  <li><a href="/documentation/programming-guide">Programming Guide</a> - Learn the details of creating your pipeline, configuring pipeline options, and applying transforms.</li>
   <li><a href="/documentation/pipelines/test-your-pipeline">Test your pipeline</a>.</li>
 </ul>
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/722bdfb7/content/documentation/programming-guide/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/programming-guide/index.html b/content/documentation/programming-guide/index.html
index 5ad4bea..bc71346 100644
--- a/content/documentation/programming-guide/index.html
+++ b/content/documentation/programming-guide/index.html
@@ -167,19 +167,15 @@
 
 <ul>
   <li><a href="#overview">Overview</a></li>
-  <li><a href="#pipeline">Creating the Pipeline</a></li>
+  <li><a href="#pipeline">Creating the Pipeline</a>
+    <ul>
+      <li><a href="#options">Configuring Pipeline Options</a></li>
+    </ul>
+  </li>
   <li><a href="#pcollection">Working with PCollections</a>
     <ul>
       <li><a href="#pccreate">Creating a PCollection</a></li>
-      <li><a href="#pccharacteristics">PCollection Characteristics</a>
-        <ul>
-          <li><a href="#pcelementtype">Element Type</a></li>
-          <li><a href="#pcimmutability">Immutability</a></li>
-          <li><a href="#pcrandomaccess">Random Access</a></li>
-          <li><a href="#pcsizebound">Size and Boundedness</a></li>
-          <li><a href="#pctimestamps">Element Timestamps</a></li>
-        </ul>
-      </li>
+      <li><a href="#pccharacteristics">PCollection Characteristics</a></li>
     </ul>
   </li>
   <li><a href="#transforms">Applying Transforms</a>
@@ -195,7 +191,6 @@
   </li>
   <li><a href="#transforms-composite">Composite Transforms</a></li>
   <li><a href="#io">Pipeline I/O</a></li>
-  <li><a href="#running">Running the Pipeline</a></li>
   <li><a href="#coders">Data Encoding and Type Safety</a></li>
   <li><a href="#windowing">Working with Windowing</a></li>
   <li><a href="#triggers">Working with Triggers</a></li>
@@ -240,25 +235,39 @@
 
 <p>To use Beam, your driver program must first create an instance of the Beam SDK class <code class="highlighter-rouge">Pipeline</code> (typically in the <code class="highlighter-rouge">main()</code> function). When you create your <code class="highlighter-rouge">Pipeline</code>, you’ll also need to set some <strong>configuration options</strong>. You can set your pipeline’s configuration options programatically, but it’s often easier to set the options ahead of time (or read them from the command line) and pass them to the <code class="highlighter-rouge">Pipeline</code> object when you create the object.</p>
 
-<p>The pipeline configuration options determine, among other things, the <code class="highlighter-rouge">PipelineRunner</code> that determines where the pipeline gets executed: locally, or using a distributed back-end of your choice. Depending on where your pipeline gets executed and what your specifed Runner requires, the options can also help you specify other aspects of execution.</p>
+<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="c1">// Start by defining the options for the pipeline.</span>
+<span class="n">PipelineOptions</span> <span class="n">options</span> <span class="o">=</span> <span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">create</span><span class="o">();</span>
+
+<span class="c1">// Then create the pipeline.</span>
+<span class="n">Pipeline</span> <span class="n">p</span> <span class="o">=</span> <span class="n">Pipeline</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="n">options</span><span class="o">);</span>
+</code></pre>
+</div>
 
-<p>To set your pipeline’s configuration options and create the pipeline, create an object of type <span class="language-java"><a href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/options/PipelineOptions.html">PipelineOptions</a></span><span class="language-py"><a href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/pipeline_options.py">PipelineOptions</a></span> and pass it to <code class="highlighter-rouge">Pipeline.Create()</code>. The most common way to do this is by parsing arguments from the command-line:</p>
+<div class="language-py highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span>
+<span class="kn">from</span> <span class="nn">apache_beam.utils.pipeline_options</span> <span class="kn">import</span> <span class="n">PipelineOptions</span>
 
-<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span>
-   <span class="c1">// Will parse the arguments passed into the application and construct a PipelineOptions</span>
-   <span class="c1">// Note that --help will print registered options, and --help=PipelineOptionsClassName</span>
-   <span class="c1">// will print out usage for the specific class.</span>
-   <span class="n">PipelineOptions</span> <span class="n">options</span> <span class="o">=</span>
-       <span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">).</span><span class="na">create</span><span class="o">();</span>
+<span class="n">p</span> <span class="o">=</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">(</span><span class="n">options</span><span class="o">=</span><span class="n">PipelineOptions</span><span class="p">())</span>
 
-   <span class="n">Pipeline</span> <span class="n">p</span> <span class="o">=</span> <span class="n">Pipeline</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="n">options</span><span class="o">);</span>
 </code></pre>
 </div>
 
-<div class="language-py highlighter-rouge"><pre class="highlight"><code><span class="c"># Will parse the arguments passed into the application and construct a PipelineOptions object.</span>
-<span class="c"># Note that --help will print registered options.</span>
+<h3 id="a-nameoptionsaconfiguring-pipeline-options"><a name="options"></a>Configuring Pipeline Options</h3>
 
-<span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span>
+<p>Use the pipeline options to configure different aspects of your pipeline, such as the pipeline runner that will execute your pipeline and any runner-specific configuration required by the chosen runner. Your pipeline options will potentially include information such as your project ID or a location for storing files.</p>
+
+<p>When you run the pipeline on a runner of your choice, a copy of the PipelineOptions will be available to your code. For example, you can read PipelineOptions from a DoFn’s Context.</p>
+
+<h4 id="setting-pipelineoptions-from-command-line-arguments">Setting PipelineOptions from Command-Line Arguments</h4>
+
+<p>While you can configure your pipeline by creating a <code class="highlighter-rouge">PipelineOptions</code> object and setting the fields directly, the Beam SDKs include a command-line parser that you can use to set fields in <code class="highlighter-rouge">PipelineOptions</code> using command-line arguments.</p>
+
+<p>To read options from the command-line, construct your <code class="highlighter-rouge">PipelineOptions</code> object as demonstrated in the following example code:</p>
+
+<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="n">MyOptions</span> <span class="n">options</span> <span class="o">=</span> <span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">).</span><span class="na">withValidation</span><span class="o">().</span><span class="na">create</span><span class="o">();</span>
+</code></pre>
+</div>
+
+<div class="language-py highlighter-rouge"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">apache_beam</span> <span class="kn">as</span> <span class="nn">beam</span>
 <span class="kn">from</span> <span class="nn">apache_beam.utils.pipeline_options</span> <span class="kn">import</span> <span class="n">PipelineOptions</span>
 
 <span class="n">p</span> <span class="o">=</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">(</span><span class="n">options</span><span class="o">=</span><span class="n">PipelineOptions</span><span class="p">())</span>
@@ -266,7 +275,82 @@
 </code></pre>
 </div>
 
-<p>The Beam SDKs contain various subclasses of <code class="highlighter-rouge">PipelineOptions</code> that correspond to different Runners. For example, <code class="highlighter-rouge">DirectPipelineOptions</code> contains options for the Direct (local) pipeline runner, while <code class="highlighter-rouge">DataflowPipelineOptions</code> contains options for using the runner for Google Cloud Dataflow. You can also define your own custom <code class="highlighter-rouge">PipelineOptions</code> by creating an interface that extends the Beam SDKs’ <code class="highlighter-rouge">PipelineOptions</code> class.</p>
+<p>This interprets command-line arguments that follow the format:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>--&lt;option&gt;=&lt;value&gt;
+</code></pre>
+</div>
+
+<blockquote>
+  <p><strong>Note:</strong> Appending the method <code class="highlighter-rouge">.withValidation</code> will check for required command-line arguments and validate argument values.</p>
+</blockquote>
+
+<p>Building your <code class="highlighter-rouge">PipelineOptions</code> this way lets you specify any of the options as a command-line argument.</p>
+
+<blockquote>
+  <p><strong>Note:</strong> The <a href="/get-started/wordcount-example">WordCount example pipeline</a> demonstrates how to set pipeline options at runtime by using command-line options.</p>
+</blockquote>
+
+<h4 id="creating-custom-options">Creating Custom Options</h4>
+
+<p>You can add your own custom options in addition to the standard <code class="highlighter-rouge">PipelineOptions</code>. To add your own options, define an interface with getter and setter methods for each option, as in the following example:</p>
+
+<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">MyOptions</span> <span class="kd">extends</span> <span class="n">PipelineOptions</span> <span class="o">{</span>
+    <span class="n">String</span> <span class="nf">getMyCustomOption</span><span class="o">();</span>
+    <span class="kt">void</span> <span class="nf">setMyCustomOption</span><span class="o">(</span><span class="n">String</span> <span class="n">myCustomOption</span><span class="o">);</span>
+  <span class="o">}</span>
+</code></pre>
+</div>
+
+<div class="language-py highlighter-rouge"><pre class="highlight"><code><span class="k">class</span> <span class="nc">MyOptions</span><span class="p">(</span><span class="n">PipelineOptions</span><span class="p">):</span>
+
+  <span class="nd">@classmethod</span>
+  <span class="k">def</span> <span class="nf">_add_argparse_args</span><span class="p">(</span><span class="n">cls</span><span class="p">,</span> <span class="n">parser</span><span class="p">):</span>
+    <span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">'--input'</span><span class="p">)</span>
+    <span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">'--output'</span><span class="p">)</span>
+
+</code></pre>
+</div>
+
+<p>You can also specify a description, which appears when a user passes <code class="highlighter-rouge">--help</code> as a command-line argument, and a default value.</p>
+
+<p>You set the description and default value using annotations, as follows:</p>
+
+<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">MyOptions</span> <span class="kd">extends</span> <span class="n">PipelineOptions</span> <span class="o">{</span>
+    <span class="nd">@Description</span><span class="o">(</span><span class="s">"My custom command line argument."</span><span class="o">)</span>
+    <span class="nd">@Default</span><span class="o">.</span><span class="na">String</span><span class="o">(</span><span class="s">"DEFAULT"</span><span class="o">)</span>
+    <span class="n">String</span> <span class="nf">getMyCustomOption</span><span class="o">();</span>
+    <span class="kt">void</span> <span class="nf">setMyCustomOption</span><span class="o">(</span><span class="n">String</span> <span class="n">myCustomOption</span><span class="o">);</span>
+  <span class="o">}</span>
+</code></pre>
+</div>
+
+<div class="language-py highlighter-rouge"><pre class="highlight"><code><span class="k">class</span> <span class="nc">MyOptions</span><span class="p">(</span><span class="n">PipelineOptions</span><span class="p">):</span>
+
+  <span class="nd">@classmethod</span>
+  <span class="k">def</span> <span class="nf">_add_argparse_args</span><span class="p">(</span><span class="n">cls</span><span class="p">,</span> <span class="n">parser</span><span class="p">):</span>
+    <span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">'--input'</span><span class="p">,</span>
+                        <span class="n">help</span><span class="o">=</span><span class="s">'Input for the pipeline'</span><span class="p">,</span>
+                        <span class="n">default</span><span class="o">=</span><span class="s">'gs://my-bucket/input'</span><span class="p">)</span>
+    <span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">'--output'</span><span class="p">,</span>
+                        <span class="n">help</span><span class="o">=</span><span class="s">'Output for the pipeline'</span><span class="p">,</span>
+                        <span class="n">default</span><span class="o">=</span><span class="s">'gs://my-bucket/output'</span><span class="p">)</span>
+
+</code></pre>
+</div>
+
+<p class="language-java">It’s recommended that you register your interface with <code class="highlighter-rouge">PipelineOptionsFactory</code> and then pass the interface when creating the <code class="highlighter-rouge">PipelineOptions</code> object. When you register your interface with <code class="highlighter-rouge">PipelineOptionsFactory</code>, the <code class="highlighter-rouge">--help</code> can find your custom options interface and add it to the output of the <code class="highlighter-rouge">--help</code> command. <code class="highlighter-rouge">PipelineOptionsFactory</code> will also validate that your custom options are compatible with all other registered options.</p>
+
+<p class="language-java">The following example code shows how to register your custom options interface with <code class="highlighter-rouge">PipelineOptionsFactory</code>:</p>
+
+<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">register</span><span class="o">(</span><span class="n">MyOptions</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
+<span class="n">MyOptions</span> <span class="n">options</span> <span class="o">=</span> <span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">)</span>
+                                                <span class="o">.</span><span class="na">withValidation</span><span class="o">()</span>
+                                                <span class="o">.</span><span class="na">as</span><span class="o">(</span><span class="n">MyOptions</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
+</code></pre>
+</div>
+
+<p>Now your pipeline can accept <code class="highlighter-rouge">--myCustomOption=value</code> as a command-line argument.</p>
 
 <h2 id="a-namepcollectionaworking-with-pcollections"><a name="pcollection"></a>Working with PCollections</h2>
 
@@ -290,6 +374,7 @@
         <span class="n">PipelineOptionsFactory</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">).</span><span class="na">create</span><span class="o">();</span>
     <span class="n">Pipeline</span> <span class="n">p</span> <span class="o">=</span> <span class="n">Pipeline</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="n">options</span><span class="o">);</span>
 
+    <span class="c1">// Create the PCollection 'lines' by applying a 'Read' transform.</span>
     <span class="n">PCollection</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">lines</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="na">apply</span><span class="o">(</span>
       <span class="s">"ReadMyFile"</span><span class="o">,</span> <span class="n">TextIO</span><span class="o">.</span><span class="na">Read</span><span class="o">.</span><span class="na">from</span><span class="o">(</span><span class="s">"protocol://path/to/some/inputData.txt"</span><span class="o">));</span>
 <span class="o">}</span>
@@ -386,7 +471,9 @@
 
 <p>In the Beam SDKs, <strong>transforms</strong> are the operations in your pipeline. A transform takes a <code class="highlighter-rouge">PCollection</code> (or more than one <code class="highlighter-rouge">PCollection</code>) as input, performs an operation that you specify on each element in that collection, and produces a new output <code class="highlighter-rouge">PCollection</code>. To invoke a transform, you must <strong>apply</strong> it to the input <code class="highlighter-rouge">PCollection</code>.</p>
 
-<p>In Beam SDK each transform has a generic <code class="highlighter-rouge">apply</code> method <span class="language-py">(or pipe operator <code class="highlighter-rouge">|</code>)</span>. Invoking multiple Beam transforms is similar to <em>method chaining</em>, but with one slight difference: You apply the transform to the input <code class="highlighter-rouge">PCollection</code>, passing the transform itself as an argument, and the operation returns the output <code class="highlighter-rouge">PCollection</code>. This takes the general form:</p>
+<p>The Beam SDKs contain a number of different transforms that you can apply to your pipeline’s <code class="highlighter-rouge">PCollection</code>s. These include general-purpose core transforms, such as <a href="/documentation/programming-guide/#transforms-pardo">ParDo</a> or <a href="/documentation/programming-guide/#transforms-combine">Combine</a>. There are also pre-written <a href="/documentation/programming-guide/#transforms-composite">composite transforms</a> included in the SDKs, which combine one or more of the core transforms in a useful processing pattern, such as counting or combining elements in a collection. You can also define your own more complex composite transforms to fit your pipeline’s exact use case.</p>
+
+<p>Each transform in the Beam SDKs has a generic <code class="highlighter-rouge">apply</code> method <span class="language-py">(or pipe operator <code class="highlighter-rouge">|</code>)</span>. Invoking multiple Beam transforms is similar to <em>method chaining</em>, but with one slight difference: You apply the transform to the input <code class="highlighter-rouge">PCollection</code>, passing the transform itself as an argument, and the operation returns the output <code class="highlighter-rouge">PCollection</code>. This takes the general form:</p>
 
 <div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="o">[</span><span class="n">Output</span> <span class="n">PCollection</span><span class="o">]</span> <span class="o">=</span> <span class="o">[</span><span class="n">Input</span> <span class="n">PCollection</span><span class="o">].</span><span class="na">apply</span><span class="o">([</span><span class="n">Transform</span><span class="o">])</span>
 </code></pre>
@@ -1406,28 +1493,6 @@ guest, [[], [order4]]
 <h3 id="beam-provided-io-transforms">Beam-provided I/O Transforms</h3>
 <p>See the  <a href="/documentation/io/built-in/">Beam-provided I/O Transforms</a> page for a list of the currently available I/O transforms.</p>
 
-<h2 id="a-namerunningarunning-the-pipeline"><a name="running"></a>Running the pipeline</h2>
-
-<p>To run your pipeline, use the <code class="highlighter-rouge">run</code> method. The program you create sends a specification for your pipeline to a pipeline runner, which then constructs and runs the actual series of pipeline operations. Pipelines are executed asynchronously by default.</p>
-
-<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="n">pipeline</span><span class="o">.</span><span class="na">run</span><span class="o">();</span>
-</code></pre>
-</div>
-
-<div class="language-py highlighter-rouge"><pre class="highlight"><code><span class="n">pipeline</span><span class="o">.</span><span class="n">run</span><span class="p">()</span>
-</code></pre>
-</div>
-
-<p>For blocking execution, append the <span class="language-java"><code class="highlighter-rouge">waitUntilFinish</code></span> <span class="language-py"><code class="highlighter-rouge">wait_until_finish</code></span> method:</p>
-
-<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="n">pipeline</span><span class="o">.</span><span class="na">run</span><span class="o">().</span><span class="na">waitUntilFinish</span><span class="o">();</span>
-</code></pre>
-</div>
-
-<div class="language-py highlighter-rouge"><pre class="highlight"><code><span class="n">pipeline</span><span class="o">.</span><span class="n">run</span><span class="p">()</span><span class="o">.</span><span class="n">wait_until_finish</span><span class="p">()</span>
-</code></pre>
-</div>
-
 <h2 id="a-namecodersadata-encoding-and-type-safety"><a name="coders"></a>Data encoding and type safety</h2>
 
 <p>When you create or output pipeline data, you’ll need to specify how the elements in your <code class="highlighter-rouge">PCollection</code>s are encoded and decoded to and from byte strings. Byte strings are used for intermediate storage as well reading from sources and writing to sinks. The Beam SDKs use objects called coders to describe how the elements of a given <code class="highlighter-rouge">PCollection</code> should be encoded and decoded.</p>
@@ -2097,7 +2162,6 @@ Subsequent transforms, however, are applied to the result of the <code class="hi
 </code></pre>
 </div>
 
-
       </div>
 
 


[3/3] beam-site git commit: This closes #229

Posted by da...@apache.org.
This closes #229


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/8c9a89eb
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/8c9a89eb
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/8c9a89eb

Branch: refs/heads/asf-site
Commit: 8c9a89ebf631ab5d446f21704f21506839e7875f
Parents: 7b3e24f 722bdfb
Author: Davor Bonaci <da...@google.com>
Authored: Thu May 4 00:38:29 2017 -0700
Committer: Davor Bonaci <da...@google.com>
Committed: Thu May 4 00:38:29 2017 -0700

----------------------------------------------------------------------
 .../pipelines/create-your-pipeline/index.html   |  89 +----------
 .../documentation/programming-guide/index.html  | 160 +++++++++++++------
 .../pipelines/create-your-pipeline.md           |  76 +--------
 src/documentation/programming-guide.md          | 133 +++++++++------
 4 files changed, 207 insertions(+), 251 deletions(-)
----------------------------------------------------------------------