You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by fr...@apache.org on 2016/08/09 23:10:29 UTC
[1/3] incubator-beam-site git commit: Revise Beam programming guide
for new DoFn
Repository: incubator-beam-site
Updated Branches:
refs/heads/asf-site 4f1473477 -> e2430eb4d
Revise Beam programming guide for new DoFn
Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/303864a3
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/303864a3
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/303864a3
Branch: refs/heads/asf-site
Commit: 303864a311abca93170ee1693a42a5e265e37a35
Parents: 4f14734
Author: Kenneth Knowles <kl...@google.com>
Authored: Mon Aug 8 10:09:43 2016 -0700
Committer: Kenneth Knowles <kl...@google.com>
Committed: Mon Aug 8 10:09:43 2016 -0700
----------------------------------------------------------------------
learn/programming-guide.md | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/303864a3/learn/programming-guide.md
----------------------------------------------------------------------
diff --git a/learn/programming-guide.md b/learn/programming-guide.md
index 92cf17c..ac18ba6 100644
--- a/learn/programming-guide.md
+++ b/learn/programming-guide.md
@@ -271,11 +271,11 @@ A `DoFn` processes one element at a time from the input `PCollection`. When you
static class ComputeWordLengthFn extends DoFn<String, Integer> { ... }
```
-Inside your `DoFn` subclass, you'll need to override the method `processElement`, where you provide the actual processing logic. You don't need to manually extract the elements from the input collection; the Beam SDKs handle that for you. Your override of `processElement` should accept an object of type `ProcessContext`. The `ProcessContext` object gives you access to an input element and a method for emitting an output element:
+Inside your `DoFn` subclass, you'll write a method annotated with `@ProcessElement` where you provide the actual processing logic. You don't need to manually extract the elements from the input collection; the Beam SDKs handle that for you. Your `@ProcessElement` method should accept an object of type `ProcessContext`. The `ProcessContext` object gives you access to an input element and a method for emitting an output element:
```java
static class ComputeWordLengthFn extends DoFn<String, Integer> {
- @Override
+ @ProcessElement
public void processElement(ProcessContext c) {
// Get the input element from ProcessContext.
String word = c.element();
@@ -287,9 +287,9 @@ static class ComputeWordLengthFn extends DoFn<String, Integer> {
> **Note:** If the elements in your input `PCollection` are key/value pairs, you can access the key or value by using `ProcessContext.element().getKey()` or `ProcessContext.element().getValue()`, respectively.
-A given `DoFn` instance generally gets invoked one or more times to process some arbitrary bundle of elements. However, Beam doesn't guarantee an exact number of invocations; it may be invoked multiple times on a given worker node to account for failures and retries. As such, you can cache information across multiple calls to `processElement`, but if you do so, make sure the implementation **does not depend on the number of invocations**.
+A given `DoFn` instance generally gets invoked one or more times to process some arbitrary bundle of elements. However, Beam doesn't guarantee an exact number of invocations; it may be invoked multiple times on a given worker node to account for failures and retries. As such, you can cache information across multiple calls to your `@ProcessElement` method, but if you do so, make sure the implementation **does not depend on the number of invocations**.
-When you override `processElement`, you'll need to meet some immutability requirements to ensure that Beam and the processing back-end can safely serialize and cache the values in your pipeline. Your method should meet the following requirements:
+In your `@ProcessElement` method, you'll also need to meet some immutability requirements to ensure that Beam and the processing back-end can safely serialize and cache the values in your pipeline. Your method should meet the following requirements:
* You should not in any way modify an element returned by `ProcessContext.element()` or `ProcessContext.sideInput()` (the incoming elements from the input collection).
* Once you output a value using `ProcessContext.output()` or `ProcessContext.sideOutput()`, you should not modify that value in any way.
@@ -310,7 +310,7 @@ PCollection<Integer> wordLengths = words.apply(
ParDo
.named("ComputeWordLengths") // the transform name
.of(new DoFn<String, Integer>() { // a DoFn as an anonymous inner class instance
- @Override
+ @ProcessElement
public void processElement(ProcessContext c) {
c.output(c.element().length());
}
[2/3] incubator-beam-site git commit: regenerated after merge
Posted by fr...@apache.org.
regenerated after merge
Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/ddbed01e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/ddbed01e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/ddbed01e
Branch: refs/heads/asf-site
Commit: ddbed01ee859bde2976693b85042aabdcee64083
Parents: 303864a
Author: Frances Perry <fj...@google.com>
Authored: Tue Aug 9 15:02:55 2016 -0700
Committer: Frances Perry <fj...@google.com>
Committed: Tue Aug 9 15:02:55 2016 -0700
----------------------------------------------------------------------
content/feed.xml | 4 ++--
content/learn/programming-guide/index.html | 10 +++++-----
content/learn/runners/capability-matrix/index.html | 2 +-
3 files changed, 8 insertions(+), 8 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/ddbed01e/content/feed.xml
----------------------------------------------------------------------
diff --git a/content/feed.xml b/content/feed.xml
index 2b5a914..53aed33 100644
--- a/content/feed.xml
+++ b/content/feed.xml
@@ -6,8 +6,8 @@
</description>
<link>http://beam.incubator.apache.org/</link>
<atom:link href="http://beam.incubator.apache.org/feed.xml" rel="self" type="application/rss+xml"/>
- <pubDate>Thu, 04 Aug 2016 09:41:39 -0700</pubDate>
- <lastBuildDate>Thu, 04 Aug 2016 09:41:39 -0700</lastBuildDate>
+ <pubDate>Tue, 09 Aug 2016 15:00:40 -0700</pubDate>
+ <lastBuildDate>Tue, 09 Aug 2016 15:00:40 -0700</lastBuildDate>
<generator>Jekyll v3.2.0</generator>
<item>
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/ddbed01e/content/learn/programming-guide/index.html
----------------------------------------------------------------------
diff --git a/content/learn/programming-guide/index.html b/content/learn/programming-guide/index.html
index 1b721de..d4d1127 100644
--- a/content/learn/programming-guide/index.html
+++ b/content/learn/programming-guide/index.html
@@ -435,10 +435,10 @@
</code></pre>
</div>
-<p>Inside your <code class="highlighter-rouge">DoFn</code> subclass, you\u2019ll need to override the method <code class="highlighter-rouge">processElement</code>, where you provide the actual processing logic. You don\u2019t need to manually extract the elements from the input collection; the Beam SDKs handle that for you. Your override of <code class="highlighter-rouge">processElement</code> should accept an object of type <code class="highlighter-rouge">ProcessContext</code>. The <code class="highlighter-rouge">ProcessContext</code> object gives you access to an input element and a method for emitting an output element:</p>
+<p>Inside your <code class="highlighter-rouge">DoFn</code> subclass, you\u2019ll write a method annotated with <code class="highlighter-rouge">@ProcessElement</code> where you provide the actual processing logic. You don\u2019t need to manually extract the elements from the input collection; the Beam SDKs handle that for you. Your <code class="highlighter-rouge">@ProcessElement</code> method should accept an object of type <code class="highlighter-rouge">ProcessContext</code>. The <code class="highlighter-rouge">ProcessContext</code> object gives you access to an input element and a method for emitting an output element:</p>
<div class="language-java highlighter-rouge"><pre class="highlight"><code><span class="kd">static</span> <span class="kd">class</span> <span class="nc">ComputeWordLengthFn</span> <span class="kd">extends</span> <span class="n">DoFn</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">></span> <span class="o">{</span>
- <span class="nd">@Override</span>
+ <span class="nd">@ProcessElement</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">processElement</span><span class="o">(</span><span class="n">ProcessContext</span> <span class="n">c</span><span class="o">)</span> <span class="o">{</span>
<span class="c1">// Get the input element from ProcessContext.</span>
<span class="n">String</span> <span class="n">word</span> <span class="o">=</span> <span class="n">c</span><span class="o">.</span><span class="na">element</span><span class="o">();</span>
@@ -453,9 +453,9 @@
<p><strong>Note:</strong> If the elements in your input <code class="highlighter-rouge">PCollection</code> are key/value pairs, you can access the key or value by using <code class="highlighter-rouge">ProcessContext.element().getKey()</code> or <code class="highlighter-rouge">ProcessContext.element().getValue()</code>, respectively.</p>
</blockquote>
-<p>A given <code class="highlighter-rouge">DoFn</code> instance generally gets invoked one or more times to process some arbitrary bundle of elements. However, Beam doesn\u2019t guarantee an exact number of invocations; it may be invoked multiple times on a given worker node to account for failures and retries. As such, you can cache information across multiple calls to <code class="highlighter-rouge">processElement</code>, but if you do so, make sure the implementation <strong>does not depend on the number of invocations</strong>.</p>
+<p>A given <code class="highlighter-rouge">DoFn</code> instance generally gets invoked one or more times to process some arbitrary bundle of elements. However, Beam doesn\u2019t guarantee an exact number of invocations; it may be invoked multiple times on a given worker node to account for failures and retries. As such, you can cache information across multiple calls to your <code class="highlighter-rouge">@ProcessElement</code> method, but if you do so, make sure the implementation <strong>does not depend on the number of invocations</strong>.</p>
-<p>When you override <code class="highlighter-rouge">processElement</code>, you\u2019ll need to meet some immutability requirements to ensure that Beam and the processing back-end can safely serialize and cache the values in your pipeline. Your method should meet the following requirements:</p>
+<p>In your <code class="highlighter-rouge">@ProcessElement</code> method, you\u2019ll also need to meet some immutability requirements to ensure that Beam and the processing back-end can safely serialize and cache the values in your pipeline. Your method should meet the following requirements:</p>
<ul>
<li>You should not in any way modify an element returned by <code class="highlighter-rouge">ProcessContext.element()</code> or <code class="highlighter-rouge">ProcessContext.sideInput()</code> (the incoming elements from the input collection).</li>
@@ -477,7 +477,7 @@
<span class="n">ParDo</span>
<span class="o">.</span><span class="na">named</span><span class="o">(</span><span class="s">"ComputeWordLengths"</span><span class="o">)</span> <span class="c1">// the transform name</span>
<span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="k">new</span> <span class="n">DoFn</span><span class="o"><</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">>()</span> <span class="o">{</span> <span class="c1">// a DoFn as an anonymous inner class instance</span>
- <span class="nd">@Override</span>
+ <span class="nd">@ProcessElement</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">processElement</span><span class="o">(</span><span class="n">ProcessContext</span> <span class="n">c</span><span class="o">)</span> <span class="o">{</span>
<span class="n">c</span><span class="o">.</span><span class="na">output</span><span class="o">(</span><span class="n">c</span><span class="o">.</span><span class="na">element</span><span class="o">().</span><span class="na">length</span><span class="o">());</span>
<span class="o">}</span>
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/ddbed01e/content/learn/runners/capability-matrix/index.html
----------------------------------------------------------------------
diff --git a/content/learn/runners/capability-matrix/index.html b/content/learn/runners/capability-matrix/index.html
index 2357778..ae72470 100644
--- a/content/learn/runners/capability-matrix/index.html
+++ b/content/learn/runners/capability-matrix/index.html
@@ -140,7 +140,7 @@
<div class="row">
<h1 id="beam-capability-matrix">Beam Capability Matrix</h1>
-<p><span style="font-size:11px;float:none">Last updated: 2016-08-04 09:41 PDT</span></p>
+<p><span style="font-size:11px;float:none">Last updated: 2016-08-09 15:00 PDT</span></p>
<p>Apache Beam (incubating) provides a portable API layer for building sophisticated data-parallel processing engines that may be executed across a diversity of exeuction engines, or <i>runners</i>. The core concepts of this layer are based upon the Beam Model (formerly referred to as the <a href="http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf">Dataflow Model</a>), and implemented to varying degrees in each Beam runner. To help clarify the capabilities of individual runners, we\u2019ve created the capability matrix below.</p>
[3/3] incubator-beam-site git commit: This closes #36
Posted by fr...@apache.org.
This closes #36
Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/e2430eb4
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/e2430eb4
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/e2430eb4
Branch: refs/heads/asf-site
Commit: e2430eb4de503640ea32a7de35d1506cb85c0965
Parents: 4f14734 ddbed01
Author: Frances Perry <fj...@google.com>
Authored: Tue Aug 9 15:04:38 2016 -0700
Committer: Frances Perry <fj...@google.com>
Committed: Tue Aug 9 15:04:38 2016 -0700
----------------------------------------------------------------------
content/feed.xml | 4 ++--
content/learn/programming-guide/index.html | 10 +++++-----
content/learn/runners/capability-matrix/index.html | 2 +-
learn/programming-guide.md | 10 +++++-----
4 files changed, 13 insertions(+), 13 deletions(-)
----------------------------------------------------------------------