You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by gi...@apache.org on 2019/09/24 00:17:07 UTC

[beam] branch asf-site updated: Publishing website 2019/09/24 00:16:54 at commit 113461a

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new e18160c  Publishing website 2019/09/24 00:16:54 at commit 113461a
e18160c is described below

commit e18160cfccadba882c97e3277d87f2547df4862a
Author: jenkins <bu...@apache.org>
AuthorDate: Tue Sep 24 00:16:54 2019 +0000

    Publishing website 2019/09/24 00:16:54 at commit 113461a
---
 .../documentation/runners/direct/index.html        | 76 ++++++++++++++++++++++
 .../documentation/sdks/python-streaming/index.html |  7 +-
 2 files changed, 77 insertions(+), 6 deletions(-)

diff --git a/website/generated-content/documentation/runners/direct/index.html b/website/generated-content/documentation/runners/direct/index.html
index 994d5df..8187175 100644
--- a/website/generated-content/documentation/runners/direct/index.html
+++ b/website/generated-content/documentation/runners/direct/index.html
@@ -208,6 +208,7 @@
     <ul>
       <li><a href="#memory-considerations">Memory considerations</a></li>
       <li><a href="#streaming-execution">Streaming execution</a></li>
+      <li><a href="#execution-mode">Execution Mode</a></li>
     </ul>
   </li>
 </ul>
@@ -295,6 +296,81 @@ interface for defaults and additional pipeline configuration options.</p>
 
 <p>If your pipeline uses an unbounded data source or sink, you must set the <code class="highlighter-rouge">streaming</code> option to <code class="highlighter-rouge">true</code>.</p>
 
+<h3 id="execution-mode">Execution Mode</h3>
+
+<p>Python <a href="https://beam.apache.org/contribute/runner-guide/#the-fn-api">FnApiRunner</a> supports multi-threading and multi-processing mode.</p>
+
+<h4 id="setting-parallelism">Setting parallelism</h4>
+
+<p>Number of threads or subprocesses is defined by setting the <code class="highlighter-rouge">direct_num_workers</code> option. There are several ways to set this option.</p>
+
+<ul>
+  <li>Passing through CLI when executing a pipeline.
+    <div class="highlighter-rouge"><pre class="highlight"><code>python wordcount.py --input xx --output xx --direct_num_workers 2
+</code></pre>
+    </div>
+  </li>
+  <li>Setting with <code class="highlighter-rouge">PipelineOptions</code>.
+    <div class="highlighter-rouge"><pre class="highlight"><code>from apache_beam.options.pipeline_options import PipelineOptions
+pipeline_options = PipelineOptions(['--direct_num_workers', '2'])
+</code></pre>
+    </div>
+  </li>
+  <li>Adding to existing <code class="highlighter-rouge">PipelineOptions</code>.
+    <div class="highlighter-rouge"><pre class="highlight"><code>from apache_beam.options.pipeline_options import DirectOptions
+pipeline_options = PipelineOptions(xxx)
+pipeline_options.view_as(DirectOptions).direct_num_workers = 2
+</code></pre>
+    </div>
+  </li>
+</ul>
+
+<h4 id="running-with-multi-threading-mode">Running with multi-threading mode</h4>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>import argparse
+
+import apache_beam as beam
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.runners.portability import fn_api_runner
+from apache_beam.portability.api import beam_runner_api_pb2
+from apache_beam.portability import python_urns
+
+parser = argparse.ArgumentParser()
+parser.add_argument(...)
+known_args, pipeline_args = parser.parse_known_args(argv)
+pipeline_options = PipelineOptions(pipeline_args)
+
+p = beam.Pipeline(options=pipeline_options,
+      runner=fn_api_runner.FnApiRunner(
+          default_environment=beam_runner_api_pb2.Environment(
+          urn=python_urns.EMBEDDED_PYTHON_GRPC)))
+</code></pre>
+</div>
+
+<h4 id="running-with-multi-processing-mode">Running with multi-processing mode</h4>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>import argparse
+import sys
+
+import apache_beam as beam
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.runners.portability import fn_api_runner
+from apache_beam.portability.api import beam_runner_api_pb2
+from apache_beam.portability import python_urns
+
+parser = argparse.ArgumentParser()
+parser.add_argument(...)
+known_args, pipeline_args = parser.parse_known_args(argv)
+pipeline_options = PipelineOptions(pipeline_args)
+
+p = beam.Pipeline(options=pipeline_options,
+      runner=fn_api_runner.FnApiRunner(
+          default_environment=beam_runner_api_pb2.Environment(
+              urn=python_urns.SUBPROCESS_SDK,
+              payload=b'%s -m apache_beam.runners.worker.sdk_worker_main'
+                        % sys.executable.encode('ascii'))))
+</code></pre>
+</div>
 
       </div>
     </div>
diff --git a/website/generated-content/documentation/sdks/python-streaming/index.html b/website/generated-content/documentation/sdks/python-streaming/index.html
index 4531074..342cd83 100644
--- a/website/generated-content/documentation/sdks/python-streaming/index.html
+++ b/website/generated-content/documentation/sdks/python-streaming/index.html
@@ -451,7 +451,7 @@ about executing streaming pipelines:</p>
   <li>Custom source API</li>
   <li>Splittable <code class="highlighter-rouge">DoFn</code> API</li>
   <li>Handling of late data</li>
-  <li>User-defined custom <code class="highlighter-rouge">WindowFn</code></li>
+  <li>User-defined custom merging <code class="highlighter-rouge">WindowFn</code> (with fnapi)</li>
 </ul>
 
 <h3 id="dataflowrunner-specific-features">DataflowRunner specific features</h3>
@@ -460,12 +460,7 @@ about executing streaming pipelines:</p>
 Dataflow specific features with Python streaming execution.</p>
 
 <ul>
-  <li>Streaming autoscaling</li>
-  <li>Updating existing pipelines</li>
   <li>Cloud Dataflow Templates</li>
-  <li>Some monitoring features, such as msec counters, display data, metrics, and
-element counts for transforms. However, logging, watermarks, and element
-counts for sources are supported.</li>
 </ul>