You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by gi...@apache.org on 2022/10/21 04:15:56 UTC

[beam] branch asf-site updated: Publishing website 2022/10/21 04:15:49 at commit 69fe1cc

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 6fb9ec387a1 Publishing website 2022/10/21 04:15:49 at commit 69fe1cc
6fb9ec387a1 is described below

commit 6fb9ec387a13467c965a01602d1bbf579e43b42c
Author: jenkins <bu...@apache.org>
AuthorDate: Fri Oct 21 04:15:49 2022 +0000

    Publishing website 2022/10/21 04:15:49 at commit 69fe1cc
---
 .../sdks/java-multi-language-pipelines/index.html  | 57 ++++++++++++++++------
 website/generated-content/sitemap.xml              |  2 +-
 2 files changed, 44 insertions(+), 15 deletions(-)

diff --git a/website/generated-content/documentation/sdks/java-multi-language-pipelines/index.html b/website/generated-content/documentation/sdks/java-multi-language-pipelines/index.html
index 8dee467d6de..92b81bb1db1 100644
--- a/website/generated-content/documentation/sdks/java-multi-language-pipelines/index.html
+++ b/website/generated-content/documentation/sdks/java-multi-language-pipelines/index.html
@@ -19,7 +19,7 @@
 function addPlaceholder(){$('input:text').attr('placeholder',"What are you looking for?");}
 function endSearch(){var search=document.querySelector(".searchBar");search.classList.add("disappear");var icons=document.querySelector("#iconsBar");icons.classList.remove("disappear");}
 function blockScroll(){$("body").toggleClass("fixedPosition");}
-function openMenu(){addPlaceholder();blockScroll();}</script><div class="clearfix container-main-content"><div class="section-nav closed" data-offset-top=90 data-offset-bottom=500><span class="section-nav-back glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list data-section-nav><li><span class=section-nav-list-main-title>Languages</span></li><li><span class=section-nav-list-title>Java</span><ul class=section-nav-list><li><a href=/documentation/sdks/java/>Java SDK overvi [...]
+function openMenu(){addPlaceholder();blockScroll();}</script><div class="clearfix container-main-content"><div class="section-nav closed" data-offset-top=90 data-offset-bottom=500><span class="section-nav-back glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list data-section-nav><li><span class=section-nav-list-main-title>Languages</span></li><li><span class=section-nav-list-title>Java</span><ul class=section-nav-list><li><a href=/documentation/sdks/java/>Java SDK overvi [...]
 with the Apache Beam SDK for Java. For a more complete discussion of the topic,
 see
 <a href=/documentation/programming-guide/#multi-language-pipelines>Multi-language pipelines</a>.</p><p>A <em>multi-language pipeline</em> is a pipeline that’s built in one Beam SDK language
@@ -96,19 +96,20 @@ a function to <code>DataframeTransform</code>, see
 <a href=/documentation/dsls/dataframes/overview/#embedding-dataframes-in-a-pipeline>Embedding DataFrames in a pipeline</a>.</p><h2 id=run-the-java-pipeline>Run the Java pipeline</h2><p>If you want to customize the environment or use transforms not available in the
 default Beam SDK, you might need to run your own expansion service. In such
 cases, <a href=#advanced-start-an-expansion-service>start the expansion service</a>
-before running your pipeline.</p><p>Here we&rsquo;ve provided commands for running the example pipeline using
-Gradle on a <a href=https://github.com/apache/beam>Beam HEAD Git clone</a>.
-If you need a more stable environment, please
-<a href=/get-started/quickstart-java/>setup a Java project</a> that uses the latest
-released Beam version and include the necessary dependencies.</p><h3 id=run-with-dataflow-runner>Run with Dataflow runner</h3><p>The following script runs the example multi-language pipeline on Dataflow, using
+before running your pipeline.</p><h3 id=run-with-dataflow-runner-at-head-beam-2410-and-later>Run with Dataflow runner at HEAD (Beam 2.41.0 and later)</h3><blockquote><p><strong>Note:</strong> Due to <a href=https://github.com/apache/beam/issues/23717>issue#23717</a>,
+Beam 2.42.0 requires manually starting up an expansion service (see
+<a href=https://beam.apache.org/documentation/sdks/java-multi-language-pipelines/#advanced-start-an-expansion-service>these instructions</a>)
+and using the additional pipeline option <code>--expansionService=localhost:&lt;PORT></code>
+when executing the pipeline.</p></blockquote><p>The following script runs the example multi-language pipeline on Dataflow, using
 example text from a Cloud Storage bucket. You’ll need to adapt the script to
-your environment.</p><pre><code>export OUTPUT_BUCKET=&lt;bucket&gt;
+your environment.</p><pre><code>export GCP_PROJECT=&lt;project&gt;
+export OUTPUT_BUCKET=&lt;bucket&gt;
 export GCP_REGION=&lt;region&gt;
 export TEMP_LOCATION=gs://$OUTPUT_BUCKET/tmp
-export PYTHON_VERSION=&lt;version&gt;
 
 ./gradlew :examples:multi-language:pythonDataframeWordCount --args=&quot; \
 --runner=DataflowRunner \
+--project=$GCP_PROJECT \
 --output=gs://${OUTPUT_BUCKET}/count \
 --region=${GCP_REGION}&quot;
 </code></pre><p>The pipeline outputs a file with the results to
@@ -120,9 +121,12 @@ Please see <a href=/get-started/quickstart-py/>here</a> for instructions.</li><l
 python -m apache_beam.runners.portability.local_job_service_main -p $JOB_SERVER_PORT
 </code></pre><ol start=3><li><p>In a different shell, go to a <a href=https://github.com/apache/beam>Beam HEAD Git clone</a>.</p></li><li><p>Build the Beam Java SDK container for a local pipeline execution
 (this guide requires that your JAVA_HOME is set to Java 11).</p></li></ol><pre><code>./gradlew :sdks:java:container:java11:docker
-</code></pre><ol start=5><li>Run the pipeline.</li></ol><pre><code>export JOB_SERVER_PORT=&lt;port&gt;  # Same port as before
+</code></pre><ol start=5><li>Run the pipeline.</li></ol><blockquote><p><strong>Note:</strong> Due to <a href=https://github.com/apache/beam/issues/23717>issue#23717</a>,
+Beam 2.42.0 requires manually starting up an expansion service (see
+<a href=https://beam.apache.org/documentation/sdks/java-multi-language-pipelines/#advanced-start-an-expansion-service>these instructions</a>)
+and using the additional pipeline option <code>--expansionService=localhost:&lt;PORT></code>
+when executing the pipeline.</p></blockquote><pre><code>export JOB_SERVER_PORT=&lt;port&gt;  # Same port as before
 export OUTPUT_FILE=&lt;local relative path&gt;
-export PYTHON_VERSION=&lt;version&gt;
 
 ./gradlew :examples:multi-language:pythonDataframeWordCount --args=&quot; \
 --runner=PortableRunner \
@@ -141,13 +145,38 @@ starting up the expansion service. But if you want to customize the environment
 or use transforms not available in the default Beam SDK, you might need to run
 your own expansion service.</p><p>For example, to start the standard expansion service for a Python transform,
 <a href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/expansion_service.py>ExpansionServiceServicer</a>,
-follow these steps:</p><ol><li><p>Activate a Python virtual environment and install Apache Beam, as described
-in the <a href=/get-started/quickstart-py/>Python quick start</a>.</p></li><li><p>In the <strong>beam/sdks/python</strong> directory of the Beam source code, run the
-following command:</p><pre><code>python apache_beam/runners/portability/expansion_service_main.py -p 18089 --fully_qualified_name_glob &quot;*&quot;
+follow these steps:</p><ol><li><p>Activate a new virtual environment following
+<a href=https://beam.apache.org/get-started/quickstart-py/#create-and-activate-a-virtual-environment>these instructions</a>.</p></li><li><p>Install Apache Beam with <code>gcp</code> and <code>dataframe</code> packages.</p></li></ol><pre><code>pip install apache-beam[gcp,dataframe]
+</code></pre><ol start=4><li><p>Run the following command</p><pre><code>python -m apache_beam.runners.portability.expansion_service_main -p &lt;PORT&gt; --fully_qualified_name_glob &quot;*&quot;
 </code></pre></li></ol><p>The command runs
 <a href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/expansion_service_main.py>expansion_service_main.py</a>, which starts the standard expansion service. When you use
 Gradle to run your Java pipeline, you can specify the expansion service with the
-<code>expansionService</code> option. For example: <code>--expansionService=localhost:18089</code>.</p><h2 id=next-steps>Next steps</h2><p>To learn more about Beam support for cross-language pipelines, see
+<code>expansionService</code> option. For example: <code>--expansionService=localhost:&lt;PORT></code>.</p><h3 id=run-with-dataflow-runner-using-a-beam-release-beam-2430-and-later>Run with Dataflow runner using a Beam release (Beam 2.43.0 and later)</h3><blockquote><p><strong>Note:</strong> Due to <a href=https://github.com/apache/beam/issues/23717>issue#23717</a>,
+Beam 2.42.0 requires manually starting up an expansion service (see
+<a href=https://beam.apache.org/documentation/sdks/java-multi-language-pipelines/#advanced-start-an-expansion-service>these instructions</a>)
+and using the additional pipeline option <code>--expansionService=localhost:&lt;PORT></code>
+when executing the pipeline.</p></blockquote><ul><li>Check out the Beam examples Maven archetype for the relevant Beam version.</li></ul><pre><code>export BEAM_VERSION=&lt;Beam version&gt;
+
+mvn archetype:generate \
+    -DarchetypeGroupId=org.apache.beam \
+    -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
+    -DarchetypeVersion=$BEAM_VERSION \
+    -DgroupId=org.example \
+    -DartifactId=multi-language-beam \
+    -Dversion=&quot;0.1&quot; \
+    -Dpackage=org.apache.beam.examples \
+    -DinteractiveMode=false
+</code></pre><ul><li>Run the pipeline.</li></ul><pre><code>export GCP_PROJECT=&lt;GCP project&gt;
+export GCP_BUCKET=&lt;GCP bucket&gt;
+export GCP_REGION=&lt;GCP region&gt;
+
+mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.multilanguage.PythonDataframeWordCount \
+    -Dexec.args=&quot;--runner=DataflowRunner --project=$GCP_PROJECT \
+                 --region=us-central1 \
+                 --gcpTempLocation=gs://$GCP_BUCKET/multi-language-beam/tmp \
+                 --output=gs://$GCP_BUCKET/multi-language-beam/output&quot; \
+    -Pdataflow-runner
+</code></pre><h2 id=next-steps>Next steps</h2><p>To learn more about Beam support for cross-language pipelines, see
 <a href=/documentation/programming-guide/#multi-language-pipelines>Multi-language pipelines</a>.
 To learn more about the Beam DataFrame API, see
 <a href=/documentation/dsls/dataframes/overview/>Beam DataFrames overview</a>.</p></div></div><footer class=footer><div class=footer__contained><div class=footer__cols><div class="footer__cols__col footer__cols__col__logos"><div class=footer__cols__col__logo><img src=/images/beam_logo_circle.svg class=footer__logo alt="Beam logo"></div><div class=footer__cols__col__logo><img src=/images/apache_logo_circle.svg class=footer__logo alt="Apache logo"></div></div><div class=footer-wrapper><div [...]
diff --git a/website/generated-content/sitemap.xml b/website/generated-content/sitemap.xml
index e4ea3823738..39e84cb2573 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.42.0/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/catego [...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.42.0/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2022-10-17T09:50:38-07:00</lastmod></url><url><loc>/catego [...]
\ No newline at end of file