You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/08/29 14:54:58 UTC

[GitHub] [beam] pcoet commented on a diff in pull request #22927: Updates to multi-lang Java quickstart

pcoet commented on code in PR #22927:
URL: https://github.com/apache/beam/pull/22927#discussion_r957445346


##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)
+
+```
+export JOB_SERVER_PORT=<port>
+
+python -m apache_beam.runners.portability.local_job_service_main -p $JOB_SERVER_PORT
+```
+
+3. In a different shell, go to a [Beam Git clone](https://github.com/apache/beam).
+
+4. Build the Beam Java SDK container for a local pipeline execution
+   (this guide requires that your JAVA_HOME is set to Java 11)

Review Comment:
   Add period: "11)" -> "11)."



##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)

Review Comment:
   Nit: add period: "Python)" -> "Python)."



##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)
+
+```
+export JOB_SERVER_PORT=<port>
+
+python -m apache_beam.runners.portability.local_job_service_main -p $JOB_SERVER_PORT
+```
+
+3. In a different shell, go to a [Beam Git clone](https://github.com/apache/beam).
+
+4. Build the Beam Java SDK container for a local pipeline execution
+   (this guide requires that your JAVA_HOME is set to Java 11)
+
+```
+./gradlew :sdks:java:container:java11:docker
+```
+
+5. Run the pipeline.
+
+```
+export JOB_SERVER_PORT=<port>  # Same port as before
+export OUTPUT_FILE=<local relative path>
+export PYTHON_VERSION=<version>
+
+./gradlew :examples:multi-language:pythonDataframeWordCount --args=" \
+--runner=PortableRunner \
+--jobEndpoint=localhost:$JOB_SERVER_PORT \
+--output=$OUTPUT_FILE"
+```
+
+> **Note** This output gets written to local file system of a Python Docker

Review Comment:
   "to local" -> "to the local"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org