You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/08/27 02:38:17 UTC

[GitHub] [beam] chamikaramj opened a new pull request, #22927: Updates to multi-lang Java quickstart

chamikaramj opened a new pull request, #22927:
URL: https://github.com/apache/beam/pull/22927

   Updates the supported version to be consistent for all runners.
   Adds instructions for running with the portable DirectRunner.
   
   This fixes #22916 
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #22927:
URL: https://github.com/apache/beam/pull/22927#discussion_r956530845


##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environemnt with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)
+
+```
+export JOB_SERVER_PORT=<port>
+
+python -m apache_beam.runners.portability.local_job_service_main -p $JOB_SERVER_PORT
+```
+
+3. In a different shell, go to a [Beam Git clone](https://github.com/apache/beam).
+
+4. Build the Beam Java SDK container for a local pipeline execution
+   (this guide that your JAVA_HOME is set to Java 11)

Review Comment:
   ```suggestion
      (this guide requires that your JAVA_HOME is set to Java 11)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj merged pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
chamikaramj merged PR #22927:
URL: https://github.com/apache/beam/pull/22927


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #22927:
URL: https://github.com/apache/beam/pull/22927#discussion_r956530633


##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environemnt with the latest version of Beam Python SDK installed.

Review Comment:
   typo
   ```suggestion
   1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
bvolpato commented on PR #22927:
URL: https://github.com/apache/beam/pull/22927#issuecomment-1229114518

   @chamikaramj Thanks! New details are looking good, I'll give it a new run and keep you posted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on PR #22927:
URL: https://github.com/apache/beam/pull/22927#issuecomment-1229105420

   R: @bvolpato 
   
   cc: @pcoet 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #22927:
URL: https://github.com/apache/beam/pull/22927#issuecomment-1229105603

   Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on a diff in pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on code in PR #22927:
URL: https://github.com/apache/beam/pull/22927#discussion_r957523657


##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)
+
+```
+export JOB_SERVER_PORT=<port>
+
+python -m apache_beam.runners.portability.local_job_service_main -p $JOB_SERVER_PORT

Review Comment:
   Did you create a virtual env and install Apache Beam Python (pip install apache-beam) ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #22927:
URL: https://github.com/apache/beam/pull/22927#discussion_r956530674


##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -42,11 +42,14 @@ can clone or download the Beam repository and build the example from the source
 code.
 
 To build and run the example, you need a Java environment with the Beam Java SDK
-version 2.40.0 or later installed, and a Python environment. If you don’t
+version 2.41.0 or later installed, and a Python environment. If you don’t
 already have these environments set up, first complete the
 [Apache Beam Java SDK Quickstart](/get-started/quickstart-java/) and the
 [Apache Beam Python SDK Quickstart](/get-started/quickstart-py/).
 
+For running with portable DirectRunner, you need to have Docker installed
+locally and the Docker deamon should be running. This is not needed for Dataflow.

Review Comment:
   typo
   ```suggestion
   locally and the Docker daemon should be running. This is not needed for Dataflow.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on PR #22927:
URL: https://github.com/apache/beam/pull/22927#issuecomment-1230984387

   Run Whitespace PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #22927:
URL: https://github.com/apache/beam/pull/22927#discussion_r956533127


##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)
+
+```
+export JOB_SERVER_PORT=<port>
+
+python -m apache_beam.runners.portability.local_job_service_main -p $JOB_SERVER_PORT

Review Comment:
   Don't know exactly why, but this yields 
   ```
   ModuleNotFoundError: No module named 'apache_beam.portability.api'
   ```
   
   Running from the file is starting it, though.
   ```
   python python/apache_beam/runners/portability/local_job_service_main.py -p $JOB_SERVER_PORT
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] pcoet commented on a diff in pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
pcoet commented on code in PR #22927:
URL: https://github.com/apache/beam/pull/22927#discussion_r957445346


##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)
+
+```
+export JOB_SERVER_PORT=<port>
+
+python -m apache_beam.runners.portability.local_job_service_main -p $JOB_SERVER_PORT
+```
+
+3. In a different shell, go to a [Beam Git clone](https://github.com/apache/beam).
+
+4. Build the Beam Java SDK container for a local pipeline execution
+   (this guide requires that your JAVA_HOME is set to Java 11)

Review Comment:
   Add period: "11)" -> "11)."



##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)

Review Comment:
   Nit: add period: "Python)" -> "Python)."



##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)
+
+```
+export JOB_SERVER_PORT=<port>
+
+python -m apache_beam.runners.portability.local_job_service_main -p $JOB_SERVER_PORT
+```
+
+3. In a different shell, go to a [Beam Git clone](https://github.com/apache/beam).
+
+4. Build the Beam Java SDK container for a local pipeline execution
+   (this guide requires that your JAVA_HOME is set to Java 11)
+
+```
+./gradlew :sdks:java:container:java11:docker
+```
+
+5. Run the pipeline.
+
+```
+export JOB_SERVER_PORT=<port>  # Same port as before
+export OUTPUT_FILE=<local relative path>
+export PYTHON_VERSION=<version>
+
+./gradlew :examples:multi-language:pythonDataframeWordCount --args=" \
+--runner=PortableRunner \
+--jobEndpoint=localhost:$JOB_SERVER_PORT \
+--output=$OUTPUT_FILE"
+```
+
+> **Note** This output gets written to local file system of a Python Docker

Review Comment:
   "to local" -> "to the local"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #22927:
URL: https://github.com/apache/beam/pull/22927#discussion_r957641021


##########
website/www/site/content/en/documentation/sdks/java-multi-language-pipelines.md:
##########
@@ -155,6 +158,50 @@ export PYTHON_VERSION=<version>
 The pipeline outputs a file with the results to
 **gs://$OUTPUT_BUCKET/count-00000-of-00001**.
 
+### Run with DirectRunner
+
+> **Note:** Multi-language Pipelines need to use [portable](/roadmap/portability/)
+> runners. Portable DirectRunner is still experimental and does not support all
+> Beam features.
+
+1. Create a Python virtual environment with the latest version of Beam Python SDK installed.
+   Please see [here](/get-started/quickstart-py/) for instructions.
+2. Run the job server for portable DirectRunner (implemented in Python)
+
+```
+export JOB_SERVER_PORT=<port>
+
+python -m apache_beam.runners.portability.local_job_service_main -p $JOB_SERVER_PORT

Review Comment:
   I had recycled an old virtualenv, so that was likely the issue. `pip install --force-reinstall apache-beam` solved the issue. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on pull request #22927: Updates to multi-lang Java quickstart

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on PR #22927:
URL: https://github.com/apache/beam/pull/22927#issuecomment-1231019943

   Run Whitespace PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org