You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/03/07 17:58:29 UTC

[GitHub] [beam] y1chi commented on a change in pull request #16938: [BEAM-13314]Revise recommendations to manage Python pipeline dependencies.

y1chi commented on a change in pull request #16938:
URL: https://github.com/apache/beam/pull/16938#discussion_r820958659



##########
File path: website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md
##########
@@ -123,3 +136,19 @@ If your pipeline uses non-Python packages (e.g. packages that require installati
         --setup_file /path/to/setup.py
 
 **Note:** Because custom commands execute after the dependencies for your workflow are installed (by `pip`), you should omit the PyPI package dependency from the pipeline's `requirements.txt` file and from the `install_requires` parameter in the `setuptools.setup()` call of your `setup.py` file.
+
+## Pre-building SDK container image
+
+In the pre-building step, we install pipeline dependencies on the container image prior to the job submission. This would speed up the pipeline execution.\
+To use pre-building the dependencies from `requirements.txt` on the container image. Follow the steps below.
+1. Provide the container engine. We support `docker` and `cloud_build`(requires a GCP project with Cloud Build API enabled).

Review comment:
       local_docker?

##########
File path: website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md
##########
@@ -123,3 +136,19 @@ If your pipeline uses non-Python packages (e.g. packages that require installati
         --setup_file /path/to/setup.py
 
 **Note:** Because custom commands execute after the dependencies for your workflow are installed (by `pip`), you should omit the PyPI package dependency from the pipeline's `requirements.txt` file and from the `install_requires` parameter in the `setuptools.setup()` call of your `setup.py` file.
+
+## Pre-building SDK container image
+
+In the pre-building step, we install pipeline dependencies on the container image prior to the job submission. This would speed up the pipeline execution.\
+To use pre-building the dependencies from `requirements.txt` on the container image. Follow the steps below.
+1. Provide the container engine. We support `docker` and `cloud_build`(requires a GCP project with Cloud Build API enabled).
+
+       --prebuild_sdk_container_enginer <execution_environment>
+2. To pass a base image for pre-building dependencies, enable this flag. If not, apache beam's base image would be used.
+
+       --prebuild_sdk_container_base_image <location_to_base_image>
+3. To push the container image, pre-built locally with `Docker` , to a remote repository(eg: docker registry), provide URL to the docker registry by passing

Review comment:
       local_docker

##########
File path: website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md
##########
@@ -123,3 +136,19 @@ If your pipeline uses non-Python packages (e.g. packages that require installati
         --setup_file /path/to/setup.py
 
 **Note:** Because custom commands execute after the dependencies for your workflow are installed (by `pip`), you should omit the PyPI package dependency from the pipeline's `requirements.txt` file and from the `install_requires` parameter in the `setuptools.setup()` call of your `setup.py` file.
+
+## Pre-building SDK container image
+
+In the pre-building step, we install pipeline dependencies on the container image prior to the job submission. This would speed up the pipeline execution.\
+To use pre-building the dependencies from `requirements.txt` on the container image. Follow the steps below.
+1. Provide the container engine. We support `docker` and `cloud_build`(requires a GCP project with Cloud Build API enabled).
+
+       --prebuild_sdk_container_enginer <execution_environment>
+2. To pass a base image for pre-building dependencies, enable this flag. If not, apache beam's base image would be used.

Review comment:
       Note that this may not work on an arbitrary base image, the base image should follow the same contract to install dependencies in a setup_only mode as apache beam's base image https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L49




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org