You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 23:03:05 UTC

[GitHub] [beam] kennknowles opened a new issue, #19240: Add option to mount a directory inside SDK harness containers

kennknowles opened a new issue, #19240:
URL: https://github.com/apache/beam/issues/19240

   While experimenting with the Python SDK locally, I found it inconvenient that I can't mount a host directory to the Docker containers, i.e. the input must already be in the container and the results of a Write remain inside the container. For local testing, users may want to mount a host directory.
   
   Since BEAM-5288 the `Environment` carries explicit environment information, we could a) add volume args to the `DockerPayload`, or b) provide a general Docker arguments field.
   
   Imported from Jira [BEAM-5440](https://issues.apache.org/jira/browse/BEAM-5440). Original Jira may contain additional context.
   Reported by: mxm.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] brentmjohnson commented on issue #19240: Add option to mount a directory inside SDK harness containers

Posted by "brentmjohnson (via GitHub)" <gi...@apache.org>.
brentmjohnson commented on issue #19240:
URL: https://github.com/apache/beam/issues/19240#issuecomment-1483528346

   Great and I should also note that there is kind of a hacky workaround to intercept and inject additional arguments into the docker run command:
   
   1. Override PATH environment variable to PREpend a path where you will create a shell script called `docker`
   2. Create something like the following `docker` at the new path:
   ```
   #!/bin/sh
   
   # Parse the command and arguments
   command=$1
   shift
   args=$*
   
   # Check if the command is "run"
   if [ "$command" = "run" ]; then
     # Set the additional arguments to inject
     additional_args="--device /dev/dxg --mount type=bind,src=/usr/lib/wsl,dst=/usr/lib/wsl"
   
     # Invoke the executable with the injected arguments
     /usr/bin/docker $command $additional_args $args
   else
     # Invoke the executable without injected arguments
     /usr/bin/docker $command $args
   fi
   ```
   3. Depending on your beam runner potentially build custom images to include the above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] brentmjohnson commented on issue #19240: Add option to mount a directory inside SDK harness containers

Posted by "brentmjohnson (via GitHub)" <gi...@apache.org>.
brentmjohnson commented on issue #19240:
URL: https://github.com/apache/beam/issues/19240#issuecomment-1478495127

   Can we expand this description to include appending all supported docker run options as defined here: https://docs.docker.com/engine/reference/commandline/run/#options
   
   Personally, I am looking to do something like this for providing gpu device access to the docker container for hardware accelerated workloads (pytorch):
   ```
   docker run
     --device /dev/dxg
     --mount type=bind,src=/usr/lib/wsl,dst=/usr/lib/wsl
     -e LD_LIBRARY_PATH=/usr/lib/wsl/lib
   k8s-lb:5000/beam_gpu_test:0.0.1
   ```
   This example seems particularly relevant given the beam team's focus on enhancing the api for AI/ML workloads: https://beam.apache.org/documentation/ml/about-ml/
   
   Unless I am missing something, this lack of support for docker run options basically rules out the incredible flexibility afforded by --environment_type="DOCKER" for any GPU accelerated workloads.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #19240: Add option to mount a directory inside SDK harness containers

Posted by "kennknowles (via GitHub)" <gi...@apache.org>.
kennknowles commented on issue #19240:
URL: https://github.com/apache/beam/issues/19240#issuecomment-1483487147

   I think this is a great discussion for dev@beam.apache.org


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org