You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Adam Antal (Jira)" <ji...@apache.org> on 2019/10/15 10:16:00 UTC

[jira] [Created] (SPARK-29474) CLI support for Spark-on-Docker-on-Yarn

Adam Antal created SPARK-29474:
----------------------------------

             Summary: CLI support for Spark-on-Docker-on-Yarn
                 Key: SPARK-29474
                 URL: https://issues.apache.org/jira/browse/SPARK-29474
             Project: Spark
          Issue Type: Improvement
          Components: Spark Shell, YARN
    Affects Versions: 2.4.4
            Reporter: Adam Antal


The Docker-on-Yarn feature is stable for a while now in Hadoop.
One can run Spark on Docker using the Docker-on-Yarn feature by providing runtime environments to the Spark AM and Executor containers similar to this:
{noformat}
--conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE=docker
--conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=repo/image:tag
--conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS="/etc/passwd:/etc/passwd:ro,/etc/hadoop:/etc/hadoop:ro"
--conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker
--conf spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=repo/image:tag
--conf spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS="/etc/passwd:/etc/passwd:ro,/etc/hadoop:/etc/hadoop:ro"
{noformat}

This is not very user friendly. I suggest to add CLI options to specify:
- whether docker image should be used ({{--docker}})
- which docker image should be used ({{--docker-image}})
- what docker mounts should be used ({{--docker-mounts}})
for the AM and executor containers separately.

Let's discuss!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org