You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Thomas Graves (Jira)" <ji...@apache.org> on 2020/02/06 19:20:00 UTC

[jira] [Commented] (SPARK-24655) [K8S] Custom Docker Image Expectations and Documentation

    [ https://issues.apache.org/jira/browse/SPARK-24655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031870#comment-17031870 ] 

Thomas Graves commented on SPARK-24655:
---------------------------------------

some other discussions on this from https://github.com/apache/spark/pull/23347

> [K8S] Custom Docker Image Expectations and Documentation
> --------------------------------------------------------
>
>                 Key: SPARK-24655
>                 URL: https://issues.apache.org/jira/browse/SPARK-24655
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 2.3.1
>            Reporter: Matt Cheah
>            Priority: Major
>
> A common use case we want to support with Kubernetes is the usage of custom Docker images. Some examples include:
>  * A user builds an application using Gradle or Maven, using Spark as a compile-time dependency. The application's jars (both the custom-written jars and the dependencies) need to be packaged in a docker image that can be run via spark-submit.
>  * A user builds a PySpark or R application and desires to include custom dependencies
>  * A user wants to switch the base image from Alpine to CentOS while using either built-in or custom jars
> We currently do not document how these custom Docker images are supposed to be built, nor do we guarantee stability of these Docker images with various spark-submit versions. To illustrate how this can break down, suppose for example we decide to change the names of environment variables that denote the driver/executor extra JVM options specified by {{spark.[driver|executor].extraJavaOptions}}. If we change the environment variable spark-submit provides then the user must update their custom Dockerfile and build new images.
> Rather than jumping to an implementation immediately though, it's worth taking a step back and considering these matters from the perspective of the end user. Towards that end, this ticket will serve as a forum where we can answer at least the following questions, and any others pertaining to the matter:
>  # What would be the steps a user would need to take to build a custom Docker image, given their desire to customize the dependencies and the content (OS or otherwise) of said images?
>  # How can we ensure the user does not need to rebuild the image if only the spark-submit version changes?
> The end deliverable for this ticket is a design document, and then we'll create sub-issues for the technical implementation and documentation of the contract.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org