You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2019/02/15 20:50:00 UTC

[jira] [Resolved] (SPARK-26398) Support building GPU docker images

     [ https://issues.apache.org/jira/browse/SPARK-26398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcelo Vanzin resolved SPARK-26398.
------------------------------------
    Resolution: Duplicate

I'm consolidating all docker image-related bugs under SPARK-24655. All discussion about requirements that people have around docker images should go there.

> Support building GPU docker images
> ----------------------------------
>
>                 Key: SPARK-26398
>                 URL: https://issues.apache.org/jira/browse/SPARK-26398
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 2.4.0
>            Reporter: Rong Ou
>            Priority: Minor
>
> To run Spark on Kubernetes, a user first needs to build docker images using the `bin/docker-image-tool.sh` script. However, this script only supports building images for running on CPUs. As parts of Spark and related libraries (e.g. XGBoost) get accelerated on GPUs, it's desirable to build base images that can take advantage of GPU acceleration.
> This issue only addresses building docker images with CUDA support. Actually accelerating Spark on GPUs is outside the scope, as is supporting other types of GPUs.
> Today if anyone wants to experiment with running Spark on Kubernetes with GPU support, they have to write their own custom `Dockerfile`. By providing an "official" way to build GPU-enabled docker images, we can make it easier to get started.
> For now probably not that many people care about this, but it's a necessary first step towards GPU acceleration for Spark on Kubernetes.
> The risks are minimal as we only need to make minor changes to `bin/docker-image-tool.sh`. The PR is already done and will be attached. Success means anyone can easily build Spark docker images with GPU support.
> Proposed API changes: add an optional  `-g` flag to `bin/docker-image-tool.sh` for building GPU versions of the JVM/Python/R docker images. When the `-g` is omitted, existing behavior is preserved.
> Design sketch: when the `-g` flag is specified, we append `-gpu` to the docker image names, and switch to dockerfiles based on the official CUDA images. Since the CUDA images are based on Ubuntu while the Spark dockerfiles are based on Alpine, steps for setting up additional packages are different, so there are a parallel set of `Dockerfile.gpu` files.
> Alternative: if we are willing to forego Alpine and switch to Ubuntu for the CPU-only images, the two sets of dockerfiles can be unified, and we can just pass in a different base image depending on whether the `-g` flag is present or not.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org