You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by ayan guha <gu...@gmail.com> on 2021/07/17 20:40:48 UTC

Spark 3: Resource Discovery

Hi

As I was going through Spark 3 config params, I noticed following group of
params. I could not understand what are they for. Can anyone please point
me in the right direction?

spark.driver.resource.{resourceName}.amount 0 Amount of a particular
resource type to use on the driver. If this is used, you must also specify
the spark.driver.resource.{resourceName}.discoveryScript for the driver to
find the resource on startup. 3.0.0
spark.driver.resource.{resourceName}.discoveryScript None A script for the
driver to run to discover a particular resource type. This should write to
STDOUT a JSON string in the format of the ResourceInformation class. This
has a name and an array of addresses. For a client-submitted driver,
discovery script must assign different resource addresses to this driver
comparing to other drivers on the same host. 3.0.0
spark.driver.resource.{resourceName}.vendor None Vendor of the resources to
use for the driver. This option is currently only supported on Kubernetes
and is actually both the vendor and domain following the Kubernetes device
plugin naming convention. (e.g. For GPUs on Kubernetes this config would be
set to nvidia.com or amd.com) 3.0.0
spark.resources.discoveryPlugin
org.apache.spark.resource.ResourceDiscoveryScriptPlugin Comma-separated
list of class names implementing
org.apache.spark.api.resource.ResourceDiscoveryPlugin to load into the
application. This is for advanced users to replace the resource discovery
class with a custom implementation. Spark will try each class specified
until one of them returns the resource information for that resource. It
tries the discovery script last if none of the plugins return information
for that resource. 3.0.0

-- 
Best Regards,
Ayan Guha

Re: Spark 3: Resource Discovery

Posted by Sean Owen <sr...@gmail.com>.

At the moment this is really about discovering GPUs, so that the scheduler
can schedule tasks that need to allocate whole GPUs.

On Sat, Jul 17, 2021 at 5:14 PM ayan guha <gu...@gmail.com> wrote:

> Hi
>
> As I was going through Spark 3 config params, I noticed following group of
> params. I could not understand what are they for. Can anyone please point
> me in the right direction?
>
> spark.driver.resource.{resourceName}.amount 0 Amount of a particular
> resource type to use on the driver. If this is used, you must also specify
> the spark.driver.resource.{resourceName}.discoveryScript for the driver
> to find the resource on startup. 3.0.0
> spark.driver.resource.{resourceName}.discoveryScript None A script for
> the driver to run to discover a particular resource type. This should write
> to STDOUT a JSON string in the format of the ResourceInformation class.
> This has a name and an array of addresses. For a client-submitted driver,
> discovery script must assign different resource addresses to this driver
> comparing to other drivers on the same host. 3.0.0
> spark.driver.resource.{resourceName}.vendor None Vendor of the resources
> to use for the driver. This option is currently only supported on
> Kubernetes and is actually both the vendor and domain following the
> Kubernetes device plugin naming convention. (e.g. For GPUs on Kubernetes
> this config would be set to nvidia.com or amd.com) 3.0.0
> spark.resources.discoveryPlugin
> org.apache.spark.resource.ResourceDiscoveryScriptPlugin Comma-separated
> list of class names implementing
> org.apache.spark.api.resource.ResourceDiscoveryPlugin to load into the
> application. This is for advanced users to replace the resource discovery
> class with a custom implementation. Spark will try each class specified
> until one of them returns the resource information for that resource. It
> tries the discovery script last if none of the plugins return information
> for that resource. 3.0.0
>
> --
> Best Regards,
> Ayan Guha
>