You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Thomas Graves (Jira)" <ji...@apache.org> on 2019/10/23 15:38:00 UTC

[jira] [Commented] (SPARK-29415) Stage Level Sched: Add base ResourceProfile and Request classes

    [ https://issues.apache.org/jira/browse/SPARK-29415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957974#comment-16957974 ] 

Thomas Graves commented on SPARK-29415:
---------------------------------------

From a high level design point, this is the base classes needed for other jira/components to be implemented. You can see the design doc attached to SPARK-27495 for the entire overview, but for this specifically this is what we are looking to add.  These will start out private until we have other parts implemented and then make public incase this isn't fully implemented for a release.

 

ResourceProfile:

The user will have to build up a _ResourceProfile_ to pass into an RDD withResources call. This profile will have a limited set of resources the user is allowed to specify. It will allow both task and executor resources. It will be a builder type interface where the main function called will be _ResourceProfile.require._  Adding the ResourceProfile API class leaves it open to do more advanced things in the future. For instance, perhaps you want a _ResourceProfile.prefer_ option where it would run on a node with some resources if available but then fall back if they aren’t.   The config names supported correspond to the regular spark configs with the prefix removed. For instance overhead memory in this api is memoryOverhead, which is spark.executor.memoryOverhead with the spark.executor removed.  Resources like GPUs are resource.gpu (spark configs spark.executor.resource.gpu.*).

| |

*_def_* _require(request: TaskResourceRequest):_ *_this_*_._*_type_*

*_def_* _require(request: ExecutorResourceRequest):_ *_this_*_._*_type_*

It will also have functions to get the resources out for both scala and java.

 

*Resource Requests:*

*_class_* _ExecutorResourceRequest(_

   _val resourceName: String,_

   _val amount: Int, // potentially make this handle fractional resources_

   _val units: String, // to handle memory unit types
_

   _val discoveryScript: Option[__String__] = None,_

   _val vendor: Option[__String__] = None)_

 

*_class_* _TaskResourceRequest(_

   _val resourceName: String,_

   _val amount: Double) // double to handle fractional resources (ie 2 tasks using 1 resource )
_

 

This will allow the user to programmatically set the resources vs just using the configs like they can in Spark 3.0 now.  The first implementation would support cpu, memory (overhead, pyspark, on heap, off heap), and the generic resources. 

 __ 

An example of the way this might work is:

 __ 

_val_ *_rp_* _= new ResourceProfile()_

_rp.require(new ExecutorResourceRequest("memory", 2048))_

_rp.require(new ExecutorResourceRequest("cores", 2))_

_rp.require(new ExecutorResourceRequest("gpu", 1, Some("/opt/gpuScripts/getGpus")))_

_rp.require(new TaskResourceRequest("gpu", 1))_

 

Internally we will also create a default profile, which will be based on the normal spark configs passed in. This default one can be used everywhere where user hasn't explicitly set the ResourceProfile

> Stage Level Sched: Add base ResourceProfile and Request classes
> ---------------------------------------------------------------
>
>                 Key: SPARK-29415
>                 URL: https://issues.apache.org/jira/browse/SPARK-29415
>             Project: Spark
>          Issue Type: Story
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>            Priority: Major
>
> this is just to add initial ResourceProfile, ExecutorResourceRequest and taskResourceRequest classes that are used by the other parts of the code.
> Initially we will have them private until we have other pieces in place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org