You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Carlo Curino (JIRA)" <ji...@apache.org> on 2013/08/09 18:38:49 UTC

[jira] [Commented] (YARN-1051) YARN Admission Control/Planner: enhancing the resource allocation model with time.

    [ https://issues.apache.org/jira/browse/YARN-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734973#comment-13734973 ] 

Carlo Curino commented on YARN-1051:
------------------------------------

This umbrella JIRA proposes an extension of the YARN RM to allow for richer admission-control semantics (beside existing ACL checks). 
This allows  jobs/users to negotiate with the RM at admission control time for time-bounded, guaranteed allocation of cluster resources (e.g., I need 100 containers for 2 hours  at any time before 3pm today). Such request can be per-job or per-users  (maybe we can call this a "session"). 
It provides the RM with an understanding of future resource demand, and exposes jobs timeand resource constraints, hence enabling the RM to lookahead and plan resource allocation over time (e.g., a job submitted now, but with lots of time before its deadline might be run after a job showing up later but in a rush to complete).

This is an important step towards SLAs on the resources received by a job/user over time, which seems useful for long-running services, workflows, and can help ameliorate some of the gang-scheduling concerns (admission control will guarantee the resources to be available, hence hoarding is not likely to produce deadlocks). 

This will require:
* addictive modifications to the job-submission API (to capture job's resource demands)
* an internal API between admission control / planner (working on the planning aspects) and the scheduler (enforcing the plan, and handling containers etc...)
* changes to the underlying scheduler (we started with the CapacityScheduler) to support queue addition/removal/resizing and cross-queues job migration, but this should ideally be pushed to the YarnScheduler API and be cross-scheduler (from various conversations, this seem to be needed/useful indepedently). 
* changes to the RM tracking datastructures to maintain metering of how many resources have been allocated to a job until now (also enables billing and accounting on the RM side, and other history-aware planning and scheduling).
* implementation of (simple first) admission control mechanism, that verify whether a job with a certain Contract can be admitted, and perform basic planning (knapsack-like to start, can be extended to sophisticated economics models).

We will track this in Sub-JIRAs. 


                
> YARN Admission Control/Planner: enhancing the resource allocation model with time.
> ----------------------------------------------------------------------------------
>
>                 Key: YARN-1051
>                 URL: https://issues.apache.org/jira/browse/YARN-1051
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler, resourcemanager, scheduler
>            Reporter: Carlo Curino
>            Assignee: Carlo Curino
>
> In this umbrella JIRA we propose to extend the YARN RM to handle time explicitly, allowing users to "reserve" capacity over time. This is an important step towards SLAs, long-running services, workflows, and helps for gang scheduling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira