You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Wilfred Spiegelenburg (Jira)" <ji...@apache.org> on 2020/05/13 14:47:00 UTC

[jira] [Commented] (YUNIKORN-99) Enhanced FIFO scheduling for batch workloads

    [ https://issues.apache.org/jira/browse/YUNIKORN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106362#comment-17106362 ] 

Wilfred Spiegelenburg commented on YUNIKORN-99:
-----------------------------------------------

PR opened with the base changes for the state aware scheduling of applications. 

Two new states added to the app: waiting and starting. An app is now moving from a new state to an accepted state when the first ask is added to the application. As soon as that ask is allocated the app moves from accepted to starting. From starting the app moves to running. An app can stay in the starting state for a maximum of 5 minutes or it moves before that if more allocations are added to the application. The rest of the time the app will spend in the running state.
It can leave the running state and move to waiting if there are no outstanding asks and no allocations for that app. That means the app is not done but there is nothing scheduled for the app. If a new ask gets added the app moves back to running and gets scheduled as normal.
Applications can be killed or marked as completed by the RM if it can determine the state. The scheduler does not move the app to those state itself.

The state aware policy for applications in a queue leverages the new state to sort the applications. The logic for sorting applications in a queue is as follows:
- only apps with pending resources are scheduled
- apps are sorted based on submission time, oldest app first
- all running applications are candidates
- a maximum of one app in the starting state will be added to the list of running apps
- if the queue contains no (0) starting apps, with or without pending resources, the oldest app in the accepted state will be added

The queue will then use that list of apps to schedule.

On recovery apps that have existing allocations are considered to be in a running state.

> Enhanced FIFO scheduling for batch workloads
> --------------------------------------------
>
>                 Key: YUNIKORN-99
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-99
>             Project: Apache YuniKorn
>          Issue Type: New Feature
>          Components: core - scheduler
>            Reporter: Weiwei Yang
>            Assignee: Wilfred Spiegelenburg
>            Priority: Major
>              Labels: pull-request-available
>
> An enhanced version of FIFO scheduling for batch workloads



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org