You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Craig Condit (Jira)" <ji...@apache.org> on 2021/10/26 17:57:00 UTC

[jira] [Updated] (YUNIKORN-921) Applications should be able to opt-out of state-aware scheduling

     [ https://issues.apache.org/jira/browse/YUNIKORN-921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Craig Condit updated YUNIKORN-921:
----------------------------------
    Description: 
When pods are submitted to YuniKorn without an associated applicationId label, the admission controller assigns a generated applicationId for that pod. However, if this pod is the first (or only) pod submitted, the generated application will block other applications from executing until a second pod is submitted, or until 5 minutes have elapsed. Since we have no way to know if another pod will be scheduled for the generated application, we should have a way to skip state-aware scheduling in this case and avoid the 5 minute delay.

To fix this:

1) Scheduler core: Add a new tag "application.stateaware.disable" which if present, will prevent waiting for a second task to transition to Running state.

2) Admission controller: Add new label "disableStateAware: true" to a pod if neither applicationId nor spark-app-selector is provided.

3) K8S Shim: When creating a New application in the core, if the stateAware label exists and is false, set the "application.stateaware.disable" tag.

In addition to bypassing state-aware scheduling for generated apps, the addition of this label / tag also gives users a mechanism to opt out on a per-application basis from state-aware scheduling if necessary, such as an application containing only a single pod, or one where a second pod may not be launched in a timely manner.

  was:
When pods are submitted to YuniKorn without an associated applicationId label, the admission controller assigns a generated applicationId for that pod. However, if this pod is the first (or only) pod submitted, the generated application will block other applications from executing until a second pod is submitted, or until 5 minutes have elapsed. Since we have no way to know if another pod will be scheduled for the generated application, we should have a way to skip state-aware scheduling in this case and avoid the 5 minute delay.

To fix this:

1) Scheduler core: Add a new tag "application.stateaware.disable" which if present, will prevent waiting for a second task to transition to Running state.

2) Admission controller: Add new label "stateAware: false" to a pod if neither applicationId nor spark-app-selector is provided.

3) K8S Shim: When creating a New application in the core, if the stateAware label exists and is false, set the "application.stateaware.disable" tag.

In addition to bypassing state-aware scheduling for generated apps, the addition of this label / tag also gives users a mechanism to opt out on a per-application basis from state-aware scheduling if necessary, such as an application containing only a single pod, or one where a second pod may not be launched in a timely manner.


> Applications should be able to opt-out of state-aware scheduling
> ----------------------------------------------------------------
>
>                 Key: YUNIKORN-921
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-921
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: core - scheduler, shim - kubernetes
>            Reporter: Craig Condit
>            Assignee: Craig Condit
>            Priority: Major
>             Fix For: 1.0.0
>
>
> When pods are submitted to YuniKorn without an associated applicationId label, the admission controller assigns a generated applicationId for that pod. However, if this pod is the first (or only) pod submitted, the generated application will block other applications from executing until a second pod is submitted, or until 5 minutes have elapsed. Since we have no way to know if another pod will be scheduled for the generated application, we should have a way to skip state-aware scheduling in this case and avoid the 5 minute delay.
> To fix this:
> 1) Scheduler core: Add a new tag "application.stateaware.disable" which if present, will prevent waiting for a second task to transition to Running state.
> 2) Admission controller: Add new label "disableStateAware: true" to a pod if neither applicationId nor spark-app-selector is provided.
> 3) K8S Shim: When creating a New application in the core, if the stateAware label exists and is false, set the "application.stateaware.disable" tag.
> In addition to bypassing state-aware scheduling for generated apps, the addition of this label / tag also gives users a mechanism to opt out on a per-application basis from state-aware scheduling if necessary, such as an application containing only a single pod, or one where a second pod may not be launched in a timely manner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org