You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@yunikorn.apache.org by "Si Latt (Jira)" <ji...@apache.org> on 2022/02/10 22:00:00 UTC

[jira] [Created] (YUNIKORN-1076) Robust handling of invalid Task Group annotation

Si Latt created YUNIKORN-1076:
---------------------------------

             Summary: Robust handling of invalid Task Group annotation
                 Key: YUNIKORN-1076
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1076
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: core - scheduler
            Reporter: Si Latt


For gang scheduling, task group information has to be defined in the annotation section. If the provided YAML for task group info is invalid, such as missing double quote for keys, it results in parse exception and gets logged in YK log. However, when looking at Kubernetes event log, there is no indication that exception happened during gang scheduling. Other gang scheduling events are logged and hence give users the wrong impression that pods are gang scheduled without any issue.

Current behavior is dangerous as it cause gang scheduling to not work in production without users realizing any issue.  We should probably take the following actions:
 # Reject / fail any app with invalid task group annotation
 # Emit exceptions in kubernetes events for surfacing critical events happening in the cluster.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org