You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Peter Bacsko (Jira)" <ji...@apache.org> on 2022/04/26 10:49:00 UTC

[jira] [Updated] (YUNIKORN-1169) Fix ApplicationMetadata restoration during recovery

     [ https://issues.apache.org/jira/browse/YUNIKORN-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Bacsko updated YUNIKORN-1169:
-----------------------------------
        Parent: YUNIKORN-1187
    Issue Type: Sub-task  (was: Bug)

> Fix ApplicationMetadata restoration during recovery
> ---------------------------------------------------
>
>                 Key: YUNIKORN-1169
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1169
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: shim - kubernetes
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>
> The following code in {{general.go}} handles the recovery part:
> {noformat}
> 	for _, pod := range appPods {
> 		log.Logger().Debug("Looking at pod for recovery candidates", zap.String("podNamespace", pod.Namespace), zap.String("podName", pod.Name))
> 		// general filter passes, and pod is assigned
> 		// this means the pod is already scheduled by scheduler for an existing app
> 		if utils.GeneralPodFilter(pod) && utils.IsAssignedPod(pod) {
> 			if meta, ok := os.getAppMetadata(pod); ok {
> 				podsRecovered++
> 				log.Logger().Debug("Adding appID as recovery candidate", zap.String("appID", meta.ApplicationID))
> 				if _, exist := existingApps[meta.ApplicationID]; !exist {
> 					existingApps[meta.ApplicationID] = meta
> 				}
> ...
> {noformat}
> The crucial part is the handling of {{existingApps}} map. It's populated only once - however, there's no guarantee that all pods have the same tags or ownerReferences. 
> The scope of this JIRA is to analyze the possible side-effects of this code and come up with a better solution. A bug was already identified because of this (see YUNIKORN-1161).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org