You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Peter Bacsko (Jira)" <ji...@apache.org> on 2022/04/04 17:52:00 UTC

[jira] [Created] (YUNIKORN-1169) Fix ApplicationMetadata restoration during recovery

Peter Bacsko created YUNIKORN-1169:
--------------------------------------

             Summary: Fix ApplicationMetadata restoration during recovery
                 Key: YUNIKORN-1169
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1169
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: shim - kubernetes
            Reporter: Peter Bacsko


The following code in {{general.go}} handles the recovery part:

{noformat}
	for _, pod := range appPods {
		log.Logger().Debug("Looking at pod for recovery candidates", zap.String("podNamespace", pod.Namespace), zap.String("podName", pod.Name))
		// general filter passes, and pod is assigned
		// this means the pod is already scheduled by scheduler for an existing app
		if utils.GeneralPodFilter(pod) && utils.IsAssignedPod(pod) {
			if meta, ok := os.getAppMetadata(pod); ok {
				podsRecovered++
				log.Logger().Debug("Adding appID as recovery candidate", zap.String("appID", meta.ApplicationID))
				if _, exist := existingApps[meta.ApplicationID]; !exist {
					existingApps[meta.ApplicationID] = meta
				}
...
{noformat}

The crucial part is the handling of {{existingApps}} map. It's populated only once - however, there's no guarantee that all pods have the same tags or ownerReferences. 

The scope of this JIRA is to analyze the possible side-effects of this code and come up with a better solution. A bug was already identified because of this (see YUNIKORN-1161).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org