You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@yunikorn.apache.org by "Craig Condit (Jira)" <ji...@apache.org> on 2022/03/29 18:44:00 UTC

[jira] [Updated] (YUNIKORN-1162) Recovery code does not update occupied resources on node

     [ https://issues.apache.org/jira/browse/YUNIKORN-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Craig Condit updated YUNIKORN-1162:
-----------------------------------
    Description: In YUNIKORN-1159 we discovered and fixed an issue with occupied resources for existing pods not getting processed properly. This was fixed by adding handling for pod additions in the node coordinator. However, it has been pointed out that this code: [https://github.com/apache/yunikorn-k8shim/blob/76ddb316bd099fc783a698d3a8fb3d5543cdaf21/pkg/cache/context_recovery.go#L96...L128] should account for the occupied resources during recovery. We need to determine why that is no longer the case.  (was: In YUNIKORN-1159 we discovered and fixed an issue with occupied resources for existing pods not getting processed properly. This was fixed by adding handling for pod additions in the node coordinator. However, it has been pointed out that this code: [https://github.infra.cloudera.com/yunikorn/k8s-shim/blob/ba536fdf7335fa90bb9a6ea6e36ae426e007e9cc/pkg/cache/context_recovery.go#L97-L128] should account for the occupied resources during recovery. We need to determine why that is no longer the case.)

> Recovery code does not update occupied resources on node
> --------------------------------------------------------
>
>                 Key: YUNIKORN-1162
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1162
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: shim - kubernetes
>            Reporter: Craig Condit
>            Priority: Blocker
>
> In YUNIKORN-1159 we discovered and fixed an issue with occupied resources for existing pods not getting processed properly. This was fixed by adding handling for pod additions in the node coordinator. However, it has been pointed out that this code: [https://github.com/apache/yunikorn-k8shim/blob/76ddb316bd099fc783a698d3a8fb3d5543cdaf21/pkg/cache/context_recovery.go#L96...L128] should account for the occupied resources during recovery. We need to determine why that is no longer the case.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org