You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Craig Condit (Jira)" <ji...@apache.org> on 2022/03/29 18:45:00 UTC

[jira] [Comment Edited] (YUNIKORN-1159) Node occupied resources are not detected on YuniKorn startup

    [ https://issues.apache.org/jira/browse/YUNIKORN-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514272#comment-17514272 ] 

Craig Condit edited comment on YUNIKORN-1159 at 3/29/22, 6:44 PM:
------------------------------------------------------------------

We may need a follow-up for this one. It has been pointed out that this code: [https://github.com/apache/yunikorn-k8shim/blob/76ddb316bd099fc783a698d3a8fb3d5543cdaf21/pkg/cache/context_recovery.go#L96...L128] should account for the occupied resources during recovery. We need to determine why that is no longer the case.


was (Author: ccondit):
We may need a follow-up for this one. It has been pointed out that this code: [https://github.infra.cloudera.com/yunikorn/k8s-shim/blob/ba536fdf7335fa90bb9a6ea6e36ae426e007e9cc/pkg/cache/context_recovery.go#L97-L128] should account for the occupied resources during recovery. We need to determine why that is no longer the case.

> Node occupied resources are not detected on YuniKorn startup
> ------------------------------------------------------------
>
>                 Key: YUNIKORN-1159
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1159
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: shim - kubernetes
>            Reporter: Craig Condit
>            Assignee: Craig Condit
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.0.0
>
>
> When YuniKorn starts up on a Kubernetes cluster, the occupied resources of existing nodes are not properly detected. This can lead to YuniKorn over-allocating resources on a node and resulting in a failure of pods to schedule.
> Restarting a pod on a node will cause the occupied values of the node to refresh properly.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org