You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@yunikorn.apache.org by "Craig Condit (Jira)" <ji...@apache.org> on 2024/03/05 22:11:00 UTC

[jira] [Resolved] (YUNIKORN-2465) Remove Task objects from the shim upon pod completion

     [ https://issues.apache.org/jira/browse/YUNIKORN-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Craig Condit resolved YUNIKORN-2465.
------------------------------------
    Fix Version/s: 1.5.0
       Resolution: Fixed

> Remove Task objects from the shim upon pod completion
> -----------------------------------------------------
>
>                 Key: YUNIKORN-2465
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2465
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: shim - kubernetes
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.5.0
>
>         Attachments: remove_ask_alloc.poc
>
>
> We don't remove Task objects from the shim when the pod completes. This has consequences for long running workloads which keep generating new pods with the same applicationID such as Spark Streaming. The ever increasing memory usage eventually results in an OOM and the termination of Yunikorn. Tasks are only removed when the application reaches Completed state in the scheduler-core.
> Restart fixes the situation because completed pods are not restored and added to the Context/Application. We should remove the tasks during the lifetime of the application unless there's a good reason not to.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org