You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@yunikorn.apache.org by "Weiwei Yang (Jira)" <ji...@apache.org> on 2021/05/20 03:52:00 UTC

[jira] [Created] (YUNIKORN-677) Potential resource leak when complete and allocate pod happens simultaneously

Weiwei Yang created YUNIKORN-677:
------------------------------------

             Summary: Potential resource leak when complete and allocate pod happens simultaneously
                 Key: YUNIKORN-677
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-677
             Project: Apache YuniKorn
          Issue Type: Bug
            Reporter: Weiwei Yang


Let's say we have an app that has 1 pod needs for scheduling. The shim submits an app to the core, and start the schedule the pod. In the shim side, this is a task in the Scheduling state. Then we have a race if the following things happen simultaneously:
# User deletes the pod, this triggers a CompleteTask event in the shim side, and the shim will send a ReleaseAllocationAskRequest to the core.
# Before handling the ReleaseAllocationAskRequest from the shim, the core made an allocation for the given pod and send an Allocation to the shim

then the core generates an allocation on a node, core receives the release request and deletes the pending ask; the shim side receives the new allocation, but since the pod has already been deleted so the shim ignores this allocation. In this case, the allocation will be left-over causing the resource leak.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org