You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@yunikorn.apache.org by "Weiwei Yang (Jira)" <ji...@apache.org> on 2021/05/20 03:52:00 UTC
[jira] [Created] (YUNIKORN-677) Potential resource leak when
complete and allocate pod happens simultaneously
Weiwei Yang created YUNIKORN-677:
------------------------------------
Summary: Potential resource leak when complete and allocate pod happens simultaneously
Key: YUNIKORN-677
URL: https://issues.apache.org/jira/browse/YUNIKORN-677
Project: Apache YuniKorn
Issue Type: Bug
Reporter: Weiwei Yang
Let's say we have an app that has 1 pod needs for scheduling. The shim submits an app to the core, and start the schedule the pod. In the shim side, this is a task in the Scheduling state. Then we have a race if the following things happen simultaneously:
# User deletes the pod, this triggers a CompleteTask event in the shim side, and the shim will send a ReleaseAllocationAskRequest to the core.
# Before handling the ReleaseAllocationAskRequest from the shim, the core made an allocation for the given pod and send an Allocation to the shim
then the core generates an allocation on a node, core receives the release request and deletes the pending ask; the shim side receives the new allocation, but since the pod has already been deleted so the shim ignores this allocation. In this case, the allocation will be left-over causing the resource leak.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org