You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Weiwei Yang (Jira)" <ji...@apache.org> on 2021/03/15 21:42:00 UTC

[jira] [Resolved] (YUNIKORN-572) terminated app move causes deadlock

     [ https://issues.apache.org/jira/browse/YUNIKORN-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Weiwei Yang resolved YUNIKORN-572.
----------------------------------
    Fix Version/s: 0.10
       Resolution: Fixed

> terminated app move causes deadlock
> -----------------------------------
>
>                 Key: YUNIKORN-572
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-572
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Wilfred Spiegelenburg
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.10
>
>
> PR #250 for the placeholder cleanup introduced a possible dead lock.
> When moving a terminated app from the queue the queue is unlinked from the app. Before that happens we make sure that the queue tracking is up to date. This requires an application lock.
> The move used to take a lock on the partition and did all its work in one go. With the change to remove the queue from the app we now lock the app inside the partition lock. Since the app is part of the active list until the move is done scheduling might check the app too. This could lead to a dead lock.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org