You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Wilfred Spiegelenburg (Jira)" <ji...@apache.org> on 2022/03/18 02:36:00 UTC

[jira] [Assigned] (YUNIKORN-1077) Negative Container Count

     [ https://issues.apache.org/jira/browse/YUNIKORN-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wilfred Spiegelenburg reassigned YUNIKORN-1077:
-----------------------------------------------

    Assignee: Peter Bacsko

This has been reproduced and we have a RC that leads to this case:

The placeholder release with a TIMEOUT causes the issue based on the log analysis. First the release is counted when the placeholder really times out. The second time is after the shim has confirmed the removal. The core processes the release to update the nodes etc and then sends the release to the shim again counting it twice.

The core should not send the release a second time to the shim, just handle the accounting and leave it at that. Similar processing as that is performed for the placeholder replacement.

[~pbacsko] is working on a fix

> Negative Container Count
> ------------------------
>
>                 Key: YUNIKORN-1077
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1077
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - common
>            Reporter: Si Latt
>            Assignee: Peter Bacsko
>            Priority: Major
>
> For some unknown reason, Yunikorn sometimes think container count is negative. This info gets displayed on the dashboard. Also, in YK log, I can see the following log lines:
> {code:java}
> 2022-02-08T09:08:17.878Z        WARN    metrics/metrics_collector.go:85 Could not 
> calculate the totalContainersRunning. {"allocatedContainers": 23, "releasedContainers": 27} {code}
> YK team mentioned it's possibly a metrics bug and hence I am filing the report. I haven't been able to repro the issue yet.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org