You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@yunikorn.apache.org by "Wilfred Spiegelenburg (Jira)" <ji...@apache.org> on 2021/01/29 00:44:00 UTC

[jira] [Resolved] (YUNIKORN-230) Incorrect application state returned for "ws/v1/apps" REST call

     [ https://issues.apache.org/jira/browse/YUNIKORN-230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wilfred Spiegelenburg resolved YUNIKORN-230.
--------------------------------------------
    Fix Version/s: 0.10
       Resolution: Fixed

The state transitions were racing between cache and scheduler. Since we have removed the cache in 0.10 this cannot happen anymore. Before that we had mitigations in place already via YUNIKORN-222.

Closing this can not happen anymore after YUNIKORN-317

> Incorrect application state returned for "ws/v1/apps" REST call
> ---------------------------------------------------------------
>
>                 Key: YUNIKORN-230
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-230
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Ayub Pathan
>            Assignee: Wilfred Spiegelenburg
>            Priority: Major
>             Fix For: 0.10
>
>
>  Yunikorn image latest
> {noformat}
>   yunikorn-scheduler-web:
>     Container ID:   docker://e649fb4db6a3b822bb2a6bc5e4bf607e5848548df7eb57f24dbe37550ec87ec3
>     Image:          apache/yunikorn:web-0.9.0
>     Image ID:       docker-pullable://apache/yunikorn@sha256:52e5cfc8823e38d50249f2c3fcd50b0f2755ffb79534482e0d9d67f8b8e604f3 {noformat}
> *Steps to reproduce:*
>  * Deploy the job
>  * Check the status
> {noformat}
> kubectl get pods -n development                                                                           
> NAME       READY   STATUS    RESTARTS   AGE
> sleepjob   1/1     Running   0          26s {noformat}
>  * Verify the API response, it still shows as starting..
> {noformat}
> [
>     {
>         "allocations": [
>             {
>                 "allocationKey": "11fc645b-b0c7-11ea-aeee-0e65480c53e2",
>                 "allocationTags": null,
>                 "applicationId": "abcd",
>                 "nodeId": "ip-10-192-172-176.ca-central-1.compute.internal",
>                 "partition": "default",
>                 "priority": "<nil>",
>                 "queueName": "root.development",
>                 "resource": "[memory:50 vcore:100]",
>                 "uuid": "5a35de0b-b0d4-4434-9f17-3618faa0e247"
>             }
>         ],
>         "applicationID": "abcd",
>         "applicationState": "Starting",
>         "partition": "[mycluster]default",
>         "queueName": "root.development",
>         "submissionTime": 1592417964439019878,
>         "usedResource": "[memory:50 vcore:100]"
>     }
> ] {noformat}
>  * Job completed
> {noformat}
> kubectl get pods -n development                                                                           
> NAME       READY   STATUS      RESTARTS   AGE
> sleepjob   0/1     Completed   0          60s {noformat}
>  * Still API response shows the status as STARTING
> {noformat}
> [
>     {
>         "allocations": null,
>         "applicationID": "abcd",
>         "applicationState": "Starting",
>         "partition": "[mycluster]default",
>         "queueName": "root.development",
>         "submissionTime": 1592417964439019878,
>         "usedResource": "[memory:0 vcore:0]"
>     }
> ] {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org