You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Ayub Pathan (Jira)" <ji...@apache.org> on 2021/03/16 04:45:00 UTC

[jira] [Created] (YUNIKORN-576) YK unable to schedule post rejecting an app

Ayub Pathan created YUNIKORN-576:
------------------------------------

             Summary: YK unable to schedule post rejecting an app
                 Key: YUNIKORN-576
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-576
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: core - scheduler
            Reporter: Ayub Pathan
             Fix For: 0.10
         Attachments: gang-app-timeout-no-gang.yaml, stack, yk.log

* Tried submitting an app( [^gang-app-timeout-no-gang.yaml] ) with min member == parallelism. I see the app is rejected by scheduler. After this whatever app submitted is not getting scheduled...
* App is rejected with below error, after placeholder pods are timed out.
{noformat}
2021-03-16T03:12:41.214Z	INFO	scheduler/context.go:674	Invalid ask add requested by shim	{"partition": "[mycluster]default", "applicationID": "gang-app-timeout-1009", "askKey": "cf58523b-9750-40b8-b148-b3319bdf3edf", "error": "failed to find application gang-app-timeout-1009, for allocation ask cf58523b-9750-40b8-b148-b3319bdf3edf"}
2021-03-16T03:12:41.214Z	WARN	cache/task.go:415	task allocation UUID is empty, sending this release request to yunikorn-core could cause all allocations of this app get released. skip this request, this may cause some resource leak. check the logs for more info!	{"applicationID": "gang-app-timeout-1009", "taskID": "cf58523b-9750-40b8-b148-b3319bdf3edf", "taskAlias": "fifo/gang-app-timeout-1009-h5qlh", "allocationUUID": "", "task": "Failed"}
2021-03-16T03:12:41.214Z	ERROR	cache/task.go:243	task failed	{"appID": "gang-app-timeout-1009", "taskID": "cf58523b-9750-40b8-b148-b3319bdf3edf", "reason": "task fifo/gang-app-timeout-1009-h5qlh failed because it is rejected by scheduler"}
github.com/apache/incubator-yunikorn-k8shim/pkg/cache.(*Task).handleFailEvent
	/grid/0/jenkins/workspace/workspace/App_builds/SOURCES/yunikorn-k8shim/pkg/cache/task.go:243
github.com/looplab/fsm.(*FSM).afterEventCallbacks
	/grid/0/jenkins/go/pkg/mod/github.com/looplab/fsm@v0.1.0/fsm.go:414
github.com/looplab/fsm.(*FSM).Event.func1
	/grid/0/jenkins/go/pkg/mod/github.com/looplab/fsm@v0.1.0/fsm.go:309
github.com/looplab/fsm.transitionerStruct.transition
	/grid/0/jenkins/go/pkg/mod/github.com/looplab/fsm@v0.1.0/fsm.go:354
github.com/looplab/fsm.(*FSM).doTransition
	/grid/0/jenkins/go/pkg/mod/github.com/looplab/fsm@v0.1.0/fsm.go:339
github.com/looplab/fsm.(*FSM).Event
	/grid/0/jenkins/go/pkg/mod/github.com/looplab/fsm@v0.1.0/fsm.go:321
github.com/apache/incubator-yunikorn-k8shim/pkg/cache.(*Task).handle
	/grid/0/jenkins/workspace/workspace/App_builds/SOURCES/yunikorn-k8shim/pkg/cache/task.go:152
github.com/apache/incubator-yunikorn-k8shim/pkg/cache.(*Context).TaskEventHandler.func1
	/grid/0/jenkins/workspace/workspace/App_builds/SOURCES/yunikorn-k8shim/pkg/cache/context.go:770
github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.Start.func1
	/grid/0/jenkins/workspace/workspace/App_builds/SOURCES/yunikorn-k8shim/pkg/dispatcher/dispatcher.go:194
2021-03-16T03:12:41.896Z	INFO	general/general.go:221	task completes	{"appType": "general", "namespace": "fifo", "podName": "tg-timeout-1009-gang-app-timeout-1009-0", "podUID": "11c4a9dd-7ec4-4dee-8e36-eb0dc74bb6d1", "podStatus": "Failed"}
{noformat}
* After this error, any app submitted is not scheduled.
{noformat}
gang-app-timeout-1010-dph4q               0/1     Pending     0          11m
gang-app-timeout-1010-f7zmp               0/1     Pending     0          11m
gang-app-timeout-1010-xmzfk               0/1     Pending     0          11m
tg-timeout-1010-gang-app-timeout-1010-0   0/1     Pending     0          11m
tg-timeout-1010-gang-app-timeout-1010-1   0/1     Pending     0          11m
tg-timeout-1010-gang-app-timeout-1010-2   0/1     Pending     0          11m
{noformat}
Complete logs are attached  [^yk.log] .
Stack trace attached  [^stack] .




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org