You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@yunikorn.apache.org by "Peter Bacsko (Jira)" <ji...@apache.org> on 2023/05/03 10:57:00 UTC

[jira] [Created] (YUNIKORN-1714) Fatal error: concurrent write/read when calling Queue.RemoveApplication()

Peter Bacsko created YUNIKORN-1714:
--------------------------------------

             Summary: Fatal error: concurrent write/read when calling Queue.RemoveApplication()
                 Key: YUNIKORN-1714
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1714
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: core - scheduler
            Reporter: Peter Bacsko


Encountered this problem when doing some local testing with lot of running applications:

{noformat}
fatal error: concurrent map read and map write

goroutine 8785 [running]:
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).RemoveApplication(0xc0002e0840, 0xc004a1cc40)
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/objects/queue.go:697 +0x65
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).UnSetQueue(0xc004a1cc40)
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/objects/application.go:1493 +0x45
github.com/apache/yunikorn-core/pkg/scheduler.(*PartitionContext).moveTerminatedApp(0xc0002aa600, {0xc00372e4e0, 0x16})
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/partition.go:1409 +0x73
created by github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).executeTerminatedCallback
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/objects/application.go:1831 +0xaa

...

goroutine 8782 [runnable]:
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).timeoutStateTimer.func1()
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/objects/application.go:298
created by time.goFunc
	/snap/go/current/src/time/sleep.go:176 +0x32

goroutine 8623 [runnable]:
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).executeTerminatedCallback.func1()
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/objects/application.go:1831
runtime.goexit()
	/snap/go/current/src/runtime/asm_amd64.s:1598 +0x1
created by github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).executeTerminatedCallback
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/objects/application.go:1831 +0xaa

goroutine 8786 [runnable]:
go.uber.org/zap.(*stacktrace).Next(...)
	/home/bacskop/go/pkg/mod/go.uber.org/zap@v1.24.0/stacktrace.go:127
go.uber.org/zap.(*Logger).check(0xc0003bb650, 0x0, {0x1e6c20c, 0x2c})
	/home/bacskop/go/pkg/mod/go.uber.org/zap@v1.24.0/logger.go:372 +0x7e5
go.uber.org/zap.(*Logger).Info(0xc0002e0420?, {0x1e6c20c?, 0x1?}, {0xc005745680, 0x2, 0x2})
	/home/bacskop/go/pkg/mod/go.uber.org/zap@v1.24.0/logger.go:219 +0x3b
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).RemoveApplication(0xc0002e0840, 0xc004aa0380)
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/objects/queue.go:742 +0xcc6
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).UnSetQueue(0xc004aa0380)
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/objects/application.go:1493 +0x45
github.com/apache/yunikorn-core/pkg/scheduler.(*PartitionContext).moveTerminatedApp(0xc0002aa600, {0xc00372e498, 0x16})
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/partition.go:1409 +0x73
created by github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).executeTerminatedCallback
	/home/bacskop/repos/incubator-yunikorn-core/pkg/scheduler/objects/application.go:1831 +0xaa
{noformat}

There is an unprotected access to {{sq.applications[]}}, the code checks if an application exist without locking. But this can fail because the map can be modified concurrently, which Go detects and does not allow.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org