You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Wilfred Spiegelenburg (Jira)" <ji...@apache.org> on 2021/02/01 00:36:00 UTC
[jira] [Comment Edited] (YUNIKORN-528) Nil pointer exception while getting both termination and delete pod event

    [ https://issues.apache.org/jira/browse/YUNIKORN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276017#comment-17276017 ] 

Wilfred Spiegelenburg edited comment on YUNIKORN-528 at 2/1/21, 12:35 AM:
--------------------------------------------------------------------------

We need to be defensive on the core side and not throw a NPE but this clearly also shows a major bug in the shim. This is already logged as YUNIKORN-229 and there is enough detail now that the shim is doing the wrong thing.

Even if the shim has no idea about the pod and it is a termination or delete it passes it on. This needs to be fixed in the shim also. A pod that does not exist or is marked as deleted in the shim should never trigger a second delete message from the shim to the core. The shim passes through pods even if they are no longer in the cache and thus not known to the system.
The cache on the shim side 

From the attached log first trigger point:
{code}
2021-01-29T16:40:08.495-0800	INFO	general/general.go:218	###### pod is terminated	{"name": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:08.495-0800	DEBUG	cache/context.go:223	removing pod from cache	{"podName": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:08.496-0800	INFO	general/general.go:219	task completes	{"appType": "general", "namespace": "default", "podName": "tg-tg-batch-sleep-job-1-0", "podUID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "podStatus": "Failed"}
2021-01-29T16:40:08.496-0800	DEBUG	cache/context.go:424	NotifyTaskComplete	{"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:08.496-0800	DEBUG	cache/context.go:428	release allocation	{"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:08.496-0800	DEBUG	cache/task.go:392	prepare to send release request	{"applicationID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "allocationUUID": "27ea8874-261f-4448-bc0f-5ea14d23f9e8", "task": "Bound", "terminationType": "PLACEHOLDER_REPLACED"}
2021-01-29T16:40:08.496-0800	INFO	cache/task.go:430	releasing allocations	{"numOfAsksToRelease": 0, "numOfAllocationsToRelease": 1}
2021-01-29T16:40:08.496-0800	DEBUG	cache/task.go:467	shim task state transition	{"app": "batch-sleep-job-1", "task": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "source": "Bound", "destination": "Completed", "event": "CompleteTask"}
2021-01-29T16:40:08.496-0800	DEBUG	scheduler/partition.go:1134	replacing placeholder allocation	{"appID": "batch-sleep-job-1", "allocationId": "27ea8874-261f-4448-bc0f-5ea14d23f9e8"}
2021-01-29T16:40:08.496-0800	DEBUG	scheduler/scheduler.go:97	enqueued event	{"eventType": "*rmevent.RMUpdateRequestEvent", "event": {"Request":{"releases":{"allocationsToRelease":[{"partitionName":"[mycluster]default","applicationID":"batch-sleep-job-1","UUID":"27ea8874-261f-4448-bc0f-5ea14d23f9e8","terminationType":3,"message":"task completed"}]},"rmID":"mycluster"}}, "currentQueueSize": 0}
{code}
 That means that after this point in time the pod (UUID) does not exist in the shim cache anymore. The task has been marked as completed. That is the last time the UUID *must* be communicated to the core.

Second trigger point in the log:
{code}
2021-01-29T16:40:12.022-0800	INFO	general/general.go:257	###### pod is deleted	{"name": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:12.022-0800	INFO	general/general.go:259	delete pod	{"appType": "general", "namespace": "default", "podName": "tg-tg-batch-sleep-job-1-0", "podUID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800	DEBUG	cache/context.go:424	NotifyTaskComplete	{"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800	DEBUG	cache/context.go:428	release allocation	{"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800	DEBUG	cache/task.go:392	prepare to send release request	{"applicationID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "allocationUUID": "27ea8874-261f-4448-bc0f-5ea14d23f9e8", "task": "Completed", "terminationType": "PLACEHOLDER_REPLACED"}
2021-01-29T16:40:12.022-0800	INFO	cache/task.go:430	releasing allocations	{"numOfAsksToRelease": 0, "numOfAllocationsToRelease": 1}
2021-01-29T16:40:12.022-0800	DEBUG	scheduler/partition.go:1134	replacing placeholder allocation	{"appID": "batch-sleep-job-1", "allocationId": "27ea8874-261f-4448-bc0f-5ea14d23f9e8"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x190cf5d]
{code}
That shows the shim has no idea that the pod or task has been terminated already. If it does know it just ignores the state and sends the same request again. This is the exact issue in YUNIKORN-229. The pod is removed from the cache but clearly the pod and task tracking are not linked correctly to the cache as the task hangs around. That *must* be fixed.


was (Author: wifreds):
We need to be defensive on the core side and not throw a NPE but this clearly also shows a major bug in the shim. This is already logged as YUNIKORN-229 and there is enough detail now that the shim is doing the wrong thing.

Even if the shim has no idea about the pod and it is a termination or delete it passes it on. This needs to be fixed in the shim also. A pod that does not exist or is marked as deleted in the shim should never trigger a second delete message from the shim to the core. The shim passes through pods even if they are no longer in the cache and thus not known to the system.
The cache on the shim side 

From the attached log first trigger point:
{code:text}
2021-01-29T16:40:08.495-0800	INFO	general/general.go:218	###### pod is terminated	{"name": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:08.495-0800	DEBUG	cache/context.go:223	removing pod from cache	{"podName": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:08.496-0800	INFO	general/general.go:219	task completes	{"appType": "general", "namespace": "default", "podName": "tg-tg-batch-sleep-job-1-0", "podUID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "podStatus": "Failed"}
2021-01-29T16:40:08.496-0800	DEBUG	cache/context.go:424	NotifyTaskComplete	{"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:08.496-0800	DEBUG	cache/context.go:428	release allocation	{"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:08.496-0800	DEBUG	cache/task.go:392	prepare to send release request	{"applicationID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "allocationUUID": "27ea8874-261f-4448-bc0f-5ea14d23f9e8", "task": "Bound", "terminationType": "PLACEHOLDER_REPLACED"}
2021-01-29T16:40:08.496-0800	INFO	cache/task.go:430	releasing allocations	{"numOfAsksToRelease": 0, "numOfAllocationsToRelease": 1}
2021-01-29T16:40:08.496-0800	DEBUG	cache/task.go:467	shim task state transition	{"app": "batch-sleep-job-1", "task": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "source": "Bound", "destination": "Completed", "event": "CompleteTask"}
2021-01-29T16:40:08.496-0800	DEBUG	scheduler/partition.go:1134	replacing placeholder allocation	{"appID": "batch-sleep-job-1", "allocationId": "27ea8874-261f-4448-bc0f-5ea14d23f9e8"}
2021-01-29T16:40:08.496-0800	DEBUG	scheduler/scheduler.go:97	enqueued event	{"eventType": "*rmevent.RMUpdateRequestEvent", "event": {"Request":{"releases":{"allocationsToRelease":[{"partitionName":"[mycluster]default","applicationID":"batch-sleep-job-1","UUID":"27ea8874-261f-4448-bc0f-5ea14d23f9e8","terminationType":3,"message":"task completed"}]},"rmID":"mycluster"}}, "currentQueueSize": 0}
{code}
 That means that after this point in time the pod (UUID) does not exist in the shim cache anymore. The task has been marked as completed. That is the last time the UUID *must* be communicated to the core.

Second trigger point in the log:
{code:text}
2021-01-29T16:40:12.022-0800	INFO	general/general.go:257	###### pod is deleted	{"name": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:12.022-0800	INFO	general/general.go:259	delete pod	{"appType": "general", "namespace": "default", "podName": "tg-tg-batch-sleep-job-1-0", "podUID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800	DEBUG	cache/context.go:424	NotifyTaskComplete	{"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800	DEBUG	cache/context.go:428	release allocation	{"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800	DEBUG	cache/task.go:392	prepare to send release request	{"applicationID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "allocationUUID": "27ea8874-261f-4448-bc0f-5ea14d23f9e8", "task": "Completed", "terminationType": "PLACEHOLDER_REPLACED"}
2021-01-29T16:40:12.022-0800	INFO	cache/task.go:430	releasing allocations	{"numOfAsksToRelease": 0, "numOfAllocationsToRelease": 1}
2021-01-29T16:40:12.022-0800	DEBUG	scheduler/partition.go:1134	replacing placeholder allocation	{"appID": "batch-sleep-job-1", "allocationId": "27ea8874-261f-4448-bc0f-5ea14d23f9e8"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x190cf5d]
{code}
That shows the shim has no idea that the pod or task has been terminated already. If it does know it just ignores the state and sends the same request again. This is the exact issue in YUNIKORN-229. The pod is removed from the cache but clearly the pod and task tracking are not linked correctly to the cache as the task hangs around. That *must* be fixed.

> Nil pointer exception while getting both termination and delete pod event
> -------------------------------------------------------------------------
>
>                 Key: YUNIKORN-528
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-528
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: shim - kubernetes
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Blocker
>              Labels: pull-request-available
>         Attachments: nil.log
>
>
> During the test, I observed on some occasions the scheduler could run into Nil pointer exception like below:
> {code}
> 4-261f-4448-bc0f-5ea14d23f9e8"}
> panic: runtime error: invalid memory address or nil pointer dereference
> [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x190cf5d]
> goroutine 114 [running]:
> github.com/apache/incubator-yunikorn-core/pkg/scheduler/objects.(*Application).ReplaceAllocation(0xc004250000, 0xc0038e01b0, 0x24, 0x0)
> 	/Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/objects/application.go:1026 +0xcd
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*PartitionContext).removeAllocation(0xc0026de600, 0xc0003c0a10, 0x0, 0x0, 0x0, 0x0)
> 	/Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/partition.go:1137 +0x14b5
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterContext).processAllocationReleases(0xc0001400f0, 0xc0066400c0, 0x1, 0x1, 0x7ffeefbff80f, 0x9)
> 	/Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/context.go:683 +0x150
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterContext).processAllocations(0xc0001400f0, 0xc006730000)
> 	/Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/context.go:606 +0x185
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterContext).processRMUpdateEvent(0xc0001400f0, 0xc0066ee0b8)
> 	/Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/context.go:213 +0x77
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).handleRMEvent(0xc00000e3c0)
> 	/Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/scheduler.go:112 +0x416
> created by github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).StartService
> 	/Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/scheduler.go:54 +0xa2
> make: *** [run] Error 2
> {code}
> the root cause is when the shim deletes a placeholder, it can trigger 2 events sometime,
> * Pod Update
> * Pod Delete
> When a pod updated to TERMINATED state and when a pod gets DELETED, the shim will send a release request to the core. But when there is a second release request, as the previous one already removed the allocation, then we are hitting the Nil pointer. We need to avoid sending a second time release if the pod is already released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org