You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Wilfred Spiegelenburg (Jira)" <ji...@apache.org> on 2021/02/01 00:36:00 UTC
[jira] [Comment Edited] (YUNIKORN-528) Nil pointer exception while
getting both termination and delete pod event
[ https://issues.apache.org/jira/browse/YUNIKORN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276017#comment-17276017 ]
Wilfred Spiegelenburg edited comment on YUNIKORN-528 at 2/1/21, 12:35 AM:
--------------------------------------------------------------------------
We need to be defensive on the core side and not throw a NPE but this clearly also shows a major bug in the shim. This is already logged as YUNIKORN-229 and there is enough detail now that the shim is doing the wrong thing.
Even if the shim has no idea about the pod and it is a termination or delete it passes it on. This needs to be fixed in the shim also. A pod that does not exist or is marked as deleted in the shim should never trigger a second delete message from the shim to the core. The shim passes through pods even if they are no longer in the cache and thus not known to the system.
The cache on the shim side
From the attached log first trigger point:
{code}
2021-01-29T16:40:08.495-0800 INFO general/general.go:218 ###### pod is terminated {"name": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:08.495-0800 DEBUG cache/context.go:223 removing pod from cache {"podName": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:08.496-0800 INFO general/general.go:219 task completes {"appType": "general", "namespace": "default", "podName": "tg-tg-batch-sleep-job-1-0", "podUID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "podStatus": "Failed"}
2021-01-29T16:40:08.496-0800 DEBUG cache/context.go:424 NotifyTaskComplete {"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:08.496-0800 DEBUG cache/context.go:428 release allocation {"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:08.496-0800 DEBUG cache/task.go:392 prepare to send release request {"applicationID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "allocationUUID": "27ea8874-261f-4448-bc0f-5ea14d23f9e8", "task": "Bound", "terminationType": "PLACEHOLDER_REPLACED"}
2021-01-29T16:40:08.496-0800 INFO cache/task.go:430 releasing allocations {"numOfAsksToRelease": 0, "numOfAllocationsToRelease": 1}
2021-01-29T16:40:08.496-0800 DEBUG cache/task.go:467 shim task state transition {"app": "batch-sleep-job-1", "task": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "source": "Bound", "destination": "Completed", "event": "CompleteTask"}
2021-01-29T16:40:08.496-0800 DEBUG scheduler/partition.go:1134 replacing placeholder allocation {"appID": "batch-sleep-job-1", "allocationId": "27ea8874-261f-4448-bc0f-5ea14d23f9e8"}
2021-01-29T16:40:08.496-0800 DEBUG scheduler/scheduler.go:97 enqueued event {"eventType": "*rmevent.RMUpdateRequestEvent", "event": {"Request":{"releases":{"allocationsToRelease":[{"partitionName":"[mycluster]default","applicationID":"batch-sleep-job-1","UUID":"27ea8874-261f-4448-bc0f-5ea14d23f9e8","terminationType":3,"message":"task completed"}]},"rmID":"mycluster"}}, "currentQueueSize": 0}
{code}
That means that after this point in time the pod (UUID) does not exist in the shim cache anymore. The task has been marked as completed. That is the last time the UUID *must* be communicated to the core.
Second trigger point in the log:
{code}
2021-01-29T16:40:12.022-0800 INFO general/general.go:257 ###### pod is deleted {"name": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:12.022-0800 INFO general/general.go:259 delete pod {"appType": "general", "namespace": "default", "podName": "tg-tg-batch-sleep-job-1-0", "podUID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800 DEBUG cache/context.go:424 NotifyTaskComplete {"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800 DEBUG cache/context.go:428 release allocation {"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800 DEBUG cache/task.go:392 prepare to send release request {"applicationID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "allocationUUID": "27ea8874-261f-4448-bc0f-5ea14d23f9e8", "task": "Completed", "terminationType": "PLACEHOLDER_REPLACED"}
2021-01-29T16:40:12.022-0800 INFO cache/task.go:430 releasing allocations {"numOfAsksToRelease": 0, "numOfAllocationsToRelease": 1}
2021-01-29T16:40:12.022-0800 DEBUG scheduler/partition.go:1134 replacing placeholder allocation {"appID": "batch-sleep-job-1", "allocationId": "27ea8874-261f-4448-bc0f-5ea14d23f9e8"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x190cf5d]
{code}
That shows the shim has no idea that the pod or task has been terminated already. If it does know it just ignores the state and sends the same request again. This is the exact issue in YUNIKORN-229. The pod is removed from the cache but clearly the pod and task tracking are not linked correctly to the cache as the task hangs around. That *must* be fixed.
was (Author: wifreds):
We need to be defensive on the core side and not throw a NPE but this clearly also shows a major bug in the shim. This is already logged as YUNIKORN-229 and there is enough detail now that the shim is doing the wrong thing.
Even if the shim has no idea about the pod and it is a termination or delete it passes it on. This needs to be fixed in the shim also. A pod that does not exist or is marked as deleted in the shim should never trigger a second delete message from the shim to the core. The shim passes through pods even if they are no longer in the cache and thus not known to the system.
The cache on the shim side
From the attached log first trigger point:
{code:text}
2021-01-29T16:40:08.495-0800 INFO general/general.go:218 ###### pod is terminated {"name": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:08.495-0800 DEBUG cache/context.go:223 removing pod from cache {"podName": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:08.496-0800 INFO general/general.go:219 task completes {"appType": "general", "namespace": "default", "podName": "tg-tg-batch-sleep-job-1-0", "podUID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "podStatus": "Failed"}
2021-01-29T16:40:08.496-0800 DEBUG cache/context.go:424 NotifyTaskComplete {"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:08.496-0800 DEBUG cache/context.go:428 release allocation {"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:08.496-0800 DEBUG cache/task.go:392 prepare to send release request {"applicationID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "allocationUUID": "27ea8874-261f-4448-bc0f-5ea14d23f9e8", "task": "Bound", "terminationType": "PLACEHOLDER_REPLACED"}
2021-01-29T16:40:08.496-0800 INFO cache/task.go:430 releasing allocations {"numOfAsksToRelease": 0, "numOfAllocationsToRelease": 1}
2021-01-29T16:40:08.496-0800 DEBUG cache/task.go:467 shim task state transition {"app": "batch-sleep-job-1", "task": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "source": "Bound", "destination": "Completed", "event": "CompleteTask"}
2021-01-29T16:40:08.496-0800 DEBUG scheduler/partition.go:1134 replacing placeholder allocation {"appID": "batch-sleep-job-1", "allocationId": "27ea8874-261f-4448-bc0f-5ea14d23f9e8"}
2021-01-29T16:40:08.496-0800 DEBUG scheduler/scheduler.go:97 enqueued event {"eventType": "*rmevent.RMUpdateRequestEvent", "event": {"Request":{"releases":{"allocationsToRelease":[{"partitionName":"[mycluster]default","applicationID":"batch-sleep-job-1","UUID":"27ea8874-261f-4448-bc0f-5ea14d23f9e8","terminationType":3,"message":"task completed"}]},"rmID":"mycluster"}}, "currentQueueSize": 0}
{code}
That means that after this point in time the pod (UUID) does not exist in the shim cache anymore. The task has been marked as completed. That is the last time the UUID *must* be communicated to the core.
Second trigger point in the log:
{code:text}
2021-01-29T16:40:12.022-0800 INFO general/general.go:257 ###### pod is deleted {"name": "tg-tg-batch-sleep-job-1-0"}
2021-01-29T16:40:12.022-0800 INFO general/general.go:259 delete pod {"appType": "general", "namespace": "default", "podName": "tg-tg-batch-sleep-job-1-0", "podUID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800 DEBUG cache/context.go:424 NotifyTaskComplete {"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800 DEBUG cache/context.go:428 release allocation {"appID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042"}
2021-01-29T16:40:12.022-0800 DEBUG cache/task.go:392 prepare to send release request {"applicationID": "batch-sleep-job-1", "taskID": "a7d59740-7b56-4b81-931c-3cc00ed12042", "taskAlias": "default/tg-tg-batch-sleep-job-1-0", "allocationUUID": "27ea8874-261f-4448-bc0f-5ea14d23f9e8", "task": "Completed", "terminationType": "PLACEHOLDER_REPLACED"}
2021-01-29T16:40:12.022-0800 INFO cache/task.go:430 releasing allocations {"numOfAsksToRelease": 0, "numOfAllocationsToRelease": 1}
2021-01-29T16:40:12.022-0800 DEBUG scheduler/partition.go:1134 replacing placeholder allocation {"appID": "batch-sleep-job-1", "allocationId": "27ea8874-261f-4448-bc0f-5ea14d23f9e8"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x190cf5d]
{code}
That shows the shim has no idea that the pod or task has been terminated already. If it does know it just ignores the state and sends the same request again. This is the exact issue in YUNIKORN-229. The pod is removed from the cache but clearly the pod and task tracking are not linked correctly to the cache as the task hangs around. That *must* be fixed.
> Nil pointer exception while getting both termination and delete pod event
> -------------------------------------------------------------------------
>
> Key: YUNIKORN-528
> URL: https://issues.apache.org/jira/browse/YUNIKORN-528
> Project: Apache YuniKorn
> Issue Type: Sub-task
> Components: shim - kubernetes
> Reporter: Weiwei Yang
> Assignee: Weiwei Yang
> Priority: Blocker
> Labels: pull-request-available
> Attachments: nil.log
>
>
> During the test, I observed on some occasions the scheduler could run into Nil pointer exception like below:
> {code}
> 4-261f-4448-bc0f-5ea14d23f9e8"}
> panic: runtime error: invalid memory address or nil pointer dereference
> [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x190cf5d]
> goroutine 114 [running]:
> github.com/apache/incubator-yunikorn-core/pkg/scheduler/objects.(*Application).ReplaceAllocation(0xc004250000, 0xc0038e01b0, 0x24, 0x0)
> /Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/objects/application.go:1026 +0xcd
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*PartitionContext).removeAllocation(0xc0026de600, 0xc0003c0a10, 0x0, 0x0, 0x0, 0x0)
> /Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/partition.go:1137 +0x14b5
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterContext).processAllocationReleases(0xc0001400f0, 0xc0066400c0, 0x1, 0x1, 0x7ffeefbff80f, 0x9)
> /Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/context.go:683 +0x150
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterContext).processAllocations(0xc0001400f0, 0xc006730000)
> /Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/context.go:606 +0x185
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterContext).processRMUpdateEvent(0xc0001400f0, 0xc0066ee0b8)
> /Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/context.go:213 +0x77
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).handleRMEvent(0xc00000e3c0)
> /Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/scheduler.go:112 +0x416
> created by github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).StartService
> /Users/wyang/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20210126213806-78bf4f684709/pkg/scheduler/scheduler.go:54 +0xa2
> make: *** [run] Error 2
> {code}
> the root cause is when the shim deletes a placeholder, it can trigger 2 events sometime,
> * Pod Update
> * Pod Delete
> When a pod updated to TERMINATED state and when a pod gets DELETED, the shim will send a release request to the core. But when there is a second release request, as the previous one already removed the allocation, then we are hitting the Nil pointer. We need to avoid sending a second time release if the pod is already released.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org