You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@yunikorn.apache.org by "Manikandan R (Jira)" <ji...@apache.org> on 2023/02/27 04:34:00 UTC

[jira] [Created] (YUNIKORN-1605) unit tests for preempted placeholder in the placeholder data

Manikandan R created YUNIKORN-1605:
--------------------------------------

             Summary: unit tests for preempted placeholder in the placeholder data
                 Key: YUNIKORN-1605
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1605
             Project: Apache YuniKorn
          Issue Type: Test
          Components: core - scheduler
            Reporter: Manikandan R
            Assignee: Manikandan R


First test generic placeholder pre-emption: 

Create a job with placeholders:
small enough to fits in the queue
larger than the free space of the queue.
This will leave the placeholder running for a long time as they need to timeout.
Hopefully it generates a node that is fully loaded with gang placeholders.
Then create a daemon set that must run on the node.
The size of the daemon set must be large enough so that it does not fit on the node
That should trigger the placeholder pre-emption.
Before the fix: the placeholder data for the app does not show that the placeholder was removed
After the fix: the placeholder data for the app shows a removed placeholder

Second test is node removal with placeholders:

Create a job with placeholders:
small enough to fits in the queue
larger than the free space of the queue.
This will leave the placeholder running for a long time as they need to timeout.
remove the node with at least 1 placeholder
Before the fix: the placeholder data for the app does not show that the placeholder was removed
After the fix: the placeholder data for the app shows a removed placeholder

Third test is kill a placeholder:

Create a job with placeholders:
small enough to fits in the queue
larger than the free space of the queue.
This will leave the placeholder running for a long time as they need to timeout.
mimic a removal of the placeholder via kubectl by creating a allocation release request and send that to the partition with the termination type STOPPED_BY_RM
Before the fix: the placeholder data for the app does not show that the placeholder was removed
After the fix: the placeholder data for the app shows a removed placeholder

Working config:

queue quota max size: 16GB / 16cpu
nodes: 2 * 8GB / 8 cpu
create an application with allocation: 4 GB / 4 cpu
create an gang application requesting: 7 * 2GB / 2cpu
create a daemon set pod for one of the nodes asking for 1GB / 1 cpu



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org