You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@yunikorn.apache.org by "Wilfred Spiegelenburg (Jira)" <ji...@apache.org> on 2023/05/16 04:23:00 UTC
[jira] [Resolved] (YUNIKORN-1708) Filtered owner references for placeholder pods.
[ https://issues.apache.org/jira/browse/YUNIKORN-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wilfred Spiegelenburg resolved YUNIKORN-1708.
---------------------------------------------
Fix Version/s: 1.3.0
Resolution: Fixed
All placeholders now get the originator pod as the owner not some random owner of the originator pod.
> Filtered owner references for placeholder pods.
> -----------------------------------------------
>
> Key: YUNIKORN-1708
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1708
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: shim - kubernetes
> Reporter: Junyoung Park
> Assignee: Qi Zhu
> Priority: Major
> Labels: AWS, pull-request-available
> Fix For: 1.3.0
>
>
> In AWS EMR on EKS service, the driver real pod's ownerReference is configmap.
> And placeholder's ownerReference is also the driver configmap.
> When user cancels emr-containers job, the job-submitter is terminated,
> but the placeholder still remains in pending state.
> [https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/emr-eks.html]
>
> *Environment*
> * EKS 1.22
> * EMR 6.9 release (Spark 3.3.0)
> * Yunikorn 1.2
> * gang scheduling enabled
>
> *placeholders event log*
> {code:java}
> Unable to find source-code formatter for language: shell. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yamlEvents:
> Type Reason Age From Message
> ---- ------ ---- ---- -------
> Normal Scheduling 19m yunikorn namespace/tg-driver-spark-000000031ttjn13iom3-0 is queued and waiting for allocation
> Normal PodUnschedulable 19m yunikorn Task namespace/tg-driver-spark-000000031ttjn13iom3-0 is pending for the requested resources become available
> Warning FailedProvisioning 19m karpenter Failed to provision new node
> {code}
>
> *placeholders spec*
> {code:java}
> apiVersion: v1
> kind: Pod
> metadata:
> name: tg-driver-spark-000000031tu35ohgkc6-0
> namespace: namespace
> uid: 80601a03-565c-4d0e-88c7-8c66b590871e
> resourceVersion: '546358515'
> creationTimestamp: '2023-04-26T15:06:06Z'
> labels:
> applicationId: spark-000000031tu35ohgkc6
> placeholder: 'true'
> queue: root.beta
> annotations:
> yunikorn.apache.org/placeholder: 'true'
> yunikorn.apache.org/schedulingPolicyParameters: placeholderTimeoutSeconds=300
> yunikorn.apache.org/task-group-name: driver
> yunikorn.apache.org/task-groups: >-
> [{"name": "driver","minResource":{"cpu":
> "1","memory":"2Gi"},"minMember":1,"nodeSelector":{"karpenter.sh/provisioner-name":"test"}},{"name":
> "executor","minResource":{"cpu":
> "1","memory":"5Gi"},"minMember":1,"nodeSelector":{"karpenter.sh/provisioner-name":"test"}}]
> ownerReferences:
> - apiVersion: batch/v1
> kind: ConfigMap
> name: 000000031tu35ohgkc6-spark-defaults
> uid: a3044750-c8b5-47b4-9efa-81bd4b064798
> controller: false
> blockOwnerDeletion: true
> - manager: k8s_yunikorn_scheduler
> operation: Update
> apiVersion: v1
> time: '2023-04-26T15:06:08Z'
> fieldsType: FieldsV1
> fieldsV1:
> f:status:
> f:conditions:
> .: {}
> k:{"type":"PodScheduled"}:
> .: {}
> f:lastProbeTime: {}
> f:lastTransitionTime: {}
> f:message: {}
> f:reason: {}
> f:status: {}
> f:type: {}
> subresource: status
> selfLink: >-
> /api/v1/namespaces/namespace/pods/tg-driver-spark-000000031tu35ohgkc6-0
> status:
> phase: Pending
> conditions:
> - type: PodScheduled
> status: 'False'
> lastProbeTime: null
> lastTransitionTime: '2023-04-26T15:06:08Z'
> reason: Unschedulable
> message: request is waiting for cluster resources become available
> qosClass: Burstable
> spec:
> volumes:
> - name: kube-api-access-gvxxk
> projected:
> sources:
> - serviceAccountToken:
> expirationSeconds: 3607
> path: token
> - configMap:
> name: kube-root-ca.crt
> items:
> - key: ca.crt
> path: ca.crt
> - downwardAPI:
> items:
> - path: namespace
> fieldRef:
> apiVersion: v1
> fieldPath: metadata.namespace
> defaultMode: 420
> containers:
> - name: pause
> image: registry.k8s.io/pause:3.7
> resources:
> requests:
> cpu: '1'
> memory: 2Gi
> volumeMounts:
> - name: kube-api-access-gvxxk
> readOnly: true
> mountPath: /var/run/secrets/kubernetes.io/serviceaccount
> terminationMessagePath: /dev/termination-log
> terminationMessagePolicy: File
> imagePullPolicy: IfNotPresent
> restartPolicy: Never
> terminationGracePeriodSeconds: 30
> nodeSelector:
> karpenter.sh/provisioner-name: test
> serviceAccountName: default
> serviceAccount: default
> securityContext:
> runAsUser: 1000
> runAsGroup: 3000
> schedulerName: yunikorn
> priority: 0
> preemptionPolicy: PreemptLowerPriority
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: dev-help@yunikorn.apache.org