You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@yunikorn.apache.org by "Weiwei Yang (Jira)" <ji...@apache.org> on 2020/08/28 05:24:00 UTC

[jira] [Comment Edited] (YUNIKORN-386) Pass applicationID for spark pods

    [ https://issues.apache.org/jira/browse/YUNIKORN-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186261#comment-17186261 ] 

Weiwei Yang edited comment on YUNIKORN-386 at 8/28/20, 5:23 AM:
----------------------------------------------------------------

hi [~kmarton]

Looks like we can solve this by adding an annotation to spark driver/executor pods. There are 2 cases.
1. for spark jobs submitted by spark-submit
We need to add CLI options {{spark.kubernetes.driver.label.[LabelName]}}, something like {{spark.kubernetes.driver.label.applicationId}} (same for executor). And in [our code|https://github.com/apache/incubator-yunikorn-k8shim/blob/ac345b38432b416aede47e3d51d5e120011c56fd/pkg/common/utils/utils.go#L84-L105]. We can identify the appID first by the label, then by {{spark-app-selector}}. 

2. for spark jobs submitted by spark-k8s-operator
We can do something like [this|https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/c19d2b8660d80ce561be28d76046ed60393a83d9/pkg/batchscheduler/volcano/volcano_scheduler.go#L105-L106]. I believe options set here will be picked up by spark pods, see the logic: [here|https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/c19d2b8660d80ce561be28d76046ed60393a83d9/pkg/controller/sparkapplication/submission.go#L286-L289]


was (Author: wwei):
hi [~kmarton]

Looks like we can solve this by adding an annotation to spark driver/executor pods. There are 2 cases.
1. for spark jobs submitted by spark-submit
We need to add CLI options {{spark.kubernetes.driver.label.[LabelName]}}, something like {{spark.kubernetes.driver.label.applicationId}} (same for executor). And in our code: https://github.com/apache/incubator-yunikorn-k8shim/blob/ac345b38432b416aede47e3d51d5e120011c56fd/pkg/common/utils/utils.go#L84-L105. We can identify the appID first by the label, then by {{spark-app-selector}}. 

2. for spark jobs submitted by spark-k8s-operator
We can do something like [this|https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/c19d2b8660d80ce561be28d76046ed60393a83d9/pkg/batchscheduler/volcano/volcano_scheduler.go#L105-L106]. I believe options set here will be picked up by spark pods, see the logic: [here|https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/c19d2b8660d80ce561be28d76046ed60393a83d9/pkg/controller/sparkapplication/submission.go#L286-L289]

> Pass applicationID for spark pods
> ---------------------------------
>
>                 Key: YUNIKORN-386
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-386
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>            Reporter: Kinga Marton
>            Priority: Major
>
> Right now we use {{spark-app-selector}} as the label for the applicationID for Spark pods: https://github.com/apache/incubator-yunikorn-k8shim/blob/master/pkg/common/utils/utils.go#L87
> When linking the Spark pod group to the CRD we use the Application ID, but for the CRD we use {{namespace-name}} convention as ApplicationID and for the spark pod group we use the {{spark-app-selector}} what will result in having two different applications internally: one for the CRD and one for the Spark POD group. 
> I think we should change this label to something else, what we can modify easily without any side effects. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@yunikorn.apache.org
For additional commands, e-mail: issues-help@yunikorn.apache.org