You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "James Yu (Jira)" <ji...@apache.org> on 2020/10/05 17:58:00 UTC

[jira] [Commented] (SPARK-32067) Use unique ConfigMap name for executor pod template

    [ https://issues.apache.org/jira/browse/SPARK-32067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208224#comment-17208224 ] 

James Yu commented on SPARK-32067:
----------------------------------

Hey, [~dongjoon] , I noticed that you added 3.1.0 into the `Affects Version/s` of this JIRA, But at this point, 3.1.0 is not released yet.  Did you mean to set the `Fix Version/s` to be 3.1.0, and it was just a typo?

> Use unique ConfigMap name for executor pod template
> ---------------------------------------------------
>
>                 Key: SPARK-32067
>                 URL: https://issues.apache.org/jira/browse/SPARK-32067
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Kubernetes
>    Affects Versions: 2.4.7, 3.0.1, 3.1.0
>            Reporter: James Yu
>            Priority: Major
>
> THE BUG:
> The bug is reproducible by spark-submit two different apps (app1 and app2) with different executor pod templates (e.g., different labels) to K8s sequentially,  with app2 launching while app1 is still in the middle of ramping up all its executor pods. The unwanted result is that some launched executor pods of app1 end up having app2's executor pod template applied to them.
> The root cause appears to be that app1's podspec-configmap got overwritten by app2 during the overlapping launching periods because both apps use the same ConfigMap (name). This causes some app1's executor pods being ramped up after app2 is launched to be inadvertently launched with the app2's pod template. The issue can be seen as follows:
> First, after submitting app1, you get these configmaps:
> {code:java}
> NAMESPACE    NAME                                       DATA    AGE
> default      app1-1111111111111111-driver-conf-map      1       9m46s
> default      podspec-configmap                          1       12m{code}
> Then submit app2 while app1 is still ramping up its executors. The podspec-confimap is modified by app2.
> {code:java}
> NAMESPACE    NAME                                       DATA    AGE
> default      app1-1111111111111111-driver-conf-map      1       11m43s
> default      app2-2222222222222222-driver-conf-map      1       10s
> default      podspec-configmap                          1       13m57s{code}
>  
> PROPOSED SOLUTION:
> Properly prefix the podspec-configmap for each submitted app, ideally the same way as the driver configmap:
> {code:java}
> NAMESPACE    NAME                                       DATA    AGE
> default      app1-1111111111111111-driver-conf-map      1       11m43s
> default      app1-1111111111111111-podspec-configmap    1       13m57s
> default      app2-2222222222222222-driver-conf-map      1       10s 
> default      app2-2222222222222222-podspec-configmap    1       3m{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org