You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Biao Geng (Jira)" <ji...@apache.org> on 2022/05/11 15:42:00 UTC

[jira] [Comment Edited] (FLINK-27329) Add default value of replica of JM pod and not declare it in example yamls

    [ https://issues.apache.org/jira/browse/FLINK-27329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534972#comment-17534972 ] 

Biao Geng edited comment on FLINK-27329 at 5/11/22 3:41 PM:
------------------------------------------------------------

Hi [~wangyang0918] [~gyfora], I revisit the default values in our *{*}Spec{*}*s and summarize them in the bottom table. 
IMO, most of them work well with `null` default value, but besides JobManagerSpec#replicas,  there are some fields that I believe we can improve:
 # *JobManagerSpec#resource#cpu &* *TaskManagerSpec#resource#cpu:* current default value is 0 which is not consistent with upstream flink. In my mind, changing the default value to 1.0 is better for: a) for JM, if users not specify it explictly, flink will use 1.0; b) for TM, if users not specify it explictly, flink will use NUM_TASK_SLOTS, whose default value is 1 as well in flink's default flink-conf.yaml{*}{*}
 # *JobSpec#parallelism:* current default value is 0, which is illegal. But I am not sure if is possible/proper for us to read the value of parallelism.default in flink-conf.yaml in the *JobSpec* constructor. I tend to leave it as it is or use `1` as default.{*}{*}
 # *FlinkDeploymentSpec#serviceAccount:* current default value is null and as a result, if we do not specify it, flink will use `default` service account. It can be problematic as we use `flink` as default in our helm chart's values.yaml. I am not so sure why we expose it as the first class field, but maybe `flink` can be a good candidate default value.


|                                        | Default Value in Upstream Flink                              | Current Default Value in k8s Operator |
| FlinkDeploymentSpec#imagePullPolicy    | KubernetesConfigOptions.ImagePullPolicy.IfNotPresent         | null                                  |
| FlinkDeploymentSpec#image              | KubernetesConfigOptions#getDefaultFlinkImage()               | null                                  |
| **FlinkDeploymentSpec#serviceAccount** | "default"                                                    | null                                  |
| FlinkDeploymentSpec#flinkVersion       | not exist                                                    | null                                  |
| FlinkDeploymentSpec#IngressSpec        | not exist                                                    | null                                  |
| FlinkDeploymentSpec#podTemplate        | no default value                                             | null                                  |
| **JobManagerSpec#replicas**            | not exist                                                    | 0                                     |
| JobManagerSpec#resource#memory         | 1600M(defined in flink-conf.yaml)                            | null                                  |
| **JobManagerSpec#resource#cpu**        | 1.0                                                          | 0                                     |
| JobManagerSpec#podTemplate             | no default value                                             | null                                  |
| TaskManagerSpec#resource#memory        | memory: 1728m(defined in flink-conf.yaml)                    | null                                  |
| **TaskManagerSpec#resource#cpu**       | NUM_TASK_SLOTS( whose default value is 1 in flink-conf.yaml) | 0                                     |
| TaskManagerSpec#podTemplate            | no default value                                             | null                                  |
| JobSpec#jarURI                         | no default value                                             | null                                  |
| **JobSpec#parallelism**                | parallelism.default in flink-conf.yaml                       | 0                                     |
| JobSpec#entryClass                     | no default value                                             | null                                  |
| JobSpec#args                           | no default value                                             | String[0]                             |
| JobSpec#state                          | not exist                                                    | JobState.RUNNING                      |
| JobSpec#savepointTriggerNonce          | not exist                                                    | null                                  |
| JobSpec#initialSavepointPath           | not exist                                                    | null                                  |
| JobSpec#upgradeMode                    | not exist                                                    | UpgradeMode.STATELESS                 |
| JobSpec#allowNonRestoredState          | not exist                                                    | null                                  |
|                                        |                                                              |                                       |





was (Author: bgeng777):
Hi [~wangyang0918] [~gyfora], I revisit the default values in our *{*}Spec{*}*s and summarize them in the bottom table. 
IMO, most of them work well with `null` default value, but besides JobManagerSpec#replicas,  there are some fields that I believe we can improve:
 # *JobManagerSpec#resource#cpu &* *TaskManagerSpec#resource#cpu:* current default value is 0 which is not consistent with upstream flink. In my mind, changing the default value to 1.0 is better for: a) for JM, if users not specify it explictly, flink will use 1.0; b) for TM, if users not specify it explictly, flink will use NUM_TASK_SLOTS, whose default value is 1 as well in flink's default flink-conf.yaml{*}{*}
 # *JobSpec#parallelism:* current default value is 0, which is illegal. But I am not sure if is possible/proper for us to read the value of parallelism.default in flink-conf.yaml in the *JobSpec* constructor. I tend to leave it as it is or use `1` as default.{*}{*}
 # *FlinkDeploymentSpec#serviceAccount:* current default value is null and as a result, if we do not specify it, flink will use `default` service account. It can be problematic as we use `flink` as default in our helm chart's values.yaml. I am not so sure why we expose it as the first class field, but maybe `flink` can be a good candidate default value.

 
| |Default Value in Upstream Flink Native K8s|Current Default Value in K8s Operator|
|FlinkDeploymentSpec#imagePullPolicy|KubernetesConfigOptions.ImagePullPolicy.IfNotPresent|null|
|FlinkDeploymentSpec#image|KubernetesConfigOptions#getDefaultFlinkImage()|null|
|*{*}FlinkDeploymentSpec#serviceAccount{*}*|"default"|null|
|FlinkDeploymentSpec#flinkVersion|\|null|
|FlinkDeploymentSpec#IngressSpec|\|null|
|FlinkDeploymentSpec#podTemplate|no default value|null|
|*{*}JobManagerSpec#replicas{*}*|1|0|
|JobManagerSpec#resource#memory|1600M(defined in flink-conf.yaml)|null|
|*JobManagerSpec#resource#cpu*|1.0|0|
|JobManagerSpec#podTemplate|no default value|null|
|TaskManagerSpec#resource#memory|memory: 1728m(defined in flink-conf.yaml)|null|
|*TaskManagerSpec#resource#cpu*|NUM_TASK_SLOTS( whose default value is 1 in flink-conf.yaml)|0|
|TaskManagerSpec#podTemplate|no default value|null|
|JobSpec#jarURI|no default value|null|
|*{*}JobSpec#parallelism{*}*|parallelism.default in flink-conf.yaml|0|
|JobSpec#entryClass|no default value|null|
|JobSpec#args|no default value|String[0]|
|JobSpec#state|\|JobState.RUNNING|
|JobSpec#savepointTriggerNonce|\|null|
|JobSpec#initialSavepointPath|\|null|
|JobSpec#upgradeMode|\|UpgradeMode.STATELESS|
|JobSpec#allowNonRestoredState|\|null|
| | | |

> Add default value of replica of JM pod and not declare it in example yamls
> --------------------------------------------------------------------------
>
>                 Key: FLINK-27329
>                 URL: https://issues.apache.org/jira/browse/FLINK-27329
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kubernetes Operator
>            Reporter: Biao Geng
>            Assignee: Biao Geng
>            Priority: Critical
>             Fix For: kubernetes-operator-1.0.0
>
>
> Currently, we do not explicitly set the default value of `replica` in `JobManagerSpec`. As a result, Java sets the default value to be zero. 
> Besides, in our examples, we explicitly declare `replica` in `JobManagerSpec` to be 1. 
> After a deeper look when debugging the exception thrown in FLINK-27310, we find it would be better to set the default value to 1 for the `replica` field and remove the declaration in examples due to following reasons:
> 1. A normal Session or Application cluster should have at least one JM. The current default value, zero, does not follow the common case.
> 2. One JM can work for k8s HA mode as well and if users really want to launch a standby JM for faster recorvery, they can declare the value of `replica` field in the yaml file. In examples, we just use the new default value(i.e. 1), which should be fine.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)