You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Biao Geng (Jira)" <ji...@apache.org> on 2022/05/11 15:42:00 UTC
[jira] [Comment Edited] (FLINK-27329) Add default value of replica of JM pod and not declare it in example yamls
[ https://issues.apache.org/jira/browse/FLINK-27329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534972#comment-17534972 ]
Biao Geng edited comment on FLINK-27329 at 5/11/22 3:41 PM:
------------------------------------------------------------
Hi [~wangyang0918] [~gyfora], I revisit the default values in our *{*}Spec{*}*s and summarize them in the bottom table.
IMO, most of them work well with `null` default value, but besides JobManagerSpec#replicas, there are some fields that I believe we can improve:
# *JobManagerSpec#resource#cpu &* *TaskManagerSpec#resource#cpu:* current default value is 0 which is not consistent with upstream flink. In my mind, changing the default value to 1.0 is better for: a) for JM, if users not specify it explictly, flink will use 1.0; b) for TM, if users not specify it explictly, flink will use NUM_TASK_SLOTS, whose default value is 1 as well in flink's default flink-conf.yaml{*}{*}
# *JobSpec#parallelism:* current default value is 0, which is illegal. But I am not sure if is possible/proper for us to read the value of parallelism.default in flink-conf.yaml in the *JobSpec* constructor. I tend to leave it as it is or use `1` as default.{*}{*}
# *FlinkDeploymentSpec#serviceAccount:* current default value is null and as a result, if we do not specify it, flink will use `default` service account. It can be problematic as we use `flink` as default in our helm chart's values.yaml. I am not so sure why we expose it as the first class field, but maybe `flink` can be a good candidate default value.
| | Default Value in Upstream Flink | Current Default Value in k8s Operator |
| FlinkDeploymentSpec#imagePullPolicy | KubernetesConfigOptions.ImagePullPolicy.IfNotPresent | null |
| FlinkDeploymentSpec#image | KubernetesConfigOptions#getDefaultFlinkImage() | null |
| **FlinkDeploymentSpec#serviceAccount** | "default" | null |
| FlinkDeploymentSpec#flinkVersion | not exist | null |
| FlinkDeploymentSpec#IngressSpec | not exist | null |
| FlinkDeploymentSpec#podTemplate | no default value | null |
| **JobManagerSpec#replicas** | not exist | 0 |
| JobManagerSpec#resource#memory | 1600M(defined in flink-conf.yaml) | null |
| **JobManagerSpec#resource#cpu** | 1.0 | 0 |
| JobManagerSpec#podTemplate | no default value | null |
| TaskManagerSpec#resource#memory | memory: 1728m(defined in flink-conf.yaml) | null |
| **TaskManagerSpec#resource#cpu** | NUM_TASK_SLOTS( whose default value is 1 in flink-conf.yaml) | 0 |
| TaskManagerSpec#podTemplate | no default value | null |
| JobSpec#jarURI | no default value | null |
| **JobSpec#parallelism** | parallelism.default in flink-conf.yaml | 0 |
| JobSpec#entryClass | no default value | null |
| JobSpec#args | no default value | String[0] |
| JobSpec#state | not exist | JobState.RUNNING |
| JobSpec#savepointTriggerNonce | not exist | null |
| JobSpec#initialSavepointPath | not exist | null |
| JobSpec#upgradeMode | not exist | UpgradeMode.STATELESS |
| JobSpec#allowNonRestoredState | not exist | null |
| | | |
was (Author: bgeng777):
Hi [~wangyang0918] [~gyfora], I revisit the default values in our *{*}Spec{*}*s and summarize them in the bottom table.
IMO, most of them work well with `null` default value, but besides JobManagerSpec#replicas, there are some fields that I believe we can improve:
# *JobManagerSpec#resource#cpu &* *TaskManagerSpec#resource#cpu:* current default value is 0 which is not consistent with upstream flink. In my mind, changing the default value to 1.0 is better for: a) for JM, if users not specify it explictly, flink will use 1.0; b) for TM, if users not specify it explictly, flink will use NUM_TASK_SLOTS, whose default value is 1 as well in flink's default flink-conf.yaml{*}{*}
# *JobSpec#parallelism:* current default value is 0, which is illegal. But I am not sure if is possible/proper for us to read the value of parallelism.default in flink-conf.yaml in the *JobSpec* constructor. I tend to leave it as it is or use `1` as default.{*}{*}
# *FlinkDeploymentSpec#serviceAccount:* current default value is null and as a result, if we do not specify it, flink will use `default` service account. It can be problematic as we use `flink` as default in our helm chart's values.yaml. I am not so sure why we expose it as the first class field, but maybe `flink` can be a good candidate default value.
| |Default Value in Upstream Flink Native K8s|Current Default Value in K8s Operator|
|FlinkDeploymentSpec#imagePullPolicy|KubernetesConfigOptions.ImagePullPolicy.IfNotPresent|null|
|FlinkDeploymentSpec#image|KubernetesConfigOptions#getDefaultFlinkImage()|null|
|*{*}FlinkDeploymentSpec#serviceAccount{*}*|"default"|null|
|FlinkDeploymentSpec#flinkVersion|\|null|
|FlinkDeploymentSpec#IngressSpec|\|null|
|FlinkDeploymentSpec#podTemplate|no default value|null|
|*{*}JobManagerSpec#replicas{*}*|1|0|
|JobManagerSpec#resource#memory|1600M(defined in flink-conf.yaml)|null|
|*JobManagerSpec#resource#cpu*|1.0|0|
|JobManagerSpec#podTemplate|no default value|null|
|TaskManagerSpec#resource#memory|memory: 1728m(defined in flink-conf.yaml)|null|
|*TaskManagerSpec#resource#cpu*|NUM_TASK_SLOTS( whose default value is 1 in flink-conf.yaml)|0|
|TaskManagerSpec#podTemplate|no default value|null|
|JobSpec#jarURI|no default value|null|
|*{*}JobSpec#parallelism{*}*|parallelism.default in flink-conf.yaml|0|
|JobSpec#entryClass|no default value|null|
|JobSpec#args|no default value|String[0]|
|JobSpec#state|\|JobState.RUNNING|
|JobSpec#savepointTriggerNonce|\|null|
|JobSpec#initialSavepointPath|\|null|
|JobSpec#upgradeMode|\|UpgradeMode.STATELESS|
|JobSpec#allowNonRestoredState|\|null|
| | | |
> Add default value of replica of JM pod and not declare it in example yamls
> --------------------------------------------------------------------------
>
> Key: FLINK-27329
> URL: https://issues.apache.org/jira/browse/FLINK-27329
> Project: Flink
> Issue Type: Improvement
> Components: Kubernetes Operator
> Reporter: Biao Geng
> Assignee: Biao Geng
> Priority: Critical
> Fix For: kubernetes-operator-1.0.0
>
>
> Currently, we do not explicitly set the default value of `replica` in `JobManagerSpec`. As a result, Java sets the default value to be zero.
> Besides, in our examples, we explicitly declare `replica` in `JobManagerSpec` to be 1.
> After a deeper look when debugging the exception thrown in FLINK-27310, we find it would be better to set the default value to 1 for the `replica` field and remove the declaration in examples due to following reasons:
> 1. A normal Session or Application cluster should have at least one JM. The current default value, zero, does not follow the common case.
> 2. One JM can work for k8s HA mode as well and if users really want to launch a standby JM for faster recorvery, they can declare the value of `replica` field in the yaml file. In examples, we just use the new default value(i.e. 1), which should be fine.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)