You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Márton Balassi (Jira)" <ji...@apache.org> on 2022/12/07 14:48:00 UTC

[jira] [Closed] (FLINK-30313) Flink Kubernetes Operator seems to rely on 1.15 k8s HA code but 1.16 config code

     [ https://issues.apache.org/jira/browse/FLINK-30313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Márton Balassi closed FLINK-30313.
----------------------------------
    Resolution: Cannot Reproduce

We jumped on a call to try to reproduce this with [~gyfora]. 

We could not reproduce the issue using the 1.16 flink image with the latest operator build and the [basic checkpointing example|https://github.com/apache/flink-kubernetes-operator/blob/main/examples/basic-checkpoint-ha.yaml]. We tried both the new and old HA [configs|https://github.com/apache/flink-kubernetes-operator/blob/main/examples/basic-checkpoint-ha.yaml#L30].

As an educated guess most probably the user packages flink-core 1.16 into their user jar and that is causing this.

 

> Flink Kubernetes Operator seems to rely on 1.15 k8s HA code but 1.16 config code
> --------------------------------------------------------------------------------
>
>                 Key: FLINK-30313
>                 URL: https://issues.apache.org/jira/browse/FLINK-30313
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.2.0
>            Reporter: Matthias Pohl
>            Assignee: Márton Balassi
>            Priority: Critical
>             Fix For: kubernetes-operator-1.3.0
>
>
> Based on [this SO post|https://stackoverflow.com/questions/74599009/flink1-16-high-availability-job-manager], it looks like there can be a setup where we use different versions of Flink:
> {code}
> 2022-11-28 08:57:56.032 [main] INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint  - Shutting KubernetesApplicationClusterEntrypoint down with application status FAILED. Diagnostics java.lang.NoSuchFieldError: USE_OLD_HA_SERVICES
>         at org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:37)
>         at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:296)
>         at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:139)
>         at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:439)
>         at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:382)
>         at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:282)
>         at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:232)
>         at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>         at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:229)
>         at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:729)
>         at org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.main(KubernetesApplicationClusterEntrypoint.java:86)
> {code}
> This field was removed in FLINK-25806 with Flink 1.16. I'd guess that it's caused by the Flink operator still depending on Flink 1.15.x whilest there is an option to use Flink 1.16 for deployments already.
> Another SO user mentioned:
> {quote}
> I have the same issue with flink-kubernetes-operator. This field NoSuchFieldError: USE_OLD_HA_SERVICES was removed in flink 1.16, but error occurs with setting: flinkVersion: v1_15
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)