You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Tamir Sagi <Ta...@niceactimize.com> on 2022/09/04 10:07:55 UTC

[Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

Hey All,

We recently updated to Flink 1.15.1. We deploy stream cluster in Application mode in Native K8S.(Deployed on Amazon EKS).  The cluster is configured with Kubernetes HA Service, Minimum 3 replicas of Job manager and pod-template which is configured with topologySpreadConstraints to enable distribution across different availability zones.
HA storage directory is on S3.

The cluster is deployed and running properly, however, after a while we noticed the following exception in Job manager instance(the log file is enclosed)

2022-09-04T02:05:33,097][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs - data-agg-events-insertion-cluster-config-map (b6da2ae2-ad2b-471c-801e-ea460a348fab)'
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for kind: [ConfigMap]  with name: [data-agg-events-insertion-cluster-config-map]  in namespace: [dev-0-flink-jobs]  failed.
Caused by: java.io.FileNotFoundException: /opt/flink/.kube/config (No such file or directory)
      at java.io.FileInputStream.open0(Native Method) ~[?:?]
      at java.io.FileInputStream.open(Unknown Source) ~[?:?]
      at java.io.FileInputStream.<init>(Unknown Source) ~[?:?]
      at org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3494) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:290) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.execute(RealCall.java:81) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.retryWithExponentialBackoff(OperationSupport.java:585) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:488) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:470) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:830) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:200) ~[flink-dist-1.15.1.jar:1.15.1]
      ... 12 more

Why is Kube/config needed in Native K8s,  should not service account be checked instead?

Are we missing something?

Thanks,
Tamir.


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately.
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

Posted by Yang Wang <da...@gmail.com>.
You are right. Starting multiple JobManagers could help when the pod is
deleted and there's not enough resources in the cluster to start a new one.
For most cases, the JobManager container will be restarted locally without
scheduling a new Kubernetes pod[1].

The "already exists" error comes from the fabric8 Kubernetes-client. It is
somewhat reasonable because a same name ConfigMap might be already created
manually beforehand.
In the Flink use case, we could simply ignore this error.

For the first exception "*Caused by: java.io.FileNotFoundException:
/opt/flink/.kube/config (No such file or directory)*", I think you need to
share the full log file of all the JobManagers.

[1].
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy


Best,
Yang


Tamir Sagi <Ta...@niceactimize.com> 于2022年9月8日周四 14:28写道:

> Hey Yang,
>
> Thank you for fast response.
>
> I get your point but, assuming 3 Job managers are up, in case the leader
> fails, one of the other 2 should become the new leader, no?
>
> If the cluster fails, the new leader should handle that.
>
> Another scenario could be that the Job manager stops(get killed by k8s
> due to memory, CPU limitations, bugs etc...)  while TMs are still
> operating, and the cluster is active. In some cases,  due to resources
> limitation, k8s will not be able to get a new instance right away, until
> auto-scale takes place(The pod remains in pending state). It seems like we
> do achieve resilience by having HA enabled in Native k8s mode.
>
> What do you think?
>
> Given that you are running multiple JobManagers, it does not matter for
> the "already exists" exception during leader election.
>
> ​Should we ignore such error? if so , it should be a warning then
>
> What about the 1st error we encountered regarding the kube/config file
> exception?
>
>
> Thank you so much,
> Best,
> Tamir
>
> ------------------------------
> *From:* Yang Wang <da...@gmail.com>
> *Sent:* Thursday, September 8, 2022 7:08 AM
> *To:* Tamir Sagi <Ta...@niceactimize.com>
> *Cc:* user@flink.apache.org <us...@flink.apache.org>; Lihi Peretz <
> Lihi.Peretz@niceactimize.com>
> *Subject:* Re: [Flink 1.15.1 - Application mode native k8s Exception] -
> Exception occurred while acquiring lock 'ConfigMapLock
>
>
> *EXTERNAL EMAIL*
>
>
> Given that you are running multiple JobManagers, it does not matter for
> the "already exists" exception during leader election.
>
> BTW, I think running multiple JobManagers does not take enough advantages
> when deploying Flink on Kubernetes. Because a new JobManager will be
> started immediately once the old one crashed.
> And Flink JobManager always needs to recover the job from the latest
> checkpoint no matter how many JobManager are running.
>
> Best,
> Yang
>
> Tamir Sagi <Ta...@niceactimize.com> 于2022年9月5日周一 21:48写道:
>
> Hey Yang,
>
> The flink-conf.yaml submitted to the cluster does not contain "kubernetes.config.file"
> at all.
> In addition, I verified flink config maps under cluster's namespace do not
> contain "kubernetes.config.file".
>
> In addition, we also noticed the following exception (appears to happen
> sporadically)
>
> 2022-09-04T21:06:35,231][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception
> occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs -
> data-agg-events-insertion-cluster-config-map
> (fa3dbbc5-1753-46cd-afaf-0baf8ff0947f)'
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException:
> Unable to create ConfigMapLock
>
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure
> executing: POST at:
> https://172.20.0.1/api/v1/namespaces/dev-0-flink-jobs/configmaps.
> Message: configmaps "data-agg-events-insertion-cluster-config-map" already
> exists.
>
> Log file is enclosed.
>
> Thanks,
> Tamir.
>
> ------------------------------
> *From:* Yang Wang <da...@gmail.com>
> *Sent:* Monday, September 5, 2022 3:03 PM
> *To:* Tamir Sagi <Ta...@niceactimize.com>
> *Cc:* user@flink.apache.org <us...@flink.apache.org>; Lihi Peretz <
> Lihi.Peretz@niceactimize.com>
> *Subject:* Re: [Flink 1.15.1 - Application mode native k8s Exception] -
> Exception occurred while acquiring lock 'ConfigMapLock
>
>
> *EXTERNAL EMAIL*
>
>
> Could you please check whether the "kubernetes.config.file" is configured
> to /opt/flink/.kube/config in the Flink configmap?
> It should be removed before creating the Flink configmap.
>
> Best,
> Yang
>
> Tamir Sagi <Ta...@niceactimize.com> 于2022年9月4日周日 18:08写道:
>
> Hey All,
>
> We recently updated to Flink 1.15.1. We deploy stream cluster in
> Application mode in Native K8S.(Deployed on Amazon EKS).  The cluster is
> configured with Kubernetes HA Service, Minimum 3 replicas of Job manager
> and pod-template which is configured with topologySpreadConstraints to
> enable distribution across different availability zones.
> HA storage directory is on S3.
>
> The cluster is deployed and running properly, however, after a while we
> noticed the following exception in Job manager instance(the log file is
> enclosed)
>
> 2022-09-04T02:05:33,097][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception
> occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs -
> data-agg-events-insertion-cluster-config-map
> (b6da2ae2-ad2b-471c-801e-ea460a348fab)'
> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]
>  for kind: [ConfigMap]  with name:
> [data-agg-events-insertion-cluster-config-map]  in namespace:
> [dev-0-flink-jobs]  failed.
> Caused by: java.io.FileNotFoundException: /opt/flink/.kube/config (No such
> file or directory)
> at java.io.FileInputStream.open0(Native Method) ~[?:?]
> at java.io.FileInputStream.open(Unknown Source) ~[?:?]
> at java.io.FileInputStream.<init>(Unknown Source) ~[?:?]
> at
> org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3494)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:290)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.RealCall.execute(RealCall.java:81)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.retryWithExponentialBackoff(OperationSupport.java:585)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:488)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:470)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:830)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:200)
> ~[flink-dist-1.15.1.jar:1.15.1]
> ... 12 more
>
> Why is Kube/config needed in Native K8s,  should not service account be
> checked instead?
>
> Are we missing something?
>
> Thanks,
> Tamir.
>
>
> Confidentiality: This communication and any attachments are intended for
> the above-named persons only and may be confidential and/or legally
> privileged. Any opinions expressed in this communication are not
> necessarily those of NICE Actimize. If this communication has come to you
> in error you must take no action based on it, nor must you copy or show it
> to anyone; please delete/destroy and inform the sender by e-mail
> immediately.
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure they are actually virus free.
>
>
> Confidentiality: This communication and any attachments are intended for
> the above-named persons only and may be confidential and/or legally
> privileged. Any opinions expressed in this communication are not
> necessarily those of NICE Actimize. If this communication has come to you
> in error you must take no action based on it, nor must you copy or show it
> to anyone; please delete/destroy and inform the sender by e-mail
> immediately.
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure they are actually virus free.
>
>
> Confidentiality: This communication and any attachments are intended for
> the above-named persons only and may be confidential and/or legally
> privileged. Any opinions expressed in this communication are not
> necessarily those of NICE Actimize. If this communication has come to you
> in error you must take no action based on it, nor must you copy or show it
> to anyone; please delete/destroy and inform the sender by e-mail
> immediately.
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure they are actually virus free.
>

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

Posted by Tamir Sagi <Ta...@niceactimize.com>.
Hey Yang,

Thank you for fast response.

I get your point but, assuming 3 Job managers are up, in case the leader fails, one of the other 2 should become the new leader, no?

If the cluster fails, the new leader should handle that.

Another scenario could be that the Job manager stops(get killed by k8s due to memory, CPU limitations, bugs etc...)  while TMs are still operating, and the cluster is active. In some cases,  due to resources limitation, k8s will not be able to get a new instance right away, until auto-scale takes place(The pod remains in pending state). It seems like we do achieve resilience by having HA enabled in Native k8s mode.

What do you think?

Given that you are running multiple JobManagers, it does not matter for the "already exists" exception during leader election.
​Should we ignore such error? if so , it should be a warning then

What about the 1st error we encountered regarding the kube/config file exception?


Thank you so much,
Best,
Tamir


________________________________
From: Yang Wang <da...@gmail.com>
Sent: Thursday, September 8, 2022 7:08 AM
To: Tamir Sagi <Ta...@niceactimize.com>
Cc: user@flink.apache.org <us...@flink.apache.org>; Lihi Peretz <Li...@niceactimize.com>
Subject: Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock


EXTERNAL EMAIL


Given that you are running multiple JobManagers, it does not matter for the "already exists" exception during leader election.

BTW, I think running multiple JobManagers does not take enough advantages when deploying Flink on Kubernetes. Because a new JobManager will be started immediately once the old one crashed.
And Flink JobManager always needs to recover the job from the latest checkpoint no matter how many JobManager are running.

Best,
Yang

Tamir Sagi <Ta...@niceactimize.com>> 于2022年9月5日周一 21:48写道:
Hey Yang,

The flink-conf.yaml submitted to the cluster does not contain "kubernetes.config.file" at all.
In addition, I verified flink config maps under cluster's namespace do not contain "kubernetes.config.file".

In addition, we also noticed the following exception (appears to happen sporadically)

2022-09-04T21:06:35,231][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs - data-agg-events-insertion-cluster-config-map (fa3dbbc5-1753-46cd-afaf-0baf8ff0947f)'
io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: Unable to create ConfigMapLock

Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://172.20.0.1/api/v1/namespaces/dev-0-flink-jobs/configmaps. Message: configmaps "data-agg-events-insertion-cluster-config-map" already exists.

Log file is enclosed.

Thanks,
Tamir.


________________________________
From: Yang Wang <da...@gmail.com>>
Sent: Monday, September 5, 2022 3:03 PM
To: Tamir Sagi <Ta...@niceactimize.com>>
Cc: user@flink.apache.org<ma...@flink.apache.org> <us...@flink.apache.org>>; Lihi Peretz <Li...@niceactimize.com>>
Subject: Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock


EXTERNAL EMAIL


Could you please check whether the "kubernetes.config.file" is configured to /opt/flink/.kube/config in the Flink configmap?
It should be removed before creating the Flink configmap.

Best,
Yang

Tamir Sagi <Ta...@niceactimize.com>> 于2022年9月4日周日 18:08写道:
Hey All,

We recently updated to Flink 1.15.1. We deploy stream cluster in Application mode in Native K8S.(Deployed on Amazon EKS).  The cluster is configured with Kubernetes HA Service, Minimum 3 replicas of Job manager and pod-template which is configured with topologySpreadConstraints to enable distribution across different availability zones.
HA storage directory is on S3.

The cluster is deployed and running properly, however, after a while we noticed the following exception in Job manager instance(the log file is enclosed)

2022-09-04T02:05:33,097][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs - data-agg-events-insertion-cluster-config-map (b6da2ae2-ad2b-471c-801e-ea460a348fab)'
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for kind: [ConfigMap]  with name: [data-agg-events-insertion-cluster-config-map]  in namespace: [dev-0-flink-jobs]  failed.
Caused by: java.io.FileNotFoundException: /opt/flink/.kube/config (No such file or directory)
      at java.io.FileInputStream.open0(Native Method) ~[?:?]
      at java.io.FileInputStream.open(Unknown Source) ~[?:?]
      at java.io.FileInputStream.<init>(Unknown Source) ~[?:?]
      at org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3494) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:290) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.execute(RealCall.java:81) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.retryWithExponentialBackoff(OperationSupport.java:585) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:488) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:470) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:830) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:200) ~[flink-dist-1.15.1.jar:1.15.1]
      ... 12 more

Why is Kube/config needed in Native K8s,  should not service account be checked instead?

Are we missing something?

Thanks,
Tamir.


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately.
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.

Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately.
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.

Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately.
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

Posted by Yang Wang <da...@gmail.com>.
Given that you are running multiple JobManagers, it does not matter for the
"already exists" exception during leader election.

BTW, I think running multiple JobManagers does not take enough advantages
when deploying Flink on Kubernetes. Because a new JobManager will be
started immediately once the old one crashed.
And Flink JobManager always needs to recover the job from the latest
checkpoint no matter how many JobManager are running.

Best,
Yang

Tamir Sagi <Ta...@niceactimize.com> 于2022年9月5日周一 21:48写道:

> Hey Yang,
>
> The flink-conf.yaml submitted to the cluster does not contain "kubernetes.config.file"
> at all.
> In addition, I verified flink config maps under cluster's namespace do not
> contain "kubernetes.config.file".
>
> In addition, we also noticed the following exception (appears to happen
> sporadically)
>
> 2022-09-04T21:06:35,231][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception
> occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs -
> data-agg-events-insertion-cluster-config-map
> (fa3dbbc5-1753-46cd-afaf-0baf8ff0947f)'
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException:
> Unable to create ConfigMapLock
>
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure
> executing: POST at:
> https://172.20.0.1/api/v1/namespaces/dev-0-flink-jobs/configmaps.
> Message: configmaps "data-agg-events-insertion-cluster-config-map" already
> exists.
>
> Log file is enclosed.
>
> Thanks,
> Tamir.
>
> ------------------------------
> *From:* Yang Wang <da...@gmail.com>
> *Sent:* Monday, September 5, 2022 3:03 PM
> *To:* Tamir Sagi <Ta...@niceactimize.com>
> *Cc:* user@flink.apache.org <us...@flink.apache.org>; Lihi Peretz <
> Lihi.Peretz@niceactimize.com>
> *Subject:* Re: [Flink 1.15.1 - Application mode native k8s Exception] -
> Exception occurred while acquiring lock 'ConfigMapLock
>
>
> *EXTERNAL EMAIL*
>
>
> Could you please check whether the "kubernetes.config.file" is configured
> to /opt/flink/.kube/config in the Flink configmap?
> It should be removed before creating the Flink configmap.
>
> Best,
> Yang
>
> Tamir Sagi <Ta...@niceactimize.com> 于2022年9月4日周日 18:08写道:
>
> Hey All,
>
> We recently updated to Flink 1.15.1. We deploy stream cluster in
> Application mode in Native K8S.(Deployed on Amazon EKS).  The cluster is
> configured with Kubernetes HA Service, Minimum 3 replicas of Job manager
> and pod-template which is configured with topologySpreadConstraints to
> enable distribution across different availability zones.
> HA storage directory is on S3.
>
> The cluster is deployed and running properly, however, after a while we
> noticed the following exception in Job manager instance(the log file is
> enclosed)
>
> 2022-09-04T02:05:33,097][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception
> occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs -
> data-agg-events-insertion-cluster-config-map
> (b6da2ae2-ad2b-471c-801e-ea460a348fab)'
> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]
>  for kind: [ConfigMap]  with name:
> [data-agg-events-insertion-cluster-config-map]  in namespace:
> [dev-0-flink-jobs]  failed.
> Caused by: java.io.FileNotFoundException: /opt/flink/.kube/config (No such
> file or directory)
> at java.io.FileInputStream.open0(Native Method) ~[?:?]
> at java.io.FileInputStream.open(Unknown Source) ~[?:?]
> at java.io.FileInputStream.<init>(Unknown Source) ~[?:?]
> at
> org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3494)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:290)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.RealCall.execute(RealCall.java:81)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.retryWithExponentialBackoff(OperationSupport.java:585)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:488)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:470)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:830)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:200)
> ~[flink-dist-1.15.1.jar:1.15.1]
> ... 12 more
>
> Why is Kube/config needed in Native K8s,  should not service account be
> checked instead?
>
> Are we missing something?
>
> Thanks,
> Tamir.
>
>
> Confidentiality: This communication and any attachments are intended for
> the above-named persons only and may be confidential and/or legally
> privileged. Any opinions expressed in this communication are not
> necessarily those of NICE Actimize. If this communication has come to you
> in error you must take no action based on it, nor must you copy or show it
> to anyone; please delete/destroy and inform the sender by e-mail
> immediately.
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure they are actually virus free.
>
>
> Confidentiality: This communication and any attachments are intended for
> the above-named persons only and may be confidential and/or legally
> privileged. Any opinions expressed in this communication are not
> necessarily those of NICE Actimize. If this communication has come to you
> in error you must take no action based on it, nor must you copy or show it
> to anyone; please delete/destroy and inform the sender by e-mail
> immediately.
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure they are actually virus free.
>

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

Posted by Tamir Sagi <Ta...@niceactimize.com>.
Hey Yang,

The flink-conf.yaml submitted to the cluster does not contain "kubernetes.config.file" at all.
In addition, I verified flink config maps under cluster's namespace do not contain "kubernetes.config.file".

In addition, we also noticed the following exception (appears to happen sporadically)

2022-09-04T21:06:35,231][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs - data-agg-events-insertion-cluster-config-map (fa3dbbc5-1753-46cd-afaf-0baf8ff0947f)'
io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: Unable to create ConfigMapLock

Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://172.20.0.1/api/v1/namespaces/dev-0-flink-jobs/configmaps. Message: configmaps "data-agg-events-insertion-cluster-config-map" already exists.

Log file is enclosed.

Thanks,
Tamir.


________________________________
From: Yang Wang <da...@gmail.com>
Sent: Monday, September 5, 2022 3:03 PM
To: Tamir Sagi <Ta...@niceactimize.com>
Cc: user@flink.apache.org <us...@flink.apache.org>; Lihi Peretz <Li...@niceactimize.com>
Subject: Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock


EXTERNAL EMAIL


Could you please check whether the "kubernetes.config.file" is configured to /opt/flink/.kube/config in the Flink configmap?
It should be removed before creating the Flink configmap.

Best,
Yang

Tamir Sagi <Ta...@niceactimize.com>> 于2022年9月4日周日 18:08写道:
Hey All,

We recently updated to Flink 1.15.1. We deploy stream cluster in Application mode in Native K8S.(Deployed on Amazon EKS).  The cluster is configured with Kubernetes HA Service, Minimum 3 replicas of Job manager and pod-template which is configured with topologySpreadConstraints to enable distribution across different availability zones.
HA storage directory is on S3.

The cluster is deployed and running properly, however, after a while we noticed the following exception in Job manager instance(the log file is enclosed)

2022-09-04T02:05:33,097][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs - data-agg-events-insertion-cluster-config-map (b6da2ae2-ad2b-471c-801e-ea460a348fab)'
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for kind: [ConfigMap]  with name: [data-agg-events-insertion-cluster-config-map]  in namespace: [dev-0-flink-jobs]  failed.
Caused by: java.io.FileNotFoundException: /opt/flink/.kube/config (No such file or directory)
      at java.io.FileInputStream.open0(Native Method) ~[?:?]
      at java.io.FileInputStream.open(Unknown Source) ~[?:?]
      at java.io.FileInputStream.<init>(Unknown Source) ~[?:?]
      at org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3494) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:290) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229) ~[flink-dist-1.15.1.jar:1.15.1]
      at org.apache.flink.kubernetes.shaded.okhttp3.RealCall.execute(RealCall.java:81) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.retryWithExponentialBackoff(OperationSupport.java:585) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:488) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:470) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:830) ~[flink-dist-1.15.1.jar:1.15.1]
      at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:200) ~[flink-dist-1.15.1.jar:1.15.1]
      ... 12 more

Why is Kube/config needed in Native K8s,  should not service account be checked instead?

Are we missing something?

Thanks,
Tamir.


Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately.
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.

Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately.
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.

Re: [Flink 1.15.1 - Application mode native k8s Exception] - Exception occurred while acquiring lock 'ConfigMapLock

Posted by Yang Wang <da...@gmail.com>.
Could you please check whether the "kubernetes.config.file" is configured
to /opt/flink/.kube/config in the Flink configmap?
It should be removed before creating the Flink configmap.

Best,
Yang

Tamir Sagi <Ta...@niceactimize.com> 于2022年9月4日周日 18:08写道:

> Hey All,
>
> We recently updated to Flink 1.15.1. We deploy stream cluster in
> Application mode in Native K8S.(Deployed on Amazon EKS).  The cluster is
> configured with Kubernetes HA Service, Minimum 3 replicas of Job manager
> and pod-template which is configured with topologySpreadConstraints to
> enable distribution across different availability zones.
> HA storage directory is on S3.
>
> The cluster is deployed and running properly, however, after a while we
> noticed the following exception in Job manager instance(the log file is
> enclosed)
>
> 2022-09-04T02:05:33,097][Error] {} [i.f.k.c.e.l.LeaderElector]: Exception
> occurred while acquiring lock 'ConfigMapLock: dev-0-flink-jobs -
> data-agg-events-insertion-cluster-config-map
> (b6da2ae2-ad2b-471c-801e-ea460a348fab)'
> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]
>  for kind: [ConfigMap]  with name:
> [data-agg-events-insertion-cluster-config-map]  in namespace:
> [dev-0-flink-jobs]  failed.
> Caused by: java.io.FileNotFoundException: /opt/flink/.kube/config (No such
> file or directory)
> at java.io.FileInputStream.open0(Native Method) ~[?:?]
> at java.io.FileInputStream.open(Unknown Source) ~[?:?]
> at java.io.FileInputStream.<init>(Unknown Source) ~[?:?]
> at
> org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3494)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:290)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> org.apache.flink.kubernetes.shaded.okhttp3.RealCall.execute(RealCall.java:81)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.retryWithExponentialBackoff(OperationSupport.java:585)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:488)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:470)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:830)
> ~[flink-dist-1.15.1.jar:1.15.1]
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:200)
> ~[flink-dist-1.15.1.jar:1.15.1]
> ... 12 more
>
> Why is Kube/config needed in Native K8s,  should not service account be
> checked instead?
>
> Are we missing something?
>
> Thanks,
> Tamir.
>
>
> Confidentiality: This communication and any attachments are intended for
> the above-named persons only and may be confidential and/or legally
> privileged. Any opinions expressed in this communication are not
> necessarily those of NICE Actimize. If this communication has come to you
> in error you must take no action based on it, nor must you copy or show it
> to anyone; please delete/destroy and inform the sender by e-mail
> immediately.
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure they are actually virus free.
>