You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by mejri houssem <me...@gmail.com> on 2021/08/27 12:31:30 UTC
k8S HA mode
hello i am deploying a flink application cluster with kubernetes HA mode, but i am facing this recurrent problem and i didn't know how to solve it.
Any help would be appreciated.
this of the jobManager:
{"@timestamp":"2021-08-27T14:19:42.447+02:00","@version":"1","message":"Exception occurred while renewing lock: Unable to update ConfigMapLock","logger_name":"io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector","thread_name":"pool-4092-thread-1","level":"DEBUG","level_value":10000,"stack_trace":"io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: Unable to update ConfigMapLock\n\tat io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.renew(LeaderElector.java:120)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$null$1(LeaderElector.java:104)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureT
ask.java:266)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PUT at: https://172.31.64.1/api/v1/namespaces/flink-pushavoo-flink-rec/configmaps/elifibre-00000000000000000000000000000000-jobmanager-leader. Message: Operation cannot be fulfilled on configmaps \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object has been modified; please apply your changes to the latest version and try again. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], gro
up=null, kind=configmaps, name=elifibre-00000000000000000000000000000000-jobmanager-leader, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Operation cannot be fulfilled on configmaps \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object has been modified; please apply your changes to the latest version and try again, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Conflict, status=Failure, additionalProperties={}).\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)\n\tat io.fabric8.kubernet
es.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820)\n\tat io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86)\n\tat io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26)\n\tat io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5)\n\tat io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92)\n\tat io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106)\n\t... 10 common frames omitted\n"}
Re: k8S HA mode
Posted by Yang Wang <da...@gmail.com>.
Could you please share the full JobManager logs?
AFAIK, you attached exceptions are normal logs when the JobManager is
trying to acquire the configmap lock.
Best,
Yang
houssem <me...@gmail.com> 于2021年8月31日周二 上午4:36写道:
> Hello, thanks for the response
>
> I am using kubernetes standalone application mode not the native one.
>
> and this error happens randomly at some point while running the job.
>
> Also i am using just one replicas of the jobmanager
>
> here is some other logs::
>
>
> {"@timestamp":"2021-08-30T15:43:44.970+02:00","@version":"1","message":"Exception
> occurred while renewing lock: Unable to update
> ConfigMapLock","logger_name":"io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector",
> "thread_name":"pool-685-thread-1","level":"DEBUG","level_value":10000,"stack_trace":"io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException:
> Unable to update ConfigMapLock
>
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108)
>
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156)
>
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.renew(LeaderElector.java:120)
>
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$null$1(LeaderElector.java:104)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException:
> Failure executing: PUT at:
>
> https://172.31.64.1/api/v1/namespaces/flink-pushavoo-flink-rec/configmaps/elifibre-00000000000000000000000000000000-jobmanager-leader
> .
> Message: Operation cannot be fulfilled on configmaps
> \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object
> has been modified; please apply your changes to the latest version and try
> again.
> Received status: Status(apiVersion=v1, code=409,
> details=StatusDetails(causes=[], group=null, kind=configmaps,
> name=elifibre-00000000000000000000000000000000-jobmanager-leader,
> retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status,
> message=Operation cannot be fulfilled on configmaps
> \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the
> object has been modified;
> please apply your changes to the latest version and try again,
> metadata=ListMeta(_continue=null, remainingItemCount=null,
> resourceVersion=null, selfLink=null, additionalProperties={}),
> reason=Conflict, status=Failure, additionalProperties={}).
>
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)
>
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507)
>
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)
>
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)
>
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289)
>
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269)
>
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820)
>
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86)
>
> io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26)
>
> io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5)
>
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92)
>
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36)
>
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106)
> ... 10 common frames omitted\n"}
>
>
> **********************************************************************************************************
>
>
>
>
>
> On 2021/08/30 10:53:10, Roman Khachatryan <ro...@apache.org> wrote:
> > Hello,
> >
> > Do I understand correctly that you are using native Kubernetes
> > deployment in application mode;
> > and the issue *only* happens if you set kubernetes-jobmanager-replicas
> > [1] to a value greater than 1?
> >
> > Does it happen during deployment or at some point while running the job?
> >
> > Could you share Flink and Kubernetes versions and HA configuration
> > [2]? (I'm assuming you're using Kubernetes for HA, not ZK).
> >
> > [1]
> >
> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/config/#kubernetes-jobmanager-replicas
> > [2]
> >
> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/
> >
> > Regards,
> > Roman
> >
> > On Fri, Aug 27, 2021 at 2:31 PM mejri houssem <me...@gmail.com>
> wrote:
> > >
> > > hello i am deploying a flink application cluster with kubernetes HA
> mode, but i am facing this recurrent problem and i didn't know how to
> solve it.
> > >
> > > Any help would be appreciated.
> > >
> > >
> > >
> > > this of the jobManager:
> > >
> > >
> {"@timestamp":"2021-08-27T14:19:42.447+02:00","@version":"1","message":"Exception
> occurred while renewing lock: Unable to update
> ConfigMapLock","logger_name":"io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector","thread_name":"pool-4092-thread-1","level":"DEBUG","level_value":10000,"stack_trace":"io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException:
> Unable to update ConfigMapLock\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108)\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156)\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.renew(LeaderElector.java:120)\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$null$1(LeaderElector.java:104)\n\tat
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat
> java.util.concurrent.FutureTask.run(Fut
> ureT
> > > ask.java:266)\n\tat
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat
> java.lang.Thread.run(Thread.java:748)\nCaused by:
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing:
> PUT at:
> https://172.31.64.1/api/v1/namespaces/flink-pushavoo-flink-rec/configmaps/elifibre-00000000000000000000000000000000-jobmanager-leader.
> Message: Operation cannot be fulfilled on configmaps
> \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object
> has been modified; please apply your changes to the latest version and try
> again. Received status: Status(apiVersion=v1, code=409,
> details=StatusDetails(causes=[],
> gro
> > > up=null, kind=configmaps,
> name=elifibre-00000000000000000000000000000000-jobmanager-leader,
> retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status,
> message=Operation cannot be fulfilled on configmaps
> \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object
> has been modified; please apply your changes to the latest version and try
> again, metadata=ListMeta(_continue=null, remainingItemCount=null,
> resourceVersion=null, selfLink=null, additionalProperties={}),
> reason=Conflict, status=Failure, additionalProperties={}).\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507)\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)\n\tat
> io.fabric8.kube
> rnet
> > >
> es.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289)\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269)\n\tat
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820)\n\tat
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86)\n\tat
> io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26)\n\tat
> io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5)\n\tat
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92)\n\tat
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36)\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106)\n\t...
> 10 common frames omitted\n"}
> > >
> >
>
Re: k8S HA mode
Posted by houssem <me...@gmail.com>.
Hello, thanks for the response
I am using kubernetes standalone application mode not the native one.
and this error happens randomly at some point while running the job.
Also i am using just one replicas of the jobmanager
here is some other logs::
{"@timestamp":"2021-08-30T15:43:44.970+02:00","@version":"1","message":"Exception occurred while renewing lock: Unable to update ConfigMapLock","logger_name":"io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector",
"thread_name":"pool-685-thread-1","level":"DEBUG","level_value":10000,"stack_trace":"io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: Unable to update ConfigMapLock
io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108)
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156)
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.renew(LeaderElector.java:120)
io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$null$1(LeaderElector.java:104)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PUT at:
https://172.31.64.1/api/v1/namespaces/flink-pushavoo-flink-rec/configmaps/elifibre-00000000000000000000000000000000-jobmanager-leader.
Message: Operation cannot be fulfilled on configmaps \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object has been modified; please apply your changes to the latest version and try again.
Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, kind=configmaps, name=elifibre-00000000000000000000000000000000-jobmanager-leader, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Operation cannot be fulfilled on configmaps
\"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object has been modified;
please apply your changes to the latest version and try again, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Conflict, status=Failure, additionalProperties={}).
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)
io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507)
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289)
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269)
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820)
io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86)
io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26)
io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5)
io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92)
io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36)
io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106)
... 10 common frames omitted\n"}
**********************************************************************************************************
On 2021/08/30 10:53:10, Roman Khachatryan <ro...@apache.org> wrote:
> Hello,
>
> Do I understand correctly that you are using native Kubernetes
> deployment in application mode;
> and the issue *only* happens if you set kubernetes-jobmanager-replicas
> [1] to a value greater than 1?
>
> Does it happen during deployment or at some point while running the job?
>
> Could you share Flink and Kubernetes versions and HA configuration
> [2]? (I'm assuming you're using Kubernetes for HA, not ZK).
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/config/#kubernetes-jobmanager-replicas
> [2]
> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/
>
> Regards,
> Roman
>
> On Fri, Aug 27, 2021 at 2:31 PM mejri houssem <me...@gmail.com> wrote:
> >
> > hello i am deploying a flink application cluster with kubernetes HA mode, but i am facing this recurrent problem and i didn't know how to solve it.
> >
> > Any help would be appreciated.
> >
> >
> >
> > this of the jobManager:
> >
> > {"@timestamp":"2021-08-27T14:19:42.447+02:00","@version":"1","message":"Exception occurred while renewing lock: Unable to update ConfigMapLock","logger_name":"io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector","thread_name":"pool-4092-thread-1","level":"DEBUG","level_value":10000,"stack_trace":"io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: Unable to update ConfigMapLock\n\tat io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.renew(LeaderElector.java:120)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$null$1(LeaderElector.java:104)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(Fut
ureT
> > ask.java:266)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PUT at: https://172.31.64.1/api/v1/namespaces/flink-pushavoo-flink-rec/configmaps/elifibre-00000000000000000000000000000000-jobmanager-leader. Message: Operation cannot be fulfilled on configmaps \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object has been modified; please apply your changes to the latest version and try again. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[],
gro
> > up=null, kind=configmaps, name=elifibre-00000000000000000000000000000000-jobmanager-leader, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Operation cannot be fulfilled on configmaps \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object has been modified; please apply your changes to the latest version and try again, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Conflict, status=Failure, additionalProperties={}).\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)\n\tat io.fabric8.kube
rnet
> > es.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820)\n\tat io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86)\n\tat io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26)\n\tat io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5)\n\tat io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92)\n\tat io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106)\n\t... 10 common frames omitted\n"}
> >
>
Re: k8S HA mode
Posted by Roman Khachatryan <ro...@apache.org>.
Hello,
Do I understand correctly that you are using native Kubernetes
deployment in application mode;
and the issue *only* happens if you set kubernetes-jobmanager-replicas
[1] to a value greater than 1?
Does it happen during deployment or at some point while running the job?
Could you share Flink and Kubernetes versions and HA configuration
[2]? (I'm assuming you're using Kubernetes for HA, not ZK).
[1]
https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/config/#kubernetes-jobmanager-replicas
[2]
https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/
Regards,
Roman
On Fri, Aug 27, 2021 at 2:31 PM mejri houssem <me...@gmail.com> wrote:
>
> hello i am deploying a flink application cluster with kubernetes HA mode, but i am facing this recurrent problem and i didn't know how to solve it.
>
> Any help would be appreciated.
>
>
>
> this of the jobManager:
>
> {"@timestamp":"2021-08-27T14:19:42.447+02:00","@version":"1","message":"Exception occurred while renewing lock: Unable to update ConfigMapLock","logger_name":"io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector","thread_name":"pool-4092-thread-1","level":"DEBUG","level_value":10000,"stack_trace":"io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException: Unable to update ConfigMapLock\n\tat io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.renew(LeaderElector.java:120)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$null$1(LeaderElector.java:104)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureT
> ask.java:266)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PUT at: https://172.31.64.1/api/v1/namespaces/flink-pushavoo-flink-rec/configmaps/elifibre-00000000000000000000000000000000-jobmanager-leader. Message: Operation cannot be fulfilled on configmaps \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object has been modified; please apply your changes to the latest version and try again. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], gro
> up=null, kind=configmaps, name=elifibre-00000000000000000000000000000000-jobmanager-leader, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Operation cannot be fulfilled on configmaps \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object has been modified; please apply your changes to the latest version and try again, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Conflict, status=Failure, additionalProperties={}).\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)\n\tat io.fabric8.kubernet
> es.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289)\n\tat io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820)\n\tat io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86)\n\tat io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26)\n\tat io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5)\n\tat io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92)\n\tat io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36)\n\tat io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106)\n\t... 10 common frames omitted\n"}
>