You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Andrew Otto <ot...@wikimedia.org> on 2023/05/08 20:20:02 UTC

flink-kubernetes-operator HA k8s RoleBinding for Leases?

Hi,

I'm trying to enable HA for flink-kubernetes-operator
<https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability>
with Helm.  We are using namespaced RBAC via watchedNamespaces.

I've followed instructions and set
kubernetes.operator.leader-election.enabled and
kubernetes.operator.leader-election.lease-name, and increased replicas to
2.  When I deploy, the second replica comes online, but errors with:

Exception occurred while acquiring lock 'LeaseLock: flink-operator -
flink-operator-lease (flink-kubernetes-operator-86b888d6b6-8cxjs
Failure executing: GET at:
https://x.x.x.x/apis/coordination.k8s.io/v1/namespaces/flink-operator/leases/flink-operator-lease.
Message: Forbidden!Configured service account doesn't have access. Service
account may have been revoked. leases.coordination.k8s.io
"flink-operator-lease" is forbidden: User
"system:serviceaccount:flink-operator:flink-operator" cannot get resource
"leases" in API group "coordination.k8s.io" in the namespace
"flink-operator".

Looking at the rbac.yaml helm template
<https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/templates/rbac.yaml>,
it looks like the Role and RoleBindings that grant access to the leases
resource are created for the configured watchNamespaces, but not for the
namespace in which the flink-kubernetes-operator is deployed.  I think that
for HA, the flink-kubernetes-operator is going to be asking k8s for Leases
in its own namespace, right?

Is this a bug, or am I doing something wrong?  I'd file a JIRA, but I
betcha I'm just doing something wrong (unless I'm the first person who's
tried to use HA + namespaced RBAC with the helm charts?).

Thanks!
-Andrew Otto
 Wikimedia Foundation

Re: flink-kubernetes-operator HA k8s RoleBinding for Leases?

Posted by Andrew Otto <ot...@wikimedia.org>.
> Could you please open a jira

Done: https://issues.apache.org/jira/browse/FLINK-32041

> PR (in case you fixed this already)
Haven't fixed it yet! But if I find time to do it I will!

Thanks!


On Tue, May 9, 2023 at 4:49 AM Tamir Sagi <Ta...@niceactimize.com>
wrote:

> Hey,
>
> I also encountered something similar with different error.  I enabled HA
> with RBAC.
>
> org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException","message":"Failure
> executing: GET at: https://172.20.0.1/api/v1/nodes. Message:
> Forbidden!Configured service account doesn't have access. Service account
> may have been revoked. nodes is forbidden: User
> "system:serviceaccount:dev-0-flink-clusters:
> *dev-0-xsight-flink-operator-sa*" cannot list resource "nodes" in API
> group "" at the cluster scope."
>
> I checked the rolebinding between the service account `dev-0-flink-clusters:dev-0-xsight-flink-operator-sa`
> and the corresponded role(*flink-operator*) which has been created by the
> operator using *rbac.nodesRule.create=true.*
>
> role binding
>
>
> role: flink-operator
>
>
> am I missing something?*​*
>
>
> ------------------------------
> *From:* Gyula Fóra <gy...@gmail.com>
> *Sent:* Tuesday, May 9, 2023 7:43 AM
> *To:* Andrew Otto <ot...@wikimedia.org>
> *Cc:* User <us...@flink.apache.org>
> *Subject:* Re: flink-kubernetes-operator HA k8s RoleBinding for Leases?
>
>
> *EXTERNAL EMAIL*
>
>
> Hey!
>
> Sounds like a bug :) Could you please open a jira / PR (in case you fixed
> this already)?
>
> Thanks
> Gyula
>
> On Mon, 8 May 2023 at 22:20, Andrew Otto <ot...@wikimedia.org> wrote:
>
> Hi,
>
> I'm trying to enable HA for flink-kubernetes-operator
> <https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability>
> with Helm.  We are using namespaced RBAC via watchedNamespaces.
>
> I've followed instructions and set
> kubernetes.operator.leader-election.enabled and
> kubernetes.operator.leader-election.lease-name, and increased replicas to
> 2.  When I deploy, the second replica comes online, but errors with:
>
> Exception occurred while acquiring lock 'LeaseLock: flink-operator -
> flink-operator-lease (flink-kubernetes-operator-86b888d6b6-8cxjs
> Failure executing: GET at:
> https://x.x.x.x/apis/coordination.k8s.io/v1/namespaces/flink-operator/leases/flink-operator-lease.
> Message: Forbidden!Configured service account doesn't have access. Service
> account may have been revoked. leases.coordination.k8s.io
> "flink-operator-lease" is forbidden: User
> "system:serviceaccount:flink-operator:flink-operator" cannot get resource
> "leases" in API group "coordination.k8s.io" in the namespace
> "flink-operator".
>
> Looking at the rbac.yaml helm template
> <https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/templates/rbac.yaml>,
> it looks like the Role and RoleBindings that grant access to the leases
> resource are created for the configured watchNamespaces, but not for the
> namespace in which the flink-kubernetes-operator is deployed.  I think that
> for HA, the flink-kubernetes-operator is going to be asking k8s for Leases
> in its own namespace, right?
>
> Is this a bug, or am I doing something wrong?  I'd file a JIRA, but I
> betcha I'm just doing something wrong (unless I'm the first person who's
> tried to use HA + namespaced RBAC with the helm charts?).
>
> Thanks!
> -Andrew Otto
>  Wikimedia Foundation
>
>
>
>
>
>
> Confidentiality: This communication and any attachments are intended for
> the above-named persons only and may be confidential and/or legally
> privileged. Any opinions expressed in this communication are not
> necessarily those of NICE Actimize. If this communication has come to you
> in error you must take no action based on it, nor must you copy or show it
> to anyone; please delete/destroy and inform the sender by e-mail
> immediately.
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure they are actually virus free.
>

Re: flink-kubernetes-operator HA k8s RoleBinding for Leases?

Posted by Tamir Sagi <Ta...@niceactimize.com>.
Hey,

I also encountered something similar with different error.  I enabled HA with RBAC.

org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException","message":"Failure executing: GET at: https://172.20.0.1/api/v1/nodes. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. nodes is forbidden: User "system:serviceaccount:dev-0-flink-clusters:dev-0-xsight-flink-operator-sa" cannot list resource "nodes" in API group "" at the cluster scope."

I checked the rolebinding between the service account `dev-0-flink-clusters:dev-0-xsight-flink-operator-sa` and the corresponded role(flink-operator) which has been created by the operator using rbac.nodesRule.create=true.

role binding
[cid:f406839c-3d43-4f87-91c4-a3f07fbeb388]

role: flink-operator

[cid:a05c06e2-48ea-48f6-9efb-017edebafea3]

am I missing something?​


________________________________
From: Gyula Fóra <gy...@gmail.com>
Sent: Tuesday, May 9, 2023 7:43 AM
To: Andrew Otto <ot...@wikimedia.org>
Cc: User <us...@flink.apache.org>
Subject: Re: flink-kubernetes-operator HA k8s RoleBinding for Leases?


EXTERNAL EMAIL


Hey!

Sounds like a bug :) Could you please open a jira / PR (in case you fixed this already)?

Thanks
Gyula

On Mon, 8 May 2023 at 22:20, Andrew Otto <ot...@wikimedia.org>> wrote:
Hi,

I'm trying to enable HA for flink-kubernetes-operator<https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability> with Helm.  We are using namespaced RBAC via watchedNamespaces.

I've followed instructions and set kubernetes.operator.leader-election.enabled and kubernetes.operator.leader-election.lease-name, and increased replicas to 2.  When I deploy, the second replica comes online, but errors with:

Exception occurred while acquiring lock 'LeaseLock: flink-operator - flink-operator-lease (flink-kubernetes-operator-86b888d6b6-8cxjs
Failure executing: GET at: https://x.x.x.x/apis/coordination.k8s.io/v1/namespaces/flink-operator/leases/flink-operator-lease. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. leases.coordination.k8s.io<http://leases.coordination.k8s.io> "flink-operator-lease" is forbidden: User "system:serviceaccount:flink-operator:flink-operator" cannot get resource "leases" in API group "coordination.k8s.io<http://coordination.k8s.io>" in the namespace "flink-operator".

Looking at the rbac.yaml helm template<https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/templates/rbac.yaml>, it looks like the Role and RoleBindings that grant access to the leases resource are created for the configured watchNamespaces, but not for the namespace in which the flink-kubernetes-operator is deployed.  I think that for HA, the flink-kubernetes-operator is going to be asking k8s for Leases in its own namespace, right?

Is this a bug, or am I doing something wrong?  I'd file a JIRA, but I betcha I'm just doing something wrong (unless I'm the first person who's tried to use HA + namespaced RBAC with the helm charts?).

Thanks!
-Andrew Otto
 Wikimedia Foundation






Confidentiality: This communication and any attachments are intended for the above-named persons only and may be confidential and/or legally privileged. Any opinions expressed in this communication are not necessarily those of NICE Actimize. If this communication has come to you in error you must take no action based on it, nor must you copy or show it to anyone; please delete/destroy and inform the sender by e-mail immediately.
Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
Viruses: Although we have taken steps toward ensuring that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free.

Re: flink-kubernetes-operator HA k8s RoleBinding for Leases?

Posted by Gyula Fóra <gy...@gmail.com>.
Hey!

Sounds like a bug :) Could you please open a jira / PR (in case you fixed
this already)?

Thanks
Gyula

On Mon, 8 May 2023 at 22:20, Andrew Otto <ot...@wikimedia.org> wrote:

> Hi,
>
> I'm trying to enable HA for flink-kubernetes-operator
> <https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability>
> with Helm.  We are using namespaced RBAC via watchedNamespaces.
>
> I've followed instructions and set
> kubernetes.operator.leader-election.enabled and
> kubernetes.operator.leader-election.lease-name, and increased replicas to
> 2.  When I deploy, the second replica comes online, but errors with:
>
> Exception occurred while acquiring lock 'LeaseLock: flink-operator -
> flink-operator-lease (flink-kubernetes-operator-86b888d6b6-8cxjs
> Failure executing: GET at:
> https://x.x.x.x/apis/coordination.k8s.io/v1/namespaces/flink-operator/leases/flink-operator-lease.
> Message: Forbidden!Configured service account doesn't have access. Service
> account may have been revoked. leases.coordination.k8s.io
> "flink-operator-lease" is forbidden: User
> "system:serviceaccount:flink-operator:flink-operator" cannot get resource
> "leases" in API group "coordination.k8s.io" in the namespace
> "flink-operator".
>
> Looking at the rbac.yaml helm template
> <https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/templates/rbac.yaml>,
> it looks like the Role and RoleBindings that grant access to the leases
> resource are created for the configured watchNamespaces, but not for the
> namespace in which the flink-kubernetes-operator is deployed.  I think that
> for HA, the flink-kubernetes-operator is going to be asking k8s for Leases
> in its own namespace, right?
>
> Is this a bug, or am I doing something wrong?  I'd file a JIRA, but I
> betcha I'm just doing something wrong (unless I'm the first person who's
> tried to use HA + namespaced RBAC with the helm charts?).
>
> Thanks!
> -Andrew Otto
>  Wikimedia Foundation
>
>
>
>
>
>