You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Barisa (Jira)" <ji...@apache.org> on 2022/09/21 13:36:00 UTC

[jira] [Created] (FLINK-29382) Flink fails to start when created using quick guide for flink operator

Barisa created FLINK-29382:
------------------------------

             Summary: Flink fails to start when created using quick guide for flink operator
                 Key: FLINK-29382
                 URL: https://issues.apache.org/jira/browse/FLINK-29382
             Project: Flink
          Issue Type: Bug
          Components: Kubernetes Operator
    Affects Versions: 1.15.2
            Reporter: Barisa


I followed [https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/] to deploy flink operator and then the flink job.

 

 

When following step 
 {{kubectl create -f https://raw.githubusercontent.com/apache/flink-kubernetes-operator/release-1.1/examples/basic.yaml}}
the pod starts, but then it keeps crashing with following exception.

 

{noformat}
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden: User "system:anonymous" cannot watch resource "pods" in API group "" in the namespace "zonda"
	at io.fabric8.kubernetes.client.dsl.internal.WatcherWebSocketListener.onFailure(WatcherWebSocketListener.java:74) ~[flink-dist-1.15.2.jar:1.15.2]
	at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:570) ~[flink-dist-1.15.2.jar:1.15.2]
	at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:199) ~[flink-dist-1.15.2.jar:1.15.2]
	at org.apache.flink.kubernetes.shaded.okhttp3.RealCall$AsyncCall.execute(RealCall.java:174) ~[flink-dist-1.15.2.jar:1.15.2]
	at org.apache.flink.kubernetes.shaded.okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[flink-dist-1.15.2.jar:1.15.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
{noformat}

I also noticed following log lines
{noformat}
2022-09-21 13:32:05,715 WARN  io.fabric8.kubernetes.client.Config                          [] - Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
2022-09-21 13:32:05,719 WARN  io.fabric8.kubernetes.client.Config                          [] - Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
{noformat}

I think the problem is that container runs as user root, which later uses gosu to became flink user. However, service account is only accessible to the main user in the container, which is root

{noformat}
root@basic-example-658578895d-qwlb2:/opt/flink# ls -hltr /var/run/secrets/kubernetes.io/serviceaccount/token
lrwxrwxrwx. 1 root 1337 12 Sep 21 08:57 /var/run/secrets/kubernetes.io/serviceaccount/token -> ..data/token
{noformat}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)