You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Sergio Sainz (Jira)" <ji...@apache.org> on 2023/04/11 19:17:00 UTC

[jira] [Created] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

Sergio Sainz created FLINK-31775:
------------------------------------

             Summary: High-Availability not supported in kubernetes when istio enabled
                 Key: FLINK-31775
                 URL: https://issues.apache.org/jira/browse/FLINK-31775
             Project: Flink
          Issue Type: Bug
          Components: Deployment / Kubernetes
    Affects Versions: 1.16.1
            Reporter: Sergio Sainz


When using native kubernetes deployment mode, and when new TaskManager is started to process a job, the TaskManager will attempt to register itself to the resource manager (job manager). the TaskManager looks up the resource manager per ip-address (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, and hence we see that the job cannot start because task manager cannot register with the resource manager:

2023-04-10 23:24:19,752 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not resolve ResourceManager address akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful registration at resource manager akka.tcp://flink@local-mci-ar32a-dev-flink-cluster.mstr-env-mci-ar32a-dev:6123/user/rpc/resourcemanager_* under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice it is not possible to disable istio (as explained here : https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html)

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging as separate defect as I believe the fix of FLINK-28171 won't fix this case. FLINK-28171  is about Flink Kubernetes Operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)