You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Weike Dong (Jira)" <ji...@apache.org> on 2020/10/16 11:28:00 UTC

[jira] [Created] (FLINK-19677) TaskManager takes abnormally long time to register with JobManager on Kubernetes

Weike Dong created FLINK-19677:
----------------------------------

             Summary: TaskManager takes abnormally long time to register with JobManager on Kubernetes
                 Key: FLINK-19677
                 URL: https://issues.apache.org/jira/browse/FLINK-19677
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Task
    Affects Versions: 1.11.2, 1.11.1, 1.11.0
            Reporter: Weike Dong


During the registration process of TaskManager, JobManager would create a 

_TaskManagerLocation_ instance, which tries to get hostname of the TaskManager via reverse DNS lookup.

However, this always fails in Kubernetes environment, because for pods that are not exposed by Services, their IPs cannot be resolved to domains by coredns, and _InetAddress#getCanonicalHostName()_ would take ~5 seconds to return, blocking the whole registration process.

Therefore Flink should provide a configuration parameter to turn off reverse DNS lookup. Also, even when hostname is actually needed, this could be done lazily to avoid blocking registration of other TaskManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)