You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Xintong Song (Jira)" <ji...@apache.org> on 2020/10/12 06:40:00 UTC

[jira] [Created] (FLINK-19568) Offload creating TM launch contexts to the IO executor

Xintong Song created FLINK-19568:
------------------------------------

             Summary: Offload creating TM launch contexts to the IO executor
                 Key: FLINK-19568
                 URL: https://issues.apache.org/jira/browse/FLINK-19568
             Project: Flink
          Issue Type: Improvement
          Components: Deployment / YARN
            Reporter: Xintong Song
             Fix For: 1.12.0


Currently, for launching each TM container on Yarn, Flink creates a container launch context in RM's PRC main thread. This includes accessing file status from remote file systems, which may blocks the RM's main thread, especially when remote file system is slow. See [this thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/TM-heartbeat-timeout-due-to-ResourceManager-being-busy-td38626.html].

The creating of TM context does not access nor change any RM's internal states. Therefore, I propose to offload the work to the IO executor. To be specific, I think the entire {{YarnResourceManagerDriver#createTaskExecutorLaunchContext}} can be invoked on the IO executor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)