You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Xintong Song (Jira)" <ji...@apache.org> on 2020/10/13 01:48:00 UTC

[jira] [Closed] (FLINK-19568) Offload creating TM launch contexts to the IO executor

     [ https://issues.apache.org/jira/browse/FLINK-19568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xintong Song closed FLINK-19568.
--------------------------------
    Resolution: Done

Done via
* master: b7d70e9b31df93478fef6f6bec5bea3239eefeea

> Offload creating TM launch contexts to the IO executor
> ------------------------------------------------------
>
>                 Key: FLINK-19568
>                 URL: https://issues.apache.org/jira/browse/FLINK-19568
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / YARN
>            Reporter: Xintong Song
>            Assignee: Xintong Song
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.12.0
>
>
> Currently, for launching each TM container on Yarn, Flink creates a container launch context in RM's PRC main thread. This includes accessing file status from remote file systems, which may blocks the RM's main thread, especially when remote file system is slow. See [this thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/TM-heartbeat-timeout-due-to-ResourceManager-being-busy-td38626.html].
> The creating of TM context does not access nor change any RM's internal states. Therefore, I propose to offload the work to the IO executor. To be specific, I think the entire {{YarnResourceManagerDriver#createTaskExecutorLaunchContext}} can be invoked on the IO executor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)