You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Matthias (Jira)" <ji...@apache.org> on 2021/01/07 07:13:00 UTC

[jira] [Updated] (FLINK-20863) Exclude network memory from ResourceProfile

     [ https://issues.apache.org/jira/browse/FLINK-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matthias updated FLINK-20863:
-----------------------------
    Component/s: Runtime / Network

> Exclude network memory from ResourceProfile
> -------------------------------------------
>
>                 Key: FLINK-20863
>                 URL: https://issues.apache.org/jira/browse/FLINK-20863
>             Project: Flink
>          Issue Type: Task
>          Components: Runtime / Network
>            Reporter: Yangze Guo
>            Priority: Major
>             Fix For: 1.13.0
>
>
> Network memory is included in the current ResourceProfile implementation, expecting the fine-grained resource management to not deploy too many tasks onto a TM that require more network memory than the TM contains.
> However, how much network memory each task needs highly depends on the shuffle service implementation, and may vary when switching to another shuffle service. Therefore, neither user nor the Flink runtime can easily specify network memory requirements for a task/slot at the moment.
> The concrete solution for network memory controlling is beyond the scope of this FLIP. However, we are aware of a few potential directions for solving this problem.
> - Make shuffle services adaptively control the amount of memory assigned to each task/slot, with respect to the given memory pool size. In this way, there should be no need to rely on fine-grained resource management to control the network memory consumption.
> - Make shuffle services expose interfaces for calculating network memory requirements for given SSGs. In this way, the Flink runtime can specify the calculated network memory requirements for slots, without having to understand the internal details of different shuffle service implementations.
> As for now, we propose to exclude network memory from ResourceProfile for the moment, to unblock the fine-grained resource management feature from the network memory controlling issue. If needed, it can be added back in future, as long as there’s a good way to specify the requirement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)