You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Till Rohrmann (Jira)" <ji...@apache.org> on 2020/01/27 15:00:17 UTC

[jira] [Commented] (FLINK-15758) Investigate potential out-of-memory problems due to managed unsafe memory allocation

    [ https://issues.apache.org/jira/browse/FLINK-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024400#comment-17024400 ] 

Till Rohrmann commented on FLINK-15758:
---------------------------------------

As https://issues.apache.org/jira/browse/FLINK-14894?focusedCommentId=17024394&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17024394 says there is indeed a problem with higher memory pressure if the fix of FLINK-14894 is applied. We reverted this fix momentarily and need to properly fix this problem by monitoring the unsafe memory usage and trigger a clean if required.

> Investigate potential out-of-memory problems due to managed unsafe memory allocation
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-15758
>                 URL: https://issues.apache.org/jira/browse/FLINK-15758
>             Project: Flink
>          Issue Type: Task
>          Components: Runtime / Task
>            Reporter: Andrey Zagrebin
>            Priority: Major
>             Fix For: 1.11.0
>
>
> After FLINK-13985, managed memory is allocated from UNSAFE, not as direct nio buffers as before 1.10.
> After FLINK-14894, the release of this memory happens only when all Java handles of the unsafe memory are about to be GC'ed. It is similar to how it was with direct nio buffers before 1.10 but the unsafe memory is not tracked by direct memory limit (-XX:MaxDirectMemorySize). The potential downside can be that over-allocating of unsafe memory will not hit the direct limit and will not cause GC immediately which will be the only way to release it. In this case, it can cause out-of-memory failures w/o triggering GC to release a lot of potentially already unused memory.
> If we should verify whether the delayed release is a problem then we can investigate further optimisations, like:
>  * directly monitoring phantom reference queue of the cleaner (if JVM detects quickly that there are no more reference to the memory) and explicitly release memory ready for GC asap, e.g. after Task exit
>  * monitor allocated memory amount and block allocation until GC releases occupied memory instead of failing with out-of-memory immediately
> cc [~sewen] [~trohrmann]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)