You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Eron Wright (JIRA)" <ji...@apache.org> on 2018/05/23 23:47:00 UTC

[jira] [Updated] (FLINK-8622) flink-mesos: High memory usage of scheduler + job manager. GC never kicks in.

     [ https://issues.apache.org/jira/browse/FLINK-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eron Wright  updated FLINK-8622:
--------------------------------
    Fix Version/s:     (was: 1.5.0)

> flink-mesos: High memory usage of scheduler + job manager. GC never kicks in.
> -----------------------------------------------------------------------------
>
>                 Key: FLINK-8622
>                 URL: https://issues.apache.org/jira/browse/FLINK-8622
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination, Mesos, ResourceManager
>    Affects Versions: 1.4.0, 1.3.2
>            Reporter: Bhumika Bayani
>            Priority: Major
>         Attachments: flink-mem-usage-graph-for-jira.png
>
>
> We are deploying a 1 job manager + 6 taskmanager flink cluster on mesos.
> We have observed that the memory usage for 'jobmanager' is high. In spite of allocating more and more memory resources to it, it hits the limit within minutes.
> We had started with 1.5 GB RAM and 1 GB heap. Currently we have allocated 4 GB RAM, 3 GB heap to jobmanager cum scheduler. We tried allocating 8GB RAM and lesser heap (i.e. same, 3GB) too. In that case also, memory graph was identical.
> As per the graph below, the scheduler almost always runs with maximum memory resources.
> !flink-mem-usage-graph-for-jira.png!
>  
> Throughout the run of the scheduler, we do not see memory usage going down unless it is killed due to OOM. So inferring, garbage collection is never happening.
> We have tried using both flink versions 1.4 and 1.3 but could see same issue on both versions.
>  
> Is there any way we can find out where and how memory is being used? 
> Are there any flink config options for jobmanager or jvm parameters which can help us restrict the memory usage, force garbage collection, and prevent it from crash? 
> Please let us know if there any resource recommendations from Flink for running Flink on mesos at scale.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)