You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Jun Qin (Jira)" <ji...@apache.org> on 2020/07/27 08:19:00 UTC

[jira] [Comment Edited] (FLINK-18702) Flink elasticsearch connector leaks threads and classloaders thereof

    [ https://issues.apache.org/jira/browse/FLINK-18702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165530#comment-17165530 ] 

Jun Qin edited comment on FLINK-18702 at 7/27/20, 8:18 AM:
-----------------------------------------------------------

Thanks Yangze for the update.

Increasing Metaspace will only delay the OOM.

I've checked the heap dump, there were as many ChildFirstClassLoaders in the heap as the number of restarts. Those ChildFirstClassLoader could not be cleaned up because of some threads (e.g., I/O dispatcher n, pool-n-thread-m) were still running.   It is likely to have the same cause as [Flink-13689|https://issues.apache.org/jira/browse/FLINK-13689]. I will give another try and verify. 


was (Author: qinjunjerry):
Thanks Yangze for the update.

Increasing Metaspace will only delay the OOM.

I've checked the heap dump, there were as many ChildFirstClassLoaders in the heap as the number of restarts. Those ChildFirstClassLoader could not be cleaned up because of some threads (e.g., I/O dispatcher n, pool-n-thread-m) were still running.   It is likely to have the same cause as Flink-13689. I will give another try and verify. 

> Flink elasticsearch connector leaks threads and classloaders thereof
> --------------------------------------------------------------------
>
>                 Key: FLINK-18702
>                 URL: https://issues.apache.org/jira/browse/FLINK-18702
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / ElasticSearch
>    Affects Versions: 1.10.0, 1.10.1
>            Reporter: Jun Qin
>            Assignee: Jun Qin
>            Priority: Major
>
> Flink elasticsearch connector leaking threads and classloaders thereof.  This results in OOM Metaspace when ES sink fails and restarted many times. 
> This issue is visible in Flink 1.10 but not in 1.11 because Flink 1.11 does not create new class loaders in case of recoveries (FLINK-16408)
>  
> Reproduction:
>  * Start a job with ES sink in a Flink 1.10 cluster, without starting the ES cluster.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)