You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Zhu Zhu (Jira)" <ji...@apache.org> on 2020/05/21 02:25:00 UTC

[jira] [Closed] (FLINK-17645) REAPER_THREAD.start() in SafetyNetCloseableRegistry failed, causing the repeated failover.

     [ https://issues.apache.org/jira/browse/FLINK-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhu Zhu closed FLINK-17645.
---------------------------
    Resolution: Fixed

Fixed via

master
9f3a71183ea4b14a396ecf66e4377da07b06a689

release-1.1
d40826d23c8993f46270922a40fe23379b3e166e

> REAPER_THREAD.start() in SafetyNetCloseableRegistry failed, causing the repeated failover.
> ------------------------------------------------------------------------------------------
>
>                 Key: FLINK-17645
>                 URL: https://issues.apache.org/jira/browse/FLINK-17645
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Task
>    Affects Versions: 1.10.1, 1.11.0, 1.12.0
>            Reporter: Zakelly Lan
>            Assignee: Lijie Wang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.11.0, 1.12.0
>
>
> I'm running a modified version of Flink, and encountered the exception below when task start:
> {code:java}
> 2020-05-12 00:46:19,037 ERROR [***] org.apache.flink.runtime.taskmanager.Task   - Encountered an unexpected exception
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:802)
>         at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
>         at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
>         at java.lang.Thread.run(Thread.java:834)
> 2020-05-12 00:46:19,038 INFO  [***] org.apache.flink.runtime.taskmanager.Task 
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:802)
>         at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
>         at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
>         at java.lang.Thread.run(Thread.java:834)
> {code}
> The REAPER_THREAD.start() fails because of OOM, and REAPER_THREAD will never be null. Since then, every time SafetyNetCloseableRegistry init in this VM will cause an IllegalStateException:
> {code:java}
> java.lang.IllegalStateException
> 	at org.apache.flink.util.Preconditions.checkState(Preconditions.java:179)
> 	at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:71)
> 	at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
> 	at java.lang.Thread.run(Thread.java:834){code}
> This may happen in very old version of Flink as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)