You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Zakelly Lan (Jira)" <ji...@apache.org> on 2020/05/13 02:59:00 UTC

[jira] [Updated] (FLINK-17645) REAPER_THREAD in SafetyNetCloseableRegistry start() failed, causing the repeated failover.

     [ https://issues.apache.org/jira/browse/FLINK-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zakelly Lan updated FLINK-17645:
--------------------------------
    Description: 
I'm running a modified version of Flink, and encountered the exception below when task start:
{code:java}
2020-05-12 00:46:19,037 ERROR [***] org.apache.flink.runtime.taskmanager.Task   - Encountered an unexpected exception
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:802)
        at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
        at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
        at java.lang.Thread.run(Thread.java:834)
2020-05-12 00:46:19,038 INFO  [***] org.apache.flink.runtime.taskmanager.Task 
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:802)
        at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
        at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
        at java.lang.Thread.run(Thread.java:834)
{code}
The REAPER_THREAD.start() fails because of OOM, and REAPER_THREAD will never be null. Since then, every time SafetyNetCloseableRegistry init in this VM will cause an IllegalStateException:
{code:java}
java.lang.IllegalStateException
	at org.apache.flink.util.Preconditions.checkState(Preconditions.java:179)
	at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:71)
	at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
	at java.lang.Thread.run(Thread.java:834){code}
This may happen in very old version of Flink as well.

  was:
I'm running a modified version of Flink, and encountered the exception below when task start:

 
{code:java}
2020-05-12 00:46:19,037 ERROR [***] org.apache.flink.runtime.taskmanager.Task   - Encountered an unexpected exception
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:802)
        at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
        at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
        at java.lang.Thread.run(Thread.java:834)
2020-05-12 00:46:19,038 INFO  [***] org.apache.flink.runtime.taskmanager.Task 
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:802)
        at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
        at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
        at java.lang.Thread.run(Thread.java:834)
{code}
 

The REAPER_THREAD.start() fails because of OOM, and REAPER_THREAD will never be null. Since then, every time SafetyNetCloseableRegistry init in this VM will cause an IllegalStateException:

 
{code:java}
java.lang.IllegalStateException
	at org.apache.flink.util.Preconditions.checkState(Preconditions.java:179)
	at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:71)
	at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
	at java.lang.Thread.run(Thread.java:834){code}
 

This may happen in very old version of Flink as well.


> REAPER_THREAD in SafetyNetCloseableRegistry start() failed, causing the repeated failover.
> ------------------------------------------------------------------------------------------
>
>                 Key: FLINK-17645
>                 URL: https://issues.apache.org/jira/browse/FLINK-17645
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Task
>    Affects Versions: 1.6.3
>            Reporter: Zakelly Lan
>            Priority: Major
>
> I'm running a modified version of Flink, and encountered the exception below when task start:
> {code:java}
> 2020-05-12 00:46:19,037 ERROR [***] org.apache.flink.runtime.taskmanager.Task   - Encountered an unexpected exception
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:802)
>         at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
>         at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
>         at java.lang.Thread.run(Thread.java:834)
> 2020-05-12 00:46:19,038 INFO  [***] org.apache.flink.runtime.taskmanager.Task 
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:802)
>         at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:73)
>         at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
>         at java.lang.Thread.run(Thread.java:834)
> {code}
> The REAPER_THREAD.start() fails because of OOM, and REAPER_THREAD will never be null. Since then, every time SafetyNetCloseableRegistry init in this VM will cause an IllegalStateException:
> {code:java}
> java.lang.IllegalStateException
> 	at org.apache.flink.util.Preconditions.checkState(Preconditions.java:179)
> 	at org.apache.flink.core.fs.SafetyNetCloseableRegistry.<init>(SafetyNetCloseableRegistry.java:71)
> 	at org.apache.flink.core.fs.FileSystemSafetyNet.initializeSafetyNetForThread(FileSystemSafetyNet.java:89)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:586)
> 	at java.lang.Thread.run(Thread.java:834){code}
> This may happen in very old version of Flink as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)