You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Stephan Ewen (JIRA)" <ji...@apache.org> on 2017/04/09 20:13:41 UTC

[jira] [Commented] (FLINK-4674) File Descriptors not being released after Completion of Flink Job Run via Flink Web Portal

    [ https://issues.apache.org/jira/browse/FLINK-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962252#comment-15962252 ] 

Stephan Ewen commented on FLINK-4674:
-------------------------------------

Sorry for the delayed reaction. Could you post some more information, like a listing of open file descriptors that you see?

Most parts of the code take great care to close FileDescriptors, remove temp files, etc. In addition, the JVM also releases the FDs when the file stream object is garbage collected.

Would be good to understand what is happening here in detail.



> File Descriptors not being released after Completion of Flink Job Run via Flink Web Portal
> ------------------------------------------------------------------------------------------
>
>                 Key: FLINK-4674
>                 URL: https://issues.apache.org/jira/browse/FLINK-4674
>             Project: Flink
>          Issue Type: Bug
>          Components: Client, Distributed Coordination, JobManager
>    Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3, 1.1.1, 1.1.2
>         Environment: RHEL6,7, UBUNTU
>            Reporter: Abey Sam Alex
>
> File descriptors utilized by Flink Task Manager are not released even after completion of job.
> For releasing all file descriptors, we need to reboot the flink cluster. This causes all Jobs to run succesfully until the OS limit is hit and post which Job keeps failing - 
> Error on Flink - 
> java.io.IOException: Error opening the Input Split file:/data/Temp/RUN10_1000.csv [84950,1699]: /data/Temp/RUN10_1000.csv (Too many open files)
> 	at org.apache.flink.api.common.io.FileInputFormat.open(FileInputFormat.java:682)
> 	at org.apache.flink.api.common.io.DelimitedInputFormat.open(DelimitedInputFormat.java:411)
> 	at org.apache.flink.api.common.io.DelimitedInputFormat.open(DelimitedInputFormat.java:45)
> 	at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:147)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.FileNotFoundException: /data/Temp/RUN10_1000.csv (Too many open files)
> 	at java.io.FileInputStream.open0(Native Method)
> 	at java.io.FileInputStream.open(FileInputStream.java:195)
> 	at java.io.FileInputStream.<init>(FileInputStream.java:138)
> 	at org.apache.flink.core.fs.local.LocalDataInputStream.<init>(LocalDataInputStream.java:52)
> 	at org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:143)
> 	at org.apache.flink.api.common.io.FileInputFormat$InputSplitOpenThread.run(FileInputFormat.java:842)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)