You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/07/16 15:41:00 UTC
[jira] [Updated] (FLINK-13245) Network stack is leaking files
[ https://issues.apache.org/jira/browse/FLINK-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-13245:
-----------------------------------
Labels: pull-request-available (was: )
> Network stack is leaking files
> ------------------------------
>
> Key: FLINK-13245
> URL: https://issues.apache.org/jira/browse/FLINK-13245
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Network
> Affects Versions: 1.9.0
> Reporter: Chesnay Schepler
> Assignee: zhijiang
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 1.9.0
>
>
> There's file leak in the network stack / shuffle service.
> When running the {{SlotCountExceedingParallelismTest}} on Windows a large number of {{.channel}} files continue to reside in a {{flink-netty-shuffle-XXX}} directory.
> From what I've gathered so far these files are still being used by a {{BoundedBlockingSubpartition}}. The cleanup logic in this class uses ref-counting to ensure we don't release data while a reader is still present. However, at the end of the job this count has not reached 0, and thus nothing is being released.
> The same issue is also present on the {{ResultPartition}} level; the {{ReleaseOnConsumptionResultPartition}} also are being released while the ref-count is greater than 0.
> Overall it appears like there's some issue with the notifications for partitions being consumed.
> It is feasible that this issue has recently caused issues on Travis where the build were failing due to a lack of disk space.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)