You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yun Gao (Jira)" <ji...@apache.org> on 2020/01/01 16:35:00 UTC

[jira] [Commented] (FLINK-15010) Temp directories flink-netty-shuffle-* are not cleaned up

    [ https://issues.apache.org/jira/browse/FLINK-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006448#comment-17006448 ] 

Yun Gao commented on FLINK-15010:
---------------------------------

The reason for this issue should be in standalone mode TaskManagers are shutdown by SIG_TERM signal, and the cleanup of directories rely on shutdown hooks, however, there are no shutdown hook registered for netty shuffle environment. 

An intuitive thought is to add shutdown hook directly for _NettyShuffleEnvironment_, however, it cannot ensure the directories get cleaned up in all cases, since the directories are created in the constructor of _FileChannelManagerImpl_, which comes before registering  shutdown hook in _NettyShuffleEnvironment's_ constructor_._ If __ task __ managers receive SIG_TERM between the two actions, the directories will not be cleaned. Therefore, the current PR enhance _FileChannelManagerImpl_ by allowing the callers to specify whether to register a shutdown hook for the manager, and the hook is registered before creating the directories. 

Besides, The above issue also exist for the existing _FileChannelManagerImpl_ usage in _IOManager_. If the current fix is acceptable, we might also fix the _IOManager_ case in similar way.

> Temp directories flink-netty-shuffle-* are not cleaned up
> ---------------------------------------------------------
>
>                 Key: FLINK-15010
>                 URL: https://issues.apache.org/jira/browse/FLINK-15010
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>    Affects Versions: 1.9.1
>            Reporter: Nico Kruber
>            Assignee: Yun Gao
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Starting a Flink cluster with 2 TMs and stopping it again will leave 2 temporary directories (and not delete them): flink-netty-shuffle-<uid>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)