You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Prabhu Joseph (Jira)" <ji...@apache.org> on 2024/01/18 08:53:00 UTC
[jira] [Created] (FLINK-34142) TaskManager WorkingDirectory is not removed during shutdown
Prabhu Joseph created FLINK-34142:
-------------------------------------
Summary: TaskManager WorkingDirectory is not removed during shutdown
Key: FLINK-34142
URL: https://issues.apache.org/jira/browse/FLINK-34142
Project: Flink
Issue Type: Bug
Components: Deployment / YARN
Affects Versions: 1.17.1, 1.16.0
Reporter: Prabhu Joseph
TaskManager WorkingDirectory is not removed during shutdown.
*Repro*
{code:java}
1. Execute a Flink batch job within a Flink on YARN Session
flink-yarn-session -d
flink run -d /usr/lib/flink/examples/batch/WordCount.jar --input s3://prabhuflinks3/INPUT --output s3://prabhuflinks3/OUT
{code}
The batch job completes successfully, but the taskmanager working directory is not being removed.
{code:java}
[root@ip-1-2-3-4 container_1705470896818_0017_01_000002]# ls -R -lrt /mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002:
total 0
drwxr-xr-x 2 yarn yarn 6 Jan 18 08:34 tmp
drwxr-xr-x 4 yarn yarn 66 Jan 18 08:34 blobStorage
drwxr-xr-x 2 yarn yarn 6 Jan 18 08:34 slotAllocationSnapshots
drwxr-xr-x 2 yarn yarn 6 Jan 18 08:34 localState
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/tmp:
total 0
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/blobStorage:
total 0
drwxr-xr-x 2 yarn yarn 94 Jan 18 08:34 job_d11f7085314ef1fb04c4e12fe292185a
drwxr-xr-x 2 yarn yarn 6 Jan 18 08:34 incoming
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/blobStorage/job_d11f7085314ef1fb04c4e12fe292185a:
total 12
-rw-r--r-- 1 yarn yarn 10323 Jan 18 08:34 blob_p-cdd441a64b3ea6eed0058df02c6c10fd208c94a8-86d84864273dad1e8084d8ef0f5aad52
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/blobStorage/incoming:
total 0
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/slotAllocationSnapshots:
total 0
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/localState:
total 0
{code}
*Analysis*
1. The TaskManagerRunner removes the working directory only when its 'close' method is called, which never happens.
{code:java}
public void close() throws Exception {
try {
closeAsync().get();
} catch (ExecutionException e) {
ExceptionUtils.rethrowException(ExceptionUtils.stripExecutionException(e));
}
}
public CompletableFuture<Result> closeAsync() {
return closeAsync(Result.SUCCESS);
}
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)