You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Prabhu Joseph (Jira)" <ji...@apache.org> on 2024/01/18 08:53:00 UTC

[jira] [Created] (FLINK-34142) TaskManager WorkingDirectory is not removed during shutdown

Prabhu Joseph created FLINK-34142:
-------------------------------------

             Summary: TaskManager WorkingDirectory is not removed during shutdown 
                 Key: FLINK-34142
                 URL: https://issues.apache.org/jira/browse/FLINK-34142
             Project: Flink
          Issue Type: Bug
          Components: Deployment / YARN
    Affects Versions: 1.17.1, 1.16.0
            Reporter: Prabhu Joseph


TaskManager WorkingDirectory is not removed during shutdown. 

*Repro*

 
{code:java}
1. Execute a Flink batch job within a Flink on YARN Session

flink-yarn-session -d

flink run -d /usr/lib/flink/examples/batch/WordCount.jar --input s3://prabhuflinks3/INPUT --output s3://prabhuflinks3/OUT

{code}
The batch job completes successfully, but the taskmanager working directory is not being removed.
{code:java}
[root@ip-1-2-3-4 container_1705470896818_0017_01_000002]# ls -R -lrt /mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002:
total 0
drwxr-xr-x 2 yarn yarn  6 Jan 18 08:34 tmp
drwxr-xr-x 4 yarn yarn 66 Jan 18 08:34 blobStorage
drwxr-xr-x 2 yarn yarn  6 Jan 18 08:34 slotAllocationSnapshots
drwxr-xr-x 2 yarn yarn  6 Jan 18 08:34 localState

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/tmp:
total 0

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/blobStorage:
total 0
drwxr-xr-x 2 yarn yarn 94 Jan 18 08:34 job_d11f7085314ef1fb04c4e12fe292185a
drwxr-xr-x 2 yarn yarn  6 Jan 18 08:34 incoming

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/blobStorage/job_d11f7085314ef1fb04c4e12fe292185a:
total 12
-rw-r--r-- 1 yarn yarn 10323 Jan 18 08:34 blob_p-cdd441a64b3ea6eed0058df02c6c10fd208c94a8-86d84864273dad1e8084d8ef0f5aad52

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/blobStorage/incoming:
total 0

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/slotAllocationSnapshots:
total 0

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_000002/localState:
total 0


{code}
*Analysis*

1. The TaskManagerRunner removes the working directory only when its 'close' method is called, which never happens.
{code:java}
    public void close() throws Exception {
        try {
            closeAsync().get();
        } catch (ExecutionException e) {
            ExceptionUtils.rethrowException(ExceptionUtils.stripExecutionException(e));
        }
    }

    public CompletableFuture<Result> closeAsync() {
        return closeAsync(Result.SUCCESS);
    }
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)