You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Derek VerLee <de...@gmail.com> on 2019/03/07 21:45:37 UTC

local disk cleanup after crash

I think that effort is put in to have task managers clean up their folders,
however I have noticed that in some cases local folders are not cleaned up and
can build up, eventually causing problems due to a full disk. As far as I know
this only happens with crashes and other out-of-happy-path scenarios.  

I am thinking of writing a script to clean up local folders that runs before
task-manager starts between restarts in the case of a crash.

Assuming local recovery is not configured, what should I delete and what
should I leave around?  

What should I keep if local recovery is configured?

  

Under the "taskmanager.tmp.dirs" I see:

blobStore-*  
flink-dist-cache-*  
flink-io-*  
localState/*  
rocksdb-lib-*

  

Thanks  


Re: local disk cleanup after crash

Posted by Gary Yao <ga...@ververica.com>.
Hi,

If no other TaskManager (TM) is running, you can delete everything. If
multiple TMs share the same host, as far as I know, you will have to parse
TM
logs to know what directories you can delete [1]. As for local recovery,
tasks
that were running on a crashed TM are lost. From the documentation [2]:

    If a task manager is lost, the local state from all its task is lost.

Therefore, assuming that only one TM is running on each host, you can delete
everything.

Best,
Gary

[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/What-are-blobstore-files-and-why-do-they-keep-filling-up-tmp-directory-td26323.html
[2]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#task-local-recovery

On Thu, Mar 7, 2019 at 10:45 PM Derek VerLee <de...@gmail.com> wrote:

> I think that effort is put in to have task managers clean up their
> folders, however I have noticed that in some cases local folders are not
> cleaned up and can build up, eventually causing problems due to a full
> disk.  As far as I know this only happens with crashes and other
> out-of-happy-path scenarios.
>
> I am thinking of writing a script to clean up local folders that runs
> before task-manager starts between restarts in the case of a crash.
>
> Assuming local recovery is not configured, what should I delete and what
> should I leave around?
>
> What should I keep if local recovery is configured?
>
>
> Under the "taskmanager.tmp.dirs" I see:
>
> blobStore-*
> flink-dist-cache-*
> flink-io-*
> localState/*
> rocksdb-lib-*
>
>
> Thanks
>