You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yun Gao (Jira)" <ji...@apache.org> on 2022/04/13 06:28:05 UTC

[jira] [Updated] (FLINK-24122) Add support to do clean in history server

     [ https://issues.apache.org/jira/browse/FLINK-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yun Gao updated FLINK-24122:
----------------------------
    Fix Version/s: 1.16.0

> Add support to do clean in history server
> -----------------------------------------
>
>                 Key: FLINK-24122
>                 URL: https://issues.apache.org/jira/browse/FLINK-24122
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / REST
>            Reporter: zlzhang0122
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.15.0, 1.16.0
>
>
> Now, the history server can clean history jobs by two means:
>  # if users have configured 
> {code:java}
> historyserver.archive.clean-expired-jobs: true{code}
> , then compare the files in hdfs over two clean interval and find the delete and clean the local cache file.
>  # if users have configured the 
> {code:java}
> historyserver.archive.retained-jobs:{code}
> a positive number, then clean the oldest files in hdfs and local.
> But the retained-jobs number is difficult to determine.
> For example, users may want to check the history jobs yesterday while many jobs failed today and exceed the retained-jobs number, then the history jobs of yesterday will be delete. So what if add a configuration which contain a retained-times that indicate the max time the history job retain?
> Also it can't clean the job history files which was no longer in hdfs but still cached in local filesystem and these files will store forever and can't be cleaned unless users manually do this. Maybe we can give a option and do this clean if the option says true.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)