You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "muhong (Jira)" <ji...@apache.org> on 2021/12/14 07:09:00 UTC

[jira] [Updated] (SPARK-37639) spark history server clean event log directory with out check status file

     [ https://issues.apache.org/jira/browse/SPARK-37639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

muhong updated SPARK-37639:
---------------------------
    Description: 
i foud a problem, the thrift server create event log file(.inprogress file create at init), and history server clean the application event log file according size and modtime. so there is a potential problem under this situation

*if the thrift server accept no quest long time(longer than time config by spark.history.fs.cleaner.maxAge), the history server will clean  the applicaiton log [directory] with the inprogress file; after clean  the thrift server accept a lot of request ,and will generate new event log directory without inprogress status file, and the director will never be clean by history server because it not contain status file. this will leads spack leak*

i think whenever create new log file , need to check wether the status file is exist, if not create it

last i think extra function need add, like log4j the compact file stii need to be clean after a period(config by user),so ,long run spark service like thrift server‘s event log file space can be limit in a config size

> spark history server clean event log directory with out check status file
> -------------------------------------------------------------------------
>
>                 Key: SPARK-37639
>                 URL: https://issues.apache.org/jira/browse/SPARK-37639
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.1.1
>            Reporter: muhong
>            Priority: Major
>
> i foud a problem, the thrift server create event log file(.inprogress file create at init), and history server clean the application event log file according size and modtime. so there is a potential problem under this situation
> *if the thrift server accept no quest long time(longer than time config by spark.history.fs.cleaner.maxAge), the history server will clean  the applicaiton log [directory] with the inprogress file; after clean  the thrift server accept a lot of request ,and will generate new event log directory without inprogress status file, and the director will never be clean by history server because it not contain status file. this will leads spack leak*
> i think whenever create new log file , need to check wether the status file is exist, if not create it
> last i think extra function need add, like log4j the compact file stii need to be clean after a period(config by user),so ,long run spark service like thrift server‘s event log file space can be limit in a config size



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org