You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bjorn Jonsson (JIRA)" <ji...@apache.org> on 2016/08/18 00:28:20 UTC

[jira] [Created] (SPARK-17119) Add configuration property to allow the history server to delete .inprogress files

Bjorn Jonsson created SPARK-17119:
-------------------------------------

             Summary: Add configuration property to allow the history server to delete .inprogress files
                 Key: SPARK-17119
                 URL: https://issues.apache.org/jira/browse/SPARK-17119
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 2.0.0
            Reporter: Bjorn Jonsson
            Priority: Minor


The History Server (HS) currently only considers completed applications when deleting event logs from spark.history.fs.logDirectory (since SPARK-6879). This means that over time, .inprogress files (from failed jobs, jobs where the SparkContext is not closed, spark-shell exits etc...) can accumulate and impact the HS.

Instead of having to manually delete these files, maybe users could have the option of telling the HS to delete all files where (now - attempt.lastUpdated) > spark.history.fs.cleaner.maxAge, or just delete .inprogress files with lastUpdated older then 7d?

https://github.com/apache/spark/blob/d6dc12ef0146ae409834c78737c116050961f350/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#L467





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org