You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/06/19 12:17:00 UTC

[jira] [Commented] (SPARK-5836) Highlight in Spark documentation that by default Spark does not delete its temporary files

    [ https://issues.apache.org/jira/browse/SPARK-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593293#comment-14593293 ] 

Apache Spark commented on SPARK-5836:
-------------------------------------

User 'srowen' has created a pull request for this issue:
https://github.com/apache/spark/pull/6901

> Highlight in Spark documentation that by default Spark does not delete its temporary files
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-5836
>                 URL: https://issues.apache.org/jira/browse/SPARK-5836
>             Project: Spark
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Tomasz Dudziak
>            Assignee: Ilya Ganelin
>            Priority: Minor
>             Fix For: 1.3.1, 1.4.0
>
>
> We recently learnt the hard way (in a prod system) that Spark by default does not delete its temporary files until it is stopped. WIthin a relatively short time span of heavy Spark use the disk of our prod machine filled up completely because of multiple shuffle files written to it. We think there should be better documentation around the fact that after a job is finished it leaves a lot of rubbish behind so that this does not come as a surprise.
> Probably a good place to highlight that fact would be the documentation of {{spark.local.dir}} property, which controls where Spark temporary files are written. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org