You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jens Rantil <je...@tink.se> on 2015/10/08 12:43:06 UTC

Best practises to clean up RDDs for old applications

Hi,

I have a couple of old application RDDs under /var/lib/spark/rdd that
haven't been properly cleaned up after themselves. Example:

# du -shx /var/lib/spark/rdd/*
44K /var/lib/spark/rdd/liblz4-java1011984124691611873.so
48K /var/lib/spark/rdd/snappy-1.0.5-libsnappyjava.so
2.3G /var/lib/spark/rdd/spark-local-20150903112858-a72d
23M /var/lib/spark/rdd/spark-local-20150929141201-143f

The applications (such as "20150903112858") aren't running anymore. What
are best practises to clean these up? A cron job? Enabling some kind of
cleaner in Spark? I'm currently running Spark 1.1, but will eventually move
to 1.2 and then 1.4.

Thanks,
Jens

-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>