You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "R (JIRA)" <ji...@apache.org> on 2017/02/07 22:41:42 UTC
[jira] [Created] (SPARK-19504) clearCache fails to delete orphan
RDDs, especially in pyspark
R created SPARK-19504:
-------------------------
Summary: clearCache fails to delete orphan RDDs, especially in pyspark
Key: SPARK-19504
URL: https://issues.apache.org/jira/browse/SPARK-19504
Project: Spark
Issue Type: Bug
Components: Optimizer
Affects Versions: 2.1.0
Environment: Both pyspark and scala spark. Although scala spark uncaches some RDD types even if orphan
Reporter: R
Priority: Minor
x=sc.parallelize([1,3,10,9]).cache()
x.count()
x=sc.parallelize([1,3,10,9]).cache()
x.count()
sqlContex.clearCache()
Overwriting x will create an orphan RDD, which cannot be deleted with clearCache(). This happens in both scala and pyspark.
Similar thing happens for rdds created from dataframe in python
spark.read.csv(....).rdd()
However, in scala clearCache can get rid of some orphan rdd types.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org