You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "R (JIRA)" <ji...@apache.org> on 2017/02/07 22:41:42 UTC

[jira] [Created] (SPARK-19504) clearCache fails to delete orphan RDDs, especially in pyspark

R created SPARK-19504:
-------------------------

             Summary: clearCache fails to delete orphan RDDs, especially in pyspark
                 Key: SPARK-19504
                 URL: https://issues.apache.org/jira/browse/SPARK-19504
             Project: Spark
          Issue Type: Bug
          Components: Optimizer
    Affects Versions: 2.1.0
         Environment: Both pyspark and scala spark. Although scala spark uncaches some RDD types even if orphan
            Reporter: R
            Priority: Minor


x=sc.parallelize([1,3,10,9]).cache()
x.count()
x=sc.parallelize([1,3,10,9]).cache()
x.count()
sqlContex.clearCache()

Overwriting x will create an orphan RDD, which cannot be deleted with clearCache(). This happens in both scala and pyspark.

Similar thing happens for rdds created from dataframe in python
spark.read.csv(....).rdd()
However, in scala clearCache can get rid of some orphan rdd types.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org