You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:03:09 UTC

[jira] [Updated] (SPARK-22616) df.cache() / df.persist() should have an option blocking like df.unpersist()

     [ https://issues.apache.org/jira/browse/SPARK-22616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-22616:
---------------------------------
    Labels: bulk-closed  (was: )

> df.cache() / df.persist() should have an option blocking like df.unpersist()
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-22616
>                 URL: https://issues.apache.org/jira/browse/SPARK-22616
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark, Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Andreas Maier
>            Priority: Minor
>              Labels: bulk-closed
>
> The method dataframe.unpersist() has an option blocking, which allows for eager unpersisting of a dataframe. On the other side the method dataframe.cache() and dataframe.persist() don't have a comparable option. A (undocumented) workaround for this is to call dataframe.count() directly after cache() or persist(). But for API consistency and convenience it would make sense to give cache() and persist() also the option blocking. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org