You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2016/03/05 20:29:40 UTC

[jira] [Updated] (SPARK-1762) Add functionality to pin RDDs in cache

     [ https://issues.apache.org/jira/browse/SPARK-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Rosen updated SPARK-1762:
------------------------------
    Component/s: Block Manager

> Add functionality to pin RDDs in cache
> --------------------------------------
>
>                 Key: SPARK-1762
>                 URL: https://issues.apache.org/jira/browse/SPARK-1762
>             Project: Spark
>          Issue Type: Improvement
>          Components: Block Manager, Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Andrew Or
>
> Right now, all RDDs are created equal, and there is no mechanism to identify a certain RDD to be more important than the rest. This is a problem if the RDD fraction is small, because just caching a few RDDs can evict more important ones.
> A side effect of this feature is that we can now more safely allocate a smaller spark.storage.memoryFraction if we know how large our important RDDs are, without having to worry about them being evicted. This allows us to use more memory for shuffles, for instance, and avoid disk spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org