You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Han JU <ju...@gmail.com> on 2014/04/29 10:24:02 UTC

Spark RDD cache memory usage

Hi,

By default a fraction of the executor memory (60%) is reserved for RDD
caching, so if there's no explicit caching in the code (eg. rdd.cache()
etc.), or if we persist RDD with StorageLevel.DISK_ONLY, is this part of
memory wated? Does Spark allocates the RDD cache memory dynamically? Or
does spark automatically caches RDDs when it can?

Thanks.
-- 
*JU Han*

Data Engineer @ Botify.com

+33 0619608888

Fwd: Spark RDD cache memory usage

Posted by Han JU <ju...@gmail.com>.
Hi,

As I understand, by default in Spark a fraction of the executor memory
(60%) is reserved for RDD caching. So if there's no explicit caching in the
code (eg. rdd.cache() etc.), or if we persist RDD with
StorageLevel.DISK_ONLY, is this part of memory wasted? Does Spark allocates
the RDD cache memory dynamically? Or does spark automatically caches RDDs
when it can?

I've posted this question in user list but got no response there, so I try
the dev list. Sorry for spam.

Thanks.

-- 
*JU Han*

Data Engineer @ Botify.com

+33 0619608888

Fwd: Spark RDD cache memory usage

Posted by Han JU <ju...@gmail.com>.
Hi,

By default a fraction of the executor memory (60%) is reserved for RDD
caching, so if there's no explicit caching in the code (eg. rdd.cache()
etc.), or if we persist RDD with StorageLevel.DISK_ONLY, is this part of
memory wated? Does Spark allocates the RDD cache memory dynamically? Or
does spark automatically caches RDDs when it can?

Thanks.

-- 
*JU Han*

Data Engineer @ Botify.com

+33 0619608888