You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Srikanth <sr...@gmail.com> on 2015/07/13 17:54:47 UTC

cache() VS cacheTable()

Hello,

I was reading "learning spark" book and saw a tip in chapter 9 that read
   "In Spark 1.2, the regular cache() method on RDDs also results in a
cacheTable()"

Is that true? When I cache a RDD and cache same data as a dataframe I see
that memory usage for dataframe cache is way less than RDD cache. I thought
this difference is due to columnar format used by dataframe. As per the
statement in the book, cache size should be similar.

Srikanth