You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Patrick Wendell (JIRA)" <ji...@apache.org> on 2014/10/07 01:23:34 UTC

[jira] [Created] (SPARK-3824) Spark SQL should cache in MEMORY_AND_DISK by default

Patrick Wendell created SPARK-3824:
--------------------------------------

             Summary: Spark SQL should cache in MEMORY_AND_DISK by default
                 Key: SPARK-3824
                 URL: https://issues.apache.org/jira/browse/SPARK-3824
             Project: Spark
          Issue Type: Bug
          Components: SQL
            Reporter: Patrick Wendell
            Assignee: Cheng Lian
            Priority: Blocker


Spark SQL currently uses MEMORY_ONLY as the default format. Due to the use of column buffers however, there is a huge cost to having to recompute blocks, much more so than Spark core. Especially since now we are more conservative about caching blocks and sometimes won't cache blocks we think might exceed memory, it seems good to keep persisted blocks on disk by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org