You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2014/10/07 04:10:33 UTC

[jira] [Commented] (SPARK-3824) Spark SQL should cache in MEMORY_AND_DISK by default

    [ https://issues.apache.org/jira/browse/SPARK-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161366#comment-14161366 ] 

Apache Spark commented on SPARK-3824:
-------------------------------------

User 'liancheng' has created a pull request for this issue:
https://github.com/apache/spark/pull/2686

> Spark SQL should cache in MEMORY_AND_DISK by default
> ----------------------------------------------------
>
>                 Key: SPARK-3824
>                 URL: https://issues.apache.org/jira/browse/SPARK-3824
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Patrick Wendell
>            Assignee: Cheng Lian
>            Priority: Blocker
>
> Spark SQL currently uses MEMORY_ONLY as the default format. Due to the use of column buffers however, there is a huge cost to having to recompute blocks, much more so than Spark core. Especially since now we are more conservative about caching blocks and sometimes won't cache blocks we think might exceed memory, it seems good to keep persisted blocks on disk by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org