You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Rares Mirica (JIRA)" <ji...@apache.org> on 2015/12/04 17:06:10 UTC
[jira] [Commented] (SPARK-12147) Off heap storage and dynamicAllocation operation

    [ https://issues.apache.org/jira/browse/SPARK-12147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15041679#comment-15041679 ] 

Rares Mirica commented on SPARK-12147:
--------------------------------------

Sorry, I wasn't specific enough about the use-case and how to trigger/take advantage of this.

There is no need to cache data in the traditional sense (by calling .cache() on the RDD) so no on-heap space is required. One only needs to append .persist(OFF_HEAP) after the computation to take advantage of this. All of the data should therefore reside in OFF-HEAP storage (for the time being this is Tachyon). There is no alternative off-heap implementation so for taking advantage of this Tachyon is required, the only alternative would be to serialise the result of the expensive computation to disk (through a .saveX call) and then re-load the RDD through sparkContext.textFile (or equivalent, using parquet, java serialised objects).

The data should only live in one place: tachyon, and should be considered persisted (as it would through serialising and saving to hdfs) for the lifetime of the application. If this would be the case death or decommission of an executor would be completely decoupled from the data originatin in that executor and "cached" in tacyhon.

> Off heap storage and dynamicAllocation operation
> ------------------------------------------------
>
>                 Key: SPARK-12147
>                 URL: https://issues.apache.org/jira/browse/SPARK-12147
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.5.2
>         Environment: Cloudera Hadoop 2.6.0-cdh5.4.8
> Tachyon 0.7.1
> Yarn
>            Reporter: Rares Mirica
>            Priority: Minor
>         Attachments: spark-defaults.conf
>
>
> For the purpose of increasing computation density and efficiency I set up to test off-heap storage (using Tachyon) with dynamicAllocation enabled.
> Following the available documentation (programming-guide for Spark 1.5.2) I was expecting data to be cached in Tachyon for the lifetime of the application (driver instance) or until unpersist() is called. This belief was supported by the doc: "Cached data is not lost if individual executors crash." where with crash I also assimilate Graceful Decommission. Furthermore, in the GD description documented in the job-scheduling document cached data preservation through off-heap storage is also hinted at.
> Seeing how Tachyon is now in a state where these promises of a better future are well within reach, I consider it a bug that upon graceful decommission of an executor the off-heap data is deleted (presumably as part of the cleanup phase).
> Needless to say, enabling the preservation of the off-heap persisted data after graceful decommission for dynamic allocation would yield significant improvements in resource allocation, especially over yarn where executors use up compute "slots" even if idle. After a long, expensive, computation where we take advantage of the dynamically scaled executors, the rest of the spark jobs can use the cached data while releasing the compute resources for other cluster tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org