You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Boris Litvak <bo...@skf.com> on 2020/06/04 07:11:11 UTC
[Spark RDD] Persisting Spark RDDs across spark contexts/applications
- options
I would like to cache Apache Spark RDDs and share them between Spark applications.
Alluxio (Tachyon), Redis & Ignite all offer such capabilities.
For instance, see Ignite's proposal:
[cid:image003.png@01D63A58.74971600]
Are there any comparison studies on performance/maintenance burden/installation experience of the above frameworks?
If you have you had any experience using spark with any of these technologies, please share.
Thanks, Boris
Re: [Spark RDD] Persisting Spark RDDs across spark
contexts/applications - options
Posted by Bin Fan <fa...@gmail.com>.
Hi Boris,
This is actually why Alluxio (by-then Tachyon) was created initially in
AMPLab.
Checkout the documentation
https://docs.alluxio.io/os/user/stable/en/compute/Spark.html on persisting
RDD/Dataframes to Alluxio.
some example
https://www.alluxio.io/resources/case-studies/making-the-impossible-possible-with-alluxio-accelerate-spark-jobs-from-hours-to-seconds/
https://www.alluxio.io/blog/tencent-case-study-delivering-customized-news-to-over-100-million-users-per-month-with-alluxio/
<https://www.alluxio.io/resources/case-studies/making-the-impossible-possible-with-alluxio-accelerate-spark-jobs-from-hours-to-seconds/>
Happy to provide you more info
- Bin
On Thu, Jun 4, 2020 at 12:26 AM Boris Litvak <bo...@skf.com> wrote:
> I would like to cache Apache Spark RDDs and share them between Spark
> applications.
>
> Alluxio (Tachyon), Redis & Ignite all offer such capabilities.
>
> For instance, see Ignite's proposal:
>
> Are there any comparison studies on performance/maintenance
> burden/installation experience of the above frameworks?
>
> If you have you had any experience using spark with any of these
> technologies, please share.
>
> Thanks, Boris
>
>
>