You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jack Kolokasis <ko...@ics.forth.gr> on 2021/08/20 12:18:03 UTC

Re: Is memory-only no-disk Spark possible? [Marketing Mail]

Hello Jacek,

On 20/8/21 2:49 μ.μ., Jacek Laskowski wrote:
> Hi,
>
> I've been exploring BlockManager and the stores for a while now and am 
> tempted to say that a memory-only Spark setup would be possible 
> (except shuffle blocks). Is this correct?
Correct.
>
> What about shuffle blocks? Do they have to be stored on disk (in 
> DiskStore)?
Well, by default Spark stores shuffle blocks on disk.
>
> I think broadcast variables are in-memory first so except on-disk 
> storage level explicitly used (by Spark devs), there's no reason not 
> to have Spark in-memory only.
>
> (I was told that one of the differences between Trino/Presto vs Spark 
> SQL is that Trino keeps all processing in-memory only and will blow up 
> while Spark uses disk to avoid OOMEs).
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski <https://about.me/JacekLaskowski>
> "The Internals Of" Online Books <https://books.japila.pl/>
> Follow me on https://twitter.com/jaceklaskowski 
> <https://twitter.com/jaceklaskowski>
>
> <https://twitter.com/jaceklaskowski>
Best,
Iacovos