You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Michael Segel <ms...@hotmail.com> on 2016/11/07 17:28:59 UTC

How sensitive is Spark to Swap?

This may seem like a silly question, but it really isn’t. 
In terms of Map/Reduce, its possible to over subscribe the cluster because there is a lack of sensitivity if the servers swap memory to disk. 

In terms of HBase, which is very sensitive, swap doesn’t just kill performance, but also can kill HBase. (I’m sure one can tune it to be less sensitive…) 

But I have to ask how sensitive is Spark? 
Considering we can cache to disk (local disk) it would imply that it would less sensitive. 
Yet we see some posters facing over subscription and hitting OOME. 

Thoughts? 


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: How sensitive is Spark to Swap?

Posted by Sean Owen <so...@cloudera.com>.
Swapping is pretty bad here, especially because a JVM-based won't even feel
the memory pressure and try to GC or shrink the heap when the OS faces
memory pressure. It's probably relatively worse than in M/R because Spark
uses memory more. Enough grinding in swap will cause tasks to fail due to
timeouts, and because these failures are pretty correlated, will cause jobs
to die, messily. For that reason I think you always want to disable swap,
all the more so because disk I/O tends to be a bottleneck.

If you're using YARN, I do find its design encourages, kind of on purpose,
under-subscription of resources. You can probably safely over-subscribe
YARN memory, without resorting to swap.

On Mon, Nov 7, 2016 at 5:29 PM Michael Segel <ms...@hotmail.com>
wrote:

> This may seem like a silly question, but it really isn’t.
> In terms of Map/Reduce, its possible to over subscribe the cluster because
> there is a lack of sensitivity if the servers swap memory to disk.
>
> In terms of HBase, which is very sensitive, swap doesn’t just kill
> performance, but also can kill HBase. (I’m sure one can tune it to be less
> sensitive…)
>
> But I have to ask how sensitive is Spark?
> Considering we can cache to disk (local disk) it would imply that it would
> less sensitive.
> Yet we see some posters facing over subscription and hitting OOME.
>
> Thoughts?
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>