You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Jeff Kubina <je...@gmail.com> on 2018/03/07 21:04:39 UTC

How to get the Mesos Spark Framework to use multiple disks

From the spark documentation
<https://spark.apache.org/docs/latest/configuration.html#application-properties>
on spark.local.dir or SPARK_LOCAL_DIRS options it looks like the Mesos
Spark Framework can be configured to use multiple disks.

spark.local.dir Directory to use for "scratch" space in Spark, including
map output files and RDDs that get stored on disk. This should be on a
fast, local disk in your system. *It can also be a comma-separated list of
multiple directories on different disks.* NOTE: *In Spark 1.0 and later
this will be overridden by SPARK_LOCAL_DIRS (Standalone, Mesos)* or
LOCAL_DIRS (YARN) environment variables set by the cluster manager.

Does anyone have experience doing this or have guidance on how to configure
this?  Would I need to configure multiple disks for persistent volumes for
the Spark Framework to use the disks?