You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Timothee Besset <tt...@ttimo.net> on 2014/01/31 23:42:10 UTC

Single application using all the cores - preventing other applications from running

Hello,

What are my options to balance resources between multiple applications
running against a Spark cluster?

I am using the standalone cluster [1] setup on my local machine, and
starting a single application uses all the available cores. As long as that
first application is running, no other application does any processing.

I tried to run more workers using less cores with SPARK_WORKER_CORES, but
the single application still takes everything (see
https://dl.dropboxusercontent.com/u/1529870/spark%20-%20multiple%20applications.png).

Is there any strategy to reallocate resources based on number of
applications running against the cluster, or is the design mostly geared
towards having a single application running at a time?

Thank you,
TTimo

[1] http://spark.incubator.apache.org/docs/latest/spark-standalone.html

Re: Single application using all the cores - preventing other applications from running

Posted by Timothee Besset <tt...@ttimo.net>.
Thank you!

TTimo


On Fri, Jan 31, 2014 at 4:48 PM, Matei Zaharia <ma...@gmail.com>wrote:

> You can set the spark.cores.max property in your application to limit the
> maximum number of cores it will take. Checko ut
> http://spark.incubator.apache.org/docs/latest/spark-standalone.html#resource-scheduling.
> It's also possible to control scheduling in more detail within a Spark
> application, or if you run on other cluster managers, like Mesos. That's
> described in more detail here:
> http://spark.incubator.apache.org/docs/latest/job-scheduling.html.
>
> Matei
>
> On Jan 31, 2014, at 2:42 PM, Timothee Besset <tt...@ttimo.net> wrote:
>
> Hello,
>
> What are my options to balance resources between multiple applications
> running against a Spark cluster?
>
> I am using the standalone cluster [1] setup on my local machine, and
> starting a single application uses all the available cores. As long as that
> first application is running, no other application does any processing.
>
> I tried to run more workers using less cores with SPARK_WORKER_CORES, but
> the single application still takes everything (see
> https://dl.dropboxusercontent.com/u/1529870/spark%20-%20multiple%20applications.png).
>
> Is there any strategy to reallocate resources based on number of
> applications running against the cluster, or is the design mostly geared
> towards having a single application running at a time?
>
> Thank you,
> TTimo
>
> [1] http://spark.incubator.apache.org/docs/latest/spark-standalone.html
>
>
>

Re: Single application using all the cores - preventing other applications from running

Posted by Matei Zaharia <ma...@gmail.com>.
You can set the spark.cores.max property in your application to limit the maximum number of cores it will take. Checko ut http://spark.incubator.apache.org/docs/latest/spark-standalone.html#resource-scheduling. It’s also possible to control scheduling in more detail within a Spark application, or if you run on other cluster managers, like Mesos. That’s described in more detail here: http://spark.incubator.apache.org/docs/latest/job-scheduling.html.

Matei

On Jan 31, 2014, at 2:42 PM, Timothee Besset <tt...@ttimo.net> wrote:

> Hello,
> 
> What are my options to balance resources between multiple applications running against a Spark cluster?
> 
> I am using the standalone cluster [1] setup on my local machine, and starting a single application uses all the available cores. As long as that first application is running, no other application does any processing.
> 
> I tried to run more workers using less cores with SPARK_WORKER_CORES, but the single application still takes everything (see https://dl.dropboxusercontent.com/u/1529870/spark%20-%20multiple%20applications.png ).
> 
> Is there any strategy to reallocate resources based on number of applications running against the cluster, or is the design mostly geared towards having a single application running at a time?
> 
> Thank you,
> TTimo
> 
> [1] http://spark.incubator.apache.org/docs/latest/spark-standalone.html
> 


Re: Single application using all the cores - preventing other applications from running

Posted by Mayur Rustagi <ma...@gmail.com>.
Go for Fair scheduler and different weights. Default is FIFO. If you are
feeling adventurous try out sparrow scheduler .
Regards
Mayur
On Feb 1, 2014 4:12 AM, "Timothee Besset" <tt...@ttimo.net> wrote:

> Hello,
>
> What are my options to balance resources between multiple applications
> running against a Spark cluster?
>
> I am using the standalone cluster [1] setup on my local machine, and
> starting a single application uses all the available cores. As long as that
> first application is running, no other application does any processing.
>
> I tried to run more workers using less cores with SPARK_WORKER_CORES, but
> the single application still takes everything (see
> https://dl.dropboxusercontent.com/u/1529870/spark%20-%20multiple%20applications.png).
>
> Is there any strategy to reallocate resources based on number of
> applications running against the cluster, or is the design mostly geared
> towards having a single application running at a time?
>
> Thank you,
> TTimo
>
> [1] http://spark.incubator.apache.org/docs/latest/spark-standalone.html
>
>