You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Haripriya Ayyalasomayajula <ah...@gmail.com> on 2015/08/11 08:38:13 UTC

Re: Controlling number of executors on Mesos vs YARN

Hi Tim,

Spark on Yarn allows us to do it using --num-executors and --executor_cores
commandline arguments. I just got a chance to look at a similar spark user
list mail, but no answer yet. So does mesos allow setting the number of
executors and cores? Is there a default number it assumes?

On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen <ti...@mesosphere.io> wrote:

> Forgot to hit reply-all.
>
> ---------- Forwarded message ----------
> From: Tim Chen <ti...@mesosphere.io>
> Date: Sun, Jan 4, 2015 at 10:46 PM
> Subject: Re: Controlling number of executors on Mesos vs YARN
> To: mvle <mv...@us.ibm.com>
>
>
> Hi Mike,
>
> You're correct there is no such setting in for Mesos coarse grain mode,
> since the assumption is that each node is launched with one container and
> Spark is launching multiple tasks in that container.
>
> In fine-grain mode there isn't a setting like that, as it currently will
> launch an executor as long as it satisfies the minimum container resource
> requirement.
>
> I've created a JIRA earlier about capping the number of executors or
> better distribute the # of executors launched in each node. Since the
> decision of choosing what node to launch containers is all in the Spark
> scheduler side, it's very easy to modify it.
>
> Btw, what's the configuration to set the # of executors on YARN side?
>
> Thanks,
>
> Tim
>
>
>
> On Sun, Jan 4, 2015 at 9:37 PM, mvle <mv...@us.ibm.com> wrote:
>
>> I'm trying to compare the performance of Spark running on Mesos vs YARN.
>> However, I am having problems being able to configure the Spark workload
>> to
>> run in a similar way on Mesos and YARN.
>>
>> When running Spark on YARN, you can specify the number of executors per
>> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
>> node. When running Spark on Mesos, there doesn't seem to be an equivalent
>> way to specify this. In Mesos, you can somewhat force this by specifying
>> the
>> number of CPU resources to be 6 when running the slave daemon. However,
>> this
>> seems to be a static configuration of the Mesos cluster rather something
>> that can be configured in the Spark framework.
>>
>> So here is my question:
>>
>> For Spark on Mesos, am I correct that there is no way to control the
>> number
>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>> coarse-grained mode, there is a way to specify max_cores but that is still
>> not equivalent to specifying the number of executors per node as when
>> Spark
>> is run on YARN.
>>
>> If I am correct, then it seems Spark might be at a disadvantage running on
>> Mesos compared to YARN (since it lacks the fine tuning ability provided by
>> YARN).
>>
>> Thanks,
>> Mike
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>
>


-- 
Regards,
Haripriya Ayyalasomayajula

Re: Controlling number of executors on Mesos vs YARN

Posted by Ajay Singal <as...@gmail.com>.
Tim,

The ability to specify fine-grain configuration could be useful for many
reasons.  Let's take an example of a node with 32 cores.  All of first, as
per my understanding, having 5 executors each with 6 cores will almost
always perform better than having a single executor with 30 cores .  Also,
these 5 executors could be a) used by the same application, or b) shared
amongst multiple applications.  In case of single executor with 30 cores,
some of the slots/core could be wasted if there are less number of Tasks
(from a single application) to be executed.

As I said, applications can specify desirable number of executors.  If not
available, Mesos (in a simple implementation) can provide/offer whatever is
available. In a slightly complex implementation, we can build a simple
protocol to negotiate.

Regards,
Ajay

On Wed, Aug 12, 2015 at 5:51 PM, Tim Chen <ti...@mesosphere.io> wrote:

> You're referring to both fine grain and coarse grain?
>
> Desirable number of executors per node could be interesting but it can't
> be guaranteed (or we could try to and when failed abort the job).
>
> How would you imagine this new option to actually work?
>
>
> Tim
>
> On Wed, Aug 12, 2015 at 11:48 AM, Ajay Singal <as...@gmail.com> wrote:
>
>> Hi Tim,
>>
>> An option like spark.mesos.executor.max to cap the number of executors
>> per node/application would be very useful.  However, having an option like spark.mesos.executor.num
>> to specify desirable number of executors per node would provide even/much
>> better control.
>>
>> Thanks,
>> Ajay
>>
>> On Wed, Aug 12, 2015 at 4:18 AM, Tim Chen <ti...@mesosphere.io> wrote:
>>
>>> Yes the options are not that configurable yet but I think it's not hard
>>> to change it.
>>>
>>> I have a patch out actually specifically able to configure amount of
>>> cpus per executor in coarse grain mode, and hopefully merged next release.
>>>
>>> I think the open question now is for fine grain mode can we limit the
>>> number of maximum concurrent executors, and I think we can definitely just
>>> add a new option like spark.mesos.executor.max to cap it.
>>>
>>> I'll file a jira and hopefully to get this change in soon too.
>>>
>>> Tim
>>>
>>>
>>>
>>> On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <
>>> aharipriya92@gmail.com> wrote:
>>>
>>>> Spark evolved as an example framework for Mesos - thats how I know it.
>>>> It is surprising to see that the options provided by mesos in this case are
>>>> less. Tweaking the source code, haven't done it yet but I would love to see
>>>> what options could be there!
>>>>
>>>> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam <ch...@gmail.com>
>>>> wrote:
>>>>
>>>>> My experience with Mesos + Spark is not great. I saw one executor with
>>>>> 30 CPU and the other executor with 6. So I don't think you can easily
>>>>> configure it without some tweaking at the source code.
>>>>>
>>>>> Sent from my iPad
>>>>>
>>>>> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <
>>>>> aharipriya92@gmail.com> wrote:
>>>>>
>>>>> Hi Tim,
>>>>>
>>>>> Spark on Yarn allows us to do it using --num-executors and
>>>>> --executor_cores commandline arguments. I just got a chance to look at a
>>>>> similar spark user list mail, but no answer yet. So does mesos allow
>>>>> setting the number of executors and cores? Is there a default number it
>>>>> assumes?
>>>>>
>>>>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen <ti...@mesosphere.io> wrote:
>>>>>
>>>>>> Forgot to hit reply-all.
>>>>>>
>>>>>> ---------- Forwarded message ----------
>>>>>> From: Tim Chen <ti...@mesosphere.io>
>>>>>> Date: Sun, Jan 4, 2015 at 10:46 PM
>>>>>> Subject: Re: Controlling number of executors on Mesos vs YARN
>>>>>> To: mvle <mv...@us.ibm.com>
>>>>>>
>>>>>>
>>>>>> Hi Mike,
>>>>>>
>>>>>> You're correct there is no such setting in for Mesos coarse grain
>>>>>> mode, since the assumption is that each node is launched with one container
>>>>>> and Spark is launching multiple tasks in that container.
>>>>>>
>>>>>> In fine-grain mode there isn't a setting like that, as it currently
>>>>>> will launch an executor as long as it satisfies the minimum container
>>>>>> resource requirement.
>>>>>>
>>>>>> I've created a JIRA earlier about capping the number of executors or
>>>>>> better distribute the # of executors launched in each node. Since the
>>>>>> decision of choosing what node to launch containers is all in the Spark
>>>>>> scheduler side, it's very easy to modify it.
>>>>>>
>>>>>> Btw, what's the configuration to set the # of executors on YARN side?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Tim
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Jan 4, 2015 at 9:37 PM, mvle <mv...@us.ibm.com> wrote:
>>>>>>
>>>>>>> I'm trying to compare the performance of Spark running on Mesos vs
>>>>>>> YARN.
>>>>>>> However, I am having problems being able to configure the Spark
>>>>>>> workload to
>>>>>>> run in a similar way on Mesos and YARN.
>>>>>>>
>>>>>>> When running Spark on YARN, you can specify the number of executors
>>>>>>> per
>>>>>>> node. So if I have a node with 4 CPUs, I can specify 6 executors on
>>>>>>> that
>>>>>>> node. When running Spark on Mesos, there doesn't seem to be an
>>>>>>> equivalent
>>>>>>> way to specify this. In Mesos, you can somewhat force this by
>>>>>>> specifying the
>>>>>>> number of CPU resources to be 6 when running the slave daemon.
>>>>>>> However, this
>>>>>>> seems to be a static configuration of the Mesos cluster rather
>>>>>>> something
>>>>>>> that can be configured in the Spark framework.
>>>>>>>
>>>>>>> So here is my question:
>>>>>>>
>>>>>>> For Spark on Mesos, am I correct that there is no way to control the
>>>>>>> number
>>>>>>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>>>>>>> coarse-grained mode, there is a way to specify max_cores but that is
>>>>>>> still
>>>>>>> not equivalent to specifying the number of executors per node as
>>>>>>> when Spark
>>>>>>> is run on YARN.
>>>>>>>
>>>>>>> If I am correct, then it seems Spark might be at a disadvantage
>>>>>>> running on
>>>>>>> Mesos compared to YARN (since it lacks the fine tuning ability
>>>>>>> provided by
>>>>>>> YARN).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>>> Nabble.com.
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Haripriya Ayyalasomayajula
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Haripriya Ayyalasomayajula
>>>>
>>>>
>>>
>>
>

Re: Controlling number of executors on Mesos vs YARN

Posted by Tim Chen <ti...@mesosphere.io>.
You're referring to both fine grain and coarse grain?

Desirable number of executors per node could be interesting but it can't be
guaranteed (or we could try to and when failed abort the job).

How would you imagine this new option to actually work?


Tim

On Wed, Aug 12, 2015 at 11:48 AM, Ajay Singal <as...@gmail.com> wrote:

> Hi Tim,
>
> An option like spark.mesos.executor.max to cap the number of executors
> per node/application would be very useful.  However, having an option like spark.mesos.executor.num
> to specify desirable number of executors per node would provide even/much
> better control.
>
> Thanks,
> Ajay
>
> On Wed, Aug 12, 2015 at 4:18 AM, Tim Chen <ti...@mesosphere.io> wrote:
>
>> Yes the options are not that configurable yet but I think it's not hard
>> to change it.
>>
>> I have a patch out actually specifically able to configure amount of cpus
>> per executor in coarse grain mode, and hopefully merged next release.
>>
>> I think the open question now is for fine grain mode can we limit the
>> number of maximum concurrent executors, and I think we can definitely just
>> add a new option like spark.mesos.executor.max to cap it.
>>
>> I'll file a jira and hopefully to get this change in soon too.
>>
>> Tim
>>
>>
>>
>> On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <
>> aharipriya92@gmail.com> wrote:
>>
>>> Spark evolved as an example framework for Mesos - thats how I know it.
>>> It is surprising to see that the options provided by mesos in this case are
>>> less. Tweaking the source code, haven't done it yet but I would love to see
>>> what options could be there!
>>>
>>> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam <ch...@gmail.com> wrote:
>>>
>>>> My experience with Mesos + Spark is not great. I saw one executor with
>>>> 30 CPU and the other executor with 6. So I don't think you can easily
>>>> configure it without some tweaking at the source code.
>>>>
>>>> Sent from my iPad
>>>>
>>>> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <
>>>> aharipriya92@gmail.com> wrote:
>>>>
>>>> Hi Tim,
>>>>
>>>> Spark on Yarn allows us to do it using --num-executors and
>>>> --executor_cores commandline arguments. I just got a chance to look at a
>>>> similar spark user list mail, but no answer yet. So does mesos allow
>>>> setting the number of executors and cores? Is there a default number it
>>>> assumes?
>>>>
>>>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen <ti...@mesosphere.io> wrote:
>>>>
>>>>> Forgot to hit reply-all.
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> From: Tim Chen <ti...@mesosphere.io>
>>>>> Date: Sun, Jan 4, 2015 at 10:46 PM
>>>>> Subject: Re: Controlling number of executors on Mesos vs YARN
>>>>> To: mvle <mv...@us.ibm.com>
>>>>>
>>>>>
>>>>> Hi Mike,
>>>>>
>>>>> You're correct there is no such setting in for Mesos coarse grain
>>>>> mode, since the assumption is that each node is launched with one container
>>>>> and Spark is launching multiple tasks in that container.
>>>>>
>>>>> In fine-grain mode there isn't a setting like that, as it currently
>>>>> will launch an executor as long as it satisfies the minimum container
>>>>> resource requirement.
>>>>>
>>>>> I've created a JIRA earlier about capping the number of executors or
>>>>> better distribute the # of executors launched in each node. Since the
>>>>> decision of choosing what node to launch containers is all in the Spark
>>>>> scheduler side, it's very easy to modify it.
>>>>>
>>>>> Btw, what's the configuration to set the # of executors on YARN side?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Jan 4, 2015 at 9:37 PM, mvle <mv...@us.ibm.com> wrote:
>>>>>
>>>>>> I'm trying to compare the performance of Spark running on Mesos vs
>>>>>> YARN.
>>>>>> However, I am having problems being able to configure the Spark
>>>>>> workload to
>>>>>> run in a similar way on Mesos and YARN.
>>>>>>
>>>>>> When running Spark on YARN, you can specify the number of executors
>>>>>> per
>>>>>> node. So if I have a node with 4 CPUs, I can specify 6 executors on
>>>>>> that
>>>>>> node. When running Spark on Mesos, there doesn't seem to be an
>>>>>> equivalent
>>>>>> way to specify this. In Mesos, you can somewhat force this by
>>>>>> specifying the
>>>>>> number of CPU resources to be 6 when running the slave daemon.
>>>>>> However, this
>>>>>> seems to be a static configuration of the Mesos cluster rather
>>>>>> something
>>>>>> that can be configured in the Spark framework.
>>>>>>
>>>>>> So here is my question:
>>>>>>
>>>>>> For Spark on Mesos, am I correct that there is no way to control the
>>>>>> number
>>>>>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>>>>>> coarse-grained mode, there is a way to specify max_cores but that is
>>>>>> still
>>>>>> not equivalent to specifying the number of executors per node as when
>>>>>> Spark
>>>>>> is run on YARN.
>>>>>>
>>>>>> If I am correct, then it seems Spark might be at a disadvantage
>>>>>> running on
>>>>>> Mesos compared to YARN (since it lacks the fine tuning ability
>>>>>> provided by
>>>>>> YARN).
>>>>>>
>>>>>> Thanks,
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>> Nabble.com.
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Haripriya Ayyalasomayajula
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Haripriya Ayyalasomayajula
>>>
>>>
>>
>

Re: Controlling number of executors on Mesos vs YARN

Posted by Ajay Singal <as...@gmail.com>.
Hi Tim,

An option like spark.mesos.executor.max to cap the number of executors per
node/application would be very useful.  However, having an option like
spark.mesos.executor.num
to specify desirable number of executors per node would provide even/much
better control.

Thanks,
Ajay

On Wed, Aug 12, 2015 at 4:18 AM, Tim Chen <ti...@mesosphere.io> wrote:

> Yes the options are not that configurable yet but I think it's not hard to
> change it.
>
> I have a patch out actually specifically able to configure amount of cpus
> per executor in coarse grain mode, and hopefully merged next release.
>
> I think the open question now is for fine grain mode can we limit the
> number of maximum concurrent executors, and I think we can definitely just
> add a new option like spark.mesos.executor.max to cap it.
>
> I'll file a jira and hopefully to get this change in soon too.
>
> Tim
>
>
>
> On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <
> aharipriya92@gmail.com> wrote:
>
>> Spark evolved as an example framework for Mesos - thats how I know it. It
>> is surprising to see that the options provided by mesos in this case are
>> less. Tweaking the source code, haven't done it yet but I would love to see
>> what options could be there!
>>
>> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam <ch...@gmail.com> wrote:
>>
>>> My experience with Mesos + Spark is not great. I saw one executor with
>>> 30 CPU and the other executor with 6. So I don't think you can easily
>>> configure it without some tweaking at the source code.
>>>
>>> Sent from my iPad
>>>
>>> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <
>>> aharipriya92@gmail.com> wrote:
>>>
>>> Hi Tim,
>>>
>>> Spark on Yarn allows us to do it using --num-executors and
>>> --executor_cores commandline arguments. I just got a chance to look at a
>>> similar spark user list mail, but no answer yet. So does mesos allow
>>> setting the number of executors and cores? Is there a default number it
>>> assumes?
>>>
>>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen <ti...@mesosphere.io> wrote:
>>>
>>>> Forgot to hit reply-all.
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: Tim Chen <ti...@mesosphere.io>
>>>> Date: Sun, Jan 4, 2015 at 10:46 PM
>>>> Subject: Re: Controlling number of executors on Mesos vs YARN
>>>> To: mvle <mv...@us.ibm.com>
>>>>
>>>>
>>>> Hi Mike,
>>>>
>>>> You're correct there is no such setting in for Mesos coarse grain mode,
>>>> since the assumption is that each node is launched with one container and
>>>> Spark is launching multiple tasks in that container.
>>>>
>>>> In fine-grain mode there isn't a setting like that, as it currently
>>>> will launch an executor as long as it satisfies the minimum container
>>>> resource requirement.
>>>>
>>>> I've created a JIRA earlier about capping the number of executors or
>>>> better distribute the # of executors launched in each node. Since the
>>>> decision of choosing what node to launch containers is all in the Spark
>>>> scheduler side, it's very easy to modify it.
>>>>
>>>> Btw, what's the configuration to set the # of executors on YARN side?
>>>>
>>>> Thanks,
>>>>
>>>> Tim
>>>>
>>>>
>>>>
>>>> On Sun, Jan 4, 2015 at 9:37 PM, mvle <mv...@us.ibm.com> wrote:
>>>>
>>>>> I'm trying to compare the performance of Spark running on Mesos vs
>>>>> YARN.
>>>>> However, I am having problems being able to configure the Spark
>>>>> workload to
>>>>> run in a similar way on Mesos and YARN.
>>>>>
>>>>> When running Spark on YARN, you can specify the number of executors per
>>>>> node. So if I have a node with 4 CPUs, I can specify 6 executors on
>>>>> that
>>>>> node. When running Spark on Mesos, there doesn't seem to be an
>>>>> equivalent
>>>>> way to specify this. In Mesos, you can somewhat force this by
>>>>> specifying the
>>>>> number of CPU resources to be 6 when running the slave daemon.
>>>>> However, this
>>>>> seems to be a static configuration of the Mesos cluster rather
>>>>> something
>>>>> that can be configured in the Spark framework.
>>>>>
>>>>> So here is my question:
>>>>>
>>>>> For Spark on Mesos, am I correct that there is no way to control the
>>>>> number
>>>>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>>>>> coarse-grained mode, there is a way to specify max_cores but that is
>>>>> still
>>>>> not equivalent to specifying the number of executors per node as when
>>>>> Spark
>>>>> is run on YARN.
>>>>>
>>>>> If I am correct, then it seems Spark might be at a disadvantage
>>>>> running on
>>>>> Mesos compared to YARN (since it lacks the fine tuning ability
>>>>> provided by
>>>>> YARN).
>>>>>
>>>>> Thanks,
>>>>> Mike
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Haripriya Ayyalasomayajula
>>>
>>>
>>
>>
>> --
>> Regards,
>> Haripriya Ayyalasomayajula
>>
>>
>

Re: Controlling number of executors on Mesos vs YARN

Posted by Jerry Lam <ch...@gmail.com>.
Great stuff Tim. This definitely will make Mesos users life easier

Sent from my iPad

On 2015-08-12, at 11:52, Haripriya Ayyalasomayajula <ah...@gmail.com> wrote:

> Thanks Tim, Jerry.
> 
> On Wed, Aug 12, 2015 at 1:18 AM, Tim Chen <ti...@mesosphere.io> wrote:
> Yes the options are not that configurable yet but I think it's not hard to change it.
> 
> I have a patch out actually specifically able to configure amount of cpus per executor in coarse grain mode, and hopefully merged next release.
> 
> I think the open question now is for fine grain mode can we limit the number of maximum concurrent executors, and I think we can definitely just add a new option like spark.mesos.executor.max to cap it. 
> 
> I'll file a jira and hopefully to get this change in soon too.
> 
> Tim
> 
> 
> 
> On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <ah...@gmail.com> wrote:
> Spark evolved as an example framework for Mesos - thats how I know it. It is surprising to see that the options provided by mesos in this case are less. Tweaking the source code, haven't done it yet but I would love to see what options could be there! 
> 
> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam <ch...@gmail.com> wrote:
> My experience with Mesos + Spark is not great. I saw one executor with 30 CPU and the other executor with 6. So I don't think you can easily configure it without some tweaking at the source code.
> 
> Sent from my iPad
> 
> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <ah...@gmail.com> wrote:
> 
>> Hi Tim,
>> 
>> Spark on Yarn allows us to do it using --num-executors and --executor_cores commandline arguments. I just got a chance to look at a similar spark user list mail, but no answer yet. So does mesos allow setting the number of executors and cores? Is there a default number it assumes?
>> 
>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen <ti...@mesosphere.io> wrote:
>> Forgot to hit reply-all.
>> 
>> ---------- Forwarded message ----------
>> From: Tim Chen <ti...@mesosphere.io>
>> Date: Sun, Jan 4, 2015 at 10:46 PM
>> Subject: Re: Controlling number of executors on Mesos vs YARN
>> To: mvle <mv...@us.ibm.com>
>> 
>> 
>> Hi Mike,
>> 
>> You're correct there is no such setting in for Mesos coarse grain mode, since the assumption is that each node is launched with one container and Spark is launching multiple tasks in that container.
>> 
>> In fine-grain mode there isn't a setting like that, as it currently will launch an executor as long as it satisfies the minimum container resource requirement.
>> 
>> I've created a JIRA earlier about capping the number of executors or better distribute the # of executors launched in each node. Since the decision of choosing what node to launch containers is all in the Spark scheduler side, it's very easy to modify it.
>> 
>> Btw, what's the configuration to set the # of executors on YARN side?
>> 
>> Thanks,
>> 
>> Tim
>> 
>> 
>> 
>> On Sun, Jan 4, 2015 at 9:37 PM, mvle <mv...@us.ibm.com> wrote:
>> I'm trying to compare the performance of Spark running on Mesos vs YARN.
>> However, I am having problems being able to configure the Spark workload to
>> run in a similar way on Mesos and YARN.
>> 
>> When running Spark on YARN, you can specify the number of executors per
>> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
>> node. When running Spark on Mesos, there doesn't seem to be an equivalent
>> way to specify this. In Mesos, you can somewhat force this by specifying the
>> number of CPU resources to be 6 when running the slave daemon. However, this
>> seems to be a static configuration of the Mesos cluster rather something
>> that can be configured in the Spark framework.
>> 
>> So here is my question:
>> 
>> For Spark on Mesos, am I correct that there is no way to control the number
>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>> coarse-grained mode, there is a way to specify max_cores but that is still
>> not equivalent to specifying the number of executors per node as when Spark
>> is run on YARN.
>> 
>> If I am correct, then it seems Spark might be at a disadvantage running on
>> Mesos compared to YARN (since it lacks the fine tuning ability provided by
>> YARN).
>> 
>> Thanks,
>> Mike
>> 
>> 
>> 
>> --
>> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> Regards,
>> Haripriya Ayyalasomayajula 
>> 
> 
> 
> 
> -- 
> Regards,
> Haripriya Ayyalasomayajula 
> 
> 
> 
> 
> 
> -- 
> Regards,
> Haripriya Ayyalasomayajula 
> 

Re: Controlling number of executors on Mesos vs YARN

Posted by Tim Chen <ti...@mesosphere.io>.
Yes the options are not that configurable yet but I think it's not hard to
change it.

I have a patch out actually specifically able to configure amount of cpus
per executor in coarse grain mode, and hopefully merged next release.

I think the open question now is for fine grain mode can we limit the
number of maximum concurrent executors, and I think we can definitely just
add a new option like spark.mesos.executor.max to cap it.

I'll file a jira and hopefully to get this change in soon too.

Tim



On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <
aharipriya92@gmail.com> wrote:

> Spark evolved as an example framework for Mesos - thats how I know it. It
> is surprising to see that the options provided by mesos in this case are
> less. Tweaking the source code, haven't done it yet but I would love to see
> what options could be there!
>
> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam <ch...@gmail.com> wrote:
>
>> My experience with Mesos + Spark is not great. I saw one executor with 30
>> CPU and the other executor with 6. So I don't think you can easily
>> configure it without some tweaking at the source code.
>>
>> Sent from my iPad
>>
>> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <
>> aharipriya92@gmail.com> wrote:
>>
>> Hi Tim,
>>
>> Spark on Yarn allows us to do it using --num-executors and
>> --executor_cores commandline arguments. I just got a chance to look at a
>> similar spark user list mail, but no answer yet. So does mesos allow
>> setting the number of executors and cores? Is there a default number it
>> assumes?
>>
>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen <ti...@mesosphere.io> wrote:
>>
>>> Forgot to hit reply-all.
>>>
>>> ---------- Forwarded message ----------
>>> From: Tim Chen <ti...@mesosphere.io>
>>> Date: Sun, Jan 4, 2015 at 10:46 PM
>>> Subject: Re: Controlling number of executors on Mesos vs YARN
>>> To: mvle <mv...@us.ibm.com>
>>>
>>>
>>> Hi Mike,
>>>
>>> You're correct there is no such setting in for Mesos coarse grain mode,
>>> since the assumption is that each node is launched with one container and
>>> Spark is launching multiple tasks in that container.
>>>
>>> In fine-grain mode there isn't a setting like that, as it currently will
>>> launch an executor as long as it satisfies the minimum container resource
>>> requirement.
>>>
>>> I've created a JIRA earlier about capping the number of executors or
>>> better distribute the # of executors launched in each node. Since the
>>> decision of choosing what node to launch containers is all in the Spark
>>> scheduler side, it's very easy to modify it.
>>>
>>> Btw, what's the configuration to set the # of executors on YARN side?
>>>
>>> Thanks,
>>>
>>> Tim
>>>
>>>
>>>
>>> On Sun, Jan 4, 2015 at 9:37 PM, mvle <mv...@us.ibm.com> wrote:
>>>
>>>> I'm trying to compare the performance of Spark running on Mesos vs YARN.
>>>> However, I am having problems being able to configure the Spark
>>>> workload to
>>>> run in a similar way on Mesos and YARN.
>>>>
>>>> When running Spark on YARN, you can specify the number of executors per
>>>> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
>>>> node. When running Spark on Mesos, there doesn't seem to be an
>>>> equivalent
>>>> way to specify this. In Mesos, you can somewhat force this by
>>>> specifying the
>>>> number of CPU resources to be 6 when running the slave daemon. However,
>>>> this
>>>> seems to be a static configuration of the Mesos cluster rather something
>>>> that can be configured in the Spark framework.
>>>>
>>>> So here is my question:
>>>>
>>>> For Spark on Mesos, am I correct that there is no way to control the
>>>> number
>>>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>>>> coarse-grained mode, there is a way to specify max_cores but that is
>>>> still
>>>> not equivalent to specifying the number of executors per node as when
>>>> Spark
>>>> is run on YARN.
>>>>
>>>> If I am correct, then it seems Spark might be at a disadvantage running
>>>> on
>>>> Mesos compared to YARN (since it lacks the fine tuning ability provided
>>>> by
>>>> YARN).
>>>>
>>>> Thanks,
>>>> Mike
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com
>>>> .
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>> Haripriya Ayyalasomayajula
>>
>>
>
>
> --
> Regards,
> Haripriya Ayyalasomayajula
>
>

Re: Controlling number of executors on Mesos vs YARN

Posted by Haripriya Ayyalasomayajula <ah...@gmail.com>.
Spark evolved as an example framework for Mesos - thats how I know it. It
is surprising to see that the options provided by mesos in this case are
less. Tweaking the source code, haven't done it yet but I would love to see
what options could be there!

On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam <ch...@gmail.com> wrote:

> My experience with Mesos + Spark is not great. I saw one executor with 30
> CPU and the other executor with 6. So I don't think you can easily
> configure it without some tweaking at the source code.
>
> Sent from my iPad
>
> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <ah...@gmail.com>
> wrote:
>
> Hi Tim,
>
> Spark on Yarn allows us to do it using --num-executors and
> --executor_cores commandline arguments. I just got a chance to look at a
> similar spark user list mail, but no answer yet. So does mesos allow
> setting the number of executors and cores? Is there a default number it
> assumes?
>
> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen <ti...@mesosphere.io> wrote:
>
>> Forgot to hit reply-all.
>>
>> ---------- Forwarded message ----------
>> From: Tim Chen <ti...@mesosphere.io>
>> Date: Sun, Jan 4, 2015 at 10:46 PM
>> Subject: Re: Controlling number of executors on Mesos vs YARN
>> To: mvle <mv...@us.ibm.com>
>>
>>
>> Hi Mike,
>>
>> You're correct there is no such setting in for Mesos coarse grain mode,
>> since the assumption is that each node is launched with one container and
>> Spark is launching multiple tasks in that container.
>>
>> In fine-grain mode there isn't a setting like that, as it currently will
>> launch an executor as long as it satisfies the minimum container resource
>> requirement.
>>
>> I've created a JIRA earlier about capping the number of executors or
>> better distribute the # of executors launched in each node. Since the
>> decision of choosing what node to launch containers is all in the Spark
>> scheduler side, it's very easy to modify it.
>>
>> Btw, what's the configuration to set the # of executors on YARN side?
>>
>> Thanks,
>>
>> Tim
>>
>>
>>
>> On Sun, Jan 4, 2015 at 9:37 PM, mvle <mv...@us.ibm.com> wrote:
>>
>>> I'm trying to compare the performance of Spark running on Mesos vs YARN.
>>> However, I am having problems being able to configure the Spark workload
>>> to
>>> run in a similar way on Mesos and YARN.
>>>
>>> When running Spark on YARN, you can specify the number of executors per
>>> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
>>> node. When running Spark on Mesos, there doesn't seem to be an equivalent
>>> way to specify this. In Mesos, you can somewhat force this by specifying
>>> the
>>> number of CPU resources to be 6 when running the slave daemon. However,
>>> this
>>> seems to be a static configuration of the Mesos cluster rather something
>>> that can be configured in the Spark framework.
>>>
>>> So here is my question:
>>>
>>> For Spark on Mesos, am I correct that there is no way to control the
>>> number
>>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>>> coarse-grained mode, there is a way to specify max_cores but that is
>>> still
>>> not equivalent to specifying the number of executors per node as when
>>> Spark
>>> is run on YARN.
>>>
>>> If I am correct, then it seems Spark might be at a disadvantage running
>>> on
>>> Mesos compared to YARN (since it lacks the fine tuning ability provided
>>> by
>>> YARN).
>>>
>>> Thanks,
>>> Mike
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>>
>>
>
>
> --
> Regards,
> Haripriya Ayyalasomayajula
>
>


-- 
Regards,
Haripriya Ayyalasomayajula

Re: Controlling number of executors on Mesos vs YARN

Posted by Jerry Lam <ch...@gmail.com>.
My experience with Mesos + Spark is not great. I saw one executor with 30 CPU and the other executor with 6. So I don't think you can easily configure it without some tweaking at the source code.

Sent from my iPad

On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <ah...@gmail.com> wrote:

> Hi Tim,
> 
> Spark on Yarn allows us to do it using --num-executors and --executor_cores commandline arguments. I just got a chance to look at a similar spark user list mail, but no answer yet. So does mesos allow setting the number of executors and cores? Is there a default number it assumes?
> 
> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen <ti...@mesosphere.io> wrote:
> Forgot to hit reply-all.
> 
> ---------- Forwarded message ----------
> From: Tim Chen <ti...@mesosphere.io>
> Date: Sun, Jan 4, 2015 at 10:46 PM
> Subject: Re: Controlling number of executors on Mesos vs YARN
> To: mvle <mv...@us.ibm.com>
> 
> 
> Hi Mike,
> 
> You're correct there is no such setting in for Mesos coarse grain mode, since the assumption is that each node is launched with one container and Spark is launching multiple tasks in that container.
> 
> In fine-grain mode there isn't a setting like that, as it currently will launch an executor as long as it satisfies the minimum container resource requirement.
> 
> I've created a JIRA earlier about capping the number of executors or better distribute the # of executors launched in each node. Since the decision of choosing what node to launch containers is all in the Spark scheduler side, it's very easy to modify it.
> 
> Btw, what's the configuration to set the # of executors on YARN side?
> 
> Thanks,
> 
> Tim
> 
> 
> 
> On Sun, Jan 4, 2015 at 9:37 PM, mvle <mv...@us.ibm.com> wrote:
> I'm trying to compare the performance of Spark running on Mesos vs YARN.
> However, I am having problems being able to configure the Spark workload to
> run in a similar way on Mesos and YARN.
> 
> When running Spark on YARN, you can specify the number of executors per
> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
> node. When running Spark on Mesos, there doesn't seem to be an equivalent
> way to specify this. In Mesos, you can somewhat force this by specifying the
> number of CPU resources to be 6 when running the slave daemon. However, this
> seems to be a static configuration of the Mesos cluster rather something
> that can be configured in the Spark framework.
> 
> So here is my question:
> 
> For Spark on Mesos, am I correct that there is no way to control the number
> of executors per node (assuming an idle cluster)? For Spark on Mesos
> coarse-grained mode, there is a way to specify max_cores but that is still
> not equivalent to specifying the number of executors per node as when Spark
> is run on YARN.
> 
> If I am correct, then it seems Spark might be at a disadvantage running on
> Mesos compared to YARN (since it lacks the fine tuning ability provided by
> YARN).
> 
> Thanks,
> Mike
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
> 
> 
> 
> 
> 
> 
> -- 
> Regards,
> Haripriya Ayyalasomayajula 
>