You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Petr Novak <os...@gmail.com> on 2016/02/26 12:40:45 UTC

Standalone vs. Mesos for production installation on a smallish cluster

Hi all,
I believe that it used to be in documentation that Standalone mode is not
for production. I'm either wrong or it was already removed.

Having a small cluster between 5-10 nodes is Standalone recommended for
production? I would like to go with Mesos but the question is if there is
real add-on value for production, mainly from stability perspective.

Can I expect that adding Mesos will improve stability compared to
Standalone to the extent to justify itself according to somewhat increased
complexity?

I know it is hard to answer because Mesos layer itself is going to add some
bugs as well.

Are there unique features enabled by Mesos specific to Spark? E.g. adaptive
resources for jobs or whatever?

In the future once cluster will grow and more services running on Mesos, we
plan to use Mesos. The question is if it does worth to go with it
immediately even maybe its utility is not directly needed at this point.

Many thanks,
Petr

Re: Standalone vs. Mesos for production installation on a smallish cluster

Posted by Tim Chen <ti...@mesosphere.io>.
Mesos does provide some benefits and features, such as the ability to
launch all the Spark pieces in Docker and also Mesos resource scheduling
features (weights, roles), and if you plan to also use HDFS/Cassandra there
are existing frameworks that are actively maintained by us.

That said when there is just 5 nodes and you just want to use Spark without
any other frameworks and not to add complexity I would also suggest use
Standalone.

Tim

On Fri, Feb 26, 2016 at 3:51 AM, Igor Berman <ig...@gmail.com> wrote:

> Imho most of production clusters are standalone
> there was some presentation from spark summit with some stats inside(can't
> find right now), so standalone was at 1st place
> it was from Matei
> https://databricks.com/resources/slides
>
> On 26 February 2016 at 13:40, Petr Novak <os...@gmail.com> wrote:
>
>> Hi all,
>> I believe that it used to be in documentation that Standalone mode is not
>> for production. I'm either wrong or it was already removed.
>>
>> Having a small cluster between 5-10 nodes is Standalone recommended for
>> production? I would like to go with Mesos but the question is if there is
>> real add-on value for production, mainly from stability perspective.
>>
>> Can I expect that adding Mesos will improve stability compared to
>> Standalone to the extent to justify itself according to somewhat increased
>> complexity?
>>
>> I know it is hard to answer because Mesos layer itself is going to add
>> some bugs as well.
>>
>> Are there unique features enabled by Mesos specific to Spark? E.g.
>> adaptive resources for jobs or whatever?
>>
>> In the future once cluster will grow and more services running on Mesos,
>> we plan to use Mesos. The question is if it does worth to go with it
>> immediately even maybe its utility is not directly needed at this point.
>>
>> Many thanks,
>> Petr
>>
>
>

RE: Standalone vs. Mesos for production installation on a smallish cluster

Posted by Mohammed Guller <mo...@glassbeam.com>.
I think you may be referring to Spark Survey 2015. According to that survey, 48% use standalone, 40% use YARN and only 11% use Mesos (the numbers don’t add up to 100 – probably because of rounding error).

Mohammed
Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Igor Berman [mailto:igor.berman@gmail.com]
Sent: Friday, February 26, 2016 3:52 AM
To: Petr Novak
Cc: user
Subject: Re: Standalone vs. Mesos for production installation on a smallish cluster

Imho most of production clusters are standalone
there was some presentation from spark summit with some stats inside(can't find right now), so standalone was at 1st place
it was from Matei
https://databricks.com/resources/slides

On 26 February 2016 at 13:40, Petr Novak <os...@gmail.com>> wrote:
Hi all,
I believe that it used to be in documentation that Standalone mode is not for production. I'm either wrong or it was already removed.

Having a small cluster between 5-10 nodes is Standalone recommended for production? I would like to go with Mesos but the question is if there is real add-on value for production, mainly from stability perspective.

Can I expect that adding Mesos will improve stability compared to Standalone to the extent to justify itself according to somewhat increased complexity?

I know it is hard to answer because Mesos layer itself is going to add some bugs as well.

Are there unique features enabled by Mesos specific to Spark? E.g. adaptive resources for jobs or whatever?

In the future once cluster will grow and more services running on Mesos, we plan to use Mesos. The question is if it does worth to go with it immediately even maybe its utility is not directly needed at this point.

Many thanks,
Petr


Re: Standalone vs. Mesos for production installation on a smallish cluster

Posted by Igor Berman <ig...@gmail.com>.
Imho most of production clusters are standalone
there was some presentation from spark summit with some stats inside(can't
find right now), so standalone was at 1st place
it was from Matei
https://databricks.com/resources/slides

On 26 February 2016 at 13:40, Petr Novak <os...@gmail.com> wrote:

> Hi all,
> I believe that it used to be in documentation that Standalone mode is not
> for production. I'm either wrong or it was already removed.
>
> Having a small cluster between 5-10 nodes is Standalone recommended for
> production? I would like to go with Mesos but the question is if there is
> real add-on value for production, mainly from stability perspective.
>
> Can I expect that adding Mesos will improve stability compared to
> Standalone to the extent to justify itself according to somewhat increased
> complexity?
>
> I know it is hard to answer because Mesos layer itself is going to add
> some bugs as well.
>
> Are there unique features enabled by Mesos specific to Spark? E.g.
> adaptive resources for jobs or whatever?
>
> In the future once cluster will grow and more services running on Mesos,
> we plan to use Mesos. The question is if it does worth to go with it
> immediately even maybe its utility is not directly needed at this point.
>
> Many thanks,
> Petr
>