You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Dogtail Ray <sp...@gmail.com> on 2015/07/23 01:56:49 UTC

Comparison between Standalone mode and YARN mode

Hi,

I am very curious about the differences between Standalone mode and YARN
mode. According to
http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/,
it seems that YARN mode is always better than Standalone mode. Is that the
case? Or I should choose different modes according to my specific
requirements? Thanks!

Re: Comparison between Standalone mode and YARN mode

Posted by Dean Wampler <de...@gmail.com>.
YARN and Mesos are better for production clusters of "non-trivial" size
that have mixed job kinds and multiple users, as they manage resources more
intelligently and dynamically. They also support other services you
probably need, like HDFS, databases, workflow tools, etc.

Standalone is fine, though, if you have a limited number of jobs competing
for resources, for example a small cluster dedicated to ingesting or
processing a specific kind of data, or for a dev/QA cluster. Standalone
mode has much lower overhead, but you have to manage the daemon services
yourself, including configuration of Zookeeper if you need master failover.
Hence, you don't see it often in production scenarios.

The Spark page on cluster deployments has more details:
http://spark.apache.org/docs/latest/cluster-overview.html

dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
Typesafe <http://typesafe.com>
@deanwampler <http://twitter.com/deanwampler>
http://polyglotprogramming.com

On Wed, Jul 22, 2015 at 6:56 PM, Dogtail Ray <sp...@gmail.com> wrote:

> Hi,
>
> I am very curious about the differences between Standalone mode and YARN
> mode. According to
> http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/,
> it seems that YARN mode is always better than Standalone mode. Is that the
> case? Or I should choose different modes according to my specific
> requirements? Thanks!
>