You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ashok Kumar <as...@yahoo.com.INVALID> on 2017/02/05 09:11:15 UTC

High Availability/DR options for Spark applications

Hello,
What are the practiced High Availability/DR operations for Spark cluster at the moment. I am specially interested if YARN is used as the resource manager.
Thanks

Re: High Availability/DR options for Spark applications

Posted by Ashok Kumar <as...@yahoo.com.INVALID>.
Hi,
High Availability means that the system including Spark will carry on with minimal disruption in case of active component failure. DR or disaster recovery means total fail-over to another location with its own nodes. HDFS and Spark cluster
Thanks   

    On Sunday, 5 February 2017, 20:15, Jacek Laskowski <ja...@japila.pl> wrote:
 

 Hi,

I'm not very familiar with "High Availability/DR operations". Could
you explain what it is? My very limited understanding of the phrase
allows me to think that with YARN and cluster deploy mode you've
failure recovery for free so when your drivers dies YARN will attempt
to resurrect it a few times. The other "components", i.e. map shuffle
stages, partitions/tasks, are handled by Spark itself.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Sun, Feb 5, 2017 at 10:11 AM, Ashok Kumar
<as...@yahoo.com.invalid> wrote:
> Hello,
>
> What are the practiced High Availability/DR operations for Spark cluster at
> the moment. I am specially interested if YARN is used as the resource
> manager.
>
> Thanks

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org



   

Re: High Availability/DR options for Spark applications

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,

I'm not very familiar with "High Availability/DR operations". Could
you explain what it is? My very limited understanding of the phrase
allows me to think that with YARN and cluster deploy mode you've
failure recovery for free so when your drivers dies YARN will attempt
to resurrect it a few times. The other "components", i.e. map shuffle
stages, partitions/tasks, are handled by Spark itself.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Sun, Feb 5, 2017 at 10:11 AM, Ashok Kumar
<as...@yahoo.com.invalid> wrote:
> Hello,
>
> What are the practiced High Availability/DR operations for Spark cluster at
> the moment. I am specially interested if YARN is used as the resource
> manager.
>
> Thanks

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org