You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by KhajaAsmath Mohammed <md...@gmail.com> on 2016/07/16 16:13:41 UTC
High availability with Spark
Hi,
could you please share your thoughts if anyone has idea on the below
topics.
- How to achieve high availability with spark cluster? I have referred
to the link *https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/exercises/spark-exercise-standalone-master-ha.html
<https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/exercises/spark-exercise-standalone-master-ha.html>*
. is there any other way to do in cluster mode?
- How to achieve high availability of spark driver? I have gone through
documentation that it is achieved through check pointing directory. is
there any other way?
- what is the procedure to know the number of messages that have been
consumed by the consumer? is there any way to tack the number of messages
consumed in spark streaming.
- I also want to save data from the spark streaming periodically and do
the aggregation on that. lets say, save date for every hour/day etc and do
aggregations on that.
Thanks,
Asmath.