You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Matt Narrell <ma...@gmail.com> on 2015/03/31 02:23:34 UTC

Spark Streaming on YARN with loss of application master

I’m looking at various HA scenarios with Spark streaming.  We’re currently running a Spark streaming job that is intended to be long-lived, 24/7.  We see that if we kill node managers that are hosting Spark workers, new node managers assume execution of the jobs that were running on the stopped node manager.  However, if we stop the node manager that is hosting the application master, we found that the job is marked as FINISHED.  Is this expected behavior?  I assumed that the resource manager would resubmit or migrate the application master to another available node.  We’re in an isolated testing situation here, so we’re assured that the cluster has enough resources to sustain outages and resubmissions.

I’m reading that Hadoop MapReduce has this functionality built into it, so I would think that there is some equivalent Spark driver configuration to allow for YARN resource managers to reschedule Spark application masters should a job not respond within some threshold.

Thanks,
Matt
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org