You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Till Rohrmann (JIRA)" <ji...@apache.org> on 2017/06/29 12:47:00 UTC
[jira] [Resolved] (FLINK-4356) new JobManager HA
[ https://issues.apache.org/jira/browse/FLINK-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Till Rohrmann resolved FLINK-4356.
----------------------------------
Resolution: Later
> new JobManager HA
> -----------------
>
> Key: FLINK-4356
> URL: https://issues.apache.org/jira/browse/FLINK-4356
> Project: Flink
> Issue Type: Sub-task
> Components: Cluster Management
> Reporter: jingzhang
>
> 1. for standalone mode, LocalDispatcher watch JobMaster
> LocalDispatcher detect the failure of JobMaster, recover jobGraph and Libraries from persistent storage, spawn a new JobManager
> new JobMaster compete for leadership, save address to zookeeper storage
> new JobMaster registers at ResourceManager
> new JobMaster recover Execution of its job (execution graph) from latest completed checkpoint
> 2. for yarn mode, YarnApplicationMasterRunner create a ProcessReaper of JobMaster
> ProcessReaper monitor JobMaster, kill JVM upon JobMaster termination
> Yarn will create a new AppMaster which contains a new JobManager, JobGraph and Libraries are retrieved as startup artifacts
> new JobMaster compete for leadership, save address to zookeeper storage
> new JobMaster registers at ResourceManager
> new JobMaster recover Execution of its job (execution graph) from latest completed checkpoint
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)