You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by uc...@apache.org on 2015/11/19 12:25:45 UTC
flink git commit: [docs] Add note about Znode root config for HA
setups
Repository: flink
Updated Branches:
refs/heads/release-0.10 821bcf5b6 -> 0e80b0540
[docs] Add note about Znode root config for HA setups
Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/0e80b054
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/0e80b054
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/0e80b054
Branch: refs/heads/release-0.10
Commit: 0e80b0540957dd7a020786ffedbf1feb2d42fc5a
Parents: 821bcf5
Author: Ufuk Celebi <uc...@apache.org>
Authored: Thu Nov 19 12:25:24 2015 +0100
Committer: Ufuk Celebi <uc...@apache.org>
Committed: Thu Nov 19 12:25:24 2015 +0100
----------------------------------------------------------------------
docs/setup/jobmanager_high_availability.md | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/flink/blob/0e80b054/docs/setup/jobmanager_high_availability.md
----------------------------------------------------------------------
diff --git a/docs/setup/jobmanager_high_availability.md b/docs/setup/jobmanager_high_availability.md
index 39d904f..125f163 100644
--- a/docs/setup/jobmanager_high_availability.md
+++ b/docs/setup/jobmanager_high_availability.md
@@ -68,6 +68,12 @@ In order to start an HA-cluster add the following configuration keys to `conf/fl
<pre>recovery.zookeeper.quorum: address1:2181[,...],addressX:2181</pre>
Each *addressX:port* refers to a ZooKeeper server, which is reachable by Flink at the given address and port.
+
+- **ZooKeeper root** (recommended): The *root ZooKeeper node*, under which all required coordination data is placed.
+
+ <pre>recovery.zookeeper.path.root: /flink # important: customize per cluster</pre>
+
+ **Important**: if you are running multiple Flink HA clusters, you have to manually configure seperate root nodes for each cluster.
- **State backend and storage directory** (required): JobManager meta data is persisted in the *state backend* and only a pointer to this state is stored in ZooKeeper. Currently, only the file system state backend is supported in HA mode.
@@ -78,7 +84,7 @@ recovery.zookeeper.storageDir: hdfs:///flink/recovery/</pre>
The `storageDir` stores all meta data needed to recover a JobManager failure.
-After configuring the masters and the ZooKeeper quorum, you can use the provided cluster startup scripts as usual. They will start a HA-cluster. **Keep in mind that the ZooKeeper quorum has to be running when you call the scripts**.
+After configuring the masters and the ZooKeeper quorum, you can use the provided cluster startup scripts as usual. They will start an HA-cluster. Keep in mind that the **ZooKeeper quorum has to be running** when you call the scripts and make sure to **configure a seperate ZooKeeper root path** for each HA cluster you are starting.
#### Example: Standalone Cluster with 2 JobManagers
@@ -87,6 +93,7 @@ After configuring the masters and the ZooKeeper quorum, you can use the provided
<pre>
recovery.mode: zookeeper
recovery.zookeeper.quorum: localhost:2181
+recovery.zookeeper.path.root: /flink # important: customize per cluster
state.backend: filesystem
state.backend.fs.checkpointdir: hdfs:///flink/checkpoints
recovery.zookeeper.storageDir: hdfs:///flink/recovery/</pre>
@@ -128,7 +135,7 @@ Stopping zookeeper daemon (pid: 7101) on host localhost.</pre>
## YARN Cluster High Availability
-When running a highly available YARN cluster, **we don't run multiple JobManager (ApplicationMaster) instances**, but only one, which is restarted by YARN. The exact behaviour depends on on the specific YARN version you are using.
+When running a highly available YARN cluster, **we don't run multiple JobManager (ApplicationMaster) instances**, but only one, which is restarted by YARN on failures. The exact behaviour depends on on the specific YARN version you are using.
### Configuration
@@ -169,6 +176,7 @@ This means that the application can be restarted 10 times before YARN fails the
<pre>
recovery.mode: zookeeper
recovery.zookeeper.quorum: localhost:2181
+recovery.zookeeper.path.root: /flink # important: customize per cluster
state.backend: filesystem
state.backend.fs.checkpointdir: hdfs:///flink/checkpoints
recovery.zookeeper.storageDir: hdfs:///flink/recovery/