You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zookeeper.apache.org by an...@apache.org on 2018/11/09 16:41:42 UTC
[05/18] zookeeper git commit: ZOOKEEPER-3155: Remove Forrest XMLs and their build process from the …
http://git-wip-us.apache.org/repos/asf/zookeeper/blob/ab59048a/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml
----------------------------------------------------------------------
diff --git a/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml b/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml
deleted file mode 100644
index d82e234..0000000
--- a/zookeeper-docs/src/documentation/content/xdocs/zookeeperAdmin.xml
+++ /dev/null
@@ -1,2315 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<!DOCTYPE article PUBLIC "-//OASIS//DTD Simplified DocBook XML V1.0//EN"
-"http://www.oasis-open.org/docbook/xml/simple/1.0/sdocbook.dtd">
-<article id="bk_Admin">
- <title>ZooKeeper Administrator's Guide</title>
-
- <subtitle>A Guide to Deployment and Administration</subtitle>
-
- <articleinfo>
- <legalnotice>
- <para>Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. You may
- obtain a copy of the License at <ulink
- url="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</ulink>.</para>
-
- <para>Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an "AS IS"
- BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied. See the License for the specific language governing permissions
- and limitations under the License.</para>
- </legalnotice>
-
- <abstract>
- <para>This document contains information about deploying, administering
- and mantaining ZooKeeper. It also discusses best practices and common
- problems.</para>
- </abstract>
- </articleinfo>
-
- <section id="ch_deployment">
- <title>Deployment</title>
-
- <para>This section contains information about deploying Zookeeper and
- covers these topics:</para>
-
- <itemizedlist>
- <listitem>
- <para><xref linkend="sc_systemReq" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_zkMulitServerSetup" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_singleAndDevSetup" /></para>
- </listitem>
- </itemizedlist>
-
- <para>The first two sections assume you are interested in installing
- ZooKeeper in a production environment such as a datacenter. The final
- section covers situations in which you are setting up ZooKeeper on a
- limited basis - for evaluation, testing, or development - but not in a
- production environment.</para>
-
- <section id="sc_systemReq">
- <title>System Requirements</title>
-
- <section id="sc_supportedPlatforms">
- <title>Supported Platforms</title>
-
- <para>ZooKeeper consists of multiple components. Some components are
- supported broadly, and other components are supported only on a smaller
- set of platforms.</para>
-
- <itemizedlist>
- <listitem>
- <para><emphasis role="bold">Client</emphasis> is the Java client
- library, used by applications to connect to a ZooKeeper ensemble.
- </para>
- </listitem>
- <listitem>
- <para><emphasis role="bold">Server</emphasis> is the Java server
- that runs on the ZooKeeper ensemble nodes.</para>
- </listitem>
- <listitem>
- <para><emphasis role="bold">Native Client</emphasis> is a client
- implemented in C, similar to the Java client, used by applications
- to connect to a ZooKeeper ensemble.</para>
- </listitem>
- <listitem>
- <para><emphasis role="bold">Contrib</emphasis> refers to multiple
- optional add-on components.</para>
- </listitem>
- </itemizedlist>
-
- <para>The following matrix describes the level of support committed for
- running each component on different operating system platforms.</para>
-
- <table>
- <title>Support Matrix</title>
- <tgroup cols="5" align="left" colsep="1" rowsep="1">
- <thead>
- <row>
- <entry>Operating System</entry>
- <entry>Client</entry>
- <entry>Server</entry>
- <entry>Native Client</entry>
- <entry>Contrib</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>GNU/Linux</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- </row>
- <row>
- <entry>Solaris</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Not Supported</entry>
- <entry>Not Supported</entry>
- </row>
- <row>
- <entry>FreeBSD</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Not Supported</entry>
- <entry>Not Supported</entry>
- </row>
- <row>
- <entry>Windows</entry>
- <entry>Development and Production</entry>
- <entry>Development and Production</entry>
- <entry>Not Supported</entry>
- <entry>Not Supported</entry>
- </row>
- <row>
- <entry>Mac OS X</entry>
- <entry>Development Only</entry>
- <entry>Development Only</entry>
- <entry>Not Supported</entry>
- <entry>Not Supported</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- <para>For any operating system not explicitly mentioned as supported in
- the matrix, components may or may not work. The ZooKeeper community
- will fix obvious bugs that are reported for other platforms, but there
- is no full support.</para>
- </section>
-
- <section id="sc_requiredSoftware">
- <title>Required Software </title>
-
- <para>ZooKeeper runs in Java, release 1.8 or greater (JDK 8 or
- greater, FreeBSD support requires openjdk8). It runs as an
- <emphasis>ensemble</emphasis> of ZooKeeper servers. Three
- ZooKeeper servers is the minimum recommended size for an
- ensemble, and we also recommend that they run on separate
- machines. At Yahoo!, ZooKeeper is usually deployed on
- dedicated RHEL boxes, with dual-core processors, 2GB of RAM,
- and 80GB IDE hard drives.</para>
- </section>
-
- </section>
-
- <section id="sc_zkMulitServerSetup">
- <title>Clustered (Multi-Server) Setup</title>
-
- <para>For reliable ZooKeeper service, you should deploy ZooKeeper in a
- cluster known as an <emphasis>ensemble</emphasis>. As long as a majority
- of the ensemble are up, the service will be available. Because Zookeeper
- requires a majority, it is best to use an
- odd number of machines. For example, with four machines ZooKeeper can
- only handle the failure of a single machine; if two machines fail, the
- remaining two machines do not constitute a majority. However, with five
- machines ZooKeeper can handle the failure of two machines. </para>
- <note>
- <para>
- As mentioned in the
- <ulink url="zookeeperStarted.html">ZooKeeper Getting Started Guide</ulink>
- , a minimum of three servers are required for a fault tolerant
- clustered setup, and it is strongly recommended that you have an
- odd number of servers.
- </para>
- <para>Usually three servers is more than enough for a production
- install, but for maximum reliability during maintenance, you may
- wish to install five servers. With three servers, if you perform
- maintenance on one of them, you are vulnerable to a failure on one
- of the other two servers during that maintenance. If you have five
- of them running, you can take one down for maintenance, and know
- that you're still OK if one of the other four suddenly fails.
- </para>
- <para>Your redundancy considerations should include all aspects of
- your environment. If you have three ZooKeeper servers, but their
- network cables are all plugged into the same network switch, then
- the failure of that switch will take down your entire ensemble.
- </para>
- </note>
- <para>Here are the steps to setting a server that will be part of an
- ensemble. These steps should be performed on every host in the
- ensemble:</para>
-
- <orderedlist>
- <listitem>
- <para>Install the Java JDK. You can use the native packaging system
- for your system, or download the JDK from:</para>
-
- <para><ulink
- url="http://java.sun.com/javase/downloads/index.jsp">http://java.sun.com/javase/downloads/index.jsp</ulink></para>
- </listitem>
-
- <listitem>
- <para>Set the Java heap size. This is very important to avoid
- swapping, which will seriously degrade ZooKeeper performance. To
- determine the correct value, use load tests, and make sure you are
- well below the usage limit that would cause you to swap. Be
- conservative - use a maximum heap size of 3GB for a 4GB
- machine.</para>
- </listitem>
-
- <listitem>
- <para>Install the ZooKeeper Server Package. It can be downloaded
- from:
- </para>
- <para>
- <ulink url="http://zookeeper.apache.org/releases.html">
- http://zookeeper.apache.org/releases.html
- </ulink>
- </para>
- </listitem>
-
- <listitem>
- <para>Create a configuration file. This file can be called anything.
- Use the following settings as a starting point:</para>
-
- <programlisting>
-tickTime=2000
-dataDir=/var/lib/zookeeper/
-clientPort=2181
-initLimit=5
-syncLimit=2
-server.1=zoo1:2888:3888
-server.2=zoo2:2888:3888
-server.3=zoo3:2888:3888</programlisting>
-
- <para>You can find the meanings of these and other configuration
- settings in the section <xref linkend="sc_configuration" />. A word
- though about a few here:</para>
-
- <para>Every machine that is part of the ZooKeeper ensemble should know
- about every other machine in the ensemble. You accomplish this with
- the series of lines of the form <emphasis
- role="bold">server.id=host:port:port</emphasis>. The parameters <emphasis
- role="bold">host</emphasis> and <emphasis
- role="bold">port</emphasis> are straightforward. You attribute the
- server id to each machine by creating a file named
- <filename>myid</filename>, one for each server, which resides in
- that server's data directory, as specified by the configuration file
- parameter <emphasis role="bold">dataDir</emphasis>.</para></listitem>
-
- <listitem><para>The myid file
- consists of a single line containing only the text of that machine's
- id. So <filename>myid</filename> of server 1 would contain the text
- "1" and nothing else. The id must be unique within the
- ensemble and should have a value between 1 and 255. <emphasis role="bold">IMPORTANT:</emphasis> if you
- enable extended features such as TTL Nodes (see below) the id must be
- between 1 and 254 due to internal limitations.</para>
- </listitem>
-
- <listitem>
- <para>If your configuration file is set up, you can start a
- ZooKeeper server:</para>
-
- <para><computeroutput>$ java -cp zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.17.jar:conf \
- org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg
- </computeroutput></para>
-
- <para>QuorumPeerMain starts a ZooKeeper server,
- <ulink url="http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/">JMX</ulink>
- management beans are also registered which allows
- management through a JMX management console.
- The <ulink url="zookeeperJMX.html">ZooKeeper JMX
- document</ulink> contains details on managing ZooKeeper with JMX.
- </para>
-
- <para>See the script <emphasis>bin/zkServer.sh</emphasis>,
- which is included in the release, for an example
- of starting server instances.</para>
-
- </listitem>
-
- <listitem>
- <para>Test your deployment by connecting to the hosts:</para>
-
- <para>In Java, you can run the following command to execute
- simple operations:</para>
-
- <para><computeroutput>$ bin/zkCli.sh -server 127.0.0.1:2181</computeroutput></para>
- </listitem>
- </orderedlist>
- </section>
-
- <section id="sc_singleAndDevSetup">
- <title>Single Server and Developer Setup</title>
-
- <para>If you want to setup ZooKeeper for development purposes, you will
- probably want to setup a single server instance of ZooKeeper, and then
- install either the Java or C client-side libraries and bindings on your
- development machine.</para>
-
- <para>The steps to setting up a single server instance are the similar
- to the above, except the configuration file is simpler. You can find the
- complete instructions in the <ulink
- url="zookeeperStarted.html#sc_InstallingSingleMode">Installing and
- Running ZooKeeper in Single Server Mode</ulink> section of the <ulink
- url="zookeeperStarted.html">ZooKeeper Getting Started
- Guide</ulink>.</para>
-
- <para>For information on installing the client side libraries, refer to
- the <ulink url="zookeeperProgrammers.html#Bindings">Bindings</ulink>
- section of the <ulink url="zookeeperProgrammers.html">ZooKeeper
- Programmer's Guide</ulink>.</para>
- </section>
- </section>
-
- <section id="ch_administration">
- <title>Administration</title>
-
- <para>This section contains information about running and maintaining
- ZooKeeper and covers these topics: </para>
- <itemizedlist>
- <listitem>
- <para><xref linkend="sc_designing" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_provisioning" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_strengthsAndLimitations" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_administering" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_maintenance" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_supervision" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_monitoring" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_logging" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_troubleshooting" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_configuration" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_zkCommands" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_dataFileManagement" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_commonProblems" /></para>
- </listitem>
-
- <listitem>
- <para><xref linkend="sc_bestPractices" /></para>
- </listitem>
- </itemizedlist>
-
- <section id="sc_designing">
- <title>Designing a ZooKeeper Deployment</title>
-
- <para>The reliablity of ZooKeeper rests on two basic assumptions.</para>
- <orderedlist>
- <listitem><para> Only a minority of servers in a deployment
- will fail. <emphasis>Failure</emphasis> in this context
- means a machine crash, or some error in the network that
- partitions a server off from the majority.</para>
- </listitem>
- <listitem><para> Deployed machines operate correctly. To
- operate correctly means to execute code correctly, to have
- clocks that work properly, and to have storage and network
- components that perform consistently.</para>
- </listitem>
- </orderedlist>
-
- <para>The sections below contain considerations for ZooKeeper
- administrators to maximize the probability for these assumptions
- to hold true. Some of these are cross-machines considerations,
- and others are things you should consider for each and every
- machine in your deployment.</para>
-
- <section id="sc_CrossMachineRequirements">
- <title>Cross Machine Requirements</title>
-
- <para>For the ZooKeeper service to be active, there must be a
- majority of non-failing machines that can communicate with
- each other. To create a deployment that can tolerate the
- failure of F machines, you should count on deploying 2xF+1
- machines. Thus, a deployment that consists of three machines
- can handle one failure, and a deployment of five machines can
- handle two failures. Note that a deployment of six machines
- can only handle two failures since three machines is not a
- majority. For this reason, ZooKeeper deployments are usually
- made up of an odd number of machines.</para>
-
- <para>To achieve the highest probability of tolerating a failure
- you should try to make machine failures independent. For
- example, if most of the machines share the same switch,
- failure of that switch could cause a correlated failure and
- bring down the service. The same holds true of shared power
- circuits, cooling systems, etc.</para>
- </section>
-
- <section>
- <title>Single Machine Requirements</title>
-
- <para>If ZooKeeper has to contend with other applications for
- access to resources like storage media, CPU, network, or
- memory, its performance will suffer markedly. ZooKeeper has
- strong durability guarantees, which means it uses storage
- media to log changes before the operation responsible for the
- change is allowed to complete. You should be aware of this
- dependency then, and take great care if you want to ensure
- that ZooKeeper operations aren’t held up by your media. Here
- are some things you can do to minimize that sort of
- degradation:
- </para>
-
- <itemizedlist>
- <listitem>
- <para>ZooKeeper's transaction log must be on a dedicated
- device. (A dedicated partition is not enough.) ZooKeeper
- writes the log sequentially, without seeking Sharing your
- log device with other processes can cause seeks and
- contention, which in turn can cause multi-second
- delays.</para>
- </listitem>
-
- <listitem>
- <para>Do not put ZooKeeper in a situation that can cause a
- swap. In order for ZooKeeper to function with any sort of
- timeliness, it simply cannot be allowed to swap.
- Therefore, make certain that the maximum heap size given
- to ZooKeeper is not bigger than the amount of real memory
- available to ZooKeeper. For more on this, see
- <xref linkend="sc_commonProblems"/>
- below. </para>
- </listitem>
- </itemizedlist>
- </section>
- </section>
-
- <section id="sc_provisioning">
- <title>Provisioning</title>
-
- <para></para>
- </section>
-
- <section id="sc_strengthsAndLimitations">
- <title>Things to Consider: ZooKeeper Strengths and Limitations</title>
-
- <para></para>
- </section>
-
- <section id="sc_administering">
- <title>Administering</title>
-
- <para></para>
- </section>
-
- <section id="sc_maintenance">
- <title>Maintenance</title>
-
- <para>Little long term maintenance is required for a ZooKeeper
- cluster however you must be aware of the following:</para>
-
- <section>
- <title>Ongoing Data Directory Cleanup</title>
-
- <para>The ZooKeeper <ulink url="#var_datadir">Data
- Directory</ulink> contains files which are a persistent copy
- of the znodes stored by a particular serving ensemble. These
- are the snapshot and transactional log files. As changes are
- made to the znodes these changes are appended to a
- transaction log. Occasionally, when a log grows large, a
- snapshot of the current state of all znodes will be written
- to the filesystem and a new transaction log file is created
- for future transactions. During snapshotting, ZooKeeper may
- continue appending incoming transactions to the old log file.
- Therefore, some transactions which are newer than a snapshot
- may be found in the last transaction log preceding the
- snapshot.
- </para>
-
- <para>A ZooKeeper server <emphasis role="bold">will not remove
- old snapshots and log files</emphasis> when using the default
- configuration (see autopurge below), this is the
- responsibility of the operator. Every serving environment is
- different and therefore the requirements of managing these
- files may differ from install to install (backup for example).
- </para>
-
- <para>The PurgeTxnLog utility implements a simple retention
- policy that administrators can use. The <ulink
- url="ext:api/index">API docs</ulink> contains details on
- calling conventions (arguments, etc...).
- </para>
-
- <para>In the following example the last count snapshots and
- their corresponding logs are retained and the others are
- deleted. The value of <count> should typically be
- greater than 3 (although not required, this provides 3 backups
- in the unlikely event a recent log has become corrupted). This
- can be run as a cron job on the ZooKeeper server machines to
- clean up the logs daily.</para>
-
- <programlisting> java -cp zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.17.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count></programlisting>
-
- <para>Automatic purging of the snapshots and corresponding
- transaction logs was introduced in version 3.4.0 and can be
- enabled via the following configuration parameters <emphasis
- role="bold">autopurge.snapRetainCount</emphasis> and <emphasis
- role="bold">autopurge.purgeInterval</emphasis>. For more on
- this, see <xref linkend="sc_advancedConfiguration"/>
- below.</para>
- </section>
-
- <section>
- <title>Debug Log Cleanup (log4j)</title>
-
- <para>See the section on <ulink
- url="#sc_logging">logging</ulink> in this document. It is
- expected that you will setup a rolling file appender using the
- in-built log4j feature. The sample configuration file in the
- release tar's conf/log4j.properties provides an example of
- this.
- </para>
- </section>
-
- </section>
-
- <section id="sc_supervision">
- <title>Supervision</title>
-
- <para>You will want to have a supervisory process that manages
- each of your ZooKeeper server processes (JVM). The ZK server is
- designed to be "fail fast" meaning that it will shutdown
- (process exit) if an error occurs that it cannot recover
- from. As a ZooKeeper serving cluster is highly reliable, this
- means that while the server may go down the cluster as a whole
- is still active and serving requests. Additionally, as the
- cluster is "self healing" the failed server once restarted will
- automatically rejoin the ensemble w/o any manual
- interaction.</para>
-
- <para>Having a supervisory process such as <ulink
- url="http://cr.yp.to/daemontools.html">daemontools</ulink> or
- <ulink
- url="http://en.wikipedia.org/wiki/Service_Management_Facility">SMF</ulink>
- (other options for supervisory process are also available, it's
- up to you which one you would like to use, these are just two
- examples) managing your ZooKeeper server ensures that if the
- process does exit abnormally it will automatically be restarted
- and will quickly rejoin the cluster.</para>
-
- <para>It is also recommended to configure the ZooKeeper server process to
- terminate and dump its heap if an
- <computeroutput>OutOfMemoryError</computeroutput> occurs. This is achieved
- by launching the JVM with the following arguments on Linux and Windows
- respectively. The <filename>zkServer.sh</filename> and
- <filename>zkServer.cmd</filename> scripts that ship with ZooKeeper set
- these options.
- </para>
-
- <programlisting>-XX:+HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError='kill -9 %p'</programlisting>
- <programlisting>"-XX:+HeapDumpOnOutOfMemoryError" "-XX:OnOutOfMemoryError=cmd /c taskkill /pid %%%%p /t /f"</programlisting>
- </section>
-
- <section id="sc_monitoring">
- <title>Monitoring</title>
-
- <para>The ZooKeeper service can be monitored in one of two
- primary ways; 1) the command port through the use of <ulink
- url="#sc_zkCommands">4 letter words</ulink> and 2) <ulink
- url="zookeeperJMX.html">JMX</ulink>. See the appropriate section for
- your environment/requirements.</para>
- </section>
-
- <section id="sc_logging">
- <title>Logging</title>
-
- <para>
- ZooKeeper uses <emphasis role="bold"><ulink url="http://www.slf4j.org">SLF4J</ulink></emphasis>
- version 1.7.5 as its logging infrastructure. For backward compatibility it is bound to
- <emphasis role="bold">LOG4J</emphasis> but you can use
- <emphasis role="bold"><ulink url="http://logback.qos.ch/">LOGBack</ulink></emphasis>
- or any other supported logging framework of your choice.
- </para>
- <para>
- The ZooKeeper default <filename>log4j.properties</filename>
- file resides in the <filename>conf</filename> directory. Log4j requires that
- <filename>log4j.properties</filename> either be in the working directory
- (the directory from which ZooKeeper is run) or be accessible from the classpath.
- </para>
-
- <para>For more information about SLF4J, see
- <ulink url="http://www.slf4j.org/manual.html">its manual</ulink>.</para>
-
- <para>For more information about LOG4J, see
- <ulink url="http://logging.apache.org/log4j/1.2/manual.html#defaultInit">Log4j Default Initialization Procedure</ulink>
- of the log4j manual.</para>
-
- </section>
-
- <section id="sc_troubleshooting">
- <title>Troubleshooting</title>
- <variablelist>
- <varlistentry>
- <term> Server not coming up because of file corruption</term>
- <listitem>
- <para>A server might not be able to read its database and fail to come up because of
- some file corruption in the transaction logs of the ZooKeeper server. You will
- see some IOException on loading ZooKeeper database. In such a case,
- make sure all the other servers in your ensemble are up and working. Use "stat"
- command on the command port to see if they are in good health. After you have verified that
- all the other servers of the ensemble are up, you can go ahead and clean the database
- of the corrupt server. Delete all the files in datadir/version-2 and datalogdir/version-2/.
- Restart the server.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section id="sc_configuration">
- <title>Configuration Parameters</title>
-
- <para>ZooKeeper's behavior is governed by the ZooKeeper configuration
- file. This file is designed so that the exact same file can be used by
- all the servers that make up a ZooKeeper server assuming the disk
- layouts are the same. If servers use different configuration files, care
- must be taken to ensure that the list of servers in all of the different
- configuration files match.</para>
-
- <note>
- <para>In 3.5.0 and later, some of these parameters should be placed in
- a dynamic configuration file. If they are placed in the static
- configuration file, ZooKeeper will automatically move them over to the
- dynamic configuration file. See <ulink url="zookeeperReconfig.html">
- Dynamic Reconfiguration</ulink> for more information.</para>
- </note>
-
- <section id="sc_minimumConfiguration">
- <title>Minimum Configuration</title>
-
- <para>Here are the minimum configuration keywords that must be defined
- in the configuration file:</para>
-
- <variablelist>
- <varlistentry>
- <term>clientPort</term>
-
- <listitem>
- <para>the port to listen for client connections; that is, the
- port that clients attempt to connect to.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>secureClientPort</term>
-
- <listitem>
- <para>the port to listen on for secure client connections using SSL.
-
- <emphasis role="bold">clientPort</emphasis> specifies
- the port for plaintext connections while <emphasis role="bold">
- secureClientPort</emphasis> specifies the port for SSL
- connections. Specifying both enables mixed-mode while omitting
- either will disable that mode.</para>
- <para>Note that SSL feature will be enabled when user plugs-in
- zookeeper.serverCnxnFactory, zookeeper.clientCnxnSocket as Netty.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry id="var_datadir">
- <term>dataDir</term>
-
- <listitem>
- <para>the location where ZooKeeper will store the in-memory
- database snapshots and, unless specified otherwise, the
- transaction log of updates to the database.</para>
-
- <note>
- <para>Be careful where you put the transaction log. A
- dedicated transaction log device is key to consistent good
- performance. Putting the log on a busy device will adversely
- effect performance.</para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry id="id_tickTime">
- <term>tickTime</term>
-
- <listitem>
- <para>the length of a single tick, which is the basic time unit
- used by ZooKeeper, as measured in milliseconds. It is used to
- regulate heartbeats, and timeouts. For example, the minimum
- session timeout will be two ticks.</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section id="sc_advancedConfiguration">
- <title>Advanced Configuration</title>
-
- <para>The configuration settings in the section are optional. You can
- use them to further fine tune the behaviour of your ZooKeeper servers.
- Some can also be set using Java system properties, generally of the
- form <emphasis>zookeeper.keyword</emphasis>. The exact system
- property, when available, is noted below.</para>
-
- <variablelist>
- <varlistentry>
- <term>dataLogDir</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>This option will direct the machine to write the
- transaction log to the <emphasis
- role="bold">dataLogDir</emphasis> rather than the <emphasis
- role="bold">dataDir</emphasis>. This allows a dedicated log
- device to be used, and helps avoid competition between logging
- and snaphots.</para>
-
- <note>
- <para>Having a dedicated log device has a large impact on
- throughput and stable latencies. It is highly recommened to
- dedicate a log device and set <emphasis
- role="bold">dataLogDir</emphasis> to point to a directory on
- that device, and then make sure to point <emphasis
- role="bold">dataDir</emphasis> to a directory
- <emphasis>not</emphasis> residing on that device.</para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>globalOutstandingLimit</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.globalOutstandingLimit.</emphasis>)</para>
-
- <para>Clients can submit requests faster than ZooKeeper can
- process them, especially if there are a lot of clients. To
- prevent ZooKeeper from running out of memory due to queued
- requests, ZooKeeper will throttle clients so that there is no
- more than globalOutstandingLimit outstanding requests in the
- system. The default limit is 1,000.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>preAllocSize</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.preAllocSize</emphasis>)</para>
-
- <para>To avoid seeks ZooKeeper allocates space in the
- transaction log file in blocks of preAllocSize kilobytes. The
- default block size is 64M. One reason for changing the size of
- the blocks is to reduce the block size if snapshots are taken
- more often. (Also, see <emphasis
- role="bold">snapCount</emphasis>).</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>snapCount</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.snapCount</emphasis>)</para>
-
- <para>ZooKeeper records its transactions using snapshots and
- a transaction log (think write-ahead log).The number of
- transactions recorded in the transaction log before a snapshot
- can be taken (and the transaction log rolled) is determined
- by snapCount. In order to prevent all of the machines in the quorum
- from taking a snapshot at the same time, each ZooKeeper server
- will take a snapshot when the number of transactions in the transaction log
- reaches a runtime generated random value in the [snapCount/2+1, snapCount]
- range.The default snapCount is 100,000.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>maxClientCnxns</term>
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Limits the number of concurrent connections (at the socket
- level) that a single client, identified by IP address, may make
- to a single member of the ZooKeeper ensemble. This is used to
- prevent certain classes of DoS attacks, including file
- descriptor exhaustion. The default is 60. Setting this to 0
- entirely removes the limit on concurrent connections.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>clientPortAddress</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> the
- address (ipv4, ipv6 or hostname) to listen for client
- connections; that is, the address that clients attempt
- to connect to. This is optional, by default we bind in
- such a way that any connection to the <emphasis
- role="bold">clientPort</emphasis> for any
- address/interface/nic on the server will be
- accepted.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>minSessionTimeout</term>
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.3.0:</emphasis> the
- minimum session timeout in milliseconds that the server
- will allow the client to negotiate. Defaults to 2 times
- the <emphasis role="bold">tickTime</emphasis>.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>maxSessionTimeout</term>
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.3.0:</emphasis> the
- maximum session timeout in milliseconds that the server
- will allow the client to negotiate. Defaults to 20 times
- the <emphasis role="bold">tickTime</emphasis>.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>fsync.warningthresholdms</term>
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.fsync.warningthresholdms</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.3.4:</emphasis> A
- warning message will be output to the log whenever an
- fsync in the Transactional Log (WAL) takes longer than
- this value. The values is specified in milliseconds and
- defaults to 1000. This value can only be set as a
- system property.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>autopurge.snapRetainCount</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.4.0:</emphasis>
- When enabled, ZooKeeper auto purge feature retains
- the <emphasis role="bold">autopurge.snapRetainCount</emphasis> most
- recent snapshots and the corresponding transaction logs in the
- <emphasis role="bold">dataDir</emphasis> and <emphasis
- role="bold">dataLogDir</emphasis> respectively and deletes the rest.
- Defaults to 3. Minimum value is 3.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>autopurge.purgeInterval</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.4.0:</emphasis> The
- time interval in hours for which the purge task has to
- be triggered. Set to a positive integer (1 and above)
- to enable the auto purging. Defaults to 0.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>syncEnabled</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.observer.syncEnabled</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.4.6, 3.5.0:</emphasis>
- The observers now log transaction and write snapshot to disk
- by default like the participants. This reduces the recovery time
- of the observers on restart. Set to "false" to disable this
- feature. Default is "true"</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>zookeeper.extendedTypesEnabled</term>
-
- <listitem>
- <para>(Java system property only: <emphasis
- role="bold">zookeeper.extendedTypesEnabled</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.5.4, 3.6.0:</emphasis> Define to "true" to enable
- extended features such as the creation of <ulink url="zookeeperProgrammers.html#TTL+Nodes">TTL Nodes</ulink>.
- They are disabled by default. IMPORTANT: when enabled server IDs must
- be less than 255 due to internal limitations.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>zookeeper.emulate353TTLNodes</term>
-
- <listitem>
- <para>(Java system property only: <emphasis
- role="bold">zookeeper.emulate353TTLNodes</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.5.4, 3.6.0:</emphasis> Due to
- <ulink url="https://issues.apache.org/jira/browse/ZOOKEEPER-2901">ZOOKEEPER-2901</ulink> TTL nodes
- created in version 3.5.3 are not supported in 3.5.4/3.6.0. However, a workaround is provided via the
- zookeeper.emulate353TTLNodes system property. If you used TTL nodes in ZooKeeper 3.5.3 and need to maintain
- compatibility set <emphasis role="bold">zookeeper.emulate353TTLNodes</emphasis> to "true" in addition to
- <emphasis role="bold">zookeeper.extendedTypesEnabled</emphasis>. NOTE: due to the bug, server IDs
- must be 127 or less. Additionally, the maximum support TTL value is 1099511627775 which is smaller
- than what was allowed in 3.5.3 (1152921504606846975)</para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- </section>
-
- <section id="sc_clusterOptions">
- <title>Cluster Options</title>
-
- <para>The options in this section are designed for use with an ensemble
- of servers -- that is, when deploying clusters of servers.</para>
-
- <variablelist>
- <varlistentry>
- <term>electionAlg</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Election implementation to use. A value of "0" corresponds
- to the original UDP-based version, "1" corresponds to the
- non-authenticated UDP-based version of fast leader election, "2"
- corresponds to the authenticated UDP-based version of fast
- leader election, and "3" corresponds to TCP-based version of
- fast leader election. Currently, algorithm 3 is the default</para>
-
- <note>
- <para> The implementations of leader election 0, 1, and 2 are now
- <emphasis role="bold"> deprecated </emphasis>. We have the intention
- of removing them in the next release, at which point only the
- FastLeaderElection will be available.
- </para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>initLimit</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Amount of time, in ticks (see <ulink
- url="#id_tickTime">tickTime</ulink>), to allow followers to
- connect and sync to a leader. Increased this value as needed, if
- the amount of data managed by ZooKeeper is large.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>leaderServes</term>
-
- <listitem>
- <para>(Java system property: zookeeper.<emphasis
- role="bold">leaderServes</emphasis>)</para>
-
- <para>Leader accepts client connections. Default value is "yes".
- The leader machine coordinates updates. For higher update
- throughput at thes slight expense of read throughput the leader
- can be configured to not accept clients and focus on
- coordination. The default to this option is yes, which means
- that a leader will accept client connections.</para>
-
- <note>
- <para>Turning on leader selection is highly recommended when
- you have more than three ZooKeeper servers in an ensemble.</para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>server.x=[hostname]:nnnnn[:nnnnn], etc</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>servers making up the ZooKeeper ensemble. When the server
- starts up, it determines which server it is by looking for the
- file <filename>myid</filename> in the data directory. That file
- contains the server number, in ASCII, and it should match
- <emphasis role="bold">x</emphasis> in <emphasis
- role="bold">server.x</emphasis> in the left hand side of this
- setting.</para>
-
- <para>The list of servers that make up ZooKeeper servers that is
- used by the clients must match the list of ZooKeeper servers
- that each ZooKeeper server has.</para>
-
- <para>There are two port numbers <emphasis role="bold">nnnnn</emphasis>.
- The first followers use to connect to the leader, and the second is for
- leader election. The leader election port is only necessary if electionAlg
- is 1, 2, or 3 (default). If electionAlg is 0, then the second port is not
- necessary. If you want to test multiple servers on a single machine, then
- different ports can be used for each server.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>syncLimit</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Amount of time, in ticks (see <ulink
- url="#id_tickTime">tickTime</ulink>), to allow followers to sync
- with ZooKeeper. If followers fall too far behind a leader, they
- will be dropped.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>group.x=nnnnn[:nnnnn]</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Enables a hierarchical quorum construction."x" is a group identifier
- and the numbers following the "=" sign correspond to server identifiers.
- The left-hand side of the assignment is a colon-separated list of server
- identifiers. Note that groups must be disjoint and the union of all groups
- must be the ZooKeeper ensemble. </para>
-
- <para> You will find an example <ulink url="zookeeperHierarchicalQuorums.html">here</ulink>
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>weight.x=nnnnn</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para>Used along with "group", it assigns a weight to a server when
- forming quorums. Such a value corresponds to the weight of a server
- when voting. There are a few parts of ZooKeeper that require voting
- such as leader election and the atomic broadcast protocol. By default
- the weight of server is 1. If the configuration defines groups, but not
- weights, then a value of 1 will be assigned to all servers.
- </para>
-
- <para> You will find an example <ulink url="zookeeperHierarchicalQuorums.html">here</ulink>
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>cnxTimeout</term>
-
- <listitem>
- <para>(Java system property: zookeeper.<emphasis
- role="bold">cnxTimeout</emphasis>)</para>
-
- <para>Sets the timeout value for opening connections for leader election notifications.
- Only applicable if you are using electionAlg 3.
- </para>
-
- <note>
- <para>Default value is 5 seconds.</para>
- </note>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>standaloneEnabled</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.5.0:</emphasis>
- When set to false, a single server can be started in replicated
- mode, a lone participant can run with observers, and a cluster
- can reconfigure down to one node, and up from one node. The
- default is true for backwards compatibility. It can be set
- using QuorumPeerConfig's setStandaloneEnabled method or by
- adding "standaloneEnabled=false" or "standaloneEnabled=true"
- to a server's config file.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>reconfigEnabled</term>
-
- <listitem>
- <para>(No Java system property)</para>
-
- <para><emphasis role="bold">New in 3.5.3:</emphasis>
- This controls the enabling or disabling of
- <ulink url="zookeeperReconfig.html">
- Dynamic Reconfiguration</ulink> feature. When the feature
- is enabled, users can perform reconfigure operations through
- the ZooKeeper client API or through ZooKeeper command line tools
- assuming users are authorized to perform such operations.
- When the feature is disabled, no user, including the super user,
- can perform a reconfiguration. Any attempt to reconfigure will return an error.
- <emphasis role="bold">"reconfigEnabled"</emphasis> option can be set as
- <emphasis role="bold">"reconfigEnabled=false"</emphasis> or
- <emphasis role="bold">"reconfigEnabled=true"</emphasis>
- to a server's config file, or using QuorumPeerConfig's
- setReconfigEnabled method. The default value is false.
-
- If present, the value should be consistent across every server in
- the entire ensemble. Setting the value as true on some servers and false
- on other servers will cause inconsistent behavior depending on which server
- is elected as leader. If the leader has a setting of
- <emphasis role="bold">"reconfigEnabled=true"</emphasis>, then the ensemble
- will have reconfig feature enabled. If the leader has a setting of
- <emphasis role="bold">"reconfigEnabled=false"</emphasis>, then the ensemble
- will have reconfig feature disabled. It is thus recommended to have a consistent
- value for <emphasis role="bold">"reconfigEnabled"</emphasis> across servers
- in the ensemble.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>4lw.commands.whitelist</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.4lw.commands.whitelist</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.5.3:</emphasis>
- A list of comma separated <ulink url="#sc_4lw">Four Letter Words</ulink>
- commands that user wants to use. A valid Four Letter Words
- command must be put in this list else ZooKeeper server will
- not enable the command.
- By default the whitelist only contains "srvr" command
- which zkServer.sh uses. The rest of four letter word commands are disabled
- by default.
- </para>
-
- <para>Here's an example of the configuration that enables stat, ruok, conf, and isro
- command while disabling the rest of Four Letter Words command:</para>
- <programlisting>
- 4lw.commands.whitelist=stat, ruok, conf, isro
- </programlisting>
-
- <para>If you really need enable all four letter word commands by default, you can use
- the asterisk option so you don't have to include every command one by one in the list.
- As an example, this will enable all four letter word commands:
- </para>
- <programlisting>
- 4lw.commands.whitelist=*
- </programlisting>
-
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>tcpKeepAlive</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.tcpKeepAlive</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.5.4:</emphasis>
- Setting this to true sets the TCP keepAlive flag on the
- sockets used by quorum members to perform elections.
- This will allow for connections between quorum members to
- remain up when there is network infrastructure that may
- otherwise break them. Some NATs and firewalls may terminate
- or lose state for long running or idle connections.</para>
-
- <para> Enabling this option relies on OS level settings to work
- properly, check your operating system's options regarding TCP
- keepalive for more information. Defaults to
- <emphasis role="bold">false</emphasis>.
- </para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- <para></para>
- </section>
-
- <section id="sc_authOptions">
- <title>Encryption, Authentication, Authorization Options</title>
-
- <para>The options in this section allow control over
- encryption/authentication/authorization performed by the service.</para>
-
- <variablelist>
- <varlistentry>
- <term>DigestAuthenticationProvider.superDigest</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.DigestAuthenticationProvider.superDigest</emphasis>)</para>
-
- <para>By default this feature is <emphasis
- role="bold">disabled</emphasis></para>
-
- <para><emphasis role="bold">New in 3.2:</emphasis>
- Enables a ZooKeeper ensemble administrator to access the
- znode hierarchy as a "super" user. In particular no ACL
- checking occurs for a user authenticated as
- super.</para>
-
- <para>org.apache.zookeeper.server.auth.DigestAuthenticationProvider
- can be used to generate the superDigest, call it with
- one parameter of "super:<password>". Provide the
- generated "super:<data>" as the system property value
- when starting each server of the ensemble.</para>
-
- <para>When authenticating to a ZooKeeper server (from a
- ZooKeeper client) pass a scheme of "digest" and authdata
- of "super:<password>". Note that digest auth passes
- the authdata in plaintext to the server, it would be
- prudent to use this authentication method only on
- localhost (not over the network) or over an encrypted
- connection.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>X509AuthenticationProvider.superUser</term>
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.X509AuthenticationProvider.superUser</emphasis>)</para>
-
- <para>The SSL-backed way to enable a ZooKeeper ensemble
- administrator to access the znode hierarchy as a "super" user.
- When this parameter is set to an X500 principal name, only an
- authenticated client with that principal will be able to bypass
- ACL checking and have full privileges to all znodes.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>zookeeper.superUser</term>
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.superUser</emphasis>)</para>
-
- <para>Similar to <emphasis role="bold">zookeeper.X509AuthenticationProvider.superUser</emphasis>
- but is generic for SASL based logins. It stores the name of
- a user that can access the znode hierarchy as a "super" user.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>ssl.keyStore.location and ssl.keyStore.password</term>
- <listitem>
- <para>(Java system properties: <emphasis role="bold">
- zookeeper.ssl.keyStore.location</emphasis> and <emphasis
- role="bold">zookeeper.ssl.keyStore.password</emphasis>)</para>
-
- <para>Specifies the file path to a JKS containing the local
- credentials to be used for SSL connections, and the
- password to unlock the file.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>ssl.trustStore.location and ssl.trustStore.password</term>
- <listitem>
- <para>(Java system properties: <emphasis role="bold">
- zookeeper.ssl.trustStore.location</emphasis> and <emphasis
- role="bold">zookeeper.ssl.trustStore.password</emphasis>)</para>
-
- <para>Specifies the file path to a JKS containing the remote
- credentials to be used for SSL connections, and the
- password to unlock the file.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>ssl.authProvider</term>
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.ssl.authProvider</emphasis>)</para>
-
- <para>Specifies a subclass of <emphasis role="bold">
- org.apache.zookeeper.auth.X509AuthenticationProvider</emphasis>
- to use for secure client authentication. This is useful in
- certificate key infrastructures that do not use JKS. It may be
- necessary to extend <emphasis role="bold">javax.net.ssl.X509KeyManager
- </emphasis> and <emphasis role="bold">javax.net.ssl.X509TrustManager</emphasis>
- to get the desired behavior from the SSL stack. To configure the
- ZooKeeper server to use the custom provider for authentication,
- choose a scheme name for the custom AuthenticationProvider and
- set the property <emphasis role="bold">zookeeper.authProvider.[scheme]
- </emphasis> to the fully-qualified class name of the custom
- implementation. This will load the provider into the ProviderRegistry.
- Then set this property <emphasis role="bold">
- zookeeper.ssl.authProvider=[scheme]</emphasis> and that provider
- will be used for secure authentication.</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section>
- <title>Experimental Options/Features</title>
-
- <para>New features that are currently considered experimental.</para>
-
- <variablelist>
- <varlistentry>
- <term>Read Only Mode Server</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">readonlymode.enabled</emphasis>)</para>
-
- <para><emphasis role="bold">New in 3.4.0:</emphasis>
- Setting this value to true enables Read Only Mode server
- support (disabled by default). ROM allows clients
- sessions which requested ROM support to connect to the
- server even when the server might be partitioned from
- the quorum. In this mode ROM clients can still read
- values from the ZK service, but will be unable to write
- values and see changes from other clients. See
- ZOOKEEPER-784 for more details.
- </para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- </section>
-
- <section>
- <title>Unsafe Options</title>
-
- <para>The following options can be useful, but be careful when you use
- them. The risk of each is explained along with the explanation of what
- the variable does.</para>
-
- <variablelist>
- <varlistentry>
- <term>forceSync</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.forceSync</emphasis>)</para>
-
- <para>Requires updates to be synced to media of the transaction
- log before finishing processing the update. If this option is
- set to no, ZooKeeper will not require updates to be synced to
- the media.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>jute.maxbuffer:</term>
-
- <listitem>
- <para>(Java system property:<emphasis role="bold">
- jute.maxbuffer</emphasis>)</para>
-
- <para>This option can only be set as a Java system property.
- There is no zookeeper prefix on it. It specifies the maximum
- size of the data that can be stored in a znode. The default is
- 0xfffff, or just under 1M. If this option is changed, the system
- property must be set on all servers and clients otherwise
- problems will arise. This is really a sanity check. ZooKeeper is
- designed to store data on the order of kilobytes in size.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>skipACL</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.skipACL</emphasis>)</para>
-
- <para>Skips ACL checks. This results in a boost in throughput,
- but opens up full access to the data tree to everyone.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>quorumListenOnAllIPs</term>
-
- <listitem>
- <para>When set to true the ZooKeeper server will listen
- for connections from its peers on all available IP addresses,
- and not only the address configured in the server list of the
- configuration file. It affects the connections handling the
- ZAB protocol and the Fast Leader Election protocol. Default
- value is <emphasis role="bold">false</emphasis>.</para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- </section>
-
- <section>
- <title>Disabling data directory autocreation</title>
-
- <para><emphasis role="bold">New in 3.5:</emphasis> The default
- behavior of a ZooKeeper server is to automatically create the
- data directory (specified in the configuration file) when
- started if that directory does not already exist. This can be
- inconvenient and even dangerous in some cases. Take the case
- where a configuration change is made to a running server,
- wherein the <emphasis role="bold">dataDir</emphasis> parameter
- is accidentally changed. When the ZooKeeper server is
- restarted it will create this non-existent directory and begin
- serving - with an empty znode namespace. This scenario can
- result in an effective "split brain" situation (i.e. data in
- both the new invalid directory and the original valid data
- store). As such is would be good to have an option to turn off
- this autocreate behavior. In general for production
- environments this should be done, unfortunately however the
- default legacy behavior cannot be changed at this point and
- therefore this must be done on a case by case basis. This is
- left to users and to packagers of ZooKeeper distributions.
- </para>
-
- <para>When running <emphasis
- role="bold">zkServer.sh</emphasis> autocreate can be disabled
- by setting the environment variable <emphasis
- role="bold">ZOO_DATADIR_AUTOCREATE_DISABLE</emphasis> to 1.
- When running ZooKeeper servers directly from class files this
- can be accomplished by setting <emphasis
- role="bold">zookeeper.datadir.autocreate=false</emphasis> on
- the java command line, i.e. <emphasis
- role="bold">-Dzookeeper.datadir.autocreate=false</emphasis>
- </para>
-
- <para>When this feature is disabled, and the ZooKeeper server
- determines that the required directories do not exist it will
- generate an error and refuse to start.
- </para>
-
- <para>A new script <emphasis
- role="bold">zkServer-initialize.sh</emphasis> is provided to
- support this new feature. If autocreate is disabled it is
- necessary for the user to first install ZooKeeper, then create
- the data directory (and potentially txnlog directory), and
- then start the server. Otherwise as mentioned in the previous
- paragraph the server will not start. Running <emphasis
- role="bold">zkServer-initialize.sh</emphasis> will create the
- required directories, and optionally setup the myid file
- (optional command line parameter). This script can be used
- even if the autocreate feature itself is not used, and will
- likely be of use to users as this (setup, including creation
- of the myid file) has been an issue for users in the past.
- Note that this script ensures the data directories exist only,
- it does not create a config file, but rather requires a config
- file to be available in order to execute.
- </para>
- </section>
-
- <section id="sc_performance_options">
- <title>Performance Tuning Options</title>
-
- <para><emphasis role="bold">New in 3.5.0:</emphasis> Several subsystems have been reworked
- to improve read throughput. This includes multi-threading of the NIO communication subsystem and
- request processing pipeline (Commit Processor). NIO is the default client/server communication
- subsystem. Its threading model comprises 1 acceptor thread, 1-N selector threads and 0-M
- socket I/O worker threads. In the request processing pipeline the system can be configured
- to process multiple read request at once while maintaining the same consistency guarantee
- (same-session read-after-write). The Commit Processor threading model comprises 1 main
- thread and 0-N worker threads.
- </para>
-
- <para>
- The default values are aimed at maximizing read throughput on a dedicated ZooKeeper machine.
- Both subsystems need to have sufficient amount of threads to achieve peak read throughput.
- </para>
-
- <variablelist>
-
- <varlistentry>
- <term>zookeeper.nio.numSelectorThreads</term>
- <listitem>
- <para>(Java system property only: <emphasis
- role="bold">zookeeper.nio.numSelectorThreads</emphasis>)
- </para>
- <para><emphasis role="bold">New in 3.5.0:</emphasis>
- Number of NIO selector threads. At least 1 selector thread required.
- It is recommended to use more than one selector for large numbers
- of client connections. The default value is sqrt( number of cpu cores / 2 ).
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>zookeeper.nio.numWorkerThreads</term>
- <listitem>
- <para>(Java system property only: <emphasis
- role="bold">zookeeper.nio.numWorkerThreads</emphasis>)
- </para>
- <para><emphasis role="bold">New in 3.5.0:</emphasis>
- Number of NIO worker threads. If configured with 0 worker threads, the selector threads
- do the socket I/O directly. The default value is 2 times the number of cpu cores.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>zookeeper.commitProcessor.numWorkerThreads</term>
- <listitem>
- <para>(Java system property only: <emphasis
- role="bold">zookeeper.commitProcessor.numWorkerThreads</emphasis>)
- </para>
- <para><emphasis role="bold">New in 3.5.0:</emphasis>
- Number of Commit Processor worker threads. If configured with 0 worker threads, the main thread
- will process the request directly. The default value is the number of cpu cores.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>znode.container.checkIntervalMs</term>
-
- <listitem>
- <para>(Java system property only)</para>
-
- <para><emphasis role="bold">New in 3.5.1:</emphasis> The
- time interval in milliseconds for each check of candidate container
- and ttl nodes. Default is "60000".</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>znode.container.maxPerMinute</term>
-
- <listitem>
- <para>(Java system property only)</para>
-
- <para><emphasis role="bold">New in 3.5.1:</emphasis> The
- maximum number of container nodes that can be deleted per
- minute. This prevents herding during container deletion.
- Default is "10000".</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section>
- <title>Communication using the Netty framework</title>
-
- <para><ulink url="http://netty.io">Netty</ulink>
- is an NIO based client/server communication framework, it
- simplifies (over NIO being used directly) many of the
- complexities of network level communication for java
- applications. Additionally the Netty framework has built
- in support for encryption (SSL) and authentication
- (certificates). These are optional features and can be
- turned on or off individually.
- </para>
- <para>In versions 3.5+, a ZooKeeper server can use Netty
- instead of NIO (default option) by setting the environment
- variable <emphasis role="bold">zookeeper.serverCnxnFactory</emphasis>
- to <emphasis role="bold">org.apache.zookeeper.server.NettyServerCnxnFactory</emphasis>;
- for the client, set <emphasis role="bold">zookeeper.clientCnxnSocket</emphasis>
- to <emphasis role="bold">org.apache.zookeeper.ClientCnxnSocketNetty</emphasis>.
- </para>
-
- <para>
- TBD - tuning options for netty - currently there are none that are netty specific but we should add some. Esp around max bound on the number of reader worker threads netty creates.
- </para>
- <para>
- TBD - how to manage encryption
- </para>
- <para>
- TBD - how to manage certificates
- </para>
-
- </section>
-
- <section id="sc_adminserver_config">
- <title>AdminServer configuration</title>
- <para><emphasis role="bold">New in 3.5.0:</emphasis> The following
- options are used to configure the <ulink
- url="#sc_adminserver">AdminServer</ulink>.</para>
-
- <variablelist>
- <varlistentry>
- <term>admin.enableServer</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.admin.enableServer</emphasis>)</para>
-
- <para>Set to "false" to disable the AdminServer. By default the
- AdminServer is enabled.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>admin.serverAddress</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.admin.serverAddress</emphasis>)</para>
-
- <para>The address the embedded Jetty server listens on. Defaults to 0.0.0.0.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>admin.serverPort</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.admin.serverPort</emphasis>)</para>
-
- <para>The port the embedded Jetty server listens on. Defaults to 8080.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>admin.idleTimeout</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.admin.idleTimeout</emphasis>)</para>
-
- <para>Set the maximum idle time in milliseconds that a connection can wait
- before sending or receiving data. Defaults to 30000 ms.</para>
- </listitem>
- </varlistentry>
-
-
- <varlistentry>
- <term>admin.commandURL</term>
-
- <listitem>
- <para>(Java system property: <emphasis
- role="bold">zookeeper.admin.commandURL</emphasis>)</para>
-
- <para>The URL for listing and issuing commands relative to the
- root URL. Defaults to "/commands".</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- </section>
-
- <section id="sc_zkCommands">
- <title>ZooKeeper Commands</title>
-
- <section id="sc_4lw">
- <title>The Four Letter Words</title>
- <para>ZooKeeper responds to a small set of commands. Each command is
- composed of four letters. You issue the commands to ZooKeeper via telnet
- or nc, at the client port.</para>
-
- <para>Three of the more interesting commands: "stat" gives some
- general information about the server and connected clients,
- while "srvr" and "cons" give extended details on server and
- connections respectively.</para>
-
- <para><emphasis role="bold">New in 3.5.3:</emphasis>
- Four Letter Words need to be explicitly white listed before using.
- Please refer <emphasis role="bold">4lw.commands.whitelist</emphasis>
- described in <ulink url="#sc_clusterOptions">
- cluster configuration section</ulink> for details.
- Moving forward, Four Letter Words will be deprecated, please use
- <ulink url="#sc_adminserver">AdminServer</ulink> instead.
- </para>
-
- <variablelist>
- <varlistentry>
- <term>conf</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Print
- details about serving configuration.</para>
- </listitem>
-
- </varlistentry>
-
- <varlistentry>
- <term>cons</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> List
- full connection/session details for all clients connected
- to this server. Includes information on numbers of packets
- received/sent, session id, operation latencies, last
- operation performed, etc...</para>
- </listitem>
-
- </varlistentry>
-
- <varlistentry>
- <term>crst</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Reset
- connection/session statistics for all connections.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>dump</term>
-
- <listitem>
- <para>Lists the outstanding sessions and ephemeral nodes. This
- only works on the leader.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>envi</term>
-
- <listitem>
- <para>Print details about serving environment</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>ruok</term>
-
- <listitem>
- <para>Tests if server is running in a non-error state. The server
- will respond with imok if it is running. Otherwise it will not
- respond at all.</para>
-
- <para>A response of "imok" does not necessarily indicate that the
- server has joined the quorum, just that the server process is active
- and bound to the specified client port. Use "stat" for details on
- state wrt quorum and client connection information.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>srst</term>
-
- <listitem>
- <para>Reset server statistics.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>srvr</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
- full details for the server.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>stat</term>
-
- <listitem>
- <para>Lists brief details for the server and connected
- clients.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>wchs</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
- brief information on watches for the server.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>wchc</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
- detailed information on watches for the server, by
- session. This outputs a list of sessions(connections)
- with associated watches (paths). Note, depending on the
- number of watches this operation may be expensive (ie
- impact server performance), use it carefully.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>dirs</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.5.1:</emphasis>
- Shows the total size of snapshot and log files in bytes
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>wchp</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.3.0:</emphasis> Lists
- detailed information on watches for the server, by path.
- This outputs a list of paths (znodes) with associated
- sessions. Note, depending on the number of watches this
- operation may be expensive (ie impact server performance),
- use it carefully.</para>
- </listitem>
- </varlistentry>
-
-
- <varlistentry>
- <term>mntr</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.4.0:</emphasis> Outputs a list
- of variables that could be used for monitoring the health of the cluster.</para>
-
- <programlisting>$ echo mntr | nc localhost 2185
-
- zk_version 3.4.0
- zk_avg_latency 0
- zk_max_latency 0
- zk_min_latency 0
- zk_packets_received 70
- zk_packets_sent 69
- zk_num_alive_connections 1
- zk_outstanding_requests 0
- zk_server_state leader
- zk_znode_count 4
- zk_watch_count 0
- zk_ephemerals_count 0
- zk_approximate_data_size 27
- zk_followers 4 - only exposed by the Leader
- zk_synced_followers 4 - only exposed by the Leader
- zk_pending_syncs 0 - only exposed by the Leader
- zk_open_file_descriptor_count 23 - only available on Unix platforms
- zk_max_file_descriptor_count 1024 - only available on Unix platforms
- zk_last_proposal_size 23
- zk_min_proposal_size 23
- zk_max_proposal_size 64
- </programlisting>
-
- <para>The output is compatible with java properties format and the content
- may change over time (new keys added). Your scripts should expect changes.</para>
-
- <para>ATTENTION: Some of the keys are platform specific and some of the keys are only exported by the Leader. </para>
-
- <para>The output contains multiple lines with the following format:</para>
- <programlisting>key \t value</programlisting>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>isro</term>
-
- <listitem>
- <para><emphasis role="bold">New in 3.4.0:</emphasis> Tests if
- server is running in read-only mode. The server will respond with
- "ro" if in read-only mode or "rw" if not in read-only mode.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>gtmk</term>
-
- <listitem>
- <para>Gets the current trace mask as a 64-bit signed long value in
- decimal format. See <command>stmk</command> for an explanation of
- the possible values.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>stmk</term>
-
- <listitem>
- <para>Sets the current trace mask. The trace mask is 64 bits,
- where each bit enables or disables a specific category of trace
- logging on the server. Log4J must be configured to enable
- <command>TRACE</command> level first in order to see trace logging
- messages. The bits of the trace mask correspond to the following
- trace logging categories.</para>
-
- <table>
- <title>Trace Mask Bit Values</title>
- <tgroup cols="2" align="left" colsep="1" rowsep="1">
- <tbody>
- <row>
- <entry>0b0000000000</entry>
- <entry>Unused, reserved for future use.</entry>
- </row>
- <row>
- <entry>0b0000000010</entry>
- <entry>Logs client requests, excluding ping
- requests.</entry>
- </row>
- <row>
- <entry>0b0000000100</entry>
- <entry>Unused, reserved for future use.</entry>
- </row>
- <row>
- <entry>0b0000001000</entry>
- <entry>Logs client ping requests.</entry>
- </row>
- <row>
- <entry>0b0000010000</entry>
- <entry>Logs packets received from the quorum peer that is
- the current leader, excluding ping requests.</entry>
- </row>
- <row>
- <entry>0b0000100000</entry>
- <entry>Logs addition, removal and validation of client
- sessions.</entry>
- </row>
- <row>
- <entry>0b0001000000</entry>
- <entry>Logs delivery of watch events to client
- sessions.</entry>
- </row>
- <row>
- <entry>0b0010000000</entry>
- <entry>Logs ping packets received from the quorum peer
- that is the current leader.</entry>
- </row>
- <row>
- <entry>0b0100000000</entry>
- <entry>Unused, reserved for future use.</entry>
- </row>
- <row>
- <entry>0b1000000000</entry>
- <entry>Unused, reserved for future use.</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- <para>All remaining bits in the 64-bit value are unused and
- reserved for future use. Multiple trace logging categories are
- specified by calculating the bitwise OR of the documented values.
- The default trace mask is 0b0100110010. Thus, by default, trace
- logging includes client requests, packets received from the
- leader and sessions.</para>
-
- <para>To set a different trace mask, send a request containing the
- <command>stmk</command> four-letter word followed by the trace
- mask represented as a 64-bit signed long value. This example uses
- the Perl <command>pack</command> function to construct a trace
- mask that enables all trace logging categories described above and
- convert it to a 64-bit signed long value with big-endian byte
- order. The result is appended to <command>stmk</command> and sent
- to the server using netcat. The server responds with the new
- trace mask in decimal format.</para>
-
- <programlisting>$ perl -e "print 'stmk', pack('q>', 0b0011111010)" | nc localhost 2181
-250
- </programlisting>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <para>Here's an example of the <emphasis role="bold">ruok</emphasis>
- command:</para>
-
- <programlisting>$ echo ruok | nc 127.0.0.1 5111
- imok
- </programlisting>
-
- </section>
- <section id="sc_adminserver">
- <title>The AdminServer</title>
- <para><emphasis role="bold">New in
<TRUNCATED>