You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by ni...@apache.org on 2009/07/15 19:40:39 UTC

svn commit: r794331 - /hadoop/hbase/trunk/src/java/overview.html

Author: nitay
Date: Wed Jul 15 17:40:39 2009
New Revision: 794331

URL: http://svn.apache.org/viewvc?rev=794331&view=rev
Log:
HBASE-1632 Write documentation for configuring/managing ZooKeeper with HBase

Modified:
    hadoop/hbase/trunk/src/java/overview.html

Modified: hadoop/hbase/trunk/src/java/overview.html
URL: http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/overview.html?rev=794331&r1=794330&r2=794331&view=diff
==============================================================================
--- hadoop/hbase/trunk/src/java/overview.html (original)
+++ hadoop/hbase/trunk/src/java/overview.html Wed Jul 15 17:40:39 2009
@@ -174,51 +174,99 @@
 </p>
 <p>
 A distributed HBase depends on a running ZooKeeper cluster.
-The ZooKeeper configuration file for HBase is stored at <code>${HBASE_HOME}/conf/zoo.cfg</code>.
-See the ZooKeeper <a href="http://hadoop.apache.org/zookeeper/docs/current/zookeeperStarted.html"> Getting Started Guide</a>
-for information about the format and options of that file.  Specifically, look at the 
-<a href="http://hadoop.apache.org/zookeeper/docs/current/zookeeperStarted.html#sc_RunningReplicatedZooKeeper">Running Replicated ZooKeeper</a> section.
-</p>
-
-
-<p>
-Though not recommended, it can be convenient having HBase continue to manage
-ZooKeeper even when in distributed mode (It can be good when testing or taking
-hbase for a testdrive).  Change <code>${HBASE_HOME}/conf/zoo.cfg</code> and
-set the server.0 property to the IP of the node that will be running ZooKeeper
-(Leaving the default value of "localhost" will make it impossible to start HBase).
-<pre>
-  ...
-server.0=example.org:2888:3888
-<blockquote>
-</pre>
-Then on the example.org server do the following <i>before</i> running HBase. 
-<pre>
-${HBASE_HOME}/bin/hbase-daemon.sh start zookeeper
-</pre>
-</blockquote>
-<p>To stop ZooKeeper, after you've shut down hbase, do:
-<blockquote>
-<pre>
-${HBASE_HOME}/bin/hbase-daemon.sh stop zookeeper
-</pre>
-</blockquote>
-Be aware that this option is only recommanded for testing purposes as a failure
-on that node would render HBase <b>unusable</b>.
-</p>
-
-<p>
-To tell HBase to stop managing a ZooKeeper instance, after configuring
-<code>zoo.cfg</code> to point at the ZooKeeper Quorum you'd like HBase to
-use, in <code>${HBASE_HOME}/conf/hbase-env.sh</code>,
-set the following to tell HBase to STOP managing its instance of ZooKeeper.
-<blockquote>
-<pre>
-  ...
-# Tell HBase whether it should manage it's own instance of Zookeeper or not.
-export HBASE_MANAGES_ZK=false
-</pre>
-</blockquote>
+HBase can manage a ZooKeeper cluster for you, or you can manage it on your own
+and point HBase to it.
+To toggle this option, use the <code>HBASE_MANAGES_ZK</code> variable in <code>
+${HBASE_HOME}/conf/hbase-env.sh</code>.
+This variable, which defaults to <code>true</code>, tells HBase whether to
+start/stop the ZooKeeper quorum servers alongside the rest of the servers.
+</p>
+<p>
+To point HBase at an existing ZooKeeper cluster, add your <code>zoo.cfg</code>
+to the <code>CLASSPATH</code>.
+HBase will see this file and use it to figure out where ZooKeeper is.
+Additionally set <code>HBASE_MANAGES_ZK</code> in <code> ${HBASE_HOME}/conf/hbase-env.sh</code>
+ to <code>false</code> so that HBase doesn't mess with your ZooKeeper setup:
+<pre>
+   ...
+  # Tell HBase whether it should manage it's own instance of Zookeeper or not.
+  export HBASE_MANAGES_ZK=false
+</pre>
+For more information about setting up a ZooKeeper cluster on your own, see
+the ZooKeeper <a href="http://hadoop.apache.org/zookeeper/docs/current/zookeeperStarted.html">Getting Started Guide</a>.
+HBase currently uses ZooKeeper version 3.2.0, so any cluster setup with a 3.x.x
+version of ZooKeeper should work.
+</p>
+<p>
+To have HBase manage the ZooKeeper cluster, you can use a <code>zoo.cfg</code>
+ file as above, or edit the options directly in the <code>${HBASE_HOME}/conf/hbase-site.xml</code>.
+Every option from the <code>zoo.cfg</code> has a corresponding property in the
+XML configuration file named <code>hbase.zookeeper.property.OPTION</code>.
+For example, the <code>clientPort</code> setting in ZooKeeper can be changed by
+setting the <code>hbase.zookeeper.property.clientPort</code> property.
+For the full list of available properties, see ZooKeeper's <code>zoo.cfg</code>.
+For the default values used by HBase, see <code>${HBASE_HOME}/conf/hbase-default.xml</code>.
+</p>
+<p>
+At minimum, you should set the list of servers that you want ZooKeeper to run
+on using the <code>hbase.zookeeper.quorum</code> property.
+This property defaults to <code>localhost</code> which is not suitable for a 
+fully distributed HBase.
+It is recommended to run a ZooKeeper quorum of 5 or 7 machines, and give each
+server around 1GB to ensure that they don't swap.
+It is also recommended to run the ZooKeeper servers on separate machines from
+the Region Servers with their own disks.
+If this is not easily doable for you, choose 5 of your region servers to run the
+ZooKeeper servers on.
+</p>
+<p>
+As an example, to have HBase manage a ZooKeeper quorum on nodes
+rs{1,2,3,4,5}.example.com, bound to port 2222 (the default is 2181), use:
+<pre>
+  ${HBASE_HOME}/conf/hbase-env.sh:
+
+       ...
+      # Tell HBase whether it should manage it's own instance of Zookeeper or not.
+      export HBASE_MANAGES_ZK=true
+
+  ${HBASE_HOME}/conf/hbase-site.xml:
+
+  &lt;configuration&gt;
+    ...
+    &lt;property&gt;
+      &lt;name&gt;hbase.zookeeper.property.clientPort&lt;/name&gt;
+      &lt;value&gt;2222&lt;/value&gt;
+      &lt;description&gt;Property from ZooKeeper's config zoo.cfg.
+      The port at which the clients will connect.
+      &lt;/description&gt;
+    &lt;/property&gt;
+    ...
+    &lt;property&gt;
+      &lt;name&gt;hbase.zookeeper.quorum&lt;/name&gt;
+      &lt;value&gt;rs1.example.com,rs2.example.com,rs3.example.com,rs4.example.com,rs5.example.com&lt;/value&gt;
+      &lt;description&gt;Comma separated list of servers in the ZooKeeper Quorum.
+      For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
+      By default this is set to localhost for local and pseudo-distributed modes
+      of operation. For a fully-distributed setup, this should be set to a full
+      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
+      this is the list of servers which we will start/stop ZooKeeper on.
+      &lt;/description&gt;
+    &lt;/property&gt;
+    ...
+  &lt;/configuration&gt;
+</pre>
+</p>
+<p>
+When HBase manages ZooKeeper, it will start/stop the ZooKeeper servers as a part
+of the regular start/stop scripts. If you would like to run it yourself, you can
+do:
+<pre>
+  ${HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper
+</pre>
+Note that you can use HBase in this manner to spin up a ZooKeeper cluster,
+unrelated to HBase. Just make sure to set <code>HBASE_MANAGES_ZK</code> to
+<code>false</code> if you want it to stay up so that when HBase shuts down it
+doesn't take ZooKeeper with it.
 </p>
 
 <p>Of note, if you have made <i>HDFS client configuration</i> on your hadoop cluster, HBase will not