You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by st...@apache.org on 2011/04/30 23:05:39 UTC
svn commit: r1098158 - in /hbase/trunk/src/docbkx: book.xml
configuration.xml getting_started.xml performance.xml troubleshooting.xml
Author: stack
Date: Sat Apr 30 21:05:39 2011
New Revision: 1098158
URL: http://svn.apache.org/viewvc?rev=1098158&view=rev
Log:
HBASE-3831 docbook xml files - standardized RegionServer, DataNode, and ZooKeeper in several xml docs
Modified:
hbase/trunk/src/docbkx/book.xml
hbase/trunk/src/docbkx/configuration.xml
hbase/trunk/src/docbkx/getting_started.xml
hbase/trunk/src/docbkx/performance.xml
hbase/trunk/src/docbkx/troubleshooting.xml
Modified: hbase/trunk/src/docbkx/book.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/book.xml?rev=1098158&r1=1098157&r2=1098158&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/book.xml (original)
+++ hbase/trunk/src/docbkx/book.xml Sat Apr 30 21:05:39 2011
@@ -231,7 +231,7 @@ throws InterruptedException, IOException
</para>
</section>
<section xml:id="rs_metrics">
- <title>Region Server Metrics</title>
+ <title>RegionServer Metrics</title>
<section xml:id="hbase.regionserver.blockCacheCount"><title><varname>hbase.regionserver.blockCacheCount</varname></title>
<para>Block cache item count in memory. This is the number of blocks of storefiles (HFiles) in the cache.</para>
</section>
@@ -266,22 +266,22 @@ throws InterruptedException, IOException
<para>TODO</para>
</section>
<section xml:id="hbase.regionserver.memstoreSizeMB"><title><varname>hbase.regionserver.memstoreSizeMB</varname></title>
- <para>Sum of all the memstore sizes in this regionserver (MB)</para>
+ <para>Sum of all the memstore sizes in this RegionServer (MB)</para>
</section>
<section xml:id="hbase.regionserver.regions"><title><varname>hbase.regionserver.regions</varname></title>
- <para>Number of regions served by the regionserver</para>
+ <para>Number of regions served by the RegionServer</para>
</section>
<section xml:id="hbase.regionserver.requests"><title><varname>hbase.regionserver.requests</varname></title>
- <para>Total number of read and write requests. Requests correspond to regionserver RPC calls, thus a single Get will result in 1 request, but a Scan with caching set to 1000 will result in 1 request for each 'next' call (i.e., not each row). A bulk-load request will constitute 1 request per HFile.</para>
+ <para>Total number of read and write requests. Requests correspond to RegionServer RPC calls, thus a single Get will result in 1 request, but a Scan with caching set to 1000 will result in 1 request for each 'next' call (i.e., not each row). A bulk-load request will constitute 1 request per HFile.</para>
</section>
<section xml:id="hbase.regionserver.storeFileIndexSizeMB"><title><varname>hbase.regionserver.storeFileIndexSizeMB</varname></title>
- <para>Sum of all the storefile index sizes in this regionserver (MB)</para>
+ <para>Sum of all the storefile index sizes in this RegionServer (MB)</para>
</section>
<section xml:id="hbase.regionserver.stores"><title><varname>hbase.regionserver.stores</varname></title>
- <para>Number of stores open on the regionserver. A store corresponds to a column family. For example, if a table (which contains the column family) has 3 regions on a regionserver, there will be 3 stores open for that column family. </para>
+ <para>Number of stores open on the RegionServer. A store corresponds to a column family. For example, if a table (which contains the column family) has 3 regions on a RegionServer, there will be 3 stores open for that column family. </para>
</section>
<section xml:id="hbase.regionserver.storeFiles"><title><varname>hbase.regionserver.storeFiles</varname></title>
- <para>Number of store filles open on the regionserver. A store may have more than one storefile (HFile).</para>
+ <para>Number of store filles open on the RegionServer. A store may have more than one storefile (HFile).</para>
</section>
</section>
</chapter>
@@ -712,7 +712,7 @@ throws InterruptedException, IOException
</para>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>
instances are not thread-safe. When creating HTable instances, it is advisable to use the same <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HBaseConfiguration">HBaseConfiguration</link>
-instance. This will ensure sharing of zookeeper and socket instances to the region servers
+instance. This will ensure sharing of ZooKeeper and socket instances to the RegionServers
which is usually what you want. For example, this is preferred:
<programlisting>HBaseConfiguration conf = HBaseConfiguration.create();
HTable table1 = new HTable(conf, "myTable");
@@ -729,7 +729,7 @@ HTable table2 = new HTable(conf2, "myTab
<section xml:id="client.writebuffer"><title>WriteBuffer and Batch Methods</title>
<para>If <xref linkend="perf.hbase.client.autoflush" /> is turned off on
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>,
- <classname>Put</classname>s are sent to region servers when the writebuffer
+ <classname>Put</classname>s are sent to RegionServers when the writebuffer
is filled. The writebuffer is 2MB by default. Before an HTable instance is
discarded, either <methodname>close()</methodname> or
<methodname>flushCommits()</methodname> should be invoked so Puts
@@ -742,7 +742,7 @@ HTable table2 = new HTable(conf2, "myTab
</section>
<section xml:id="client.filter"><title>Filters</title>
<para><link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</link> and <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link> instances can be
- optionally configured with <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html">filters</link> which are applied on the region server.
+ optionally configured with <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html">filters</link> which are applied on the RegionServer.
</para>
</section>
</section>
@@ -796,7 +796,7 @@ HTable table2 = new HTable(conf2, "myTab
<listitem>
<para>There is not much memory footprint difference between 1 region
- and 10 in terms of indexes, etc, held by the regionserver.</para>
+ and 10 in terms of indexes, etc, held by the RegionServer.</para>
</listitem>
</itemizedlist>
@@ -1118,27 +1118,27 @@ HTable table2 = new HTable(conf2, "myTab
<para>See <xref linkend="compression.tool" />.</para>
</section>
<section xml:id="decommission"><title>Node Decommission</title>
- <para>You can stop an individual regionserver by running the following
+ <para>You can stop an individual RegionServer by running the following
script in the HBase directory on the particular node:
<programlisting>$ ./bin/hbase-daemon.sh stop regionserver</programlisting>
- The regionserver will first close all regions and then shut itself down.
- On shutdown, the regionserver's ephemeral node in ZooKeeper will expire.
- The master will notice the regionserver gone and will treat it as
- a 'crashed' server; it will reassign the nodes the regionserver was carrying.
+ The RegionServer will first close all regions and then shut itself down.
+ On shutdown, the RegionServer's ephemeral node in ZooKeeper will expire.
+ The master will notice the RegionServer gone and will treat it as
+ a 'crashed' server; it will reassign the nodes the RegionServer was carrying.
<note><title>Disable the Load Balancer before Decommissioning a node</title>
<para>If the load balancer runs while a node is shutting down, then
there could be contention between the Load Balancer and the
- Master's recovery of the just decommissioned regionserver.
+ Master's recovery of the just decommissioned RegionServer.
Avoid any problems by disabling the balancer first.
See <xref linkend="lb" /> below.
</para>
</note>
</para>
<para>
- A downside to the above stop of a regionserver is that regions could be offline for
+ A downside to the above stop of a RegionServer is that regions could be offline for
a good period of time. Regions are closed in order. If many regions on the server, the
first region to close may not be back online until all regions close and after the master
- notices the regionserver's znode gone. In HBase 0.90.2, we added facility for having
+ notices the RegionServer's znode gone. In HBase 0.90.2, we added facility for having
a node gradually shed its load and then shutdown itself down. HBase 0.90.2 added the
<filename>graceful_stop.sh</filename> script. Here is its usage:
<programlisting>$ ./bin/graceful_stop.sh
@@ -1151,14 +1151,14 @@ Usage: graceful_stop.sh [--config &c
hostname Hostname of server we are to stop</programlisting>
</para>
<para>
- To decommission a loaded regionserver, run the following:
+ To decommission a loaded RegionServer, run the following:
<programlisting>$ ./bin/graceful_stop.sh HOSTNAME</programlisting>
where <varname>HOSTNAME</varname> is the host carrying the RegionServer
you would decommission.
<note><title>On <varname>HOSTNAME</varname></title>
<para>The <varname>HOSTNAME</varname> passed to <filename>graceful_stop.sh</filename>
- must match the hostname that hbase is using to identify regionservers.
- Check the list of regionservers in the master UI for how HBase is
+ must match the hostname that hbase is using to identify RegionServers.
+ Check the list of RegionServers in the master UI for how HBase is
referring to servers. Its usually hostname but can also be FQDN.
Whatever HBase is using, this is what you should pass the
<filename>graceful_stop.sh</filename> decommission
@@ -1167,7 +1167,7 @@ Usage: graceful_stop.sh [--config &c
currently running; the graceful unloading of regions will not run.
</para>
</note> The <filename>graceful_stop.sh</filename> script will move the regions off the
- decommissioned regionserver one at a time to minimize region churn.
+ decommissioned RegionServer one at a time to minimize region churn.
It will verify the region deployed in the new location before it
will moves the next region and so on until the decommissioned server
is carrying zero regions. At this point, the <filename>graceful_stop.sh</filename>
@@ -1201,7 +1201,7 @@ false
<programlisting>$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt &
</programlisting>
Tail the output of <filename>/tmp/log.txt</filename> to follow the scripts
- progress. The above does regionservers only. Be sure to disable the
+ progress. The above does RegionServers only. Be sure to disable the
load balancer before doing the above. You'd need to do the master
update separately. Do it before you run the above script.
Here is a pseudo-script for how you might craft a rolling restart script:
@@ -1227,10 +1227,10 @@ false
</para>
</listitem>
<listitem>
- <para>Run the <filename>graceful_stop.sh</filename> script per regionserver. For example:
+ <para>Run the <filename>graceful_stop.sh</filename> script per RegionServer. For example:
<programlisting>$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt &
</programlisting>
- If you are running thrift or rest servers on the regionserver, pass --thrift or --rest options (See usage
+ If you are running thrift or rest servers on the RegionServer, pass --thrift or --rest options (See usage
for <filename>graceful_stop.sh</filename> script).
</para>
</listitem>
Modified: hbase/trunk/src/docbkx/configuration.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/configuration.xml?rev=1098158&r1=1098157&r2=1098158&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/configuration.xml (original)
+++ hbase/trunk/src/docbkx/configuration.xml Sat Apr 30 21:05:39 2011
@@ -114,7 +114,7 @@ to ensure well-formedness of your docume
a minute or even less so the Master notices failures the sooner.
Before changing this value, be sure you have your JVM garbage collection
configuration under control otherwise, a long garbage collection that lasts
- beyond the zookeeper session timeout will take out
+ beyond the ZooKeeper session timeout will take out
your RegionServer (You might be fine with this -- you probably want recovery to start
on the server if a RegionServer has been in GC for a long period of time).</para>
@@ -274,7 +274,7 @@ of all regions.
</para>
<para>
Minimally, a client of HBase needs the hbase, hadoop, log4j, commons-logging, commons-lang,
- and zookeeper jars in its <varname>CLASSPATH</varname> connecting to a cluster.
+ and ZooKeeper jars in its <varname>CLASSPATH</varname> connecting to a cluster.
</para>
<para>
An example basic <filename>hbase-site.xml</filename> for client only
@@ -307,7 +307,7 @@ of all regions.
ensemble for the cluster programmatically do as follows:
<programlisting>Configuration config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", "localhost"); // Here we are running zookeeper locally</programlisting>
- If multiple ZooKeeper instances make up your zookeeper ensemble,
+ If multiple ZooKeeper instances make up your ZooKeeper ensemble,
they may be specified in a comma-separated list (just as in the <filename>hbase-site.xml</filename> file).
This populated <classname>Configuration</classname> instance can then be passed to an
<link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>,
Modified: hbase/trunk/src/docbkx/getting_started.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/getting_started.xml?rev=1098158&r1=1098157&r2=1098158&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/getting_started.xml (original)
+++ hbase/trunk/src/docbkx/getting_started.xml Sat Apr 30 21:05:39 2011
@@ -453,7 +453,7 @@ stopping hbase...............</programli
in the <xref linkend="quickstart" /> section. In
standalone mode, HBase does not use HDFS -- it uses the local
filesystem instead -- and it runs all HBase daemons and a local
- zookeeper all up in the same JVM. Zookeeper binds to a well known port
+ ZooKeeper all up in the same JVM. Zookeeper binds to a well known port
so clients may talk to HBase.</para>
</section>
@@ -508,7 +508,7 @@ stopping hbase...............</programli
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
- <description>The directory shared by region servers.
+ <description>The directory shared by RegionServers.
</description>
</property>
<property>
@@ -539,7 +539,7 @@ stopping hbase...............</programli
<para>See <link
xlink:href="http://hbase.apache.org/pseudo-distributed.html">Pseudo-distributed
mode extras</link> for notes on how to start extra Masters and
- regionservers when running pseudo-distributed.</para>
+ RegionServers when running pseudo-distributed.</para>
</footnote></para>
</section>
@@ -564,7 +564,7 @@ stopping hbase...............</programli
<property>
<name>hbase.rootdir</name>
<value>hdfs://namenode.example.org:9000/hbase</value>
- <description>The directory shared by region servers.
+ <description>The directory shared by RegionServers.
</description>
</property>
<property>
@@ -873,7 +873,7 @@ stopping hbase...............</programli
<property>
<name>hbase.zookeeper.quorum</name>
<value>example1,example2,example3</value>
- <description>The directory shared by region servers.
+ <description>The directory shared by RegionServers.
</description>
</property>
<property>
@@ -886,7 +886,7 @@ stopping hbase...............</programli
<property>
<name>hbase.rootdir</name>
<value>hdfs://example0:9000/hbase</value>
- <description>The directory shared by region servers.
+ <description>The directory shared by RegionServers.
</description>
</property>
<property>
@@ -905,8 +905,8 @@ stopping hbase...............</programli
<section xml:id="regionservers">
<title><filename>regionservers</filename></title>
- <para>In this file you list the nodes that will run regionservers.
- In our case we run regionservers on all but the head node
+ <para>In this file you list the nodes that will run RegionServers.
+ In our case we run RegionServers on all but the head node
<varname>example1</varname> which is carrying the HBase Master and
the HDFS namenode</para>
Modified: hbase/trunk/src/docbkx/performance.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/performance.xml?rev=1098158&r1=1098157&r2=1098158&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/performance.xml (original)
+++ hbase/trunk/src/docbkx/performance.xml Sat Apr 30 21:05:39 2011
@@ -16,14 +16,14 @@
here for more pointers.</para>
<note xml:id="rpc.logging"><title>Enabling RPC-level logging</title>
- <para>Enabling the RPC-level logging on a regionserver can often given
+ <para>Enabling the RPC-level logging on a RegionServer can often given
insight on timings at the server. Once enabled, the amount of log
spewed is voluminous. It is not recommended that you leave this
logging on for more than short bursts of time. To enable RPC-level
- logging, browse to the regionserver UI and click on
+ logging, browse to the RegionServer UI and click on
<emphasis>Log Level</emphasis>. Set the log level to <varname>DEBUG</varname> for the package
<classname>org.apache.hadoop.ipc</classname> (Thats right, for
- hadoop.ipc, NOT, hbase.ipc). Then tail the regionservers log.
+ hadoop.ipc, NOT, hbase.ipc). Then tail the RegionServers log.
Analyze.</para>
<para>To disable, set the logging level back to <varname>INFO</varname> level.
</para>
@@ -87,13 +87,13 @@
<section xml:id="perf.handlers">
<title><varname>hbase.regionserver.handler.count</varname></title>
<para>This setting is in essence sets how many requests are
- concurrently being processed inside the regionserver at any
+ concurrently being processed inside the RegionServer at any
one time. If set too high, then throughput may suffer as
the concurrent requests contend; if set too low, requests will
be stuck waiting to get into the machine. You can get a
sense of whether you have too little or too many handlers by
<xref linkend="rpc.logging" />
- on an individual regionserver then tailing its logs.</para>
+ on an individual RegionServer then tailing its logs.</para>
</section>
</section>
@@ -167,7 +167,7 @@ public static byte[][] getHexSplits(Stri
to false on your <link
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>
instance. Otherwise, the Puts will be sent one at a time to the
- regionserver. Puts added via <code> htable.add(Put)</code> and <code> htable.add( <List> Put)</code>
+ RegionServer. Puts added via <code> htable.add(Put)</code> and <code> htable.add( <List> Put)</code>
wind up in the same write buffer. If <code>autoFlush = false</code>,
these messages are not sent until the write-buffer is filled. To
explicitly flush the messages, call <methodname>flushCommits</methodname>.
@@ -187,7 +187,7 @@ public static byte[][] getHexSplits(Stri
processed. Setting this value to 500, for example, will transfer 500
rows at a time to the client to be processed. There is a cost/benefit to
have the cache value be large because it costs more in memory for both
- client and regionserver, so bigger isn't always better.</para>
+ client and RegionServer, so bigger isn't always better.</para>
</section>
<section xml:id="perf.hbase.client.scannerclose">
@@ -197,7 +197,7 @@ public static byte[][] getHexSplits(Stri
<emphasis>avoiding</emphasis> performance problems. If you forget to
close <link
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/ResultScanner.html">ResultScanners</link>
- you can cause problems on the regionservers. Always have ResultScanner
+ you can cause problems on the RegionServers. Always have ResultScanner
processing enclosed in try/catch blocks... <programlisting>
Scan scan = new Scan();
// set attrs...
@@ -216,7 +216,7 @@ htable.close();</programlisting></para>
<para><link
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
- instances can be set to use the block cache in the region server via the
+ instances can be set to use the block cache in the RegionServer via the
<methodname>setCacheBlocks</methodname> method. For input Scans to MapReduce jobs, this should be
<varname>false</varname>. For frequently accessed rows, it is advisable to use the block
cache.</para>
@@ -228,7 +228,7 @@ htable.close();</programlisting></para>
<varname>MUST_PASS_ALL</varname> operator to the scanner using <methodname>setFilter</methodname>. The filter list
should include both a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>
and a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html">KeyOnlyFilter</link>.
- Using this filter combination will result in a worst case scenario of a region server reading a single value from disk
+ Using this filter combination will result in a worst case scenario of a RegionServer reading a single value from disk
and minimal network traffic to the client for a single row.
</para>
</section>
Modified: hbase/trunk/src/docbkx/troubleshooting.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/troubleshooting.xml?rev=1098158&r1=1098157&r2=1098158&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/troubleshooting.xml (original)
+++ hbase/trunk/src/docbkx/troubleshooting.xml Sat Apr 30 21:05:39 2011
@@ -28,7 +28,7 @@
<para>
RegionServer suicides are ânormalâ, as this is what they do when something goes wrong.
For example, if ulimit and xcievers (the two most important initial settings, see <xref linkend="ulimit" />)
- arenât changed, it will make it impossible at some point for datanodes to create new threads
+ arenât changed, it will make it impossible at some point for DataNodes to create new threads
that from the HBase point of view is seen as if HDFS was gone. Think about what would happen if your
MySQL database was suddenly unable to access files on your local file system, well itâs the same with
HBase and HDFS. Another very common reason to see RegionServers committing seppuku is when they enter
@@ -145,7 +145,7 @@ hadoop@sv4borg12:~$ jps
<listitem>Child, its MapReduce task, cannot tell which type exactly</listitem>
<listitem>Hadoop TaskTracker, manages the local Childs</listitem>
<listitem>Hadoop DataNode, serves blocks</listitem>
- <listitem>HQuorumPeer, a zookeeper ensemble member</listitem>
+ <listitem>HQuorumPeer, a ZooKeeper ensemble member</listitem>
<listitem>Jps, well⦠itâs the current process</listitem>
<listitem>ThriftServer, itâs a special one will be running only if thrift was started</listitem>
<listitem>jmx, this is a local process thatâs part of our monitoring platform ( poorly named maybe). You probably donât have that.</listitem>
@@ -275,7 +275,7 @@ hadoop 17789 155 35.2 9067824 8604364
</programlisting>
</para>
<para>
- And here is a master trying to recover a lease after a region server died:
+ And here is a master trying to recover a lease after a RegionServer died:
<programlisting>
"LeaseChecker" daemon prio=10 tid=0x00000000407ef800 nid=0x76cd waiting on condition [0x00007f6d0eae2000..0x00007f6d0eae2a70]
--
@@ -370,7 +370,7 @@ java.lang.UnsatisfiedLinkError: no gplco
</para>
</section>
<section xml:id="trouble.rs.runtime.oom-nt">
- <title>System instability, and the presence of "java.lang.OutOfMemoryError: unable to create new native thread in exceptions" HDFS datanode logs or that of any system daemon</title>
+ <title>System instability, and the presence of "java.lang.OutOfMemoryError: unable to create new native thread in exceptions" HDFS DataNode logs or that of any system daemon</title>
<para>
See the Getting Started section on <link linkend="ulimit">ulimit and nproc configuration</link>.
</para>