You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@chukwa.apache.org by as...@apache.org on 2009/11/10 04:29:02 UTC
svn commit: r834323 [1/3] - in /hadoop/chukwa/site/publish: ./ docs/r0.3.0/

Author: asrabkin
Date: Tue Nov 10 03:29:02 2009
New Revision: 834323

URL: http://svn.apache.org/viewvc?rev=834323&view=rev
Log:
update admin guide as per CHUKWA-413.

Modified:
    hadoop/chukwa/site/publish/docs/r0.3.0/admin.html
    hadoop/chukwa/site/publish/docs/r0.3.0/admin.pdf
    hadoop/chukwa/site/publish/docs/r0.3.0/agent.html
    hadoop/chukwa/site/publish/docs/r0.3.0/agent.pdf
    hadoop/chukwa/site/publish/docs/r0.3.0/changes.html
    hadoop/chukwa/site/publish/docs/r0.3.0/collector.html
    hadoop/chukwa/site/publish/docs/r0.3.0/collector.pdf
    hadoop/chukwa/site/publish/docs/r0.3.0/releasenotes.html
    hadoop/chukwa/site/publish/index.html
    hadoop/chukwa/site/publish/index.pdf
    hadoop/chukwa/site/publish/releases.html
    hadoop/chukwa/site/publish/releases.pdf

Modified: hadoop/chukwa/site/publish/docs/r0.3.0/admin.html
URL: http://svn.apache.org/viewvc/hadoop/chukwa/site/publish/docs/r0.3.0/admin.html?rev=834323&r1=834322&r2=834323&view=diff
==============================================================================
--- hadoop/chukwa/site/publish/docs/r0.3.0/admin.html (original)
+++ hadoop/chukwa/site/publish/docs/r0.3.0/admin.html Tue Nov 10 03:29:02 2009
@@ -165,75 +165,54 @@
 </li>
 <li>
 <a href="#Pre-requisites"> Pre-requisites</a>
-<ul class="minitoc">
-<li>
-<a href="#Supported+Platforms">Supported Platforms</a>
 </li>
 <li>
-<a href="#Required+Software">Required Software</a>
-</li>
-</ul>
-</li>
-<li>
-<a href="#Install+Chukwa">Install Chukwa</a>
+<a href="#Installing+Chukwa">Installing Chukwa</a>
 <ul class="minitoc">
 <li>
-<a href="#General++Install+Procedure">General  Install Procedure </a>
+<a href="#First+Steps">First Steps </a>
 </li>
 <li>
-<a href="#Chukwa+Binary">Chukwa Binary</a>
-</li>
-<li>
-<a href="#Chukwa+Configuration+Files">Chukwa Configuration Files </a>
-</li>
-<li>
-<a href="#Hadoop+Configuration+Files">Hadoop Configuration Files</a>
+<a href="#General+Configuration">General Configuration</a>
 </li>
 </ul>
 </li>
 <li>
-<a href="#Chukwa+Cluster+Deployment">Chukwa Cluster Deployment </a>
+<a href="#Agents">Agents </a>
 <ul class="minitoc">
 <li>
-<a href="#1.+Set+the+Environment+Variables">1. Set the Environment Variables</a>
+<a href="#Configuration">Configuration</a>
 </li>
 <li>
-<a href="#2.+Set+Up+the+Hadoop+jar+File">2. Set Up the Hadoop jar File </a>
+<a href="#Starting%2C+stopping%2C+and+monitoring">Starting, stopping, and monitoring</a>
 </li>
 <li>
-<a href="#3.+Configure+the+Collector"> 3. Configure the Collector  </a>
+<a href="#Configuring+Hadoop+for+monitoring">Configuring Hadoop for monitoring</a>
 </li>
-<li>
-<a href="#4.+Set+Up+the+Database"> 4. Set Up the Database </a>
+</ul>
 </li>
 <li>
-<a href="#5.+Start+the+Chukwa+Processes">5. Start the Chukwa Processes </a>
-</li>
+<a href="#Collectors">Collectors </a>
+<ul class="minitoc">
 <li>
-<a href="#6.+Validate+the+Chukwa+Processes">6. Validate the Chukwa Processes </a>
+<a href="#Configuration-N1013E">Configuration</a>
 </li>
 <li>
-<a href="#7.+Set+Up+HICC">7. Set Up HICC </a>
+<a href="#Starting%2C+stopping%2C+and+monitoring-N1015B">Starting, stopping, and monitoring</a>
 </li>
 </ul>
 </li>
 <li>
-<a href="#Monitored+Source+Node+Deployment">Monitored Source Node Deployment </a>
+<a href="#Demux+and+HICC">Demux and HICC</a>
 <ul class="minitoc">
 <li>
-<a href="#1.+Set+the+Environment+Variables-N10205">1. Set the Environment Variables </a>
-</li>
-<li>
-<a href="#2.+Configure+the+Agent">2. Configure the Agent</a>
+<a href="#Start+the+Chukwa+Processes">Start the Chukwa Processes </a>
 </li>
 <li>
-<a href="#3.+Configure+Adaptors">3. Configure Adaptors</a>
+<a href="#Set+Up+the+Database">Set Up the Database </a>
 </li>
 <li>
-<a href="#4.+Start+the+Chukwa+Processes">4. Start the Chukwa Processes </a>
-</li>
-<li>
-<a href="#5.+Validate+the+Chukwa+Processes">5. Validate the Chukwa Processes </a>
+<a href="#Set+Up+HICC">Set Up HICC </a>
 </li>
 </ul>
 </li>
@@ -267,106 +246,196 @@
 <a name="N1000D"></a><a name="Purpose"></a>
 <h2 class="h3"> Purpose </h2>
 <div class="section">
-<p>The purpose of this document is to help you install and configure Chukwa.</p>
+<p> Chukwa is a system for large-scale reliable log collection and processing
+with Hadoop. The <a href="design.html">Chukwa design overview</a> discusses the overall architecture of Chukwa.
+You should read that document before this one.
+The purpose of this document is to help you install and configure Chukwa.</p>
 </div>
 
 
-<a name="N10017"></a><a name="Pre-requisites"></a>
+<a name="N1001B"></a><a name="Pre-requisites"></a>
 <h2 class="h3"> Pre-requisites</h2>
 <div class="section">
-<a name="N1001D"></a><a name="Supported+Platforms"></a>
-<h3 class="h4">Supported Platforms</h3>
-<p>GNU/Linux is supported as a development and production platform. Chukwa has been demonstrated on Hadoop clusters with 2000 nodes.</p>
-<a name="N10027"></a><a name="Required+Software"></a>
-<h3 class="h4">Required Software</h3>
-<p>Required software for Linux include:</p>
-<ol>
-
-<li> Java 1.6.10, preferably from Sun, installed (see <a href="http://java.sun.com/">http://java.sun.com/</a>)
-</li> 
-<li> MySQL 5.1.30 (see <a href="#4.+Set+Up+the+Database">Set Up the Database)</a>
+<p>Chukwa should work on any POSIX platform, but  GNU/Linux is the only
+ production platform that has been tested extensively. Chukwa has also been used
+ successfully on Mac OS X, which several members of the Chukwa team use for 
+ development. </p>
+<p>
+ The only absolute software requirements are <a href="http://java.sun.com">Java 1.6
+ </a> or better and <a href="http://hadoop.apache.org/">Hadoop 0.18+</a>.
+  
 
-</li> 
-<li> Hadoop cluster, installed (see <a href="http://hadoop.apache.org/">http://hadoop.apache.org/</a>)
-</li> 
-<li> ssh must be installed and sshd must be running to use the Chukwa scripts that manage remote Chukwa daemons 
-</li>
-</ol>
+ HICC, the Chukwa
+ visualization interface, <a href="#Set+Up+the+Database">requires MySQL 5.1.30+.</a>
+</p>
+<p>
+The Chukwa cluster management scripts rely on <span class="codefrag">ssh</span>; these scripts, however,
+are not required if you have some alternate mechanism for starting and stopping
+daemons.
+ </p>
 </div>
 
 
 
-<a name="N1004C"></a><a name="Install+Chukwa"></a>
-<h2 class="h3">Install Chukwa</h2>
+<a name="N10039"></a><a name="Installing+Chukwa"></a>
+<h2 class="h3">Installing Chukwa</h2>
 <div class="section">
-<p>Chukwa is installed on: </p>
+<p>A minimal Chukwa deployment has three components: </p>
 <ul>
 
-<li> A hadoop cluster created specifically for Chukwa (referred to as the Chukwa cluster).</li> 
+<li> A Hadoop cluster on which Chukwa will store data (referred to as the Chukwa cluster).</li> 
 
-<li> The source nodes that Chukwa monitors (referred to as the monitored source nodes).</li>
+<li> A collector process, that writes collected data to HDFS, the Hadoop file system.</li>
+
+<li> One or more agent processes, that send monitoring data to the collector. 
+The nodes with active agent processes are referred to as the monitored source nodes.</li>
 
 </ul>
-<p></p>
-<p></p>
-<p>Chukwa can also be installed on a single node, in which case the machine must have at least 16 GB of memory. </p>
+<p>In addition, you may wish to run the Chukwa Demux jobs, which parse collected
+data, or HICC, the Chukwa visualization tool.</p>
 <p></p>
 <p></p>
 <p></p>
 <div id="" style="text-align: center;">
 <img id="" class="figure" alt="Chukwa Components" src="images/components.gif"></div>
-<a name="N10070"></a><a name="General++Install+Procedure"></a>
-<h3 class="h4">General  Install Procedure </h3>
-<p>1. Select one of the nodes in the Chukwa cluster: </p>
+<a name="N1005C"></a><a name="First+Steps"></a>
+<h3 class="h4">First Steps </h3>
+<ol>
+
+<li>Obtain a copy of Chukwa. You can find the latest release on the 
+<a href="http://hadoop.apache.org/chukwa/releases.html">Chukwa release page</a>.</li>
+
+<li>Un-tar the release, via <span class="codefrag">tar xzf</span>.</li>
+
+<li>Make sure a copy of Chukwa is available on each node being monitored, and on
+each node that will run a collector.</li>
+
+<li>
+We refer to the directory containing Chukwa as <span class="codefrag">CHUKWA_HOME</span>. It may
+be helpful to set <span class="codefrag">CHUKWA_HOME</span> explicitly in your environment,
+but Chukwa does not require that you do so.</li>
+
+</ol>
+<a name="N10081"></a><a name="General+Configuration"></a>
+<h3 class="h4">General Configuration</h3>
+<p>Agents and collectors are configured differently, but part of the process
+is common to both. </p>
 <ul>
 
-<li> Create a directory for the Chukwa installation (Chukwa will set the  environment variable <strong>CHUKWA_HOME</strong> to point to this directory during the the install).
-</li> 
-<li> Move to the new directory.
-</li> 
-<li> Download and un-tar the Chukwa binary.
-</li> 
-<li> Configure the components for the Chukwa cluster (see <a href="#Chukwa+Cluster+Deployment">Chukwa Cluster Deployment</a>).
-</li> 
-<li> Configure the Hadoop configuration files (see <a href="#Hadoop+Configuration+Files">Hadoop Configuration Files</a>).
-</li> 
-<li> Zip the directory and deploy to all nodes in the Chukwa cluster.
+<li>Make sure that <span class="codefrag">JAVA_HOME</span> is set correctly and points to a Java 1.6 JRE. 
+It's generally best to set this in <span class="codefrag">conf/chukwa-env.sh</span>.</li>
+
+<li>
+In <span class="codefrag">conf/chukwa-env.sh</span>, set <span class="codefrag">CHUKWA_LOG_DIR</span> and
+<span class="codefrag">CHUKWA_PID_DIR</span> to the directories where Chukwa should store its
+console logs and pid files.  The pid directory must not be shared between
+different Chukwa instances: it should be local, not NFS-mounted.
 </li>
+ 
+<li> Optionally, set CHUKWA_IDENT_STRING. This string is
+ used to name Chukwa's own console log files.</li>
+<!--
+<li>Set <b>either</b> <code>HADOOP_HOME</code> or <code>HADOOP_JAR</code></li>
+-->
+
 </ul>
+</div>
+
+<!-- 
+</li> <li> Download and un-tar the Chukwa binary.
+</li> <li> Configure the components for the Chukwa cluster (see <a href="#Chukwa+Cluster+Deployment">Chukwa Cluster Deployment</a>).
+</li> <li> Configure the Hadoop configuration files (see <a href="#Hadoop+Configuration+Files">Hadoop Configuration Files</a>).
+</li> <li> Zip the directory and deploy to all nodes in the Chukwa cluster.
+</li></ul> 
 <p></p>
 <p></p>
 <p>2. Select one of the source nodes to be monitored: </p>
 <ul>
-
 <li> Create a directory for the Chukwa installation (Chukwa will set the environment variable <strong>CHUKWA_HOME</strong> to point to this directory during the install).
-</li> 
-<li> Move to the new directory.
-</li> 
-<li> Download and un-tar the Chukwa binary.
-</li> 
-<li> Configure the components for the source nodes (see <a href="#Monitored+Source+Node+Deployment">Monitored Source Node Deployment</a>).
-</li> 
-<li> Configure the Hadoop configuration files (see <a href="#Hadoop+Configuration+Files">Hadoop Configuration Files</a>).
-</li> 
-<li> Zip the directory and deploy to all source nodes to be monitored.
+</li> <li> Move to the new directory.
+</li> <li> Download and un-tar the Chukwa binary.
+</li> <li> Configure the components for the source nodes (see <a href="#Monitored+Source+Node+Deployment">Monitored Source Node Deployment</a>).
+</li> <li> Configure the Hadoop configuration files (see <a href="#Hadoop+Configuration+Files">Hadoop Configuration Files</a>).
+</li> <li> Zip the directory and deploy to all source nodes to be monitored.
+</li></ul> 
+</section>
+ -->
+
+
+<a name="N100AB"></a><a name="Agents"></a>
+<h2 class="h3">Agents </h2>
+<div class="section">
+<p>Agents are the Chukwa processes that actually produce data. This section
+describes how to configure and run them. More details are available in the
+<a href="agent.html">Agent configuration guide</a>.</p>
+<a name="N100B8"></a><a name="Configuration"></a>
+<h3 class="h4">Configuration</h3>
+<p>This section describes how to set up the agent process on the source nodes. </p>
+<p>The one mandatory configuration step is to set up 
+<span class="codefrag"> $CHUKWA_HOME/conf/collectors</span>. This file should contain a list
+of hosts that will run Chukwa collectors. Agents will pick a random collector
+from this list to try sending to, and will fail-over to another listed collector
+on error.  The file should look something like:</p>
+<pre class="code">
+http://&lt;collector1HostName&gt;:&lt;collector1Port&gt;/
+http://&lt;collector2HostName&gt;:&lt;collector2Port&gt;/
+http://&lt;collector3HostName&gt;:&lt;collector3Port&gt;/
+</pre>
+<p>Edit the CHUKWA_HOME/conf/initial_adaptors configuration file. This is 
+where you tell Chukwa what log files to monitor. See
+<a href="agent.html#Adaptors">the adaptor configuration guide</a> for
+a list of available adaptors.</p>
+<p>There are a number of optional settings in 
+<span class="codefrag">$CHUKWA_HOME/conf/chukwa-agent-conf.xml</span>:</p>
+<ul>
+
+<li>The most important of these is the cluster/group name that identifies the
+monitored source nodes. This value is stored in each Chunk of collected data;
+you can therefore use it to distinguish data coming from different groups of 
+machines.
+<pre class="code">
+ &lt;property&gt;
+    &lt;name&gt;chukwaAgent.tags&lt;/name&gt;
+    &lt;value&gt;cluster="demo"&lt;/value&gt;
+    &lt;description&gt;The cluster's name for this agent&lt;/description&gt;
+  &lt;/property&gt;
+</pre>
+
 </li>
+
+<li>
+Another important option is <span class="codefrag">chukwaAgent.checkpoint.dir</span>.
+This is the directory Chukwa will use for its periodic checkpoints of running adaptors.
+It <strong>must not</strong> be a shared directory; use a local, not NFS-mount, directory.
+</li>
+
 </ul>
-<a name="N100BF"></a><a name="Chukwa+Binary"></a>
-<h3 class="h4">Chukwa Binary</h3>
-<p>To get a Chukwa distribution, download a recent stable release of Chukwa from one of the Apache Download Mirrors (see 
- <a href="http://hadoop.apache.org/chukwa/">Hadoop Chukwa Releases</a>.  
+<a name="N100F0"></a><a name="Starting%2C+stopping%2C+and+monitoring"></a>
+<h3 class="h4">Starting, stopping, and monitoring</h3>
+<p>To run an agent process on a single node, use <span class="codefrag">bin/agent.sh</span>.
 </p>
-<a name="N100CD"></a><a name="Chukwa+Configuration+Files"></a>
-<h3 class="h4">Chukwa Configuration Files </h3>
-<p>The Chukwa configuration files are located in the CHUKWA_HOME/conf directory. The configuration files that you modify are named <strong> *.template. </strong>
-To set up your Chukwa installation (configure various components), copy, rename, and modify the *.template files as necessary. 
-For example, copy the chukwa-collector-conf.xml.template file to a file named chukwa-collector-conf.xml and then modify the file to include the cluster/group name for the source nodes.
+<p>
+Typically, agents run as daemons. The script <span class="codefrag">bin/start-agents.sh</span> 
+will ssh to each machine listed in <span class="codefrag">conf/agents</span> and start an agent,
+running in the background. The script <span class="codefrag">bin/stop-agents.sh</span> 
+does the reverse.</p>
+<p>You can, of course, use any other daemon-management system you like. 
+For instance, <span class="codefrag">tools/init.d</span> includes init scripts for running
+Chukwa agents.</p>
+<p>To check if an agent is working properly, you can telnet to the control
+port (9093 by default) and hit "enter". You will get a status message if
+the agent is running normally.
 </p>
-<p>The <strong>default.properties</strong> file contains default parameter settings. To override these default settings use the <strong>build.properties </strong> file. 
-For example, copy the TODO-JAVA-HOME environment variable from the default.properties file to the build.properties file and change the setting.</p>
-<a name="N100E3"></a><a name="Hadoop+Configuration+Files"></a>
-<h3 class="h4">Hadoop Configuration Files</h3>
-<p>The Hadoop configuration files are located in the HADOOP_HOME/conf directory. To setup Chukwa to collect logs from Hadoop, you need to change some of the hadoop configuration files.</p>
+<a name="N10112"></a><a name="Configuring+Hadoop+for+monitoring"></a>
+<h3 class="h4">Configuring Hadoop for monitoring</h3>
+<p>
+One of the key goals for Chukwa is to collect logs from Hadoop clusters. This section
+describes how to configure Hadoop to send its logs to Chukwa. Note that 
+these directions require Hadoop 0.20.0+.  Earlier versions of Hadoop do not have
+the hooks that Chukwa requires in order to grab MapReduce job logs.</p>
+<p>The Hadoop configuration files are located in <span class="codefrag">HADOOP_HOME/conf</span>.
+ To setup Chukwa to collect logs from Hadoop, you need to change some of the 
+ Hadoop configuration files.</p>
 <ol>
 	
 <li>Copy CHUKWA_HOME/conf/hadoop-log4j.properties file to HADOOP_HOME/conf/log4j.properties</li>
@@ -374,44 +443,73 @@
 <li>Copy CHUKWA_HOME/conf/hadoop-metrics.properties file to HADOOP_HOME/conf/hadoop-metrics.properties</li>
 	
 <li>Edit HADOOP_HOME/conf/hadoop-metrics.properties file and change @CHUKWA_LOG_DIR@ to your actual CHUKWA log dirctory (ie, CHUKWA_HOME/var/log)</li>	
-	
-<li>ln -s HADOOP_HOME/conf/hadoop-site.xml CHUKWA_HOME/conf/hadoop-site.xml</li>
-
+<!-- <li>ln -s HADOOP_HOME/conf/hadoop-site.xml CHUKWA_HOME/conf/hadoop-site.xml</li>
+ -->	
+ 
 </ol>
 </div>
 
 
 
-<a name="N100FD"></a><a name="Chukwa+Cluster+Deployment"></a>
-<h2 class="h3">Chukwa Cluster Deployment </h2>
+<a name="N10131"></a><a name="Collectors"></a>
+<h2 class="h3">Collectors </h2>
+<div class="section">
+<p>This section describes how to set up the Chukwa collectors.
+For more details, see <a href="collector.html">the collector configuration guide</a>.</p>
+<a name="N1013E"></a><a name="Configuration-N1013E"></a>
+<h3 class="h4">Configuration</h3>
+<p>First, edit <span class="codefrag">$CHUKWA_HOME/conf/chukwa-env.sh</span> In addition to 
+the general directions given above, you should set <span class="codefrag">
+HADOOP_HOME</span>. This should be the Hadoop deployment Chukwa will use to
+store collected data.
+You will get a version mismatch error if this is configured incorrectly.
+</p>
+<p>Next, edit <span class="codefrag">$CHUKWA_HOME/conf/chukwa-collector-conf.xml</span>.
+The one mandatory configuration parameter is <span class="codefrag">writer.hdfs.filesystem</span>.
+This should be set to the HDFS root URL on which Chukwa will store data.
+Various optional configuration options are described in <a href="collector.html">the collector configuration guide</a>
+and in the collector configuration file itself.
+</p>
+<a name="N1015B"></a><a name="Starting%2C+stopping%2C+and+monitoring-N1015B"></a>
+<h3 class="h4">Starting, stopping, and monitoring</h3>
+<p>To run a collector process on a single node, use <span class="codefrag">bin/jettyCollector.sh</span>.
+</p>
+<p>
+Typically, collectors run as daemons. The script <span class="codefrag">bin/start-collectors.sh</span> 
+will ssh to each collector listed in <span class="codefrag">conf/collectors</span> and start a
+collector, running in the background. The script <span class="codefrag">bin/stop-collectors.sh
+</span> does the reverse.</p>
+<p>You can, of course, use any other daemon-management system you like. 
+For instance, <span class="codefrag">tools/init.d</span> includes init scripts for running
+Chukwa collectors.</p>
+<p>To check if a collector is working properly, you can simply access
+<span class="codefrag">http://collectorhost:collectorport/chukwa?ping=true</span> with a web browser.
+If the collector is running, you should see a status page with a handful of statistics.</p>
+</div>
+
+
+<a name="N10181"></a><a name="Demux+and+HICC"></a>
+<h2 class="h3">Demux and HICC</h2>
 <div class="section">
-<p>This section describes how to set up the Chukwa cluster and related components.</p>
-<a name="N10106"></a><a name="1.+Set+the+Environment+Variables"></a>
-<h3 class="h4">1. Set the Environment Variables</h3>
-<p>Edit the CHUKWA_HOME/conf/chukwa-env.sh configuration file: </p>
+<a name="N10189"></a><a name="Start+the+Chukwa+Processes"></a>
+<h3 class="h4">Start the Chukwa Processes </h3>
+<p>The Chukwa startup scripts are located in the CHUKWA_HOME/tools/init.d directory.</p>
 <ul>
 
-<li> Set JAVA_HOME to your Java installation.
-</li> 
-<li> Set HADOOP_JAR to $CHUKWA_HOME/hadoopjars/hadoop-0.18.2.jar 
-</li> 
-<li> Set CHUKWA_IDENT_STRING to the Chukwa cluster name. 
+<li> Start the Chukwa data processors script (execute this command only on the data processor node):
 </li>
 </ul>
-<a name="N1011B"></a><a name="2.+Set+Up+the+Hadoop+jar+File"></a>
-<h3 class="h4">2. Set Up the Hadoop jar File </h3>
-<p>Do the following:</p>
-<pre class="code">
-cp $HADOOP_HOME/lib hadoop-*-core.jar file $CHUKWA_HOME/hadoopjars
-</pre>
-<a name="N10129"></a><a name="3.+Configure+the+Collector"></a>
-<h3 class="h4"> 3. Configure the Collector  </h3>
-<p>Edit the CHUKWA_HOME/conf/chukwa-collector-conf.xml configuration file.</p>
-<p>Set the writer.hdfs.filesystem property to the HDFS root URL. </p>
-<a name="N10136"></a><a name="4.+Set+Up+the+Database"></a>
-<h3 class="h4"> 4. Set Up the Database </h3>
+<pre class="code">CHUKWA_HOME/tools/init.d/chukwa-data-processors start </pre>
+<ul>
+
+<li> Create down sampling daily cron job:
+</li>
+</ul>
+<pre class="code">CHUKWA_HOME/bin/downSampling.sh --config &lt;path to chukwa conf&gt; -n add </pre>
+<a name="N101A5"></a><a name="Set+Up+the+Database"></a>
+<h3 class="h4">Set Up the Database </h3>
 <p>Set up and configure the MySQL database.</p>
-<a name="N1013F"></a><a name="Install+MySQL"></a>
+<a name="N101AE"></a><a name="Install+MySQL"></a>
 <h4>Install MySQL</h4>
 <p>Download MySQL 5.1 from the <a href="http://dev.mysql.com/downloads/mysql/5.1.html#downloads">MySQL site</a>. </p>
 <pre class="code">
@@ -434,7 +532,7 @@
 </pre>
 <p>Download the MySQL Connector/J 5.1 from the  <a href="http://dev.mysql.com/downloads/connector/j/5.1.html">MySQL site</a>, 
 and place the jar file in $CHUKWA_HOME/lib.</p>
-<a name="N10169"></a><a name="Set+Up+MySQL+for+Replication"></a>
+<a name="N101D8"></a><a name="Set+Up+MySQL+for+Replication"></a>
 <h4>Set Up MySQL for Replication</h4>
 <p>Start the MySQL shell:</p>
 <pre class="code">
@@ -446,58 +544,8 @@
 GRANT REPLICATION SLAVE ON *.* TO '&lt;username&gt;'@'%' IDENTIFIED BY '&lt;password&gt;';
 FLUSH PRIVILEGES; 
 </pre>
-<a name="N1017E"></a><a name="Migrate+Existing+Data+From+Chukwa+0.1.1"></a>
-<h4>Migrate Existing Data From Chukwa 0.1.1</h4>
-<p>Start the MySQL shell:</p>
-<pre class="code">
-mysql -u root -p
-Enter password:
-</pre>
-<p>From the MySQL shell, enter these commands (replace &lt;database_name&gt; with an actual value):</p>
-<pre class="code">
-use &lt;database_name&gt;
-source /path/to/chukwa/conf/database_create_table.sql
-source /path/to/chukwa/conf/database_upgrade.sql
-</pre>
-<a name="N10194"></a><a name="5.+Start+the+Chukwa+Processes"></a>
-<h3 class="h4">5. Start the Chukwa Processes </h3>
-<p>The Chukwa startup scripts are located in the CHUKWA_HOME/tools/init.d directory.</p>
-<ul>
-
-<li> Start the Chukwa collector  script (execute this command only on those nodes that have the Chukwa Collector installed):
-</li>
-</ul>
-<pre class="code">CHUKWA_HOME/tools/init.d/chukwa-collector start </pre>
-<ul>
-
-<li> Start the Chukwa data processors script (execute this command only on the data processor node):
-</li>
-</ul>
-<pre class="code">CHUKWA_HOME/tools/init.d/chukwa-data-processors start </pre>
-<ul>
-
-<li> Create down sampling daily cron job:
-</li>
-</ul>
-<pre class="code">CHUKWA_HOME/bin/downSampling.sh --config &lt;path to chukwa conf&gt; -n add </pre>
-<a name="N101B9"></a><a name="6.+Validate+the+Chukwa+Processes"></a>
-<h3 class="h4">6. Validate the Chukwa Processes </h3>
-<p>The Chukwa status scripts are located in the CHUKWA_HOME/tools/init.d directory.</p>
-<ul>
-
-<li> To obtain the status for the Chukwa collector, run:</li>
-
-</ul>
-<pre class="code">CHUKWA_HOME/tools/init.d/chukwa-collector status </pre>
-<ul>
-
-<li> To verify that the data processors are functioning correctly: </li>
-
-</ul>
-<pre class="code">Visit the Chukwa hadoop cluster's Job Tracker UI for job status. 
-Refresh to the Chukwa Cluster Configuration page for the Job Tracker URL. </pre>
-<a name="N101D7"></a><a name="7.+Set+Up+HICC"></a>
-<h3 class="h4">7. Set Up HICC </h3>
+<a name="N101F0"></a><a name="Set+Up+HICC"></a>
+<h3 class="h4">Set Up HICC </h3>
 <p>The Hadoop Infrastructure Care Center (HICC) is the Chukwa web user interface. To set up HICC, do the following:</p>
 <ul>
 
@@ -508,7 +556,7 @@
 <li>Start up HICC by running: </li> 
 
 </ul>
-<pre class="code">CHUKWA_HOME/bin/hicc.sh start</pre>
+<pre class="code">$CHUKWA_HOME/bin/hicc.sh start</pre>
 <ul>
 
 <li>Point your favorite browser to: http://&lt;server&gt;:8080/hicc  </li> 
@@ -517,131 +565,13 @@
 </div>
 
 
-<a name="N101FC"></a><a name="Monitored+Source+Node+Deployment"></a>
-<h2 class="h3">Monitored Source Node Deployment </h2>
-<div class="section">
-<p>This section describes how to set up the source nodes. </p>
-<a name="N10205"></a><a name="1.+Set+the+Environment+Variables-N10205"></a>
-<h3 class="h4">1. Set the Environment Variables </h3>
-<p>Edit the CHUKWA_HOME/conf/chukwa-current/chukwa-env.sh configuration file: </p>
-<ul>
-
-<li> Set JAVA_HOME to the root of your Java installation.
-</li>
-<li> Set other environment variables as necessary.
-</li>
-</ul>
-<pre class="code">
-export JAVA_HOME=/path/to/java
-export HADOOP_HOME=/path/to/hadoop
-export chuwaRecordsRepository="/chukwa/repos/"
-export JDBC_DRIVER=com.mysql.jdbc.Driver
-export JDBC_URL_PREFIX=jdbc:mysql://
-</pre>
-<a name="N1021A"></a><a name="2.+Configure+the+Agent"></a>
-<h3 class="h4">2. Configure the Agent</h3>
-<p>Edit the CHUKWA_HOME/conf/chukwa-current/chukwa-agent-conf.xml configuration file. </p>
-<p>Enter the cluster/group name that identifies the monitored source nodes:</p>
-<pre class="code">
- &lt;property&gt;
-    &lt;name&gt;chukwaAgent.tags&lt;/name&gt;
-    &lt;value&gt;cluster="demo"&lt;/value&gt;
-    &lt;description&gt;The cluster's name for this agent&lt;/description&gt;
-  &lt;/property&gt;
-</pre>
-<p>Edit the CHUKWA_HOME/conf/chukwa-current/agents configuration file. </p>
-<p>Create a list of hosts that are running the Chukwa agent:</p>
-<pre class="code">
-localhost
-localhost
-localhost
-</pre>
-<p>Edit the CHUKWA_HOME/conf/collectors configuration file. </p>
-<p>The Chukwa agent needs to know about the Chukwa collectors. Create a list of hosts that are running the Chukwa collector:</p>
-<ul>
-	
-<li>This ...</li>
-
-</ul>
-<pre class="code">
-&lt;collector1HostName&gt;
-&lt;collector2HostName&gt;
-&lt;collector3HostName&gt;
-</pre>
-<ul>
-	
-<li>Or this ...</li>
-
-</ul>
-<pre class="code">
-http://&lt;collector1HostName&gt;:&lt;collector1Port&gt;/
-http://&lt;collector2HostName&gt;:&lt;collector2Port&gt;/
-http://&lt;collector3HostName&gt;:&lt;collector3Port&gt;/
-</pre>
-<a name="N1024F"></a><a name="3.+Configure+Adaptors"></a>
-<h3 class="h4">3. Configure Adaptors</h3>
-<p>Edit the CHUKWA_HOME/conf/initial_adaptors configuration file.</p>
-<p>Define the default adaptors:</p>
-<pre class="code">
-add org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8NewLineEscaped SysLog 0 /var/log/messages 0
-</pre>
-<p>Make sure Chukwa has a Read access to /var/log/messages. </p>
-<a name="N10263"></a><a name="4.+Start+the+Chukwa+Processes"></a>
-<h3 class="h4">4. Start the Chukwa Processes </h3>
-<p>Start the Chukwa agent and system metrics processes on the monitored source nodes.</p>
-<p>The Chukwa startup scripts are located in the CHUKWA_HOME/tools/init.d directory.</p>
-<p>Run both of these commands on all monitored source nodes: </p>
-<ul>
-
-<li> Start the Chukwa agent script:
-</li>
-</ul>
-<pre class="code">CHUKWA_HOME /tools/init.d/chukwa-agent start</pre>
-<ul>
-
-<li> Start the Chukwa system metrics script:
-</li>
-</ul>
-<pre class="code">CHUKWA_HOME /tools/init.d/chukwa-system-metrics start</pre>
-<a name="N10285"></a><a name="5.+Validate+the+Chukwa+Processes"></a>
-<h3 class="h4">5. Validate the Chukwa Processes </h3>
-<p>The Chukwa status scripts are located in the CHUKWA_HOME/tools/init.d directory.</p>
-<p>Verify that that agent and system metrics processes are running on all source nodes: </p>
-<ul>
-
-<li> To obtain the status for the Chukwa agent, run:
-</li>
-</ul>
-<pre class="code">CHUKWA_HOME/tools/init.d/chukwa-agent status </pre>
-<ul>
-
-<li> To obtain the status for the system metrics, run:
-</li>
-</ul>
-<pre class="code">CHUKWA_HOME/tools/init.d/chukwa-system-metrics status </pre>
-</div>
-
 
 
-<a name="N102A5"></a><a name="Troubleshooting+Tips"></a>
+<a name="N10215"></a><a name="Troubleshooting+Tips"></a>
 <h2 class="h3">Troubleshooting Tips</h2>
 <div class="section">
-<a name="N102AB"></a><a name="UNIX+Processes+For+Chukwa+Agents"></a>
+<a name="N1021B"></a><a name="UNIX+Processes+For+Chukwa+Agents"></a>
 <h3 class="h4">UNIX Processes For Chukwa Agents</h3>
-<p>The system metrics data loader process names are uniquely defined by:</p>
-<ul>
-
-<li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec sar -q -r -n ALL 55
-</li> 
-<li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec iostat -x -k 55 2
-</li> 
-<li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec top -b -n 1 -c
-</li> 
-<li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec df -l
-</li> 
-<li> org.apache.hadoop.chukwa.inputtools.plugin.metrics.Exec CHUKWA_HOME/bin/../bin/netstat.sh
-</li>
-</ul>
 <p>The Chukwa agent process name is identified by:</p>
 <ul>
 
@@ -654,7 +584,7 @@
 <li> ps ax | grep org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent
 </li>
 </ul>
-<a name="N102D6"></a><a name="UNIX+Processes+For+Chukwa+Collectors"></a>
+<a name="N10234"></a><a name="UNIX+Processes+For+Chukwa+Collectors"></a>
 <h3 class="h4">UNIX Processes For Chukwa Collectors</h3>
 <p>Chukwa Collector name is identified by:</p>
 <ul>
@@ -664,7 +594,7 @@
 
 </li>
 </ul>
-<a name="N102E8"></a><a name="UNIX+Processes+For+Chukwa+Data+Processes"></a>
+<a name="N10246"></a><a name="UNIX+Processes+For+Chukwa+Data+Processes"></a>
 <h3 class="h4">UNIX Processes For Chukwa Data Processes</h3>
 <p>Chukwa Data Processors are identified by:</p>
 <ul>
@@ -677,7 +607,7 @@
 </li>
 </ul>
 <p>The processes are scheduled execution, therefore they are not always visible from the process list.</p>
-<a name="N10300"></a><a name="Checks+for+MySQL+Replication"></a>
+<a name="N1025E"></a><a name="Checks+for+MySQL+Replication"></a>
 <h3 class="h4">Checks for MySQL Replication </h3>
 <p>At slave server, MySQL prompt, run:</p>
 <pre class="code">
@@ -707,7 +637,7 @@
   MASTER_CONNECT_RETRY=10;
 START SLAVE;
 </pre>
-<a name="N1032C"></a><a name="Checks+For+Disk+Full"></a>
+<a name="N1028A"></a><a name="Checks+For+Disk+Full"></a>
 <h3 class="h4"> Checks For Disk Full </h3>
 <p>If anything is wrong, use /etc/init.d/chukwa-agent and CHUKWA_HOME/tools/init.d/chukwa-system-metrics stop to shutdown Chukwa.  
 Look at agent.log and collector.log file to determine the problems. </p>
@@ -716,7 +646,7 @@
  0 12 * * * CHUKWA_HOME/tools/expiration.sh 10 !CHUKWA_HOME/var/log nowait
 </pre>
 <p>This will set up the log file expiration for CHUKWA_HOME/var/log for log files older than 10 days.</p>
-<a name="N10340"></a><a name="Emergency+Shutdown+Procedure"></a>
+<a name="N1029E"></a><a name="Emergency+Shutdown+Procedure"></a>
 <h3 class="h4">Emergency Shutdown Procedure</h3>
 <p>If the system is not functioning properly and you cannot find an answer in the Administration Guide, execute the kill command. 
 The current state of the java process will be written to the log files. You can analyze these files to determine the cause of the problem.</p>