You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@chukwa.apache.org by as...@apache.org on 2010/03/17 05:28:03 UTC

svn commit: r924149 - in /hadoop/chukwa/trunk: ./ src/docs/src/documentation/content/xdocs/

Author: asrabkin
Date: Wed Mar 17 04:28:03 2010
New Revision: 924149

URL: http://svn.apache.org/viewvc?rev=924149&view=rev
Log:
CHUKWA-458. Doc fixes for 0.4

Modified:
    hadoop/chukwa/trunk/CHANGES.txt
    hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/admin.xml
    hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/agent.xml
    hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/collector.xml
    hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/programming.xml

Modified: hadoop/chukwa/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/chukwa/trunk/CHANGES.txt?rev=924149&r1=924148&r2=924149&view=diff
==============================================================================
--- hadoop/chukwa/trunk/CHANGES.txt (original)
+++ hadoop/chukwa/trunk/CHANGES.txt Wed Mar 17 04:28:03 2010
@@ -30,6 +30,8 @@ Trunk (unreleased changes)
  
    IMPROVEMENTS
 
+    CHUKWA-458. Documentation fixes for 0.4 (asrabkin)
+
     CHUKWA-446. Refactor start/stop scripts. (Eric Yang)
 
     CHUKWA-450. Ability to turn off sort in dumpchunks. (asrabkin)

Modified: hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/admin.xml
URL: http://svn.apache.org/viewvc/hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/admin.xml?rev=924149&r1=924148&r2=924149&view=diff
==============================================================================
--- hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/admin.xml (original)
+++ hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/admin.xml Wed Mar 17 04:28:03 2010
@@ -240,7 +240,7 @@ It <strong>must not</strong> be a shared
 
 <section>
 <title>Starting, stopping, and monitoring</title>
-<p>To run an agent process on a single node, use <code>bin/agent.sh</code>.
+<p>To run an agent process on a single node, use <code>bin/chukwa agent</code>.
 </p>
 
 <p>
@@ -303,7 +303,7 @@ and in the collector configuration file 
 
 <section>
 <title>Starting, stopping, and monitoring</title>
-<p>To run a collector process on a single node, use <code>bin/jettyCollector.sh</code>.
+<p>To run a collector process on a single node, use <code>bin/chukwa collector</code>.
 </p>
 
 <p>

Modified: hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/agent.xml
URL: http://svn.apache.org/viewvc/hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/agent.xml?rev=924149&r1=924148&r2=924149&view=diff
==============================================================================
--- hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/agent.xml (original)
+++ hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/agent.xml Wed Mar 17 04:28:03 2010
@@ -34,7 +34,7 @@ or tailing a file, or listening for inco
 Adaptors are dynamically loadable modules that run inside the Agent process. 
 There is generally one Adaptor for each data source: for each file being watched 
 or for each Unix command being executed. Each adaptor has a unique name. If you 
-do not specify a name, one will be autogenerated by hashing the Adaptor type and
+do not specify a name, one will be auto-generated by hashing the Adaptor type and
 parameters.</p>
 
 <p>There are a number of Adaptors built into Chukwa, and you can also develop
@@ -60,6 +60,7 @@ your own. Chukwa will use them if you ad
 <tr><td><code>list</code>  </td><td> List currently running adaptors</td><td>None</td></tr>
 <tr><td><code>reloadcollectors</code>  </td><td> Re-read list of collectors</td><td>None</td></tr>
 <tr><td><code>stop</code>  </td><td> Stop adaptor, abruptly</td><td>Adaptor name</td></tr>
+<tr><td><code>stopall</code>  </td><td> Stop all adaptors, abruptly</td><td>Adaptor name</td></tr>
 <tr><td><code>shutdown</code>  </td><td> Stop adaptor, gracefully</td><td>Adaptor name</td></tr>
 <tr><td><code>stopagent</code>  </td><td> Stop agent process</td><td>None</td></tr>
 </table>
@@ -83,12 +84,11 @@ the adaptor parameters.
 followed with an equals sign. It should be a string of printable characters, 
 without whitespace or '='.  Chukwa Adaptor names all start with "adaptor_".
 If you specify an adaptor name which does not start with that prefix, it will
-be added automatically.  This convention is intended to make the Chukwa logs
-easier to parse.
+be added automatically.  
 </p>
 
-<p>Adaptor parameters aren't required by the add command, but adaptor 
-implementations may have both mandatory and optional parameters. See below.</p>
+<p>Adaptor parameters aren't required by the Chukwa agent, but each class of adaptor 
+may itself specify both mandatory and optional parameters. See below.</p>
 </section>
 
 <section>
@@ -102,7 +102,7 @@ that will be used as collector, overridi
 not for production use.</p>
 
 <source>
-bin/agent.sh local
+bin/chukwa agent local
 </source>
 </section>
 
@@ -117,46 +117,71 @@ bin/agent.sh local
 <source>add FileTailer FooData /tmp/foo 0</source>
 This pushes file <code>/tmp/foo</code> as one chunk, with datatype <code>FooData</code>.
 </li>
-<li><strong>filetailer.FileTailingAdaptor</strong>
- Repeatedly tails a file, treating the file as a sequence of bytes, ignoring the
+<li><strong>filetailer.LWFTAdaptor</strong>
+Repeatedly tails a file, treating the file as a sequence of bytes, ignoring the
   content. Chunk boundaries are arbitrary. This is useful for streaming binary 
-  data. Takes one mandatory parameter; a path to the file to tail.
+  data. Takes one mandatory parameter; a path to the file to tail. If log file
+  is rotated while there is unread data, this adaptor will not attempt to recover it.
+  <source>add filetailer.LWFTAdaptor BarData /foo/bar 0</source>
+This pushes <code>/foo/bar</code> in a sequence of Chunks of type <code>BarData</code>
+</li>
+
+<li><strong>filetailer.FileTailingAdaptor</strong>
+ Repeatedly tails a file, again ignoring content and with unspecified Chunk
+ boundaries. Takes one mandatory parameter; a path to the file to tail. Keeps a 
+  file handle open in order to detect log file rotation.
 <source>add filetailer.FileTailingAdaptor BarData /foo/bar 0</source>
 This pushes <code>/foo/bar</code> in a sequence of Chunks of type <code>BarData</code>
+</li>
 
+
+<li><strong>filetailer.RCheckFTAdaptor</strong>
+ An experimental modification of the above, which avoids the need to keep a file handle
+ open.  Same parameters and usage as the above.
 </li>
+
 <li><strong>filetailer.CharFileTailingAdaptorUTF8</strong>
-The same, except that chunks are guaranteed to end only at carriage returns.
+The same as the base FileTailingAdaptor, except that chunks are guaranteed to end only at carriage returns.
  This is useful for most ASCII log file formats.
 </li>
 
 <li><strong>filetailer.CharFileTailingAdaptorUTF8NewLineEscaped</strong>
  The same, except that chunks are guaranteed to end only at non-escaped carriage
-  returns. This is useful for pushing Chukwa-formatted log files, where exception
-   stack traces stay in a single chunk.
+ returns. This is useful for pushing Chukwa-formatted log files, where exception
+ stack traces stay in a single chunk.
 </li>
 
-<li><strong>DirTailingAdaptor</strong> Takes a directory path and a second
+<li><strong>DirTailingAdaptor</strong> Takes a directory path and an
  adaptor name as mandatory parameters; repeatedly scans that directory and all
  subdirectories, and starts the indicated adaptor running on each file. Since
  the DirTailingAdaptor does not, itself, emit data, the datatype parameter is 
  applied to the newly-spawned adaptors.  Note  that if you try this on a large 
- directory, it is possible to exceed your system's limit on open files.
+ directory with an adaptor that keeps file handles open,
+  it is possible to exceed your system's limit on open files.
+  A file pattern can be specified as an optional second parameter.
 
-<source>add DirTailingAdaptor logs /var/log/ filetailer.CharFileTailingAdaptorUTF8 0</source>
+<source>add DirTailingAdaptor logs /var/log/ [pattern] filetailer.CharFileTailingAdaptorUTF8 0</source>
 
 </li>
-<li><strong>ExecAdaptor</strong> Takes a frequency (in miliseconds) as optional 
+<li><strong>ExecAdaptor</strong> Takes a frequency (in milliseconds) as optional 
 parameter, and then program name as mandatory parameter. Runs that program 
 repeatedly at a rate specified by frequency.
 
 <source>add ExecAdaptor Df 60000 /bin/df -x nfs -x none 0</source>
- This adaptor will run <code>df</code> every minute, labelling output as Df.
+ This adaptor will run <code>df</code> every minute, labeling output as Df.
+</li>
+
+<li><strong>UDPAdaptor</strong> Takes a port number as mandatory parameter.
+Binds to the indicated UDP port, and emits one Chunk for each received packet.
+
+<source>add UdpAdaptor Packets 1234 0</source>
+ This adaptor will listen for incoming traffic on port 1234, labeling output as Packets.
 </li>
 
-<li><strong>edu.berkeley.chukwa_xtrace.XtrAdaptor</strong> (available in contrib)
+
+<li><strong>edu.berkeley.chukwa_xtrace.XtrAdaptor</strong> (available in <code>contrib</code>)
  Takes an <a href="http://www.x-trace.net/wiki/doku.php">Xtrace</a> ReportSource
- classname [without package] as mandatory argument, and no optional parameters.
+ class name [without package] as mandatory argument, and no optional parameters.
  Listens for incoming reports in the same way as that ReportSource would.
 
 <source>add edu.berkeley.chukwa_xtrace.XtrAdaptor Xtrace UdpReportSource 0</source>

Modified: hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/collector.xml
URL: http://svn.apache.org/viewvc/hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/collector.xml?rev=924149&r1=924148&r2=924149&view=diff
==============================================================================
--- hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/collector.xml (original)
+++ hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/collector.xml Wed Mar 17 04:28:03 2010
@@ -47,7 +47,7 @@
   	the default port number.
   	</p>
   	 	<source>
-  	  bin/jettyCollector.sh writer=pretend portno=8081
+  	  bin/chukwa collector writer=pretend portno=8081
   	</source>
   	</section>
   	
@@ -109,7 +109,7 @@
 	  	<code>datatype</code> <code>stream name</code> <code>offset</code>, separated by spaces.
 	  	</p>
 	  	<p>
-	  	The filter will be inactivated when the socket is closed.
+	  	The filter will be de-activated when the socket is closed.
 	  	</p>
 
 	  	<source>

Modified: hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/programming.xml
URL: http://svn.apache.org/viewvc/hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/programming.xml?rev=924149&r1=924148&r2=924149&view=diff
==============================================================================
--- hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/programming.xml (original)
+++ hadoop/chukwa/trunk/src/docs/src/documentation/content/xdocs/programming.xml Wed Mar 17 04:28:03 2010
@@ -47,7 +47,7 @@ Chukwa gives you several ways of inspect
 <p>
 It very often happens that you want to retrieve one or more files that have been
 collected with Chukwa. If the total volume of data to be recovered is not too
-great, you can use <code>dump.sh</code>, a command-line tool that does the job.
+great, you can use <code>bin/chukwa dumpArchive</code>, a command-line tool that does the job.
 The <code>dump</code> tool does an in-memory sort of the data, so you'll be 
 constrained by the Java heap size (typically a few hundred MB).
 </p>
@@ -65,11 +65,11 @@ separated by a row of dashes.  
 matches the glob pattern.  Note the use of single quotes to pass glob patterns
 through to the application, preventing the shell from expanding them.</p>
 <source>
-$CHUKWA_HOME/bin/dump.sh 'datatype=.*' 'hdfs://host:9000/chukwa/archive/*.arc'
+$CHUKWA_HOME/bin/chukwa dumpArchive 'datatype=.*' 'hdfs://host:9000/chukwa/archive/*.arc'
 </source>
 
 <p>
-The patterns used by <code>dump.sh</code> are based on normal regular 
+The patterns used by <code>dump</code> are based on normal regular 
 expressions. They are of the form <code>field1=regex&#38;field2=regex</code>.
 That is, they are a sequence of rules, separated by ampersand signs. Each rule
 is of the form <code>metadatafield=regex</code>, where 
@@ -107,7 +107,7 @@ You can invoke the tool by running <code
 To specify summarize mode, pass <code>--summarize</code> as the first argument.
 </p>
 <source>
-bin/dumpArchive.sh --summarize 'hdfs://host:9000/chukwa/logs/*.done'
+bin/chukwa dumpArchive --summarize 'hdfs://host:9000/chukwa/logs/*.done'
 </source>
 </section>