You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by sc...@apache.org on 2010/05/06 16:04:11 UTC

svn commit: r941741 [3/4] - in /uima/uimaj/branches/mavenAlign/uima-docbook-tools: ./ src/ src/docbook/ src/docbook/images/ src/docbook/images/tools/ src/docbook/images/tools/tools.annotation_viewer/ src/docbook/images/tools/tools.caseditor/ src/docboo...

Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.cvd.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.cvd.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.cvd.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.cvd.xml Thu May  6 14:04:08 2010
@@ -0,0 +1,941 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/tools.cvd/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
+%uimaents;
+]>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+<chapter id="ugr.tools.cvd">
+ <title>CAS Visual Debugger</title>
+ <section id="ugr.tools.cvd.introduction">
+ <title>Introduction</title>
+ <para>
+  The CAS Visual Debugger is a tool to run text analysis engines in UIMA
+  and view the results. The tool is implemented as a stand-alone GUI 
+  tool using Java's Swing library.
+ </para>
+ <para>
+  This is a developer's tool.  It is intended to support you in writing
+  text analysis annotators for UIMA (Unstructured Information Management
+  Architecture).  As a development tool, the emphasis is not so much on
+  pretty pictures, but rather on navigability.  It is intended to show
+  you all the information you need, and show it to you quickly (at least
+  on a fast machine ;-).
+ </para>
+ <para>
+  The main purpose of this application is to let you browse all the data
+  that was created when you ran an analysis engine over some text.  The
+  display mimics the access methods you have in the CAS API in terms of
+  indexes, types, feature structures and feature values.
+ </para>
+ <para>
+  As in the CAS, there is special support for annotations.  Clicking on
+  an annotation will select the corresponding text, and conversely, you
+  can display all annotations that cover a given position in the text.
+  This will be explained in more detail in the section on the main
+  display area.
+ </para>
+ <para>
+  As usual, the graphics in this manual are for illustrative purposes
+  and may not look 100% like the actual version of CVD you are running.
+  This depends on your operating system, your version of Java, and a
+  variety of other factors.
+ </para>
+ <section id="ugr.cvd.introduction.running">
+ <title>Running CVD</title>
+ <para>
+  You will usually want to start CVD from the command line, or from Eclipse.  To start CVD from the
+  command line, you minimally need the uima-core and uima-tools jars.  Below is a sample command
+  line for sh and its offspring.
+  <programlisting>java -cp ${UIMA_HOME}/lib/uima-core.jar:${UIMA_HOME}/lib/uima-tools.jar 
+    org.apache.uima.tools.cvd.CVD</programlisting>
+  However, there is no need to type this.  The ${UIMA_HOME}/bin directory contains a cvd.sh and
+  cvd.bat file for Unix/Linux/MacOS and Windows, respectively.
+ </para>
+ <para>
+   In Eclipse, you have a ready to use launch configuration available when you have installed the
+   UIMA sample project (see <olink targetdoc="&uima_docs_overview;" 
+   targetptr="ugr.ovv.eclipse_setup.example_code"/>).  Below is a screenshot of the the Eclipse Run 
+   dialog with the CVD
+   run configuration selected.
+   <screenshot>
+    <mediaobject>
+     <imageobject>
+      <imagedata scale="85" format="JPG" fileref="&imgroot;eclipse-cvd-launch.jpg"/>
+     </imageobject>
+     <textobject>
+      <phrase>Eclipse run dialog with CVD selected</phrase>
+     </textobject>
+    </mediaobject>
+   </screenshot>
+ </para>
+ </section>
+ 
+ <section id="cvd.introduction.commandline">
+ <title>Command line parameters</title>
+ <para>
+ You can provide some command line parameters to influence the startup behavior of CVD.  For
+ example, if you want to run a certain analysis engine on a certain text over and over again
+ (for debugging, say), you can make CVD load the annotator and text at startup and execute
+ the annotator.  Here's a list of the supported command line options.
+ </para>
+ 
+    <table frame="none" id="cvd.table.commandline">
+    <title>Command line options</title>
+    <tgroup cols="2">
+     <thead>
+      <row>
+       <entry>Option</entry>
+       <entry>Description</entry>
+      </row>
+     </thead>
+     <tbody>
+      <row>
+       <entry>
+        <computeroutput>-text &lt;textFile></computeroutput>
+       </entry>
+       <entry>Loads the text file <computeroutput>&lt;textFile></computeroutput></entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>-desc &lt;descriptorFile></computeroutput>
+       </entry>
+       <entry>Loads the descriptor <computeroutput>&lt;descriptorFile></computeroutput></entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>-exec</computeroutput>
+       </entry>
+       <entry>Runs the pre-loaded annotator; only allowed in conjunction with <computeroutput>-desc</computeroutput> </entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>-datapath &lt;datapath></computeroutput>
+       </entry>
+       <entry>Sets the data path to <computeroutput>&lt;datapath></computeroutput></entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>-ini &lt;iniFile></computeroutput>
+       </entry>
+       <entry>Makes CVD use alternative ini file <computeroutput>&lt;textFile></computeroutput> (default is ~/annotViewer.pref)</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>-lookandfeel &lt;lnfClass></computeroutput>
+       </entry>
+       <entry>Uses alternative look-and-feel <computeroutput>&lt;lnfClass></computeroutput></entry>
+      </row>
+      </tbody>
+      </tgroup>
+      </table>
+ 
+ </section>
+ 
+ </section>
+ <section id="cvd.errorHandling">
+  <title>Error Handling</title>
+  <para>
+   On encountering
+   an error, CVD will pop up an error dialog with a short,
+   usually incomprehensible message.  Often, the error message will
+   claim that there is more information available in the log file, and
+   sometimes, this is actually true; so do go and check the log.  You
+   can view the log file by selecting the appropriate item in the
+   &quot;Tools&quot; menu.
+
+   <screenshot>
+    <mediaobject>
+     <imageobject>
+      <imagedata scale="100" format="JPG" fileref="&imgroot;ErrorExample.jpg"/>
+     </imageobject>
+     <textobject>
+      <phrase>Sample error dialog</phrase>
+     </textobject>
+    </mediaobject>
+   </screenshot>
+
+  </para>
+ </section>
+
+ <section id="cvd.preferencesFile">
+  <title>Preferences File</title>
+  <para>
+   The program will attempt to read on startup and save on exit a file
+   called annotViewer.pref in your home directory.  This file contains
+   information about choices you made while running the program:
+   directories (such as where your data files are) and window sizes. 
+   These settings will be used the next time you use the program. There
+   is no user control over this process, but the file format is
+   reasonably transparent, in case you feel like changing it.  Note,
+   however, that the file will be overwritten every time you exit the
+   program.
+  </para>
+  
+  <para>
+  If you use CVD for several projects, it may be convenient to use a different
+  ini files for each project.  You can specify the ini file CVD should use
+  with the <programlisting>-ini &lt;iniFile></programlisting> parameter on the
+  command line.
+  </para>
+ </section>
+
+ <section id="cvd.theMenus">
+  <title>The Menus</title>
+  <para>
+   We give a brief description of the various menus. All menu items come
+   with mnemonics (e.g., Alt-F X will exit the program). In addition,
+   some menu items have their own keyboard accelerators that you can use
+   anywhere in the program. For example, Ctrl-S will save the text
+   you've been editing.
+  </para>
+  <section id="cvd.fileMenu">
+   <title>The File Menu</title>
+   <para>
+    The File menu lets you load, create and save text, load and save
+    color settings, and import and export the XCAS format. Here's a
+    screenshot.
+
+   <screenshot> 
+    <mediaobject>
+      <imageobject>
+       <imagedata scale="100" format="JPG" fileref="&imgroot;FileMenu.jpg"/>
+      </imageobject>
+      <textobject>
+       <phrase>The File menu</phrase>
+      </textobject>
+     </mediaobject>
+    </screenshot>
+   </para>
+
+   <itemizedlist>
+    <para>
+     Below is a list of the menu items, together with an explanation.
+    </para>
+
+    <listitem>
+     <formalpara>
+      <title>New Text...</title>
+      <para>
+       Clears the text area. Text you type is written to an anonymous
+       buffer. You can use &quot;Save Text As...&quot; to save the text
+       you typed to a file. Note: whenever you modify the text, be it
+       through typing, loading a file or using the &quot;New
+       Text...&quot; menu item, previous analysis results will be lost.
+       Since the previous analysis is specific to the text, modifying
+       the text invalidates the analysis.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Open Text File</title>
+      <para>
+       Loads a new text file into the viewer.  The next time you run an
+       analysis engine, it will run the text you loaded last.  Depending
+       on the annotator you're using, the program may run slow with very
+       large text files, so you may want to experiment.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Save Text File</title>
+      <para>
+       Saves the currently open text file. If no file is currently
+       loaded (either because you haven't loaded a file, or you've used
+       the &quot;New Text...&quot; menu item), this menu item is
+       disabled (and Ctrl-S will do nothing).
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Save Text As...</title>
+      <para>
+       Save the text to a file of your choosing. This can be an existing
+       file, which is then overwritten, or it can be a new file that
+       you're creating.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Change Code Page</title>
+      <para>
+       Allows you to change the code page that is used to load and save
+       text files. If you're sure the text you're loading is in ASCII or
+       one of the 8-bit extensions such as ISO-8859-1 (ISO Latin1),
+       there is probably nothing you need to do. Just load the text and
+       look at the display. If you see no funny characters or square
+       boxes, chances are your selected code page is compatible with
+       your text file.
+       
+       Note that the code page setting is also in effect when you save
+       files. You can observe the effects with a hex editor or by just
+       looking at the file size. For example, if you save the default
+       text
+       <computeroutput>This is where the text goes.</computeroutput>
+       to a file on Windows using the default code page, the size of the
+       file will be 28 bytes. If you now change the code page to UTF-16
+       and save the file again, the file size will be 58 bytes: two
+       bytes for each character, plus two bytes for the byte-order mark.
+       Now switch the code page back to the default Windows code page
+       and reload the UTF-16 file to see the difference in the editor.
+       
+       CVD will display all code pages that are available in the JVM
+       you're running it on.  The first code page in the list is the
+       default code page of your system.  This is also CVD's default if
+       you don't make a specific choice.
+       
+       Your code page selection will be remembered in CVD's ini file.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Load Color Settings</title>
+      <para>
+       Load previously saved color settings from a file (see
+       Tools/Customize Annotation Display).  It is highly recommended
+       that you only load automatically generated files.  Strange things
+       may happen if you try to load the wrong file format. On startup,
+       the program attempts to load the last color settings file that
+       you loaded or saved during a previous session. If you intend to
+       use the same color settings as the last time you ran the program,
+       there is therefore no need to manually load a color settings
+       file.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Save Color Settings</title>
+      <para>
+       Save your customized color settings (see Tools/Customize
+       Annotation Display).  The file is a Java properties file, and as
+       such, reasonably transparent.  What is not transparent is the
+       encoding of the colors (integer encoding of 24-bit RGB values),
+       so changing the file by hand is not really recommended.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Read Type System File</title>
+      <para>
+       Load a type system file. This allows you to load an XCAS file
+       without having to have access to the corresponding annotator.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Write Type System File</title>
+      <para>
+       Create a type system file from the currently loaded type
+       definitions. In addition, you can save the current CAS as a XCAS
+       file (see below). This allows you to later load the type system
+       and XCAS to view the CAS without having to rerun the annotator.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Read XMI CAS File</title>
+      <para>
+       Read an XMI CAS file. Important: XMI CAS is a serialization format that
+       serializes a CAS without type system and index information. It is
+       therefore impossible to read in a stand-alone XMI CAS file. XMI CAS
+       files can only be interpreted in the context of an existing type
+       system. Consequently, you need to first load the Analysis Engine that was used to
+       create the XMI file, to be able to load that XMI file.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Write XMI CAS File</title>
+      <para>
+       Writes the current analysis out as an XMI CAS file.
+      </para>
+     </formalpara>
+    </listitem>
+
+
+    <listitem>
+     <formalpara>
+      <title>Read XCAS File</title>
+      <para>
+       Read an XCAS file. Important: XCAS is a serialization format that
+       serializes a CAS without type system and index information. It is
+       therefore impossible to read in a stand-alone XCAS file. XCAS
+       files can only be interpreted in the context of an existing type
+       system. Consequently, you need to load the Analysis Engine that was used to
+       create the XCAS file to be able to load it. Loading a XCAS file
+       without loading the Analysis Engine may produce strange errors. You may get
+       syntax errors on loading the XCAS file, or worse, everything may
+       appear to go smoothly but in reality your CAS may be corrupted.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Write XCAS File</title>
+      <para>
+       Writes the current analysis out as an XCAS file.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Exit</title>
+      <para>Exits the program. Your preferences will be saved.</para>
+     </formalpara>
+    </listitem>
+
+   </itemizedlist>
+
+  </section>
+
+  <section id="cvd.editMenu">
+   <title>The Edit Menu</title>
+   <para>
+
+   <screenshot>
+     <mediaobject>
+      <imageobject>  <!-- was 2.15in -->
+       <imagedata scale="100" format="JPG" fileref="&imgroot;EditMenu.jpg" />
+      </imageobject>
+      <textobject>
+       <phrase>The Edit menu</phrase>
+      </textobject>
+     </mediaobject>
+    </screenshot>
+
+    The &quot;Edit&quot; menu provides a standard text editing menu with
+    Cut, Copy and Paste, as well as unlimited Undo.
+   </para>
+   <para>
+    Note that standard keyboard accelerators Ctrl-X, Ctrl-C, Ctrl-V and
+    Ctrl-Z can be used for Cut, Copy, Paste and Undo, respectively. The
+    text area supports other standard keyboard operations such as
+    navigation HOME, Ctrl-HOME etc., as well as marking text with Shift-
+    &lt;ArrowKey&gt;.
+   </para>
+  </section>
+
+  <section id="cvd.runMenu">
+   <title>The Run Menu</title>
+   <para>
+
+    <screenshot>
+     <mediaobject>
+      <imageobject> <!-- was width="2.225in" -->
+       <imagedata scale="100" format="JPG" fileref="&imgroot;RunMenu.jpg" />
+      </imageobject>
+      <textobject>
+       <phrase>The Run menu</phrase>
+      </textobject>
+     </mediaobject>
+     </screenshot>
+
+     In the Run menu, you can load and run text analysis engines.
+   </para>
+
+   <itemizedlist>
+
+    <listitem>
+     <formalpara>
+      <title>Load AE</title>
+      <para>
+       Loads and initializes a text analysis engine. Choosing this menu
+       item will display a file open dialog where you should choose an
+       XML descriptor of a Text Analysis Engine to process the current
+       text.  Even if the analysis engine runs fast, this will take a
+       while, since there is a lot of setup work to do when a new TAE is
+       created.  So be patient.
+
+       When you develop a new annotator, you will often need to
+       recompile your code. Gladis will not reload your annotator code.
+       When you recompile your code, you need to terminate the GUI and
+       restart it. If you only make changes to the XML descriptor, you
+       don't need to restart the GUI. Simply reload the XML file.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Run AE</title>
+      <para>
+       Before you have (successfully) loaded a TAE, this menu item will
+       be disabled. After you have loaded a TAE, it will be enabled, and
+       the name changes according to the name of the TAE you have
+       loaded. For example, if you've loaded &quot;The World's Fastest
+       Parser&quot;, you will have a menu item called &quot;Run The
+       World's Fastest Parser&quot;. When you choose the item, the TAE
+       is run on whatever text you have currently loaded.
+
+       After a TAE has run successfully, the index window in the upper
+       left-hand corner of the screen should be updated and show the
+       indexes that were created by this run.  We will have more to say
+       about indexes and what to do with them later.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Run AE on CAS</title>
+      <para>
+       This allows you to run an analysis engine on the current CAS.
+       This is useful if you have loaded a CAS from an XCAS file, and
+       would like to run further analysis on it.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Run collectionProcessComplete</title>
+      <para>
+       When you select this item, the analysis engine's 
+       collectionProcessComplete() method is called.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Performance Report</title>
+      <para>
+       After you've run your analysis, you can view a performance report.  It will show
+       you where the time went: which component used how much of the processing time.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Recently used</title>
+      <para>
+       Collects a list of recently used analysis engines as a short-cut
+       for loading.
+      </para>
+     </formalpara>
+    </listitem>
+
+    <listitem>
+     <formalpara>
+      <title>Language</title>
+      <para>
+       Some annotators do language specific processing. For example, if
+       you run lexical analysis, the results may be quite different
+       depending on what the analysis engine thinks the language of the
+       document is. With this menu item, you can manually set the
+       document language. Alternatively, you can use an automatic
+       language identification annotator. If the analysis engines you're
+       working with are language agnostic, there is no need to set the
+       language.
+      </para>
+     </formalpara>
+    </listitem>
+
+   </itemizedlist>
+  </section>
+
+  <section id="cvd.toolsMenu">
+   <title>The tools menu</title>
+   <para>
+    The tools menu contains some assorted utilities, such as the log
+    file viewer. Here you can also set the log level for UIMA.  
+    A more detailed description of some of the menu items
+    follows below.
+   </para>
+   <section id="cvd.viewTypeSystem">
+    <title>View Type System</title>
+    <para>
+
+     <screenshot>
+       <mediaobject>
+        <imageobject>  
+         <imagedata scale="100" format="JPG" fileref="&imgroot;TypeSystemViewer.jpg" />
+        </imageobject>
+       </mediaobject>
+      </screenshot>
+
+     Brings up a new window that displays the type system. This menu
+     item is disabled until the first time you have run an analysis
+     engine, since there is no type system to display until then. An
+     example is shown above.
+    </para>
+    <para>
+     You can view the inheritance tree on the left by expanding and
+     collapsing nodes.  When you select a type, the features defined on
+     that type are displayed in the table on the right.  The feature
+     table has three columns.  The first gives the name of the feature,
+     the second one the type of the feature (i.e., what values it
+     takes), and the third column displays the highest type this feature
+     is defined on.  In this example, the features &quot;begin&quot; and
+     &quot;end&quot; are inherited from the built-in annotation type.
+    </para>
+    <para>
+     In the options menu, you can configure if you want to see inherited
+     features or not (not yet implemented).
+    </para>
+   </section>
+
+   <section id="cvd.showSelectedAnnotations">
+    <title>Show Selected Annotations</title>
+    <para>
+     <figure id="AnnotationViewerFigure">
+      <title>
+       Annotations produced by a statistical named entity tagger
+      </title>
+      <mediaobject>
+       <imageobject> <!-- was width="5.82in" -->
+        <imagedata scale="100" format="JPG" fileref="&imgroot;AnnotationViewer.jpg" />
+       </imageobject>
+      </mediaobject>
+     </figure>
+    </para>
+
+    <para>
+     To enable this menu, you must have run an analysis engine and
+     selected the ``AnnotationIndex'' or one of its subnodes in the
+     upper left hand corncer of the screen.  It will bring up a new text
+     window with all selected annotations marked up in the text. 
+    </para>
+    <para>
+     <xref linkend="AnnotationViewerFigure" />
+     shows the results of applying a statistical named entity tagger to
+     a newspaper article.  Some annotation colors have been customized:
+     countries are in reverse video, organizations have a turquois
+     background, person names are green, and occupations have a maroon
+     background.  The default background color is yellow.  This color is
+     also used if there is more than one annotation spanning a certain
+     text.  Clearly, this display is only useful if you don't have any
+     overlapping annotations, or at least not too many.
+    </para>
+    <para>
+     This menu item is also available as a context menu in the Index
+     Tree area of the main window. To use it, select the annotation
+     index or one of its subnodes, right-click to bring up a popup menu,
+     and select the only item in the popup menu. The popup menu is
+     actually a better way to invoke the annotation display, since it
+     changes according to the selection in the Index Tree area, and will
+     tell you if what you've selected can be displayed or not.
+    </para>
+
+
+   </section>
+
+  </section>
+
+ </section>
+
+ <section id="cvd.mainDisplayArea">
+  <title>The Main Display Area</title>
+  <para>
+   The main display area has three sub-areas.  In the upper left-hand
+   corner is the
+   <emphasis role="bold">index display</emphasis>, which shows the indexes that were defined in the 
+   AE, as well as
+   the types of the indexes and their subtypes.  In the lower left-hand
+   corner, the content of indexes and sub-indexes is displayed 
+   (<emphasis role="bold">FS display</emphasis>).  Clicking on any node in the index display will 
+   show the
+   corresponding feature structures in the FS display.  You can explore
+   those structures by expanding the tree nodes.  When you click on a
+   node that represents an annotation, clicking on it will cause the
+   corresponding text span to marked in the
+   <emphasis role="bold">text display</emphasis>.
+  </para>
+  <para>
+   <figure id="Main1Figure">
+    <title>State of GUI after running an analysis engine</title>
+    <mediaobject>
+     <imageobject>
+      <imagedata scale="100" format="JPG" fileref="&imgroot;Main1.jpg" />
+     </imageobject>
+    </mediaobject>
+   </figure>
+  </para>
+  <para>
+   <xref linkend="Main1Figure"></xref>
+   shows the state after running the UIMA_Analysis_Example.xml aggregate from the
+   uimaj-examples project.  There are two indexes in the index display, and the
+   annotation index has been selected.  Note that the number of
+   structures in an index is displayed in square brackets after the
+   index name.
+  </para>
+  <para>
+   Since displaying thousands of sister nodes is both confusing and
+   slow, nodes are grouped in powers of 10.  As soon as there are no
+   more than 100 sister nodes, they are displayed next to each other.
+  </para>
+  <para>
+   In our example, a name annotation has been selected, and the
+   corresponding token text is highlighted in the text area.  We have
+   also expanded the token node to display its structure (not much to see in this simple example).
+  </para>
+  <para>
+   In <xref linkend="Main1Figure"/>, we selected an annotation in the FS display to find the
+   corresponding text.  We can also do the reverse and find out what
+   annotations cover a certain point in the text.  Let's go back to the
+   name recognizer for an example.
+  </para>
+  <para>
+   <figure id="Main2Figure">
+    <title>
+     Finding annotations for a specific location in the text
+    </title>
+    <mediaobject>
+     <imageobject>  <!-- next width was 6.39in -->
+      <imagedata scale="100" format="JPG" fileref="&imgroot;Main2.jpg" />
+     </imageobject>
+    </mediaobject>
+   </figure>
+  </para>
+  <para>
+   We would like to know if the Michael Baessler has been
+   recognized as a name.  So we position the cursor in the corresponding
+   text span somewhere, then right-click to bring up the context menu
+   telling us which annotations exist at this point. An example is shown
+   in
+   <xref linkend="Main2Figure" />.
+  </para>
+  <para>
+   <figure id="Main3Figure">
+    <title>
+     Selecting an annotation from the context menu will highlight that
+     annotation in the FS display
+    </title>
+    <mediaobject>
+     <imageobject> <!-- width was 6.39in -->
+      <imagedata scale="100" format="JPG" fileref="&imgroot;Main3.jpg" />
+     </imageobject>
+    </mediaobject>
+   </figure>
+  </para>
+
+  <para>
+   At this point (<xref linkend="Main2Figure" />), 
+   we only know that somewhere around the text cursor position (not
+   visible in the picture), we discovered a name.  When we select the corresponding entry in the
+   context menu, the name annotation is selected in the FS display, and its covered text is
+   highlighted.
+   <xref linkend="Main3Figure" /> shows the display after 
+   the name node has been selected in
+   the popup menu.
+  </para>
+  <para>
+   We're glad to see that, indeed, Michael Baessler is
+   considered to be a name.  Note that in the FS display, the
+   corresponding annotation node has been selected, and the tree has
+   been expanded to make the node visible.
+  </para>
+  <para>
+   NB that the annotations displayed in the popup menu come from the
+   annotations currently displayed in the FS display.  If you didn't
+   select the annotation index or one of its sub-nodes, no annotations
+   can be displayed and the popup menu will be empty.
+  </para>
+
+  <section id="cvd.statusBar">
+   <title>The Status Bar</title>
+   <para>
+    At the bottom of the screen, some useful information is displayed in
+    the
+    <emphasis role="bold">status bar</emphasis>. The left-most area shows the most recent major event, with the
+    time when the event terminated in square brackets. The next area
+    shows the file name of the currently loaded XML descriptor. This
+    area supports a tool tip that will show the full path to the file.
+    The right-most area shows the current cursor position, or the extent
+    of the selection, if a portion of the text has been selected. The
+    numbers correspond to the character offsets that are used for
+    annotations.
+   </para>
+  </section>
+
+  <section id="cvd.keyboardNavigation">
+   <title>Keyboard Navigation and Shortcuts</title>
+   <para>
+    The GUI can be completely navigated and operated through the
+    keyboard. All menus and menu items support keyboard mnemonics, and
+    some common operations are accessible through keyboard accelerators.
+   </para>
+   <para>
+    You can move the focus between the three main areas using
+    <computeroutput>Tab</computeroutput>
+    (clockwise) and
+    <computeroutput>Shift-Tab</computeroutput>
+    (counterclockwise). When the focus is on the text area, the
+    <computeroutput>Tab</computeroutput>
+    key will insert the corresponding character into the text, so you
+    will need to use
+    <computeroutput>Ctrl-Tab</computeroutput>
+    and
+    <computeroutput>Ctrl-Shift-Tab</computeroutput>
+    instead. Alternatively, you can use the following key bindings to
+    jump directly to one of the areas:
+    <computeroutput>Ctrl-T</computeroutput>
+    to focus the text area,
+    <computeroutput>Ctrl-I</computeroutput>
+    for the index repository frame and
+    <computeroutput>Ctrl-F</computeroutput>
+    for the feature structure area.
+   </para>
+   <para>
+    Some additional keyboard shortcuts are available only in the text
+    area, such as
+    <computeroutput>Ctrl-X</computeroutput>
+    for Cut,
+    <computeroutput>Ctrl-C</computeroutput>
+    for Copy,
+    <computeroutput>Ctrl-V</computeroutput>
+    for Paste and
+    <computeroutput>Ctrl-Z</computeroutput>
+    for Undo. The context menu in the text area can be evoke through the
+    <computeroutput>Alt-Enter</computeroutput>
+    shortcut. Text can be selected using the arrow keys while holding
+    the
+    <computeroutput>Shift</computeroutput>
+    key.
+   </para>
+   <para>
+    The following table shows the supported keyboard shortcuts.
+   </para>
+   <table frame="none" id="cvd.table.keyboardShortcuts">
+    <title>Keyboard shortcuts</title>
+    <tgroup cols="3">
+     <thead>
+      <row>
+       <entry>Shortcut</entry>
+       <entry>Action</entry>
+       <entry>Scope</entry>
+      </row>
+     </thead>
+     <tbody>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-O</computeroutput>
+       </entry>
+       <entry>Open text file</entry>
+       <entry>Global</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-S</computeroutput>
+       </entry>
+       <entry>Save text file</entry>
+       <entry>Global</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-L</computeroutput>
+       </entry>
+       <entry>Load AE descriptor</entry>
+       <entry>Global</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-R</computeroutput>
+       </entry>
+       <entry>Run current AE</entry>
+       <entry>Global</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-I</computeroutput>
+       </entry>
+       <entry>Switch focus to index repository</entry>
+       <entry>Global</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-T</computeroutput>
+       </entry>
+       <entry>Switch focus to text area</entry>
+       <entry>Global</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-F</computeroutput>
+       </entry>
+       <entry>Switch focus to FS area</entry>
+       <entry>Global</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-X</computeroutput>
+       </entry>
+       <entry>Cut selection</entry>
+       <entry>Text</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-C</computeroutput>
+       </entry>
+       <entry>Copy selection</entry>
+       <entry>Text</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-V</computeroutput>
+       </entry>
+       <entry>Paste selection</entry>
+       <entry>Text</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Ctrl-Z</computeroutput>
+       </entry>
+       <entry>Undo</entry>
+       <entry>Text</entry>
+      </row>
+      <row>
+       <entry>
+        <computeroutput>Alt-Enter</computeroutput>
+       </entry>
+       <entry>Show context menu</entry>
+       <entry>Text</entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+  </section>
+
+ </section>
+</chapter>
\ No newline at end of file

Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.doc_analyzer.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.doc_analyzer.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.doc_analyzer.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.doc_analyzer.xml Thu May  6 14:04:08 2010
@@ -0,0 +1,275 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/tools.doc_analyzer/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tools.doc_analyzer">
+  <title>Document Analyzer User&apos;s Guide</title>
+ 
+
+<para>The <emphasis>Document Analyzer</emphasis> is a tool provided by the
+UIMA SDK for testing annotators and AEs. It reads text files from your disk, processes them using an AE, and
+allows you to view the results.  The
+Document Analyzer is designed to work with text files and cannot be used with
+Analysis Engines that process other types of data.</para>
+
+<para>For an introduction to developing annotators and Analysis
+Engines, read 
+ <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aae"/>.  
+  This chapter is a user&apos;s guide for using the Document Analyzer tool, and
+does not describe the process of developing annotators and Analysis Engines.</para>
+
+<section id="ugr.tools.doc_analyzer.starting">
+  <title>Starting the Document Analyzer</title>
+  
+<para>To run the Document Analyzer, execute the <literal>documentAnalyzer</literal> script that is in the <literal>bin</literal> directory of your UIMA SDK installation, or, if you
+are using the example Eclipse project, execute the <quote>UIMA Document Analyzer</quote>
+run configuration supplied with that project.</para>
+
+<para>Note that if you&apos;re planning to run an Analysis Engine
+other than one of the examples included in the UIMA SDK, you&apos;ll first need to
+update your CLASSPATH environment variable to include the classes needed by
+that Analysis Engine.</para>
+
+<para>When you first run the Document Analyzer, you should see a
+screen that looks like this:
+  
+  <screenshot>
+    <mediaobject>
+      <imageobject>
+        <imagedata width="5.8in" format="JPG" fileref="&imgroot;image002.jpg"/>
+      </imageobject>
+      <textobject><phrase>Document Analyzer GUI</phrase>
+      </textobject>
+    </mediaobject>
+  </screenshot></para>
+
+
+  </section>
+  
+  <section id="ugr.tools.doc_analyzer.running_an_ae">
+    <title>Running an AE</title>
+
+
+
+<para>To run a AE, you must first configure the six fields on
+the main screen of the Document Analyzer.</para>
+
+<para><emphasis role="bold">Input Directory:</emphasis>  
+  Browse to or type the path of a directory containing text files that you
+want to analyze.  Some sample documents
+are provided in the UIMA SDK under the <literal>examples/data</literal>
+directory.</para>
+
+<para><emphasis role="bold">Output Directory:</emphasis> Browse to or type the path of a directory where you want
+  output to be written. (As we&apos;ll see later, you won&apos;t normally need to look directly at these files, but the
+  Document Analyzer needs to know where to write them.) The files written to this directory will be an XML
+  representation of the analyzed documents. If this directory doesn&apos;t exist, it will be created. If the
+  directory exists, any files in it will be deleted (but the tool will ask you to confirm this before doing so). If you
+  leave this field blank, your AE will be run but no output will be generated.</para>
+
+<para><emphasis role="bold">Location of AE XML Descriptor:</emphasis>  
+  Browse to or type the path of the descriptor
+for the AE that you want to run.  There
+are some example descriptors provided in the UIMA SDK under the <literal>examples/descriptors/analysis_engine</literal> and <literal>examples/descriptors/tutorial</literal> directories.</para>
+
+<para><emphasis role="bold">XML Tag containing Text:</emphasis>  
+  This is an optional feature.  If you enter a value here, it specifies the
+name of an XML tag, expected to be found within the input documents, that
+contains the text to be analyzed.  For
+example, the value <literal>TEXT</literal> would cause the AE to only
+analyze the portion of the document enclosed within &lt;TEXT&gt;...&lt;/TEXT&gt;
+tags.  Also, any XML tags occuring within that text will be removed prior to analysis.</para>
+
+<para><emphasis role="bold">Language:</emphasis>
+  Specify
+the language in which the documents are written.  Some Analysis Engines, but not all, require
+that this be set correctly in order to do their analysis.  You can select a value from the drop-down
+list or type your own.  The value entered
+here must be an ISO language identifier, the list of which can be found here: 
+  <ulink url="http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt"/>.
+</para>
+
+<para><emphasis role="bold">Character Encoding:</emphasis>  
+  The character encoding of the input files.  The default, UTF-8, also works fine for ASCII
+text files.  If you have a different
+encoding, enter it here.  For more
+information on character sets and their names, see the Javadocs for 
+  <literal>java.nio.charset.Charset</literal>.</para>
+
+<para>Once you&apos;ve filled in the appropriate values, press the
+<quote>Run</quote> button.</para>
+
+<para>If an error occurs, a dialog will appear with the error
+message.  (A stack trace will also be
+printed to the console, which may help you if the error was generated by your
+own annotator code.)  Otherwise, an
+<quote>Analysis Results</quote> window will appear.</para>
+
+
+
+</section>
+  
+  <section id="ugr.tools.doc_analyzer.viewing_results">
+    <title>Viewing the Analysis Results</title>
+
+<para>After a successful analysis, the <quote>Analysis
+Results</quote> window will appear.
+  
+  <screenshot>
+    <mediaobject>
+      <imageobject>
+        <imagedata width="4.2in" format="JPG" fileref="&imgroot;image004.jpg"/>
+      </imageobject>
+      <textobject><phrase>Analysis Results Window</phrase></textobject>
+    </mediaobject>
+  </screenshot></para>
+
+
+<para>The <quote>Results Display Format</quote> options at the
+bottom of this window show the different ways you can view your analysis &ndash; the
+Java Viewer, Java Viewer (JV) with User Colors, HTML, and XML.  
+  The default, Java Viewer, is recommended.</para>
+
+<para>Once you have selected your desired Results Display
+Format, you can double-click on one of the files in the list to view the
+analysis done on that file.</para>
+
+<para>For the Java viewer, the results display looks like this
+(for the AE descriptor <literal>examples/descriptors/tutorial/ex4/MeetingDetectorAE.xml</literal>):
+
+  <screenshot>
+    <mediaobject>
+      <imageobject>
+        <imagedata width="5.8in" format="JPG" fileref="&imgroot;image006.jpg"/>
+      </imageobject>
+      <textobject><phrase>Analysis Results Window showing results from tutorial example 4</phrase></textobject>
+    </mediaobject>
+  </screenshot></para>
+
+
+<para>You can click the mouse on one of the highlighted
+annotations to see a list of all its features in the frame on the right.</para>
+
+<para>If there are multiple annotation types in the view, you
+can control which ones are selected by using the checkboxes in the legend, the
+Select All button, or the Deselect All button.</para>
+
+<para>If you are viewing a CAS that contains multiple subjects
+of analysis, then a selector will appear at the bottom right of the Annotation
+Viewer window.  This will allow you to
+choose the Sofa that you wish to view.  Note that only text Sofas containing a non-null document are available
+for viewing.</para>
+
+</section>
+  
+  <section id="ugr.tools.doc_analyzer.configuring">
+    <title>Configuring the Annotation Viewer</title>
+
+<para>The <quote>JV User Colors</quote> and the HTML viewer allow
+you to specify exactly which colors are used to display each of your annotation
+types.  For the Java Viewer, you can also
+specify which types should be initially selected, and you can hide types
+entirely.</para>
+
+<para>To configure the viewer, click the <quote>Edit Style
+Map</quote> button on the <quote>Analysis Results</quote> dialog.  
+  You should see a dialog that looks like this:
+
+  
+  <screenshot>
+    <mediaobject>
+      <imageobject>
+        <imagedata width="5.8in" format="JPG" fileref="&imgroot;image008.jpg"/>
+      </imageobject>
+      <textobject><phrase>Configuring the Analysis Results Viewer</phrase></textobject>
+    </mediaobject>
+  </screenshot></para>
+
+<para>To change the color assigned to a type, simply click on
+the colored cell in the <quote>Background</quote> column for the type you wish to
+edit.  This will display a dialog that
+allows you to choose the color.  For the
+HTML viewer only, you can also change the foreground color.</para>
+
+<para>If you would like the type to be initially checked
+(selected) in the legend when the viewer is first launched, check the box in
+the <quote>Checked</quote> column.  If you
+would like the type to never be shown in the viewer, click the box in the
+<quote>Hidden</quote> column.  These
+settings only affect the Java Viewer, not the HTML view.</para>
+
+<para>When you are done editing, click the <quote>Save</quote>
+button.  This will save your choices to a
+file in the same directory as your AE descriptor.  From now on, when you view analysis results
+produced by this AE using the <quote>JV User Colors</quote> or <quote>HTML</quote>
+options, the viewer will be configured as you have specified.</para>
+
+</section>
+
+<section id="ugr.tools.doc_analyzer.interactive_mode">
+  <title>Interactive Mode</title>
+  
+
+<para>Interactive Mode allows you to analyze text that you type
+or cut-and-paste into the tool, rather than requiring that the documents be
+stored as files.</para>
+
+<para>In the main Document Analyzer window, you can invoke
+Interactive Mode by clicking the <quote>Interactive</quote> button instead of the
+<quote>Run</quote> button.  This will
+display a dialog that looks like this:
+  
+   
+  <screenshot>
+    <mediaobject>
+      <imageobject>
+        <imagedata width="5.5in" format="JPG" fileref="&imgroot;image010.jpg"/>
+      </imageobject>
+      <textobject><phrase>Invoking Interactive Mode</phrase></textobject>
+    </mediaobject>
+  </screenshot></para> 
+
+<para>You can type or cut-and-paste your text into this window,
+then choose your Results Display Format and click the <quote>Analyze</quote>
+button.  Your AE will be run on the text
+that you supplied and the results will be displayed as usual.</para>
+
+
+</section>
+  
+  <section id="ugr.tools.doc_analyzer.view_mode">
+    <title>View Mode</title>
+    
+<para>If you have previously run a AE and saved its analysis
+results, you can use the Document Analyzer&apos;s View mode to view those results,
+without re-running your analysis.  To do
+this, on the main Document Analyzer window simply select the location of your
+analyzed documents in the <quote>Output Directory</quote> dialog and click the
+<quote>View</quote> button.  You can then
+view your analysis results as described in Section 
+ <xref linkend="ugr.tools.doc_analyzer.viewing_results"/>.</para>
+
+</section>
+  </chapter>
+

Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.jcasgen.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.jcasgen.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.jcasgen.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.jcasgen.xml Thu May  6 14:04:08 2010
@@ -0,0 +1,181 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/tools.jcasgen/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tools.jcasgen">
+  <title>JCasGen User&apos;s Guide</title>
+  
+  <para>JCasGen reads a descriptor for an application (either an Analysis Engine Descriptor, 
+    or a Type System Descriptor), creates the merged type system
+    specification by merging all the type system information from all the components
+    referred to in the descriptor, and then uses this merged type system to create Java source
+    files for classes that enable JCas access to the CAS. Java classes are not produced for the
+    built-in types, since these classes are already provided by the UIMA SDK.  (An exception is
+    the built-in type <literal>uima.tcas.DocumentAnnotation</literal>, see the warning below.) </para>
+  
+  <warning><para>If the components comprising the input to the type merging process 
+    have different definitions for the same type name,
+    JCasGen will show a warning, and in some environments may offer to abort the operation.
+    If you continue past this warning, 
+    JCasGen will produce correct Java source files representing the merged types 
+   (that is, the
+    type definition containing all of the features defined on that type by all of the
+    components).  It is recommended that you do not use this capability (of having 
+    two different definitions for the same type name, with different feature sets) since it can make it 
+    difficult to combine/package your annotator with others. See <olink
+      targetdoc="&uima_docs_ref;"
+      targetptr="ugr.ref.jcas.merging_types_from_other_specs"/> for more information.
+  </para>
+  
+  <para>Also note that if your type system declares a custom version of the 
+    <literal>uima.tcas.DocumentAnnotation</literal> 
+    built-in type, then JCasGen will generate a Java source file for it.  If you do this, you need to be
+    aware of the issues discussed in 
+    <olink
+       targetdoc="&uima_docs_ref;"
+       targetptr="ugr.ref.jcas.documentannotation_issues"/>.</para></warning>
+  
+  <para>There are several versions of JCasGen. The basic version reads an XML descriptor
+    which contains a type system descriptor, and generates the corresponding Java Class
+    Models for those types. Variants exist for the Eclipse environment that allow merging the
+    newly generated Java source code with previously augmented versions; see <olink
+      targetdoc="&uima_docs_ref;"
+      targetptr="ugr.ref.jcas.augmenting_generated_code"/> for a discussion of how the
+    Java Class Models can be augmented by adding additional methods and fields.</para>
+  
+  <para>Input to JCasGen needs to be mostly self-contained. In particular, any types that are
+    defined to depend on user-defined supertypes must have that supertype defined, if the
+    supertype is <literal>uima.tcas.Annotation </literal>or a subtype of it. Any features
+    referencing ranges which are subtypes of uima.cas.String must have those subtypes
+    included. If this is not followed, a warning message is given stating that the resulting
+    generation may be inaccurate.</para>
+  
+  <para>JCasGen is typically invoked automatically when using the Component Descriptor
+    Editor (see <olink targetdoc="&uima_docs_tools;"
+      targetptr="ugr.tools.cde.auto_jcasgen"/>), but can also be run using a shell
+    script. These scripts can take 0, 1, or 2 arguments. The first argument is the location of
+    the file containing the input XML descriptor. The second argument specifies where the
+    generated Java source code should go. If it isn&apos;t given, JCasGen generates its
+    output into a subfolder called JCas (or sometimes JCasNew &ndash; see below), of the first
+    argument&apos;s path.</para>
+  
+  <para>If no arguments are given to JCasGen, then it launches a GUI to interact with the user
+    and ask for the same input. The GUI will remember the arguments you previously used.
+    Here&apos;s what it looks like:
+    
+    
+    <screenshot>
+      <mediaobject>
+        <imageobject>
+          <imagedata width="5.8in" format="JPG" fileref="&imgroot;image002.jpg"/>
+        </imageobject>
+        <textobject><phrase>JCasGen tool showing fields for input arguments</phrase>
+        </textobject>
+      </mediaobject>
+    </screenshot></para>
+  
+  <para>When running with automatic merging of the generated Java source with previously
+    augmented versions, the output location is where the merge function obtains the source
+    for the merge operation.</para>
+  
+  <para>As is customary for Java, the generated class source files are placed in the
+    appropriate subdirectory structure according to Java conventions that correspond to
+    the package (name space) name.</para>
+  
+  <para>The Java classes must be compiled and the resulting class files included in the class
+    path of your application; you make these classes available for other annotator writers
+    using your types, perhaps packaged as an xxx.jar file. If the xxx.jar file is made to
+    contain only the Java Class Models for the CAS types, it can be reused by any users of these
+    types.</para>
+  
+  <section id="ugr.tools.jcasgen.running_without_eclipse">
+    <title>Running stand-alone without Eclipse</title>
+    
+    <para>There is no capability to automatically merge the generated Java source with
+      previous versions, unless running with Eclipse. If run without Eclipse, no automatic
+      merging of the generated Java source is done with any previous versions. In this case,
+      the output is put in a folder called <quote>JCasNew</quote> unless overridden by
+      specifying a second argument.</para>
+    
+    <para>The distribution includes a shell script/bat file to run the stand-alone version,
+      called jcasgen.</para>
+    
+  </section>
+  
+  <section id="ugr.tools.jcasgen.running_standalone_with_eclipse">
+    <title>Running stand-alone with Eclipse</title>
+    
+    <para>If you have Eclipse and EMF (EMF = Eclipse Modeling Framework; both of these are
+      available from <ulink url="http://www.eclipse.org"/>) installed (version 3 or
+      later) JCasGen can merge the Java code it generates with previous versions, picking up
+      changes you might have inserted by hand. The output (and source of the merge input) is in a
+      folder <quote>JCas</quote> under the same path as the input XML file, unless
+      overridden by specifying a second argument.</para>
+    
+    <para>You must install the UIMA plug-ins into Eclipse to enable this function.</para>
+    
+    <para>The distribution includes a shell script/bat file to run the stand-alone with
+      Eclipse version, called jcasgen_merge. This works by starting Eclipse in
+      <quote>headless</quote> mode (no GUI) and invoking JCasGen within Eclipse. You will
+      need to set the ECLIPSE_HOME environment variable or modify the jcasgen_merge shell
+      script to specify where to find Eclipse. The version of Eclipse needed is 3 or higher,
+      with the EMF plug-in and the UIMA runtime plug-in installed. A temporary workspace is
+      used; the name/location of this is customizable in the shell script.</para>
+    
+    <para>Log and error messages are written to the UIMA log. This file is called uima.log, and
+      is located in the default working directory, which if not overridden, is the startup
+      directory of Eclipse.</para>
+    
+  </section>
+  
+  <section id="ugr.tools.jcasgen.running_within_eclipse">
+    <title>Running within Eclipse</title>
+    
+    <para>There are two ways to run JCasGen within Eclipse. The first way is to configure an
+      Eclipse external tools launcher, and use it to run the stand-alone shell scripts, with
+      the arguments filled in. Here&apos;s a picture of a typical launcher configuration
+      screen (you get here by navigating from the top menu: Run &ndash;&gt; External Tools
+      &ndash;&gt; External tools...).
+      
+      
+      <screenshot>
+      <mediaobject>
+        <imageobject>
+          <imagedata width="5.8in" format="JPG" fileref="&imgroot;image004.jpg"/>
+        </imageobject>
+        <textobject><phrase>Running JCasGen within Eclipse using the external tool launcher</phrase>
+        </textobject>
+      </mediaobject>
+    </screenshot></para>
+    
+    <para>The second way (which is the normal way it's done) to run within Eclipse is to use the
+      Component Descriptor Editor (CDE) (see <olink targetdoc="&uima_docs_tools;"
+        targetptr="ugr.tools.cde"/>). This tool can be configured to automatically
+      launch JCasGen whenever the type system descriptor is modified. In this release, this
+      operation completely regenerates the files, even if just a small thing changed. For
+      very large type systems, you probably don&apos;t want to enable this all the time. The
+      configurator tool has an option to enable/disable this function.</para>
+  </section>
+  
+</chapter>
\ No newline at end of file

Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.installer.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.installer.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.installer.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.installer.xml Thu May  6 14:04:08 2010
@@ -0,0 +1,110 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/tools.pear.installer/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tools.pear.installer">
+  <title>PEAR Installer User&apos;s Guide</title>
+  
+  <para>PEAR (Processing Engine ARchive) is a new standard for packaging UIMA compliant
+    components. This standard defines several service elements that should be included in
+    the archive package to enable automated installation of the encapsulated UIMA
+    component. The major PEAR service element is an XML Installation Descriptor that
+    specifies installation platform, component attributes, custom installation
+    procedures and environment variables. </para>
+  
+  <para>The installation of a UIMA compliant component includes 2 steps: (1) installation of
+    the component code and resources in a local file system, and (2) verification of the
+    serviceability of the installed component. Installation of the component code and
+    resources involves extracting component files from the archive (PEAR) package in a
+    designated directory and localizing file references in component descriptors and other
+    configuration files. Verification of the component serviceability is accomplished
+    with the help of standard UIMA mechanisms for instantiating analysis engines.
+    
+    
+    <screenshot>
+    <mediaobject>
+      <imageobject>
+        <imagedata width="5.8in" format="JPG" fileref="&imgroot;image002.jpg"/>
+      </imageobject>
+      <textobject><phrase>PEAR Installer GUI</phrase>
+      </textobject>
+    </mediaobject>
+  </screenshot></para>
+  
+  <para>To launch the PEAR Installer, use the script in the UIMA bin directory: 
+  <code>runPearInstaller.bat</code> or <code>runPearInstaller.sh.</code></para>
+  
+  <para>PEAR Installer is a simple GUI based Java application that helps installing UIMA
+    compliant components (analysis engines) from PEAR packages in a local file system. To
+    install a desired UIMA component the user needs to select the appropriate PEAR file in a
+    local file system and specify the installation directory (optional). If no installation
+    directory is specified, the PEAR file is installed to the current working directory. 
+	By default the PEAR packages are not installed directly to the specified installation directory. 
+	For each PEAR a subdirectory with the name of the PEAR's ID is created where the PEAR package is 
+	installed to. If the PEAR installation directory already exists, the old content is automatically 
+	deleted before the new content is installed. During the
+    component installation the user can read messages printed by the installation program in
+    the message area of the application window. If the installation fails, appropriate error
+    message is printed to help identifying and fixing the problem.</para>
+  
+  <para>After the desired UIMA component is successfully installed, the PEAR Installer
+    allows testing this component in the CAS Visual Debugger (CVD) application, which is
+    provided with the UIMA package. The CVD application will load your UIMA component using
+    its XML descriptor file. If the component is loaded successfully, you&apos;ll be able to
+    run it either with sample documents provided in the
+    <literal>&lt;UIMA_HOME&gt;/examples/data</literal> directory, or with any other
+    sample documents. See <olink targetdoc="&uima_docs_tools;"
+      targetptr="ugr.tools.cvd"/> for more information about the CVD application.
+    Running your component in the CVD application helps to make sure the component will run in
+    other UIMA applications. If the CVD application fails to load or run your component, or
+    throws an exception, you can find more information about the problem in the uima.log file
+    in the current working directory. The log file can be viewed with the CVD.</para>
+  
+  <para>PEAR Installer creates a file named <literal>setenv.txt</literal> in the
+    <literal>&lt;component_root&gt;/metadata</literal> directory. This file contains
+    environment variables required to run your component in any UIMA application. 
+    It also creates a PEAR descriptor (see also <olink targetdoc="&uima_docs_ref;"
+      targetptr="ugr.ref.pear.specifier"/>)
+    file named <literal>&lt;componentID&gt;_pear.xml</literal> 
+    in the <literal>&lt;component_root&gt;</literal> directory that can be used to directly run
+    the installed pear file in your application.
+  </para>
+
+  <para>
+    The metadata/setenv.txt is not read by the UIMA framework anywhere.  
+    It's there for use by non-UIMA application code if that code wants to set environment variables.
+    The metadata/setenv.txt is just a "convenience" file duplicating what's in the xml.  
+  </para>
+  
+  <para>
+    The setenv.txt file has 2 special variables: the CLASSPATH and the PATH. 
+    The CLASSPATH is computed from any supplied CLASSPATH environment variable, 
+    plus the jars that are configured in the PEAR structure, including subcomponents. 
+    The PATH is similarly computed, using any supplied PATH environment variable plus 
+    it includes the "bin" subdirectory of the PEAR structure, if it exists.
+  </para>
+  
+    
+  
+</chapter>
\ No newline at end of file

Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.merger.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.merger.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.merger.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.merger.xml Thu May  6 14:04:08 2010
@@ -0,0 +1,164 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tools.pear.merger">
+  <title>PEAR Merger User&apos;s Guide</title>
+  
+  <para>The PEAR Merger utility takes two or more PEAR files and merges their contents,
+    creating a new PEAR which has, in turn, a new Aggregate analysis engine whose delegates are
+    the components from the original files being merged. It does this by (1) copying the
+    contents of the input components into the output component, placing each component into a
+    separate subdirectory, (2) generating a UIMA descriptor for the output Aggregate 
+    analysis engine and (3) creating an output PEAR file that encapsulates the output
+    Aggregate.</para>
+  
+  <para>The merge logic is quite simple, and is intended to work for simple cases. More complex
+    merging needs to be done by hand. Please see the Restrictions and Limitations section,
+    below.</para>
+  
+  <para>To run the PearMerger command line utility you can use the runPearMerger scripts (.bat for Windows, and .sh for
+    Unix). The usage of the tooling is shown below:</para>
+  
+  <para><programlisting>runPearMerger 1st_input_pear_file ... nth_input_pear_file 
+  -n output_analysis_engine_name [-f output_pear_file ]</programlisting></para>
+  
+  <para>The first group of parameters are the input PEAR files. No duplicates are allowed
+    here. The <literal>-n</literal> parameter is the name of the generated Aggregate
+    Analysis Engine. The optional <literal>-f</literal> parameter specifies the name of
+    the output file. If it is omitted, the output is written to
+    <literal>output_analysis_engine_name.pear</literal> in the current working directory.</para>
+  
+  <para>During the running of this tool, work files are written to a temporary directory
+    created in the user&apos;s home directory.</para>
+  
+  <section id="ugr.tools.pear.merger.merge_details">
+    <title>Details of the merging process</title>
+    
+    <para>The PEARs are merged using the following steps:</para>
+    
+    <orderedlist><listitem><para>A temporary working directory, is created for the
+      output aggregate component.</para></listitem>
+      
+      <listitem><para>Each input PEAR file is extracted into a separate
+        &apos;input_component_name&apos; folder under the working directory.</para>
+        </listitem>
+      
+      <listitem><para>The extracted files are processed to adjust the
+        &apos;$main_root&apos; macros. This operation differs from the PEAR installation
+        operation, because it does not replace the macros with absolute paths.</para>
+        </listitem>
+      
+      <listitem><para>The output PEAR directory structure, &apos;metadata&apos; and
+        &apos;desc&apos; folders under the working directory, are created.</para>
+        </listitem>
+      
+      <listitem><para>The UIMA AE descriptor for the output aggregate component is built
+        in the &apos;desc&apos; folder. This aggregate descriptor refers to the input
+        delegate components, specifying &apos;fixed flow&apos; based on the original
+        order of the input components in the command line. The aggregate descriptor&apos;s
+        &apos;capabilities&apos; and
+        &apos;operational properties&apos; sections are built based on the input
+        components&apos; specifications.</para></listitem>
+      
+      <listitem><para>A new PEAR installation descriptor is created in the
+        &apos;metadata&apos; folder, referencing the new output aggregate descriptor
+        built in the previous step. </para></listitem>
+      
+      <listitem><para>The content of the temporary output working directory is zipped to
+        created the output PEAR, and then the temporary working directory is deleted.
+        </para></listitem></orderedlist>
+    
+    <para>The PEAR merger utility logs all the operations both to standard console output and
+      to a log file, pm.log, which is created in the current working directory.</para>
+    
+  </section>
+  
+  <section id="ugr.tools.pear.merger.testing_modifying_resulting_pear">
+    <title>Testing and Modifying the resulting PEAR</title>
+    
+    <para>The output PEAR file can be installed and tested using the PEAR Installer. The
+      output aggregate component can also be tested by using the CVD or DocAnalyzer
+      tools.</para>
+    
+    <para>The PEAR Installer creates Eclipse project files (.classpath and .project) in the
+      root directory of the installer PEAR, so the installed component can be imported into
+      the Eclipse IDE as an external project. Once the component is in the Eclipse IDE,
+      developers may use the Component Descriptor Editor and the PEAR Packager to modify the
+      output aggregate descriptor and re-package the component.</para>
+    
+  </section>
+  <section id="ugr.tools.pear.merger.restrictions_limitations">
+    <title>Restrictions and Limitations</title>
+    
+    <para>The PEAR Merger utility only does basic merging operations, and is limited as
+      follows. You can overcome these by editing the resulting PEAR file or the resulting
+      Aggregate Descriptor.</para>
+    
+    <orderedlist><listitem><para>The Merge operation specifies Fixed Flow sequencing
+      for the Aggregate.</para></listitem>
+      
+      <listitem><para>The merged aggregate does not define any parameters, so the delegate
+        parameters cannot be overridden.</para></listitem>
+      
+      <listitem><para>No External Resource definitions are generated for the
+        aggregate.</para></listitem>
+      
+      <listitem><para>No Sofa Mappings are generated for the aggregate.</para>
+        </listitem>
+      
+      <listitem><para>Name collisions are not checked for. Possible name collisions could
+        occur in the fully-qualified class names of the implementing Java classes, the names
+        of JAR files, the names of descriptor files, and the names of resource bindings or
+        resource file paths.</para></listitem>
+      
+      <listitem><para>The input and output capabilities are generated based on merging the
+        capabilities from the components (removing duplicates). Capability sets are
+        ignored - only the first of the set is used in this process, and only one set is created
+        for the generated Aggregate. There is no support for merging Sofa
+        specifications.</para></listitem>
+      
+      <listitem><para>No Indexes or Type Priorities are created for the generated
+        Aggregate. No checking is done to see if the Indexes or Type Priorities of the
+        components conflict or are inconsistent.</para></listitem>
+      
+      <listitem><para>You can only merge Analysis Engines and CAS Consumers. </para>
+        </listitem>
+      
+      <listitem><para>Although PEAR file installation descriptors that are being merged
+        can have specific XML elements describing Collection Reader and CAS Consumer
+        descriptors, these elements are ignored during the merge, in the sense that the
+        installation descriptor that is created by the merge does not set these elements. The
+        merge process does not use these elements; the output PEAR&apos;s new aggregate only
+        references the merged components&apos; main PEAR descriptor element, as
+        identified by the PEAR element:
+        
+        <programlisting><![CDATA[<SUBMITTED_COMPONENT>
+  <DESC>the_component.xml</DESC>... 
+</SUBMITTED_COMPONENT>
+]]></programlisting></para>
+        </listitem></orderedlist>
+    
+  </section>
+  
+</chapter>