You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by sc...@apache.org on 2010/05/06 16:04:11 UTC
svn commit: r941741 [3/4] - in
/uima/uimaj/branches/mavenAlign/uima-docbook-tools: ./ src/ src/docbook/
src/docbook/images/ src/docbook/images/tools/
src/docbook/images/tools/tools.annotation_viewer/
src/docbook/images/tools/tools.caseditor/ src/docboo...
Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.cvd.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.cvd.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.cvd.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.cvd.xml Thu May 6 14:04:08 2010
@@ -0,0 +1,941 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/tools.cvd/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >
+%uimaents;
+]>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+<chapter id="ugr.tools.cvd">
+ <title>CAS Visual Debugger</title>
+ <section id="ugr.tools.cvd.introduction">
+ <title>Introduction</title>
+ <para>
+ The CAS Visual Debugger is a tool to run text analysis engines in UIMA
+ and view the results. The tool is implemented as a stand-alone GUI
+ tool using Java's Swing library.
+ </para>
+ <para>
+ This is a developer's tool. It is intended to support you in writing
+ text analysis annotators for UIMA (Unstructured Information Management
+ Architecture). As a development tool, the emphasis is not so much on
+ pretty pictures, but rather on navigability. It is intended to show
+ you all the information you need, and show it to you quickly (at least
+ on a fast machine ;-).
+ </para>
+ <para>
+ The main purpose of this application is to let you browse all the data
+ that was created when you ran an analysis engine over some text. The
+ display mimics the access methods you have in the CAS API in terms of
+ indexes, types, feature structures and feature values.
+ </para>
+ <para>
+ As in the CAS, there is special support for annotations. Clicking on
+ an annotation will select the corresponding text, and conversely, you
+ can display all annotations that cover a given position in the text.
+ This will be explained in more detail in the section on the main
+ display area.
+ </para>
+ <para>
+ As usual, the graphics in this manual are for illustrative purposes
+ and may not look 100% like the actual version of CVD you are running.
+ This depends on your operating system, your version of Java, and a
+ variety of other factors.
+ </para>
+ <section id="ugr.cvd.introduction.running">
+ <title>Running CVD</title>
+ <para>
+ You will usually want to start CVD from the command line, or from Eclipse. To start CVD from the
+ command line, you minimally need the uima-core and uima-tools jars. Below is a sample command
+ line for sh and its offspring.
+ <programlisting>java -cp ${UIMA_HOME}/lib/uima-core.jar:${UIMA_HOME}/lib/uima-tools.jar
+ org.apache.uima.tools.cvd.CVD</programlisting>
+ However, there is no need to type this. The ${UIMA_HOME}/bin directory contains a cvd.sh and
+ cvd.bat file for Unix/Linux/MacOS and Windows, respectively.
+ </para>
+ <para>
+ In Eclipse, you have a ready to use launch configuration available when you have installed the
+ UIMA sample project (see <olink targetdoc="&uima_docs_overview;"
+ targetptr="ugr.ovv.eclipse_setup.example_code"/>). Below is a screenshot of the the Eclipse Run
+ dialog with the CVD
+ run configuration selected.
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata scale="85" format="JPG" fileref="&imgroot;eclipse-cvd-launch.jpg"/>
+ </imageobject>
+ <textobject>
+ <phrase>Eclipse run dialog with CVD selected</phrase>
+ </textobject>
+ </mediaobject>
+ </screenshot>
+ </para>
+ </section>
+
+ <section id="cvd.introduction.commandline">
+ <title>Command line parameters</title>
+ <para>
+ You can provide some command line parameters to influence the startup behavior of CVD. For
+ example, if you want to run a certain analysis engine on a certain text over and over again
+ (for debugging, say), you can make CVD load the annotator and text at startup and execute
+ the annotator. Here's a list of the supported command line options.
+ </para>
+
+ <table frame="none" id="cvd.table.commandline">
+ <title>Command line options</title>
+ <tgroup cols="2">
+ <thead>
+ <row>
+ <entry>Option</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ <computeroutput>-text <textFile></computeroutput>
+ </entry>
+ <entry>Loads the text file <computeroutput><textFile></computeroutput></entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>-desc <descriptorFile></computeroutput>
+ </entry>
+ <entry>Loads the descriptor <computeroutput><descriptorFile></computeroutput></entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>-exec</computeroutput>
+ </entry>
+ <entry>Runs the pre-loaded annotator; only allowed in conjunction with <computeroutput>-desc</computeroutput> </entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>-datapath <datapath></computeroutput>
+ </entry>
+ <entry>Sets the data path to <computeroutput><datapath></computeroutput></entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>-ini <iniFile></computeroutput>
+ </entry>
+ <entry>Makes CVD use alternative ini file <computeroutput><textFile></computeroutput> (default is ~/annotViewer.pref)</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>-lookandfeel <lnfClass></computeroutput>
+ </entry>
+ <entry>Uses alternative look-and-feel <computeroutput><lnfClass></computeroutput></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </section>
+
+ </section>
+ <section id="cvd.errorHandling">
+ <title>Error Handling</title>
+ <para>
+ On encountering
+ an error, CVD will pop up an error dialog with a short,
+ usually incomprehensible message. Often, the error message will
+ claim that there is more information available in the log file, and
+ sometimes, this is actually true; so do go and check the log. You
+ can view the log file by selecting the appropriate item in the
+ "Tools" menu.
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata scale="100" format="JPG" fileref="&imgroot;ErrorExample.jpg"/>
+ </imageobject>
+ <textobject>
+ <phrase>Sample error dialog</phrase>
+ </textobject>
+ </mediaobject>
+ </screenshot>
+
+ </para>
+ </section>
+
+ <section id="cvd.preferencesFile">
+ <title>Preferences File</title>
+ <para>
+ The program will attempt to read on startup and save on exit a file
+ called annotViewer.pref in your home directory. This file contains
+ information about choices you made while running the program:
+ directories (such as where your data files are) and window sizes.Â
+ These settings will be used the next time you use the program. There
+ is no user control over this process, but the file format is
+ reasonably transparent, in case you feel like changing it. Note,
+ however, that the file will be overwritten every time you exit the
+ program.
+ </para>
+
+ <para>
+ If you use CVD for several projects, it may be convenient to use a different
+ ini files for each project. You can specify the ini file CVD should use
+ with the <programlisting>-ini <iniFile></programlisting> parameter on the
+ command line.
+ </para>
+ </section>
+
+ <section id="cvd.theMenus">
+ <title>The Menus</title>
+ <para>
+ We give a brief description of the various menus. All menu items come
+ with mnemonics (e.g., Alt-F X will exit the program). In addition,
+ some menu items have their own keyboard accelerators that you can use
+ anywhere in the program. For example, Ctrl-S will save the text
+ you've been editing.
+ </para>
+ <section id="cvd.fileMenu">
+ <title>The File Menu</title>
+ <para>
+ The File menu lets you load, create and save text, load and save
+ color settings, and import and export the XCAS format. Here's a
+ screenshot.
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata scale="100" format="JPG" fileref="&imgroot;FileMenu.jpg"/>
+ </imageobject>
+ <textobject>
+ <phrase>The File menu</phrase>
+ </textobject>
+ </mediaobject>
+ </screenshot>
+ </para>
+
+ <itemizedlist>
+ <para>
+ Below is a list of the menu items, together with an explanation.
+ </para>
+
+ <listitem>
+ <formalpara>
+ <title>New Text...</title>
+ <para>
+ Clears the text area. Text you type is written to an anonymous
+ buffer. You can use "Save Text As..." to save the text
+ you typed to a file. Note: whenever you modify the text, be it
+ through typing, loading a file or using the "New
+ Text..." menu item, previous analysis results will be lost.
+ Since the previous analysis is specific to the text, modifying
+ the text invalidates the analysis.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Open Text File</title>
+ <para>
+ Loads a new text file into the viewer. The next time you run an
+ analysis engine, it will run the text you loaded last. Depending
+ on the annotator you're using, the program may run slow with very
+ large text files, so you may want to experiment.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Save Text File</title>
+ <para>
+ Saves the currently open text file. If no file is currently
+ loaded (either because you haven't loaded a file, or you've used
+ the "New Text..." menu item), this menu item is
+ disabled (and Ctrl-S will do nothing).
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Save Text As...</title>
+ <para>
+ Save the text to a file of your choosing. This can be an existing
+ file, which is then overwritten, or it can be a new file that
+ you're creating.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Change Code Page</title>
+ <para>
+ Allows you to change the code page that is used to load and save
+ text files. If you're sure the text you're loading is in ASCII or
+ one of the 8-bit extensions such as ISO-8859-1 (ISO Latin1),
+ there is probably nothing you need to do. Just load the text and
+ look at the display. If you see no funny characters or square
+ boxes, chances are your selected code page is compatible with
+ your text file.
+
+ Note that the code page setting is also in effect when you save
+ files. You can observe the effects with a hex editor or by just
+ looking at the file size. For example, if you save the default
+ text
+ <computeroutput>This is where the text goes.</computeroutput>
+ to a file on Windows using the default code page, the size of the
+ file will be 28 bytes. If you now change the code page to UTF-16
+ and save the file again, the file size will be 58 bytes: two
+ bytes for each character, plus two bytes for the byte-order mark.
+ Now switch the code page back to the default Windows code page
+ and reload the UTF-16 file to see the difference in the editor.
+
+ CVD will display all code pages that are available in the JVM
+ you're running it on. The first code page in the list is the
+ default code page of your system. This is also CVD's default if
+ you don't make a specific choice.
+
+ Your code page selection will be remembered in CVD's ini file.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Load Color Settings</title>
+ <para>
+ Load previously saved color settings from a file (see
+ Tools/Customize Annotation Display). It is highly recommended
+ that you only load automatically generated files. Strange things
+ may happen if you try to load the wrong file format. On startup,
+ the program attempts to load the last color settings file that
+ you loaded or saved during a previous session. If you intend to
+ use the same color settings as the last time you ran the program,
+ there is therefore no need to manually load a color settings
+ file.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Save Color Settings</title>
+ <para>
+ Save your customized color settings (see Tools/Customize
+ Annotation Display). The file is a Java properties file, and as
+ such, reasonably transparent. What is not transparent is the
+ encoding of the colors (integer encoding of 24-bit RGB values),
+ so changing the file by hand is not really recommended.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Read Type System File</title>
+ <para>
+ Load a type system file. This allows you to load an XCAS file
+ without having to have access to the corresponding annotator.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Write Type System File</title>
+ <para>
+ Create a type system file from the currently loaded type
+ definitions. In addition, you can save the current CAS as a XCAS
+ file (see below). This allows you to later load the type system
+ and XCAS to view the CAS without having to rerun the annotator.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Read XMI CAS File</title>
+ <para>
+ Read an XMI CAS file. Important: XMI CAS is a serialization format that
+ serializes a CAS without type system and index information. It is
+ therefore impossible to read in a stand-alone XMI CAS file. XMI CAS
+ files can only be interpreted in the context of an existing type
+ system. Consequently, you need to first load the Analysis Engine that was used to
+ create the XMI file, to be able to load that XMI file.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Write XMI CAS File</title>
+ <para>
+ Writes the current analysis out as an XMI CAS file.
+ </para>
+ </formalpara>
+ </listitem>
+
+
+ <listitem>
+ <formalpara>
+ <title>Read XCAS File</title>
+ <para>
+ Read an XCAS file. Important: XCAS is a serialization format that
+ serializes a CAS without type system and index information. It is
+ therefore impossible to read in a stand-alone XCAS file. XCAS
+ files can only be interpreted in the context of an existing type
+ system. Consequently, you need to load the Analysis Engine that was used to
+ create the XCAS file to be able to load it. Loading a XCAS file
+ without loading the Analysis Engine may produce strange errors. You may get
+ syntax errors on loading the XCAS file, or worse, everything may
+ appear to go smoothly but in reality your CAS may be corrupted.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Write XCAS File</title>
+ <para>
+ Writes the current analysis out as an XCAS file.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Exit</title>
+ <para>Exits the program. Your preferences will be saved.</para>
+ </formalpara>
+ </listitem>
+
+ </itemizedlist>
+
+ </section>
+
+ <section id="cvd.editMenu">
+ <title>The Edit Menu</title>
+ <para>
+
+ <screenshot>
+ <mediaobject>
+ <imageobject> <!-- was 2.15in -->
+ <imagedata scale="100" format="JPG" fileref="&imgroot;EditMenu.jpg" />
+ </imageobject>
+ <textobject>
+ <phrase>The Edit menu</phrase>
+ </textobject>
+ </mediaobject>
+ </screenshot>
+
+ The "Edit" menu provides a standard text editing menu with
+ Cut, Copy and Paste, as well as unlimited Undo.
+ </para>
+ <para>
+ Note that standard keyboard accelerators Ctrl-X, Ctrl-C, Ctrl-V and
+ Ctrl-Z can be used for Cut, Copy, Paste and Undo, respectively. The
+ text area supports other standard keyboard operations such as
+ navigation HOME, Ctrl-HOME etc., as well as marking text with Shift-
+ <ArrowKey>.
+ </para>
+ </section>
+
+ <section id="cvd.runMenu">
+ <title>The Run Menu</title>
+ <para>
+
+ <screenshot>
+ <mediaobject>
+ <imageobject> <!-- was width="2.225in" -->
+ <imagedata scale="100" format="JPG" fileref="&imgroot;RunMenu.jpg" />
+ </imageobject>
+ <textobject>
+ <phrase>The Run menu</phrase>
+ </textobject>
+ </mediaobject>
+ </screenshot>
+
+ In the Run menu, you can load and run text analysis engines.
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <formalpara>
+ <title>Load AE</title>
+ <para>
+ Loads and initializes a text analysis engine. Choosing this menu
+ item will display a file open dialog where you should choose an
+ XML descriptor of a Text Analysis Engine to process the current
+ text. Even if the analysis engine runs fast, this will take a
+ while, since there is a lot of setup work to do when a new TAE is
+ created. So be patient.
+
+ When you develop a new annotator, you will often need to
+ recompile your code. Gladis will not reload your annotator code.
+ When you recompile your code, you need to terminate the GUI and
+ restart it. If you only make changes to the XML descriptor, you
+ don't need to restart the GUI. Simply reload the XML file.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Run AE</title>
+ <para>
+ Before you have (successfully) loaded a TAE, this menu item will
+ be disabled. After you have loaded a TAE, it will be enabled, and
+ the name changes according to the name of the TAE you have
+ loaded. For example, if you've loaded "The World's Fastest
+ Parser", you will have a menu item called "Run The
+ World's Fastest Parser". When you choose the item, the TAE
+ is run on whatever text you have currently loaded.
+
+ After a TAE has run successfully, the index window in the upper
+ left-hand corner of the screen should be updated and show the
+ indexes that were created by this run. We will have more to say
+ about indexes and what to do with them later.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Run AE on CAS</title>
+ <para>
+ This allows you to run an analysis engine on the current CAS.
+ This is useful if you have loaded a CAS from an XCAS file, and
+ would like to run further analysis on it.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Run collectionProcessComplete</title>
+ <para>
+ When you select this item, the analysis engine's
+ collectionProcessComplete() method is called.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Performance Report</title>
+ <para>
+ After you've run your analysis, you can view a performance report. It will show
+ you where the time went: which component used how much of the processing time.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Recently used</title>
+ <para>
+ Collects a list of recently used analysis engines as a short-cut
+ for loading.
+ </para>
+ </formalpara>
+ </listitem>
+
+ <listitem>
+ <formalpara>
+ <title>Language</title>
+ <para>
+ Some annotators do language specific processing. For example, if
+ you run lexical analysis, the results may be quite different
+ depending on what the analysis engine thinks the language of the
+ document is. With this menu item, you can manually set the
+ document language. Alternatively, you can use an automatic
+ language identification annotator. If the analysis engines you're
+ working with are language agnostic, there is no need to set the
+ language.
+ </para>
+ </formalpara>
+ </listitem>
+
+ </itemizedlist>
+ </section>
+
+ <section id="cvd.toolsMenu">
+ <title>The tools menu</title>
+ <para>
+ The tools menu contains some assorted utilities, such as the log
+ file viewer. Here you can also set the log level for UIMA.
+ A more detailed description of some of the menu items
+ follows below.
+ </para>
+ <section id="cvd.viewTypeSystem">
+ <title>View Type System</title>
+ <para>
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata scale="100" format="JPG" fileref="&imgroot;TypeSystemViewer.jpg" />
+ </imageobject>
+ </mediaobject>
+ </screenshot>
+
+ Brings up a new window that displays the type system. This menu
+ item is disabled until the first time you have run an analysis
+ engine, since there is no type system to display until then. An
+ example is shown above.
+ </para>
+ <para>
+ You can view the inheritance tree on the left by expanding and
+ collapsing nodes. When you select a type, the features defined on
+ that type are displayed in the table on the right. The feature
+ table has three columns. The first gives the name of the feature,
+ the second one the type of the feature (i.e., what values it
+ takes), and the third column displays the highest type this feature
+ is defined on. In this example, the features "begin" and
+ "end" are inherited from the built-in annotation type.
+ </para>
+ <para>
+ In the options menu, you can configure if you want to see inherited
+ features or not (not yet implemented).
+ </para>
+ </section>
+
+ <section id="cvd.showSelectedAnnotations">
+ <title>Show Selected Annotations</title>
+ <para>
+ <figure id="AnnotationViewerFigure">
+ <title>
+ Annotations produced by a statistical named entity tagger
+ </title>
+ <mediaobject>
+ <imageobject> <!-- was width="5.82in" -->
+ <imagedata scale="100" format="JPG" fileref="&imgroot;AnnotationViewer.jpg" />
+ </imageobject>
+ </mediaobject>
+ </figure>
+ </para>
+
+ <para>
+ To enable this menu, you must have run an analysis engine and
+ selected the ``AnnotationIndex'' or one of its subnodes in the
+ upper left hand corncer of the screen. It will bring up a new text
+ window with all selected annotations marked up in the text.Â
+ </para>
+ <para>
+ <xref linkend="AnnotationViewerFigure" />
+ shows the results of applying a statistical named entity tagger to
+ a newspaper article. Some annotation colors have been customized:
+ countries are in reverse video, organizations have a turquois
+ background, person names are green, and occupations have a maroon
+ background. The default background color is yellow. This color is
+ also used if there is more than one annotation spanning a certain
+ text. Clearly, this display is only useful if you don't have any
+ overlapping annotations, or at least not too many.
+ </para>
+ <para>
+ This menu item is also available as a context menu in the Index
+ Tree area of the main window. To use it, select the annotation
+ index or one of its subnodes, right-click to bring up a popup menu,
+ and select the only item in the popup menu. The popup menu is
+ actually a better way to invoke the annotation display, since it
+ changes according to the selection in the Index Tree area, and will
+ tell you if what you've selected can be displayed or not.
+ </para>
+
+
+ </section>
+
+ </section>
+
+ </section>
+
+ <section id="cvd.mainDisplayArea">
+ <title>The Main Display Area</title>
+ <para>
+ The main display area has three sub-areas. In the upper left-hand
+ corner is the
+ <emphasis role="bold">index display</emphasis>, which shows the indexes that were defined in the
+ AE, as well as
+ the types of the indexes and their subtypes. In the lower left-hand
+ corner, the content of indexes and sub-indexes is displayed
+ (<emphasis role="bold">FS display</emphasis>). Clicking on any node in the index display will
+ show the
+ corresponding feature structures in the FS display. You can explore
+ those structures by expanding the tree nodes. When you click on a
+ node that represents an annotation, clicking on it will cause the
+ corresponding text span to marked in the
+ <emphasis role="bold">text display</emphasis>.
+ </para>
+ <para>
+ <figure id="Main1Figure">
+ <title>State of GUI after running an analysis engine</title>
+ <mediaobject>
+ <imageobject>
+ <imagedata scale="100" format="JPG" fileref="&imgroot;Main1.jpg" />
+ </imageobject>
+ </mediaobject>
+ </figure>
+ </para>
+ <para>
+ <xref linkend="Main1Figure"></xref>
+ shows the state after running the UIMA_Analysis_Example.xml aggregate from the
+ uimaj-examples project. There are two indexes in the index display, and the
+ annotation index has been selected. Note that the number of
+ structures in an index is displayed in square brackets after the
+ index name.
+ </para>
+ <para>
+ Since displaying thousands of sister nodes is both confusing and
+ slow, nodes are grouped in powers of 10. As soon as there are no
+ more than 100 sister nodes, they are displayed next to each other.
+ </para>
+ <para>
+ In our example, a name annotation has been selected, and the
+ corresponding token text is highlighted in the text area. We have
+ also expanded the token node to display its structure (not much to see in this simple example).
+ </para>
+ <para>
+ In <xref linkend="Main1Figure"/>, we selected an annotation in the FS display to find the
+ corresponding text. We can also do the reverse and find out what
+ annotations cover a certain point in the text. Let's go back to the
+ name recognizer for an example.
+ </para>
+ <para>
+ <figure id="Main2Figure">
+ <title>
+ Finding annotations for a specific location in the text
+ </title>
+ <mediaobject>
+ <imageobject> <!-- next width was 6.39in -->
+ <imagedata scale="100" format="JPG" fileref="&imgroot;Main2.jpg" />
+ </imageobject>
+ </mediaobject>
+ </figure>
+ </para>
+ <para>
+ We would like to know if the Michael Baessler has been
+ recognized as a name. So we position the cursor in the corresponding
+ text span somewhere, then right-click to bring up the context menu
+ telling us which annotations exist at this point. An example is shown
+ in
+ <xref linkend="Main2Figure" />.
+ </para>
+ <para>
+ <figure id="Main3Figure">
+ <title>
+ Selecting an annotation from the context menu will highlight that
+ annotation in the FS display
+ </title>
+ <mediaobject>
+ <imageobject> <!-- width was 6.39in -->
+ <imagedata scale="100" format="JPG" fileref="&imgroot;Main3.jpg" />
+ </imageobject>
+ </mediaobject>
+ </figure>
+ </para>
+
+ <para>
+ At this point (<xref linkend="Main2Figure" />),
+ we only know that somewhere around the text cursor position (not
+ visible in the picture), we discovered a name. When we select the corresponding entry in the
+ context menu, the name annotation is selected in the FS display, and its covered text is
+ highlighted.
+ <xref linkend="Main3Figure" /> shows the display after
+ the name node has been selected in
+ the popup menu.
+ </para>
+ <para>
+ We're glad to see that, indeed, Michael Baessler is
+ considered to be a name. Note that in the FS display, the
+ corresponding annotation node has been selected, and the tree has
+ been expanded to make the node visible.
+ </para>
+ <para>
+ NB that the annotations displayed in the popup menu come from the
+ annotations currently displayed in the FS display. If you didn't
+ select the annotation index or one of its sub-nodes, no annotations
+ can be displayed and the popup menu will be empty.
+ </para>
+
+ <section id="cvd.statusBar">
+ <title>The Status Bar</title>
+ <para>
+ At the bottom of the screen, some useful information is displayed in
+ the
+ <emphasis role="bold">status bar</emphasis>. The left-most area shows the most recent major event, with the
+ time when the event terminated in square brackets. The next area
+ shows the file name of the currently loaded XML descriptor. This
+ area supports a tool tip that will show the full path to the file.
+ The right-most area shows the current cursor position, or the extent
+ of the selection, if a portion of the text has been selected. The
+ numbers correspond to the character offsets that are used for
+ annotations.
+ </para>
+ </section>
+
+ <section id="cvd.keyboardNavigation">
+ <title>Keyboard Navigation and Shortcuts</title>
+ <para>
+ The GUI can be completely navigated and operated through the
+ keyboard. All menus and menu items support keyboard mnemonics, and
+ some common operations are accessible through keyboard accelerators.
+ </para>
+ <para>
+ You can move the focus between the three main areas using
+ <computeroutput>Tab</computeroutput>
+ (clockwise) and
+ <computeroutput>Shift-Tab</computeroutput>
+ (counterclockwise). When the focus is on the text area, the
+ <computeroutput>Tab</computeroutput>
+ key will insert the corresponding character into the text, so you
+ will need to use
+ <computeroutput>Ctrl-Tab</computeroutput>
+ and
+ <computeroutput>Ctrl-Shift-Tab</computeroutput>
+ instead. Alternatively, you can use the following key bindings to
+ jump directly to one of the areas:
+ <computeroutput>Ctrl-T</computeroutput>
+ to focus the text area,
+ <computeroutput>Ctrl-I</computeroutput>
+ for the index repository frame and
+ <computeroutput>Ctrl-F</computeroutput>
+ for the feature structure area.
+ </para>
+ <para>
+ Some additional keyboard shortcuts are available only in the text
+ area, such as
+ <computeroutput>Ctrl-X</computeroutput>
+ for Cut,
+ <computeroutput>Ctrl-C</computeroutput>
+ for Copy,
+ <computeroutput>Ctrl-V</computeroutput>
+ for Paste and
+ <computeroutput>Ctrl-Z</computeroutput>
+ for Undo. The context menu in the text area can be evoke through the
+ <computeroutput>Alt-Enter</computeroutput>
+ shortcut. Text can be selected using the arrow keys while holding
+ the
+ <computeroutput>Shift</computeroutput>
+ key.
+ </para>
+ <para>
+ The following table shows the supported keyboard shortcuts.
+ </para>
+ <table frame="none" id="cvd.table.keyboardShortcuts">
+ <title>Keyboard shortcuts</title>
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Shortcut</entry>
+ <entry>Action</entry>
+ <entry>Scope</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-O</computeroutput>
+ </entry>
+ <entry>Open text file</entry>
+ <entry>Global</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-S</computeroutput>
+ </entry>
+ <entry>Save text file</entry>
+ <entry>Global</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-L</computeroutput>
+ </entry>
+ <entry>Load AE descriptor</entry>
+ <entry>Global</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-R</computeroutput>
+ </entry>
+ <entry>Run current AE</entry>
+ <entry>Global</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-I</computeroutput>
+ </entry>
+ <entry>Switch focus to index repository</entry>
+ <entry>Global</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-T</computeroutput>
+ </entry>
+ <entry>Switch focus to text area</entry>
+ <entry>Global</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-F</computeroutput>
+ </entry>
+ <entry>Switch focus to FS area</entry>
+ <entry>Global</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-X</computeroutput>
+ </entry>
+ <entry>Cut selection</entry>
+ <entry>Text</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-C</computeroutput>
+ </entry>
+ <entry>Copy selection</entry>
+ <entry>Text</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-V</computeroutput>
+ </entry>
+ <entry>Paste selection</entry>
+ <entry>Text</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Ctrl-Z</computeroutput>
+ </entry>
+ <entry>Undo</entry>
+ <entry>Text</entry>
+ </row>
+ <row>
+ <entry>
+ <computeroutput>Alt-Enter</computeroutput>
+ </entry>
+ <entry>Show context menu</entry>
+ <entry>Text</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </section>
+
+ </section>
+</chapter>
\ No newline at end of file
Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.doc_analyzer.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.doc_analyzer.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.doc_analyzer.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.doc_analyzer.xml Thu May 6 14:04:08 2010
@@ -0,0 +1,275 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/tools.doc_analyzer/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tools.doc_analyzer">
+ <title>Document Analyzer User's Guide</title>
+
+
+<para>The <emphasis>Document Analyzer</emphasis> is a tool provided by the
+UIMA SDK for testing annotators and AEs. It reads text files from your disk, processes them using an AE, and
+allows you to view the results. The
+Document Analyzer is designed to work with text files and cannot be used with
+Analysis Engines that process other types of data.</para>
+
+<para>For an introduction to developing annotators and Analysis
+Engines, read
+ <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aae"/>.
+ This chapter is a user's guide for using the Document Analyzer tool, and
+does not describe the process of developing annotators and Analysis Engines.</para>
+
+<section id="ugr.tools.doc_analyzer.starting">
+ <title>Starting the Document Analyzer</title>
+
+<para>To run the Document Analyzer, execute the <literal>documentAnalyzer</literal> script that is in the <literal>bin</literal> directory of your UIMA SDK installation, or, if you
+are using the example Eclipse project, execute the <quote>UIMA Document Analyzer</quote>
+run configuration supplied with that project.</para>
+
+<para>Note that if you're planning to run an Analysis Engine
+other than one of the examples included in the UIMA SDK, you'll first need to
+update your CLASSPATH environment variable to include the classes needed by
+that Analysis Engine.</para>
+
+<para>When you first run the Document Analyzer, you should see a
+screen that looks like this:
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata width="5.8in" format="JPG" fileref="&imgroot;image002.jpg"/>
+ </imageobject>
+ <textobject><phrase>Document Analyzer GUI</phrase>
+ </textobject>
+ </mediaobject>
+ </screenshot></para>
+
+
+ </section>
+
+ <section id="ugr.tools.doc_analyzer.running_an_ae">
+ <title>Running an AE</title>
+
+
+
+<para>To run a AE, you must first configure the six fields on
+the main screen of the Document Analyzer.</para>
+
+<para><emphasis role="bold">Input Directory:</emphasis>
+ Browse to or type the path of a directory containing text files that you
+want to analyze. Some sample documents
+are provided in the UIMA SDK under the <literal>examples/data</literal>
+directory.</para>
+
+<para><emphasis role="bold">Output Directory:</emphasis> Browse to or type the path of a directory where you want
+ output to be written. (As we'll see later, you won't normally need to look directly at these files, but the
+ Document Analyzer needs to know where to write them.) The files written to this directory will be an XML
+ representation of the analyzed documents. If this directory doesn't exist, it will be created. If the
+ directory exists, any files in it will be deleted (but the tool will ask you to confirm this before doing so). If you
+ leave this field blank, your AE will be run but no output will be generated.</para>
+
+<para><emphasis role="bold">Location of AE XML Descriptor:</emphasis>
+ Browse to or type the path of the descriptor
+for the AE that you want to run. There
+are some example descriptors provided in the UIMA SDK under the <literal>examples/descriptors/analysis_engine</literal> and <literal>examples/descriptors/tutorial</literal> directories.</para>
+
+<para><emphasis role="bold">XML Tag containing Text:</emphasis>
+ This is an optional feature. If you enter a value here, it specifies the
+name of an XML tag, expected to be found within the input documents, that
+contains the text to be analyzed. For
+example, the value <literal>TEXT</literal> would cause the AE to only
+analyze the portion of the document enclosed within <TEXT>...</TEXT>
+tags. Also, any XML tags occuring within that text will be removed prior to analysis.</para>
+
+<para><emphasis role="bold">Language:</emphasis>
+ Specify
+the language in which the documents are written. Some Analysis Engines, but not all, require
+that this be set correctly in order to do their analysis. You can select a value from the drop-down
+list or type your own. The value entered
+here must be an ISO language identifier, the list of which can be found here:
+ <ulink url="http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt"/>.
+</para>
+
+<para><emphasis role="bold">Character Encoding:</emphasis>
+ The character encoding of the input files. The default, UTF-8, also works fine for ASCII
+text files. If you have a different
+encoding, enter it here. For more
+information on character sets and their names, see the Javadocs for
+ <literal>java.nio.charset.Charset</literal>.</para>
+
+<para>Once you've filled in the appropriate values, press the
+<quote>Run</quote> button.</para>
+
+<para>If an error occurs, a dialog will appear with the error
+message. (A stack trace will also be
+printed to the console, which may help you if the error was generated by your
+own annotator code.) Otherwise, an
+<quote>Analysis Results</quote> window will appear.</para>
+
+
+
+</section>
+
+ <section id="ugr.tools.doc_analyzer.viewing_results">
+ <title>Viewing the Analysis Results</title>
+
+<para>After a successful analysis, the <quote>Analysis
+Results</quote> window will appear.
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata width="4.2in" format="JPG" fileref="&imgroot;image004.jpg"/>
+ </imageobject>
+ <textobject><phrase>Analysis Results Window</phrase></textobject>
+ </mediaobject>
+ </screenshot></para>
+
+
+<para>The <quote>Results Display Format</quote> options at the
+bottom of this window show the different ways you can view your analysis – the
+Java Viewer, Java Viewer (JV) with User Colors, HTML, and XML.
+ The default, Java Viewer, is recommended.</para>
+
+<para>Once you have selected your desired Results Display
+Format, you can double-click on one of the files in the list to view the
+analysis done on that file.</para>
+
+<para>For the Java viewer, the results display looks like this
+(for the AE descriptor <literal>examples/descriptors/tutorial/ex4/MeetingDetectorAE.xml</literal>):
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata width="5.8in" format="JPG" fileref="&imgroot;image006.jpg"/>
+ </imageobject>
+ <textobject><phrase>Analysis Results Window showing results from tutorial example 4</phrase></textobject>
+ </mediaobject>
+ </screenshot></para>
+
+
+<para>You can click the mouse on one of the highlighted
+annotations to see a list of all its features in the frame on the right.</para>
+
+<para>If there are multiple annotation types in the view, you
+can control which ones are selected by using the checkboxes in the legend, the
+Select All button, or the Deselect All button.</para>
+
+<para>If you are viewing a CAS that contains multiple subjects
+of analysis, then a selector will appear at the bottom right of the Annotation
+Viewer window. This will allow you to
+choose the Sofa that you wish to view. Note that only text Sofas containing a non-null document are available
+for viewing.</para>
+
+</section>
+
+ <section id="ugr.tools.doc_analyzer.configuring">
+ <title>Configuring the Annotation Viewer</title>
+
+<para>The <quote>JV User Colors</quote> and the HTML viewer allow
+you to specify exactly which colors are used to display each of your annotation
+types. For the Java Viewer, you can also
+specify which types should be initially selected, and you can hide types
+entirely.</para>
+
+<para>To configure the viewer, click the <quote>Edit Style
+Map</quote> button on the <quote>Analysis Results</quote> dialog.
+ You should see a dialog that looks like this:
+
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata width="5.8in" format="JPG" fileref="&imgroot;image008.jpg"/>
+ </imageobject>
+ <textobject><phrase>Configuring the Analysis Results Viewer</phrase></textobject>
+ </mediaobject>
+ </screenshot></para>
+
+<para>To change the color assigned to a type, simply click on
+the colored cell in the <quote>Background</quote> column for the type you wish to
+edit. This will display a dialog that
+allows you to choose the color. For the
+HTML viewer only, you can also change the foreground color.</para>
+
+<para>If you would like the type to be initially checked
+(selected) in the legend when the viewer is first launched, check the box in
+the <quote>Checked</quote> column. If you
+would like the type to never be shown in the viewer, click the box in the
+<quote>Hidden</quote> column. These
+settings only affect the Java Viewer, not the HTML view.</para>
+
+<para>When you are done editing, click the <quote>Save</quote>
+button. This will save your choices to a
+file in the same directory as your AE descriptor. From now on, when you view analysis results
+produced by this AE using the <quote>JV User Colors</quote> or <quote>HTML</quote>
+options, the viewer will be configured as you have specified.</para>
+
+</section>
+
+<section id="ugr.tools.doc_analyzer.interactive_mode">
+ <title>Interactive Mode</title>
+
+
+<para>Interactive Mode allows you to analyze text that you type
+or cut-and-paste into the tool, rather than requiring that the documents be
+stored as files.</para>
+
+<para>In the main Document Analyzer window, you can invoke
+Interactive Mode by clicking the <quote>Interactive</quote> button instead of the
+<quote>Run</quote> button. This will
+display a dialog that looks like this:
+
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata width="5.5in" format="JPG" fileref="&imgroot;image010.jpg"/>
+ </imageobject>
+ <textobject><phrase>Invoking Interactive Mode</phrase></textobject>
+ </mediaobject>
+ </screenshot></para>
+
+<para>You can type or cut-and-paste your text into this window,
+then choose your Results Display Format and click the <quote>Analyze</quote>
+button. Your AE will be run on the text
+that you supplied and the results will be displayed as usual.</para>
+
+
+</section>
+
+ <section id="ugr.tools.doc_analyzer.view_mode">
+ <title>View Mode</title>
+
+<para>If you have previously run a AE and saved its analysis
+results, you can use the Document Analyzer's View mode to view those results,
+without re-running your analysis. To do
+this, on the main Document Analyzer window simply select the location of your
+analyzed documents in the <quote>Output Directory</quote> dialog and click the
+<quote>View</quote> button. You can then
+view your analysis results as described in Section
+ <xref linkend="ugr.tools.doc_analyzer.viewing_results"/>.</para>
+
+</section>
+ </chapter>
+
Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.jcasgen.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.jcasgen.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.jcasgen.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.jcasgen.xml Thu May 6 14:04:08 2010
@@ -0,0 +1,181 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/tools.jcasgen/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tools.jcasgen">
+ <title>JCasGen User's Guide</title>
+
+ <para>JCasGen reads a descriptor for an application (either an Analysis Engine Descriptor,
+ or a Type System Descriptor), creates the merged type system
+ specification by merging all the type system information from all the components
+ referred to in the descriptor, and then uses this merged type system to create Java source
+ files for classes that enable JCas access to the CAS. Java classes are not produced for the
+ built-in types, since these classes are already provided by the UIMA SDK. (An exception is
+ the built-in type <literal>uima.tcas.DocumentAnnotation</literal>, see the warning below.) </para>
+
+ <warning><para>If the components comprising the input to the type merging process
+ have different definitions for the same type name,
+ JCasGen will show a warning, and in some environments may offer to abort the operation.
+ If you continue past this warning,
+ JCasGen will produce correct Java source files representing the merged types
+ (that is, the
+ type definition containing all of the features defined on that type by all of the
+ components). It is recommended that you do not use this capability (of having
+ two different definitions for the same type name, with different feature sets) since it can make it
+ difficult to combine/package your annotator with others. See <olink
+ targetdoc="&uima_docs_ref;"
+ targetptr="ugr.ref.jcas.merging_types_from_other_specs"/> for more information.
+ </para>
+
+ <para>Also note that if your type system declares a custom version of the
+ <literal>uima.tcas.DocumentAnnotation</literal>
+ built-in type, then JCasGen will generate a Java source file for it. If you do this, you need to be
+ aware of the issues discussed in
+ <olink
+ targetdoc="&uima_docs_ref;"
+ targetptr="ugr.ref.jcas.documentannotation_issues"/>.</para></warning>
+
+ <para>There are several versions of JCasGen. The basic version reads an XML descriptor
+ which contains a type system descriptor, and generates the corresponding Java Class
+ Models for those types. Variants exist for the Eclipse environment that allow merging the
+ newly generated Java source code with previously augmented versions; see <olink
+ targetdoc="&uima_docs_ref;"
+ targetptr="ugr.ref.jcas.augmenting_generated_code"/> for a discussion of how the
+ Java Class Models can be augmented by adding additional methods and fields.</para>
+
+ <para>Input to JCasGen needs to be mostly self-contained. In particular, any types that are
+ defined to depend on user-defined supertypes must have that supertype defined, if the
+ supertype is <literal>uima.tcas.Annotation </literal>or a subtype of it. Any features
+ referencing ranges which are subtypes of uima.cas.String must have those subtypes
+ included. If this is not followed, a warning message is given stating that the resulting
+ generation may be inaccurate.</para>
+
+ <para>JCasGen is typically invoked automatically when using the Component Descriptor
+ Editor (see <olink targetdoc="&uima_docs_tools;"
+ targetptr="ugr.tools.cde.auto_jcasgen"/>), but can also be run using a shell
+ script. These scripts can take 0, 1, or 2 arguments. The first argument is the location of
+ the file containing the input XML descriptor. The second argument specifies where the
+ generated Java source code should go. If it isn't given, JCasGen generates its
+ output into a subfolder called JCas (or sometimes JCasNew – see below), of the first
+ argument's path.</para>
+
+ <para>If no arguments are given to JCasGen, then it launches a GUI to interact with the user
+ and ask for the same input. The GUI will remember the arguments you previously used.
+ Here's what it looks like:
+
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata width="5.8in" format="JPG" fileref="&imgroot;image002.jpg"/>
+ </imageobject>
+ <textobject><phrase>JCasGen tool showing fields for input arguments</phrase>
+ </textobject>
+ </mediaobject>
+ </screenshot></para>
+
+ <para>When running with automatic merging of the generated Java source with previously
+ augmented versions, the output location is where the merge function obtains the source
+ for the merge operation.</para>
+
+ <para>As is customary for Java, the generated class source files are placed in the
+ appropriate subdirectory structure according to Java conventions that correspond to
+ the package (name space) name.</para>
+
+ <para>The Java classes must be compiled and the resulting class files included in the class
+ path of your application; you make these classes available for other annotator writers
+ using your types, perhaps packaged as an xxx.jar file. If the xxx.jar file is made to
+ contain only the Java Class Models for the CAS types, it can be reused by any users of these
+ types.</para>
+
+ <section id="ugr.tools.jcasgen.running_without_eclipse">
+ <title>Running stand-alone without Eclipse</title>
+
+ <para>There is no capability to automatically merge the generated Java source with
+ previous versions, unless running with Eclipse. If run without Eclipse, no automatic
+ merging of the generated Java source is done with any previous versions. In this case,
+ the output is put in a folder called <quote>JCasNew</quote> unless overridden by
+ specifying a second argument.</para>
+
+ <para>The distribution includes a shell script/bat file to run the stand-alone version,
+ called jcasgen.</para>
+
+ </section>
+
+ <section id="ugr.tools.jcasgen.running_standalone_with_eclipse">
+ <title>Running stand-alone with Eclipse</title>
+
+ <para>If you have Eclipse and EMF (EMF = Eclipse Modeling Framework; both of these are
+ available from <ulink url="http://www.eclipse.org"/>) installed (version 3 or
+ later) JCasGen can merge the Java code it generates with previous versions, picking up
+ changes you might have inserted by hand. The output (and source of the merge input) is in a
+ folder <quote>JCas</quote> under the same path as the input XML file, unless
+ overridden by specifying a second argument.</para>
+
+ <para>You must install the UIMA plug-ins into Eclipse to enable this function.</para>
+
+ <para>The distribution includes a shell script/bat file to run the stand-alone with
+ Eclipse version, called jcasgen_merge. This works by starting Eclipse in
+ <quote>headless</quote> mode (no GUI) and invoking JCasGen within Eclipse. You will
+ need to set the ECLIPSE_HOME environment variable or modify the jcasgen_merge shell
+ script to specify where to find Eclipse. The version of Eclipse needed is 3 or higher,
+ with the EMF plug-in and the UIMA runtime plug-in installed. A temporary workspace is
+ used; the name/location of this is customizable in the shell script.</para>
+
+ <para>Log and error messages are written to the UIMA log. This file is called uima.log, and
+ is located in the default working directory, which if not overridden, is the startup
+ directory of Eclipse.</para>
+
+ </section>
+
+ <section id="ugr.tools.jcasgen.running_within_eclipse">
+ <title>Running within Eclipse</title>
+
+ <para>There are two ways to run JCasGen within Eclipse. The first way is to configure an
+ Eclipse external tools launcher, and use it to run the stand-alone shell scripts, with
+ the arguments filled in. Here's a picture of a typical launcher configuration
+ screen (you get here by navigating from the top menu: Run –> External Tools
+ –> External tools...).
+
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata width="5.8in" format="JPG" fileref="&imgroot;image004.jpg"/>
+ </imageobject>
+ <textobject><phrase>Running JCasGen within Eclipse using the external tool launcher</phrase>
+ </textobject>
+ </mediaobject>
+ </screenshot></para>
+
+ <para>The second way (which is the normal way it's done) to run within Eclipse is to use the
+ Component Descriptor Editor (CDE) (see <olink targetdoc="&uima_docs_tools;"
+ targetptr="ugr.tools.cde"/>). This tool can be configured to automatically
+ launch JCasGen whenever the type system descriptor is modified. In this release, this
+ operation completely regenerates the files, even if just a small thing changed. For
+ very large type systems, you probably don't want to enable this all the time. The
+ configurator tool has an option to enable/disable this function.</para>
+ </section>
+
+</chapter>
\ No newline at end of file
Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.installer.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.installer.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.installer.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.installer.xml Thu May 6 14:04:08 2010
@@ -0,0 +1,110 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/tools/tools.pear.installer/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tools.pear.installer">
+ <title>PEAR Installer User's Guide</title>
+
+ <para>PEAR (Processing Engine ARchive) is a new standard for packaging UIMA compliant
+ components. This standard defines several service elements that should be included in
+ the archive package to enable automated installation of the encapsulated UIMA
+ component. The major PEAR service element is an XML Installation Descriptor that
+ specifies installation platform, component attributes, custom installation
+ procedures and environment variables. </para>
+
+ <para>The installation of a UIMA compliant component includes 2 steps: (1) installation of
+ the component code and resources in a local file system, and (2) verification of the
+ serviceability of the installed component. Installation of the component code and
+ resources involves extracting component files from the archive (PEAR) package in a
+ designated directory and localizing file references in component descriptors and other
+ configuration files. Verification of the component serviceability is accomplished
+ with the help of standard UIMA mechanisms for instantiating analysis engines.
+
+
+ <screenshot>
+ <mediaobject>
+ <imageobject>
+ <imagedata width="5.8in" format="JPG" fileref="&imgroot;image002.jpg"/>
+ </imageobject>
+ <textobject><phrase>PEAR Installer GUI</phrase>
+ </textobject>
+ </mediaobject>
+ </screenshot></para>
+
+ <para>To launch the PEAR Installer, use the script in the UIMA bin directory:
+ <code>runPearInstaller.bat</code> or <code>runPearInstaller.sh.</code></para>
+
+ <para>PEAR Installer is a simple GUI based Java application that helps installing UIMA
+ compliant components (analysis engines) from PEAR packages in a local file system. To
+ install a desired UIMA component the user needs to select the appropriate PEAR file in a
+ local file system and specify the installation directory (optional). If no installation
+ directory is specified, the PEAR file is installed to the current working directory.
+ By default the PEAR packages are not installed directly to the specified installation directory.
+ For each PEAR a subdirectory with the name of the PEAR's ID is created where the PEAR package is
+ installed to. If the PEAR installation directory already exists, the old content is automatically
+ deleted before the new content is installed. During the
+ component installation the user can read messages printed by the installation program in
+ the message area of the application window. If the installation fails, appropriate error
+ message is printed to help identifying and fixing the problem.</para>
+
+ <para>After the desired UIMA component is successfully installed, the PEAR Installer
+ allows testing this component in the CAS Visual Debugger (CVD) application, which is
+ provided with the UIMA package. The CVD application will load your UIMA component using
+ its XML descriptor file. If the component is loaded successfully, you'll be able to
+ run it either with sample documents provided in the
+ <literal><UIMA_HOME>/examples/data</literal> directory, or with any other
+ sample documents. See <olink targetdoc="&uima_docs_tools;"
+ targetptr="ugr.tools.cvd"/> for more information about the CVD application.
+ Running your component in the CVD application helps to make sure the component will run in
+ other UIMA applications. If the CVD application fails to load or run your component, or
+ throws an exception, you can find more information about the problem in the uima.log file
+ in the current working directory. The log file can be viewed with the CVD.</para>
+
+ <para>PEAR Installer creates a file named <literal>setenv.txt</literal> in the
+ <literal><component_root>/metadata</literal> directory. This file contains
+ environment variables required to run your component in any UIMA application.
+ It also creates a PEAR descriptor (see also <olink targetdoc="&uima_docs_ref;"
+ targetptr="ugr.ref.pear.specifier"/>)
+ file named <literal><componentID>_pear.xml</literal>
+ in the <literal><component_root></literal> directory that can be used to directly run
+ the installed pear file in your application.
+ </para>
+
+ <para>
+ The metadata/setenv.txt is not read by the UIMA framework anywhere.
+ It's there for use by non-UIMA application code if that code wants to set environment variables.
+ The metadata/setenv.txt is just a "convenience" file duplicating what's in the xml.
+ </para>
+
+ <para>
+ The setenv.txt file has 2 special variables: the CLASSPATH and the PATH.
+ The CLASSPATH is computed from any supplied CLASSPATH environment variable,
+ plus the jars that are configured in the PEAR structure, including subcomponents.
+ The PATH is similarly computed, using any supplied PATH environment variable plus
+ it includes the "bin" subdirectory of the PEAR structure, if it exists.
+ </para>
+
+
+
+</chapter>
\ No newline at end of file
Added: uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.merger.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.merger.xml?rev=941741&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.merger.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-tools/src/docbook/tools.pear.merger.xml Thu May 6 14:04:08 2010
@@ -0,0 +1,164 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tools.pear.merger">
+ <title>PEAR Merger User's Guide</title>
+
+ <para>The PEAR Merger utility takes two or more PEAR files and merges their contents,
+ creating a new PEAR which has, in turn, a new Aggregate analysis engine whose delegates are
+ the components from the original files being merged. It does this by (1) copying the
+ contents of the input components into the output component, placing each component into a
+ separate subdirectory, (2) generating a UIMA descriptor for the output Aggregate
+ analysis engine and (3) creating an output PEAR file that encapsulates the output
+ Aggregate.</para>
+
+ <para>The merge logic is quite simple, and is intended to work for simple cases. More complex
+ merging needs to be done by hand. Please see the Restrictions and Limitations section,
+ below.</para>
+
+ <para>To run the PearMerger command line utility you can use the runPearMerger scripts (.bat for Windows, and .sh for
+ Unix). The usage of the tooling is shown below:</para>
+
+ <para><programlisting>runPearMerger 1st_input_pear_file ... nth_input_pear_file
+ -n output_analysis_engine_name [-f output_pear_file ]</programlisting></para>
+
+ <para>The first group of parameters are the input PEAR files. No duplicates are allowed
+ here. The <literal>-n</literal> parameter is the name of the generated Aggregate
+ Analysis Engine. The optional <literal>-f</literal> parameter specifies the name of
+ the output file. If it is omitted, the output is written to
+ <literal>output_analysis_engine_name.pear</literal> in the current working directory.</para>
+
+ <para>During the running of this tool, work files are written to a temporary directory
+ created in the user's home directory.</para>
+
+ <section id="ugr.tools.pear.merger.merge_details">
+ <title>Details of the merging process</title>
+
+ <para>The PEARs are merged using the following steps:</para>
+
+ <orderedlist><listitem><para>A temporary working directory, is created for the
+ output aggregate component.</para></listitem>
+
+ <listitem><para>Each input PEAR file is extracted into a separate
+ 'input_component_name' folder under the working directory.</para>
+ </listitem>
+
+ <listitem><para>The extracted files are processed to adjust the
+ '$main_root' macros. This operation differs from the PEAR installation
+ operation, because it does not replace the macros with absolute paths.</para>
+ </listitem>
+
+ <listitem><para>The output PEAR directory structure, 'metadata' and
+ 'desc' folders under the working directory, are created.</para>
+ </listitem>
+
+ <listitem><para>The UIMA AE descriptor for the output aggregate component is built
+ in the 'desc' folder. This aggregate descriptor refers to the input
+ delegate components, specifying 'fixed flow' based on the original
+ order of the input components in the command line. The aggregate descriptor's
+ 'capabilities' and
+ 'operational properties' sections are built based on the input
+ components' specifications.</para></listitem>
+
+ <listitem><para>A new PEAR installation descriptor is created in the
+ 'metadata' folder, referencing the new output aggregate descriptor
+ built in the previous step. </para></listitem>
+
+ <listitem><para>The content of the temporary output working directory is zipped to
+ created the output PEAR, and then the temporary working directory is deleted.
+ </para></listitem></orderedlist>
+
+ <para>The PEAR merger utility logs all the operations both to standard console output and
+ to a log file, pm.log, which is created in the current working directory.</para>
+
+ </section>
+
+ <section id="ugr.tools.pear.merger.testing_modifying_resulting_pear">
+ <title>Testing and Modifying the resulting PEAR</title>
+
+ <para>The output PEAR file can be installed and tested using the PEAR Installer. The
+ output aggregate component can also be tested by using the CVD or DocAnalyzer
+ tools.</para>
+
+ <para>The PEAR Installer creates Eclipse project files (.classpath and .project) in the
+ root directory of the installer PEAR, so the installed component can be imported into
+ the Eclipse IDE as an external project. Once the component is in the Eclipse IDE,
+ developers may use the Component Descriptor Editor and the PEAR Packager to modify the
+ output aggregate descriptor and re-package the component.</para>
+
+ </section>
+ <section id="ugr.tools.pear.merger.restrictions_limitations">
+ <title>Restrictions and Limitations</title>
+
+ <para>The PEAR Merger utility only does basic merging operations, and is limited as
+ follows. You can overcome these by editing the resulting PEAR file or the resulting
+ Aggregate Descriptor.</para>
+
+ <orderedlist><listitem><para>The Merge operation specifies Fixed Flow sequencing
+ for the Aggregate.</para></listitem>
+
+ <listitem><para>The merged aggregate does not define any parameters, so the delegate
+ parameters cannot be overridden.</para></listitem>
+
+ <listitem><para>No External Resource definitions are generated for the
+ aggregate.</para></listitem>
+
+ <listitem><para>No Sofa Mappings are generated for the aggregate.</para>
+ </listitem>
+
+ <listitem><para>Name collisions are not checked for. Possible name collisions could
+ occur in the fully-qualified class names of the implementing Java classes, the names
+ of JAR files, the names of descriptor files, and the names of resource bindings or
+ resource file paths.</para></listitem>
+
+ <listitem><para>The input and output capabilities are generated based on merging the
+ capabilities from the components (removing duplicates). Capability sets are
+ ignored - only the first of the set is used in this process, and only one set is created
+ for the generated Aggregate. There is no support for merging Sofa
+ specifications.</para></listitem>
+
+ <listitem><para>No Indexes or Type Priorities are created for the generated
+ Aggregate. No checking is done to see if the Indexes or Type Priorities of the
+ components conflict or are inconsistent.</para></listitem>
+
+ <listitem><para>You can only merge Analysis Engines and CAS Consumers. </para>
+ </listitem>
+
+ <listitem><para>Although PEAR file installation descriptors that are being merged
+ can have specific XML elements describing Collection Reader and CAS Consumer
+ descriptors, these elements are ignored during the merge, in the sense that the
+ installation descriptor that is created by the merge does not set these elements. The
+ merge process does not use these elements; the output PEAR's new aggregate only
+ references the merged components' main PEAR descriptor element, as
+ identified by the PEAR element:
+
+ <programlisting><![CDATA[<SUBMITTED_COMPONENT>
+ <DESC>the_component.xml</DESC>...
+</SUBMITTED_COMPONENT>
+]]></programlisting></para>
+ </listitem></orderedlist>
+
+ </section>
+
+</chapter>