You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by sc...@apache.org on 2010/05/06 16:01:57 UTC
svn commit: r941739 [1/5] - in /uima/uimaj/branches/mavenAlign/uima-docbook-references: ./ src/ src/docbook/ src/docbook/images/ src/docbook/images/references/ src/docbook/images/references/ref.cas/ src/docbook/images/references/ref.javadocs/ src/docbo...

Author: schor
Date: Thu May  6 14:01:56 2010
New Revision: 941739

URL: http://svn.apache.org/viewvc?rev=941739&view=rev
Log:
[UIMA-1757] split uima-docbooks, rework for docbkx

Added:
    uima/uimaj/branches/mavenAlign/uima-docbook-references/pom.xml
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.cas/
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.cas/image001.png   (with props)
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.javadocs/
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.javadocs/image002.jpg   (with props)
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.pear/
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.pear/image002.jpg   (with props)
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.xml.cpe_descriptor/
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.xml.cpe_descriptor/image002.png   (with props)
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.cas.xml
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.javadocs.xml
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.jcas.xml
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.pear.xml
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.xmi.xml
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.xml.component_descriptor.xml
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.xml.cpe_descriptor.xml
    uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/references.xml
Modified:
    uima/uimaj/branches/mavenAlign/uima-docbook-references/   (props changed)

Propchange: uima/uimaj/branches/mavenAlign/uima-docbook-references/
------------------------------------------------------------------------------
--- svn:ignore (added)
+++ svn:ignore Thu May  6 14:01:56 2010
@@ -0,0 +1,2 @@
+target
+.project

Added: uima/uimaj/branches/mavenAlign/uima-docbook-references/pom.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-references/pom.xml?rev=941739&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-references/pom.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-references/pom.xml Thu May  6 14:01:56 2010
@@ -0,0 +1,65 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.    
+-->
+
+<project xmlns="http://maven.apache.org/POM/4.0.0"
+	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+	<modelVersion>4.0.0</modelVersion>
+  
+  <parent>
+    <groupId>org.apache.uima</groupId>
+    <artifactId>parent-pom-docbook</artifactId>
+    <version>1-SNAPSHOT</version>
+    <relativePath/>
+  </parent>
+  
+	<artifactId>uima-docbook-references</artifactId>
+	<packaging>pom</packaging>
+	<version>2.3.1-SNAPSHOT</version>
+	<name>Apache UIMA SDK Documentation - references</name>	
+  <url>${uimaWebsiteUrl}</url>
+ 
+   <!-- Special inheritance note
+       even though the <scm> element that follows is exactly the 
+       same as those in super poms, it cannot be inherited because 
+       there is some special code that computes the connection elements
+       from the chain of parent poms, if this is omitted. 
+       
+       Keeping this a bit factored allows cutting/pasting the <scm>
+       element, and just changing the following two properties -->  
+  <scm>
+    <connection>
+      scm:svn:http://svn.apache.org/repos/asf/uima/${uimaScmRoot}/trunk/${uimaScmProject}
+    </connection>
+    <developerConnection>
+      scm:svn:https://svn.apache.org/repos/asf/uima/${uimaScmRoot}/trunk/${uimaScmProject}
+    </developerConnection>
+    <url>
+      http://svn.apache.org/viewvc/uima/${uimaScmRoot}/trunk/${uimaScmProject}
+    </url>
+  </scm>
+  
+  <properties>
+    <uimaScmRoot>uimaj</uimaScmRoot>
+    <uimaScmProject>${project.artifactId}</uimaScmProject>
+    <!-- next property is the name of the top file under src/docbook without trailing .xml -->
+    <bookNameRoot>references</bookNameRoot>
+  </properties>
+ 	
+</project>
\ No newline at end of file

Added: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.cas/image001.png
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.cas/image001.png?rev=941739&view=auto
==============================================================================
Binary file - no diff available.

Propchange: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.cas/image001.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.javadocs/image002.jpg
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.javadocs/image002.jpg?rev=941739&view=auto
==============================================================================
Binary file - no diff available.

Propchange: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.javadocs/image002.jpg
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.pear/image002.jpg
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.pear/image002.jpg?rev=941739&view=auto
==============================================================================
Binary file - no diff available.

Propchange: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.pear/image002.jpg
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.xml.cpe_descriptor/image002.png
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.xml.cpe_descriptor/image002.png?rev=941739&view=auto
==============================================================================
Binary file - no diff available.

Propchange: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/images/references/ref.xml.cpe_descriptor/image002.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.cas.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.cas.xml?rev=941739&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.cas.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.cas.xml Thu May  6 14:01:56 2010
@@ -0,0 +1,962 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/references/ref.cas/" >
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.ref.cas">
+  <title>CAS Reference</title>
+  
+  <para>The CAS (Common Analysis System) is the part of the Unstructured Information
+    Management Architecture (UIMA) that is concerned with creating and handling the data
+    that annotators manipulate.</para>
+  
+  <para>Java users typically use the JCas (Java interface to the CAS) when manipulating
+    objects in the CAS. This chapter describes an alternative interface to the CAS which
+    allows discovery and specification of types and features at run time. It is recommended
+    for use when the using code cannot know ahead of time the type system it will be dealing
+    with.</para>
+    
+  <para>Use of the CAS as described here is also recommended (or necessary) when components add
+  to the definitions of types of other components.  This UIMA feature allows users to add features
+  to a type that was already defined elsewhere.  When this feature is used in conjunction with the
+  JCas, it can lead to problems with class loading.  This is because different JCas representations
+  of a single type are generated by the different components, and only one of them is loaded 
+  (unless you are using Pear descriptors).  Note:
+  we do not recommend that you add features to pre-existing types.  A type should be defined in one
+  place only, and then there is no problem with using the JCas.  However, if you do use this feature,
+  do not use the JCas.  Similarly, if you distribute your components for inclusion in somebody else's
+  UIMA application, and you're not sure that they won't add features to your types, do not use the
+  JCas for the same reasons.
+  </para>
+  
+  <para>CASes passed to Annotator Components are either a base CAS or a regular CAS. Base CASes
+    are only passed to Multi-View components - they are like regular CASes, but do not have user
+    accessible indexes or Sofas. They are used by the component only for switching to other CAS
+    views, which are regular CASes.</para>
+  
+  <section id="ugr.ref.cas.javadocs">
+    <title>Javadocs</title>
+    
+    <para>The subdirectory <literal>docs/api</literal> contains the documentation
+      details of all the classes, methods, and constants for the APIs discussed here. Please
+      refer to this for details on the methods, classes and constants, specifically in the
+      packages <literal>org.apache.uima.cas.*</literal>.</para>
+  </section>
+  
+  <section id="ugr.ref.cas.overview">
+    <title>CAS Overview</title>
+    
+    <para>There are three<footnote><para>A fourth part, the Subject of Analysis,
+      is discussed in <olink targetdoc="&uima_docs_tutorial_guides;"
+        targetptr="ugr.tug.aas"/>.</para></footnote> main parts to the CAS: the type system, data creation and
+      manipulation, and indexing.  We will start with a brief
+      description of these components.</para>
+    <section id="ugr.ref.cas.type_system">
+      <title>The Type System</title>
+      
+      <para>The type system specifies what kind of data you will be able to manipulate in your
+        annotators. The type system defines two kinds of entities, types and features. Types
+        are arranged in a single inheritance tree and define the kinds of entities (objects)
+        you can manipulate in the CAS. Features optionally specify slots or fields within a
+        type. The correspondence to Java is to equate a CAS Type to a Java Class, and the CAS
+        Features to fields within the type. A critical difference is that CAS types have no
+        methods; they are just data structures with named slots (features). These features can
+        have as values primitive things like integers, floating point numbers, and strings,
+        and they also can hold references to other instances of objects in the CAS. We call
+        instances of the data structures declared by the type system <quote>feature
+        structures</quote> (not to be confused with <quote>features</quote>). Feature
+        structures are similar to the many variants of record structures found in computer
+        science.<footnote><para> The name <quote>feature structure</quote> comes from
+        terminology used in linguistics.</para></footnote></para>
+      
+      <para>Each CAS Type defines a supertype; it is a subtype of that supertype. This means
+        that any features that the supertype defines are features of the subtype; in other
+        words, it inherits its supertype&apos;s features. Only single inheritance is
+        supported; a type&apos;s feature set is the union of all of the features in its
+        supertype hierarchy. There is a built-in type called uima.cas.TOP; this is the top,
+        root node of the inheritance tree. It defines no features.</para>
+      
+      <para>The values that can be stored in features are either built-in primitive values or
+        references to other feature structures. The primitive values are
+        <literal>boolean</literal>, <literal>byte</literal>,
+        <literal>short</literal> (16 bit integers), <literal>integer</literal> (32
+        bit), <literal>long</literal> (64 bit), <literal>float</literal> (32 bit),
+        <literal>double</literal> (64 bit floats) and strings; the official names of these
+        are <literal>uima.cas.Boolean</literal>, <literal>uima.cas.Byte</literal>,
+        <literal>uima.cas.Short</literal>, <literal>uima.cas.Integer</literal>,
+        <literal>uima.cas.Long</literal>, <literal>uima.cas.Float</literal>
+        ,<literal> uima.cas.Double</literal> and <literal>uima.cas.String</literal>
+        . The strings are Java strings, and characters are Java characters.  Technically, this means
+        that characters are UTF-16 code points, which is not quite the same as a Unicode character.
+        This distinction should make no difference for almost all applications.
+        The CAS also defines other basic built-in types for arrays of these, plus arrays of
+        references to other objects, called <literal>uima.cas.IntegerArray</literal>
+        ,<literal> uima.cas.FloatArray</literal>,
+        <literal>uima.cas.StringArray</literal>,
+        <literal>uima.cas.FSArray</literal>, etc.</para>
+      
+      <para>The CAS also defines a built-in type called
+        <literal>uima.tcas.Annotation</literal> which inherits from
+        <literal>uima.cas.AnnotationBase</literal> which in turn inherits from
+        <literal>uima.cas.TOP</literal>. There are two features defined by this type,
+        called <literal>begin</literal> and <literal>end</literal>, both of which are
+        integer valued.</para>
+      
+    </section>
+    
+    <section id="ugr.ref.cas.creating_accessing_manipulating_data">
+      <title>Creating, accessing and manipulating data</title>
+      <titleabbrev>Creating/Accessing/Changing data</titleabbrev>
+      
+      <para>
+        Creating and accessing data in the CAS requires knowledge about the types and features 
+        defined in the type system.  The idea is similar to other data access APIs, such as the XML
+        DOM or SAX APIs, or database access APIs such as JDBC.  Contrary to those APIs, however, the
+        CAS does not use the names of type system entities directly in the APIs.  Rather, you use
+        the type system to access type and feature entities by name, then use these entities in the
+        data manipulation APIs.  This can be compared to the Java reflection APIs: the type system
+        is comparable to the Java class loader, and the type and feature objects to the
+        <literal>java.lang.Class</literal> and <literal>java.lang.reflect.Field</literal> classes.
+      </para>
+      
+      <para>
+        Why does it have to be this complicated?  You wouldn&apos;t normally use reflection to create a
+        Java object, either.  As mentioned earlier, the JCas provides the more straightforward
+        method to manipulate CAS data.  The CAS access methods described here need only be used for
+        generic types of applications that need to be able to handle any kind of data (e.g., generic
+        tooling) or when the JCas may not be used for other reasons.  The generic kinds of applications
+        are exactly the ones where you would use the reflection API in Java as well.
+      </para>
+      
+    </section>
+    
+    <section id="ugr.ref.cas.creating_using_indexes">
+      <title>Creating and using indexes</title>
+      
+      <para>Each view of a CAS provides a set of indexes for that view. Instances of feature
+        structures can be added to a view&apos;s indexes. These indexes provide
+        the only way for other annotators to locate existing data in the CAS. The only way for an
+        annotator to use data that another annotator has created is by using an index (or the
+        method <literal>getAllIndexedFS</literal> of the object <literal>FSIndexRepository</literal>) to
+        retrieve feature structures the first annotator created. If you want the data you
+        create to be visible to other annotators, you must explicitly call methods which
+        add it to the indexes &mdash; you must index it.</para>
+      
+      <para>Indexes are named and are associated with a CAS Type; they are used to index
+        instances of that CAS type (including instances of that type&apos;s subtypes). If
+        you are using multiple views (see <olink
+          targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.mvs"/>),
+        each view contains a separate instantiation of all of the indexes.
+        To access an index, you
+        minimally need to know its name. A CAS view provides an index repository which you can
+        query for indexes for that view. Once you have a handle to an index, you can get
+        information about the feature structures in the index, the size of the index, as well
+        as an iterator over the feature structures.</para>
+      
+      <para>Indexes are defined in the XML descriptor metadata for the application. Each CAS
+        View has its own, separate instantiation of indexes based on these definitions, 
+        kept in the view's index repository. When you obtain an index, it is always from a
+        particular CAS view. When you index an item, it is always added to all indexes where it
+        belongs, within just one repository. You can specify different repositories
+        (associated with different CAS views) to use; a given Feature Structure instance 
+        may be indexed in more
+        than one CAS View.</para>
+      
+      <para>Iterators allow you to enumerate the feature structures in an index.  FS iterators
+        provide two kinds of APIs: the regular Java iterator API, and a specific FS iterator API
+        where the usual Java iterator APIs (<literal>hasNext()</literal> and <literal>next()</literal>)
+        are replaced by <literal>isValid()</literal>, <literal>moveToNext()</literal> (which does
+        not return an element) and <literal>get()</literal>.  Which API style you use is up to you,
+        but we do not recommend mixing the styles as the results are sometimes unexpected.  If you
+        just want to iterate over an index from start to finish, either style is equally appropriate.
+        If you also use <literal>moveTo(FeatureStructure fs)</literal> and 
+        <literal>moveToPrevious()</literal>, it is better to use the special FS iterator style.
+      </para>
+      <note><para>The reason to not mix these styles is that you might be thinking that
+        next() followed by moveToPrevious() would always work.  This is not true, because
+        next() returns the "current" element, and advances to the next position, which might be
+        beyond the last element.  At that point, the interator becomes "invalid", and by the iterator
+        contracts, moveToNext and moveToPrevious are not allowed on "invalid" iterators; 
+        when an iterator is not valid, all bets are off.  But you can
+        call these methods on the iterator &mdash; moveToFirst(), moveToLast(), or moveTo(FS) &mdash; to reset it.</para></note>
+      
+      <para>Indexes are created by specifying them in the annotator&apos;s or
+        aggregate&apos;s resource descriptor. An index specification includes its name,
+        the CAS type being indexed, the kind of index it is, and an (optional) ordering
+        relation on the feature structures to be indexed. At startup time, all index
+        specifications are combined; duplicate definitions (having the same name) are
+        allowed only if their definitions are the same. </para>
+      
+      <para>Feature structure instances need to be explicitly added to the index repository by a
+        method call. Feature structures that are not indexed will not be visible to other
+        annotators, (unless they are located via being referenced by some other feature of
+        another feature structure, which is indexed, or through a chain of these).</para>
+      
+      <para>The framework defines an unnamed bag index which indexes all types.  The
+      only access provided for this index is the getAllIndexedFS(type) method on the
+        index repository, which returns an iterator over all indexed instances of the
+        specified type (including its subtypes) for that CAS View.
+      </para>
+      
+      <para>The framework defines one standard, built-in annotation index, called
+        AnnotationIndex, which indexes the <literal>uima.tcas.Annotation</literal>
+        type: all feature structures of type <literal>uima.tcas.Annotation</literal> or
+        its subtypes are automatically indexed with this built-in index.</para>
+      
+      <para>The ordering relation used by this index is to first order by the value of the
+        <quote>begin</quote> features (in ascending order) and then by the value of the
+        <quote>end</quote> feature (in descending order). This ordering insures that
+        longer annotations starting at the same spot come before shorter ones. For Subjects
+        of Analysis other than Text, this may not be an appropriate index.</para>
+      
+    </section>
+  </section>
+  
+  <section id="ugr.ref.cas.builtin_types">
+    <title>Built-in CAS Types</title>
+    
+    <para>The CAS has two kinds of built-in types &ndash; primitive and non-primitive. The
+      primitive types are:
+      
+      <itemizedlist spacing="compact">
+        <listitem><para>uima.cas.Boolean</para></listitem>
+        <listitem><para>uima.cas.Byte</para></listitem>
+        <listitem><para>uima.cas.Short</para></listitem>
+        <listitem><para>uima.cas.Integer</para></listitem>
+        <listitem><para>uima.cas.Long</para></listitem>
+        <listitem><para>uima.cas.Float</para></listitem>
+        <listitem><para>uima.cas.Double</para></listitem>
+        <listitem><para>uima.cas.String</para></listitem>
+      </itemizedlist></para>
+    
+    <para>The <literal>Byte, Short, Integer, </literal>and<literal> Long</literal> are
+      all signed integer types, of length 8, 16, 32, and 64 bits. The
+      <literal>Double</literal> type is 64 bit floating point. The
+      <literal>String</literal> type can be sub-typed to create sets of allowed values; see
+        <olink targetdoc="&uima_docs_ref;"
+        targetptr="ugr.ref.xml.component_descriptor.type_system.string_subtypes"/>.
+      These types can be used to specify the range of a String-valued feature. They act like
+      Strings, but have additional checking to insure the setting of values into them
+      conforms to one of the allowed values. Note that the other primitive types cannot be used
+      as a supertype for another type definition; only
+      <literal>uima.cas.String</literal> can be sub-typed.</para>
+    
+    <para>The non-primitive types exist in a type hierarchy; the top of the hierarchy is the
+      type <literal>uima.cas.TOP</literal>. All other non-primitive types inherit from
+      some supertype.</para>
+    
+    <para>There are 9 built-in array types. These arrays have a size specified when they are
+      created; the size is fixed at creation time. They are named:
+      
+      <itemizedlist spacing="compact">
+        <listitem><para>uima.cas.BooleanArray</para></listitem>
+        <listitem><para>uima.cas.ByteArray</para></listitem>
+        <listitem><para>uima.cas.ShortArray</para></listitem>
+        <listitem><para>uima.cas.IntegerArray</para></listitem>
+        <listitem><para>uima.cas.LongArray</para></listitem>
+        <listitem><para>uima.cas.FloatArray</para></listitem>
+        <listitem><para>uima.cas.DoubleArray</para></listitem>
+        <listitem><para>uima.cas.StringArray</para></listitem>
+        <listitem><para>uima.cas.FSArray</para></listitem>
+      </itemizedlist></para>
+    
+    <para>The <literal>uima.cas.FSArray</literal> type is an array whose elements are
+      arbitrary other feature structures (instances of non-primitive types).</para>
+    
+    <para>There are 3 built-in types associated with the artifact being analyzed:
+      
+      <itemizedlist spacing="compact">
+        <listitem><para>uima.cas.AnnotationBase</para></listitem>
+        <listitem><para>uima.tcas.Annotation</para></listitem>
+        <listitem><para>uima.tcas.DocumentAnnotation</para></listitem>
+      </itemizedlist></para>
+    
+    <para>The <literal>AnnotationBase</literal> type defines one system-used feature
+      which specifies for an annotation the subject of analysis (Sofa) to which it refers. The
+      Annotation type extends from this and defines 2 features, taking
+      <literal>uima.cas.Integer</literal> values, called <literal>begin</literal>
+      and <literal>end</literal>. The <literal>begin</literal> feature typically
+      identifies the start of a span of text the annotation covers; the
+      <literal>end</literal> feature identifies the end. The values refer to character
+      offsets; the starting index is 0. An annotation of the word <quote>CAS</quote> in a text
+      <quote>CAS Reference</quote> would have a start index of 0, and an end index of 3; the
+      difference between end and start is the length of the span the annotation refers
+      to.</para>
+    
+    <para>Annotations are always with respect to some Sofa (Subject of Analysis &ndash; see
+        <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aas"/>
+      .</para>
+    <note><para>Artifacts which are not text strings may have a different interpretation of
+    the meaning of begin and end, or may define their own kind of annotation, extending from
+    <literal>AnnotationBase</literal>. </para></note>
+    
+    <para id="ugr.ref.cas.document_annotation">The <literal>DocumentAnnotation</literal> type has one special instance. It is
+      a subtype of the Annotation type, and the built-in definition defines one feature,
+      <literal>language</literal>, which is a string indicating the language of the
+      document in the CAS. The value of this language feature is used by the system to control
+      flow among annotators when the <quote>CapabilityLanguageFlow</quote> mode is used,
+      allowing the flow to skip over annotators that don&apos;t process particular
+      languages. Users may extend this type by adding additional features to it, using the XML
+      Descriptor element for defining a type.</para>
+      
+    <note><para>
+      We do <emphasis>not</emphasis> recommend extending the <literal>DocumentAnnotation</literal>
+      type.  If you do, you must <emphasis>not</emphasis> use the JCas, for the reasons stated
+      earlier.
+    </para></note>
+    
+    <para>Each CAS view has a different associated instance of the
+      <literal>DocumentAnnotation</literal> type.  On the CAS, use 
+      <literal>getDocumentationAnnotation()</literal> to access the 
+      <literal>DocumentAnnotation</literal>.</para>
+    
+    <para>There are also built-in types supporting linked lists, similar to the ones available in
+    Java and other programming languages. Their use is
+      constrained by the usual properties of linked lists: not very space efficient, no (efficient)
+      random access, but an easy choice if you don't know how long your list will be ahead of time. The
+      implementation is type specific; there are different list building objects for each of
+      the primitive types, plus one for general feature structures. Here are the type names:
+      <itemizedlist spacing="compact">
+        <listitem><para>uima.cas.FloatList</para></listitem>
+        <listitem><para>uima.cas.IntegerList</para></listitem>
+        <listitem><para>uima.cas.StringList</para></listitem>
+        <listitem><para>uima.cas.FSList</para>
+          <para></para></listitem>
+        <listitem><para>uima.cas.EmptyFloatList</para></listitem>
+        <listitem><para>uima.cas.EmptyIntegerList</para></listitem>
+        <listitem><para>uima.cas.EmptyStringList</para></listitem>
+        <listitem><para>uima.cas.EmptyFSList</para>
+          <para></para></listitem>
+        <listitem><para>uima.cas.NonEmptyFloatList</para></listitem>
+        <listitem><para>uima.cas.NonEmptyIntegerList</para></listitem>
+        <listitem><para>uima.cas.NonEmptyStringList</para></listitem>
+        <listitem><para>uima.cas.NonEmptyFSList</para></listitem>
+        
+      </itemizedlist></para>
+    
+    <para>For the primitive types <literal>Float</literal>,
+      <literal>Integer</literal>, <literal>String</literal> and
+      <literal>FeatureStructure</literal>, there is a base type, for instance,
+      <literal>uima.cas.FloatList</literal>. For each of these, there are two subtypes,
+      corresponding to a non-empty element, and a marker that serves to indicate the end of the
+      list, or an empty list. The non-empty types define two features &ndash;
+      <literal>head</literal> and <literal>tail</literal>. The head feature holds the
+      particular value for that part of the list. The tail refers to the next list object
+      (either a non-empty one or the empty version to indicate the end of the list).</para>
+    
+    <para>There are no other built-in types. Users are free to define their own type systems,
+      building upon these types.</para>
+    
+  </section>
+  
+  <section id="ugr.ref.cas.accessing_the_type_system">
+    <title>Accessing the type system</title>
+    
+    <para>
+      During annotator processing, or outside an annotator, access the type system by calling 
+      <literal>CAS.getTypeSystem()</literal>.
+    </para>
+    
+    <para>However, CAS annotators implement an additional method,
+      <literal>typeSystemInit()</literal>, which is called by the UIMA framework before the
+      annotator&apos;s process method. This method, implemented by the annotator writer,
+      is passed a reference to the CAS&apos;s type system metadata. The method typically uses
+      the type system APIs to obtain type and feature objects corresponding to all the types
+      and features the annotator will be using in its process method. This initialization
+      step should not be done during an annotator&apos;s initialize method since the type
+      system can change after the initialize method is called; it should not be done during the
+      process method, since this is presumably work that is identical for each incoming
+      document, and so should be performed only when the type system changes (which will be a
+      rare event). The UIMA framework guarantees it will call the <literal>typeSystemInit
+      </literal>method of an annotator whenever the type system changes, before calling the
+      annotator&apos;s <literal>process()</literal> method.</para>
+    
+    <para>The initialization done by <literal>typeSystemInit()</literal> is done by the
+      UIMA framework when you use the JCas APIs; you only need to provide a
+      <literal>typeSystemInit()</literal> method, as described here, when you are not using
+      the JCas approach.</para>
+    
+    <section id="ugr.ref.cas.type_system.printer_example">
+      <title>TypeSystemPrinter example</title>
+      
+      <para>Here is a code fragment that, given a CAS Type System, will print a list of all
+        types.</para>
+      
+      
+      <programlisting>// Get all type names from the type system
+// and print them to stdout.
+private void listTypes1(TypeSystem ts) {
+  // Get an iterator over types
+  Iterator typeIterator = ts.getTypeIterator();
+  Type t;
+  System.out.println("Types in the type system:");
+  while (typeIterator.hasNext()) {
+    // Retrieve a type...
+    t = (Type) typeIterator.next();
+    // ...and print its name.
+    System.out.println(t.getName());
+  }
+  System.out.println();
+}</programlisting>
+      
+      <para>This method is passed the type system as a parameter.  From the type system, we can 
+        get an iterator
+        over all known types. If you run this against a CAS created with no additional
+        user-defined types, we should see something like this on the console:</para>
+      
+      <programlisting>Types in the type system: 
+uima.cas.Boolean 
+uima.cas.Byte
+uima.cas.Short 
+uima.cas.Integer 
+uima.cas.Long 
+uima.cas.ArrayBase 
+...
+        </programlisting>
+      
+      <para>If the type system had user-defined types these would show up too. Note that some
+        of these types are not directly creatable &ndash; they are types used by the framework
+        in the type hierarchy (e.g. uima.cas.ArrayBase).</para>
+      
+      <para>CAS type names include a name-space prefix. The components of a type name are
+        separated by the dot (.). A type name component must start with a Unicode letter,
+        followed by an arbitrary sequence of letters, digits and the underscore (_). By
+        convention, the last component of a type name starts with an uppercase letter, the
+        rest start with a lowercase letter.</para>
+      
+      <para>Listing the type names is mildly useful, but it would be even better if we could see
+        the inheritance relation between the types. The following code prints the
+        inheritance tree in indented format.</para>
+      
+      
+      <programlisting>private static final int INDENT = 2;
+private void listTypes2(TypeSystem ts) {
+  // Get the root of the inheritance tree.
+  Type top = ts.getTopType();
+  // Recursively print the tree.
+  printInheritanceTree(ts, top, 0);
+}
+
+private void printInheritanceTree(TypeSystem ts, Type type, int level) {
+  indent(level); // Print indentation.
+  System.out.println(type.getName());
+  // Get a vector of the immediate subtypes.
+  Vector subTypes =
+    ts.getDirectlySubsumedTypes(type);
+  ++level; // Increase the indentation level.
+  for (int i = 0; i &lt; subTypes.size(); i++) {
+    // Print the subtypes.
+    printInheritanceTree(ts, (Type) subTypes.get(i), level);
+  }
+}
+  
+// A simple, inefficient indenter
+private void indent(int level) {
+  int spaces = level * INDENT;
+  for (int i = 0; i &lt; spaces; i++) {
+    System.out.print(" ");
+  }
+}</programlisting>
+      
+      <para> This example shows that you can traverse the type hierarchy by starting at the top
+        with TypeSystem.getTopType and by retrieving subtypes with
+        <literal>TypeSystem.getDirectlySubsumedTypes()</literal>.</para>
+      
+      <para>The Javadocs also have APIs that allow you to access the features, as well as what
+        the allowed value type is for that feature. Here is sample code which prints out all the
+        features of all the types, together with the allowed value types (the feature
+        <quote>range</quote>). Each feature has a <quote>domain</quote> which is the type
+        where it is defined, as well as a <quote>range</quote>.
+        
+        
+        <programlisting>private void listFeatures2(TypeSystem ts) {
+  Iterator featureIterator = ts.getFeatures();
+  Feature f;
+  System.out.println("Features in the type system:");
+  while (featureIterator.hasNext()) {
+    f = (Feature) featureIterator.next();
+    System.out.println(
+      f.getShortName() + ": " +
+      f.getDomain() + " -&gt; " + f.getRange());
+  }
+  System.out.println();
+}</programlisting></para>
+      
+      <para>We can ask a feature object for its domain (the type it is defined on) and its range
+        (the type of the value of the feature). The terminology derives from the fact that
+        features can be viewed as functions on subspaces of the object space.</para>
+      
+    </section>
+    
+    <section id="ugr.ref.cas.cas_apis_create_modify_feature_structures">
+      <title>Using the CAS APIs to create and modify feature structures</title>
+      <titleabbrev>Using CAS APIs: Feature Structures</titleabbrev>
+      
+      <para>Assume a type system declaration that defines two types: Entity and Person.
+        Entity has no features defined within it but inherits from uima.tcas.Annotation
+        &ndash; so it has the begin and end features. Person is, in turn, a subtype of Entity,
+        and adds firstName and lastName features. CAS type systems are declaratively
+        specified using XML; the format of this XML is described in <olink
+          targetdoc="&uima_docs_ref;"
+          targetptr="ugr.ref.xml.component_descriptor.type_system"/>.
+        
+        
+        <programlisting><![CDATA[<!-- Type System Definition -->
+<typeSystemDescription>
+  <types>
+    <typeDescription>
+      <name>com.xyz.proj.Entity</name>
+      <description />
+      <supertypeName>uima.tcas.Annotation</supertypeName>
+    </typeDescription>
+    <typeDescription>
+      <name>Person</name>
+      <description />
+      <supertypeName>com.xyz.proj.Entity </supertypeName>
+      <features>
+        <featureDescription>
+          <name>firstName</name>
+          <description />
+          <rangeTypeName>uima.cas.String</rangeTypeName>
+        </featureDescription>
+        <featureDescription>
+          <name>lastName</name>
+          <description />
+          <rangeTypeName>uima.cas.String</rangeTypeName>
+        </featureDescription>
+      </features>
+    </typeDescription>
+  </types>
+</typeSystemDescription>]]></programlisting></para>
+      
+  <para>
+    To be able to access types and features, we need to know their names.  The CAS interface defines
+    constants that hold the names of built-in feature names, such as, e.g.,
+    <literal>CAS.TYPE_NAME_INTEGER</literal>.  It is good programming practice to create such
+    constants for the types and features you define, for your own use as well as for others who will
+    be using your annotators.
+  </para>
+      
+      
+      <programlisting>/** Entity type name constant. */
+public static final String ENTITY_TYPE_NAME = "com.xyz.proj.Entity";
+  
+/** Person type name constant. */
+public static final String PERSON_TYPE_NAME = "com. xyz.proj.Person";
+
+/** First name feature name constant. */
+public static final String FIRST_NAME_FEAT_NAME = "firstName";
+
+/** Last name feature name constant. */
+public static final String LAST_NAME_FEAT_NAME = "lastName";</programlisting>
+      
+      <para>Next we define type and feature member variables; these will hold the values of the
+        type and feature objects needed by the CAS APIs, to be assigned during
+        <literal>typeSystemInit()</literal>.</para>
+      
+      
+      <programlisting>// Type system object variables
+private Type entityType;
+private Type personType;
+private Feature firstNameFeature;
+private Feature lastNameFeature;
+private Type stringType;</programlisting>
+      
+      <para>The type system does not throw an exception if we ask for something that is
+        not known, it simply returns null; therefore the code checks for this and throws a proper
+        exception.  We require all these types and features to be defined for the annotator to
+        work.  One might imagine situations where certain computations are predicated on some type
+        or feature being defined in the type system, but that is not the case here.</para>
+      
+      
+      <programlisting>// Get a type object corresponding to a name.
+// If it doesn&apos;t exist, throw an exception.
+private Type initType(String typeName)
+  throws AnnotatorInitializationException {
+  Type type = ts.getType(typeName);
+  if (type == null) {
+    throw new AnnotatorInitializationException(
+      AnnotatorInitializationException.TYPE_NOT_FOUND,
+      new Object[] { this.getClass().getName(), typeName });
+  }
+  return type;
+}
+
+// We add similar code for retrieving feature objects.
+// Get a feature object from a name and a type object.
+// If it doesn&apos;t exist, throw an exception.
+private Feature initFeature(String featName, Type type)
+  throws AnnotatorInitializationException {
+  Feature feat = type.getFeatureByBaseName(featName);
+  if (feat == null) {
+    throw new AnnotatorInitializationException(
+      AnnotatorInitializationException.FEATURE_NOT_FOUND,
+      new Object[] { this.getClass().getName(), featName });
+  }
+  return feat;
+}</programlisting>
+      
+      <para>Using these two functions, code for initializing the type system described
+        above would be:
+        
+        
+        <programlisting>public void typeSystemInit(TypeSystem aTypeSystem)
+    throws AnalysisEngineProcessException {
+  this.typeSystem = aTypeSystem;
+  // Set type system member variables.
+  this.entityType = initType(ENTITY_TYPE_NAME);
+  this.personType = initType(PERSON_TYPE_NAME);
+  this.firstNameFeature =
+    initFeature(FIRST_NAME_FEAT_NAME, personType);
+  this.lastNameFeature =
+    initFeature(LAST_NAME_FEAT_NAME, personType);
+  this.stringType = initType(CAS.TYPE_NAME_STRING);
+}</programlisting></para>
+      
+      <para>Note that we initialize the string type by using a type name constant from the
+        CAS.</para>
+      
+    </section>
+  </section>
+  
+  <section id="ugr.ref.cas.creating_feature_structures">
+    <title>Creating feature structures</title>
+    
+    <para>To create feature structures in JCas, we use the Java <quote>new</quote>
+      operator. In the CAS, we use one of several different API methods on the CAS object,
+      depending on which of the 10 basic kinds of feature structures we are creating (a plain
+      feature structure, or an instance of the built-in primitive type arrays or FSArray).
+      There are is also a method to create an instance of a
+      <literal>uima.tcas.Annotation</literal>, setting the begin and end
+      values.</para>
+    
+    <para>Once a feature structure is created, it needs to be added to the CAS indexes (unless
+      it will be accessed via some reference from another accessible feature structure). The
+      CAS provides this API: Assuming aCAS holds a reference to a CAS, and token holds a
+      reference to a newly created feature structure, here&apos;s the code to add that
+      feature structure to all the relevant CAS indexes:</para>
+    
+    
+    <programlisting>    // Add the token to the index repository.
+    aCAS.addFsToIndexes(token);</programlisting>
+    
+    <para>There is also a corresponding <literal>removeFsFromIndexes(token)</literal>
+      method on CAS objects.</para>
+    
+    <para>Because some of the indexes (the Sorted and Set types) use comparators defined
+    on particular values of the features of an indexed type, if you change the values of
+    those features being used in the index key, the correct way to do this is to
+    <orderedlist spacing="compact">
+      <listitem><para>remove the item from all indexes where it is indexed, in all views
+      where it is indexed,</para>       
+      </listitem>
+      <listitem><para>update the value of the features being used as keys,</para></listitem>
+      <listitem><para>add the item back to the indexes, in all views.</para></listitem> 
+    </orderedlist></para>
+  </section>
+  
+  <section id="ugr.ref.cas.accessing_modifying_features_of_feature_structures">
+    <title>Accessing or modifying features of feature structures</title>
+    <titleabbrev>Accessing or modifying Features</titleabbrev>
+    
+    <para>Values of individual features for a feature structure can be set or referenced,
+      using a set of methods that depend on the type of value that feature is declared to have.
+      There are methods on FeatureStructure for this: getBooleanValue, getByteValue,
+      getShortValue, getIntValue, getLongValue, getFloatValue, getDoubleValue,
+      getStringValue, and getFeatureValue (which means to get a value which in turn is a
+      reference to a feature structure). There are corresponding <quote>setter</quote>
+      methods, as well. These methods on the feature structure object take as arguments the
+      feature object retrieved earlier in the typeSystemInit method.</para>
+    
+    <para>Using the previous example, with the type system initialized with type personType
+      and feature lastNameFeature, here&apos;s a sample code fragment that gets and sets
+      that feature:</para>
+    
+    
+    <programlisting>// Assume aPerson is a variable holding an object of type Person
+// get the lastNameFeature value from the feature structure
+String lastName = aPerson.getStringValue(lastNameFeature);
+// set the lastNameFeature value
+aPerson.setStringValue(lastNameFeature, newStringValueForLastName);</programlisting>
+    
+    <para>The getters and setters for each of the primitive types are defined in the Javadocs
+      as methods of the FeatureStructure interface.</para>
+    
+  </section>
+  
+  <section id="ugr.ref.cas.indexes_and_iterators">
+    <title>Indexes and Iterators</title>
+    
+    <para>Each CAS can have many indexes associated with it; each CAS View contains 
+      a complete set of instantions of the indexes.   Each index is represented by an
+      instance of the type org.apache.uima.cas.FSIndex. You use the object
+      org.apache.uima.cas.FSIndexRepository, accessible via a method on a CAS object, to
+      retrieve instances of indexes. There are methods that let you select the index
+      by name, by type, or by both name and type. Since each index is already associated with a type, 
+      passing both a name and a type is valid only if the type passed in is the same
+      type or a subtype of the one declared in the index specification for the named index. If you
+      pass in a subtype, the returned FSIndex object refers to an index that will return only
+      items belonging to that subtype (or subtypes of that subtype).</para>
+    
+    <para>The returned FSIndex objects are used, in turn, to create iterators. 
+      There is also a method on the Index Repository, <literal>getAllIndexedFS</literal>, 
+      which will return an iterator over all indexed Feature Structures (for that CAS View),
+      in no particular order.  The iterators
+      created can be used like common Java iterators, to sequentially retrieve items
+      indexed. If the index represents a sorted index, the items are returned in a sorted
+      order, where the sort order is specified in the XML index definition. This XML is part of
+      the Component Descriptor, see <olink targetdoc="&uima_docs_ref;"
+        targetptr="ugr.ref.xml.component_descriptor.aes.index"/>.</para>
+       
+    <para>Feature structures should not be added to or removed from indexes while iterating
+      over them; a ConcurrentModificationException is thrown when this is detected.
+      Certain operations are allowed with the iterators after modification, which can
+      <quote>reset</quote> this condition, such as moving to beginning, end, or moving to a
+      particular feature structure. So - if you have to modify the index, you can move it back to
+      the last FS you had retrieved from the iterator, and then continue, if that makes sense in
+      your application.</para>   
+
+    <section id="ugr.ref.cas.index.built_in_indexes">
+      <title>Built-in Indexes</title>
+      
+      <para>An unnamed built-in bag index exists which holds all feature structures which are indexed.
+      The only access to this index is the method getAllIndexedFS(Type) which returns an iterator
+      over all indexed Feature Structures.</para>
+      
+      <para>The CAS also contains a built-in index for the type <literal>uima.tcas.Annotation</literal>, which sorts
+        annotations in the order in which they appear in the document. Annotations are sorted first by increasing
+        <literal>begin</literal> position. Ties are then broken by <emphasis>decreasing</emphasis>
+        <literal>end</literal> position (so that longer annotations come first). Annotations that match in both
+        their <literal>begin</literal> and <literal>end</literal> features are sorted using the Type Priority
+        (see <olink targetdoc="&uima_docs_ref;"
+          targetptr="ugr.ref.xml.component_descriptor.aes.type_priority"/> )</para>
+    </section>
+
+    
+    <section id="ugr.ref.cas.index.adding_to_indexes">
+      <title>Adding Feature Structures to the Indexes</title>
+      
+      <para>Feature Structures are added to the indexes by calling the
+        <literal>FSIndexRepository.addFS(FeatureStructure)</literal> method or the equivalent convenience
+        method <literal>CAS.addFsToIndexes(FeatureStructure)</literal>. This adds the Feature Structure to
+        <emphasis>all</emphasis> indexes that are defined for the type of that FeatureStructure (or any of its
+        supertypes). Note that you should not add a Feature Structure to the indexes until you have set values for all
+        of the features that may be used as sort keys in an index.</para>
+    </section>
+        
+    <section id="ugr.ref.cas.index.iterators">
+      <title>Iterators</title>
+      
+      <para>Iterators are objects of class <literal>org.apache.uima.cas.FSIterator.</literal> This class
+        extends <literal>java.util.Iterator</literal> and implements the normal Java iterator methods, plus
+        additional ones that allow moving both forwards and backwards.</para>  
+    </section>
+    
+    <section id="ugr.ref.cas.index.annotation_index">
+      <title>Special iterators for Annotation types</title>
+      
+      <para>The built-in index over the <literal>uima.tcas.Annotation</literal> type
+        named <quote><literal>AnnotationIndex</literal></quote> has additional
+        capabilities. To use them, you first get a reference to this built-in index using
+        either the <literal>getAnnotationIndex</literal> method on a CAS View object, or
+        by asking the <literal>FSIndexRepository</literal> object for an index having the
+        particular name <quote>AnnotationIndex</quote>, for example:        
+        
+        <programlisting>AnnotationIndex idx = aCAS.getAnnotationIndex(); 
+// or you can iterate over a specific subtype of Annotation:        
+AnnotationIndex idx = aCAS.getAnnotationIndex(aType); </programlisting></para>
+      
+      <para>This object can be used to produce several additional kinds of iterators. It can
+        produce unambiguous iterators; these skip over elements until it finds one where the
+        start position of the next annotation is equal to or greater than the end position of
+        the previously returned annotation.</para>
+      
+      <para>It can also produce several kinds of subiterators; these are iterators whose
+        annotations fall within the span of another annotation. This kind of iterator can
+        also have the unambiguous property, if desired. It also can be
+        <quote>strict</quote> or not; strict means that the returned annotation lies
+        completely within the span of the controlling annotation. Non-strict only implies
+        that the beginning of the returned annotation falls within the span of the
+        controlling annotation.</para>
+      
+      <para>There is also a method which produces an <literal>AnnotationTree</literal>
+        object, which contains nodes representing the results of doing a strict,
+        unambiguous subiterator over the span of some controlling annotation. For more
+        details, please refer to the Javadocs for the
+        <literal>org.apache.uima.cas.text</literal> package.</para>
+      
+    </section>
+    
+    <section id="ugr.ref.cas.index.constraints_and_filtered_iterators">
+      <title>Constraints and Filtered iterators</title>
+      
+      <para>There is a set of API calls that build constraint objects. These objects can be
+        used directly to test if a particular feature structure matches (satisfies) the
+        constraint, or they can be passed to the createFilteredIterator method to create an
+        iterator that skips over instances which fail to satisfy the constraint.</para>
+      
+      <para>It is possible to specify a feature value located by following a chain of
+        references starting from the feature structure being tested. Here&apos;s a
+        scenario to explore this concept. Let&apos;s suppose you have the following type
+        system (namespaces are omitted for clarity):
+        
+        <blockquote>
+          <para><emphasis role="bold">Token</emphasis>, having a feature PartOfSpeech
+            which holds a reference to another type (POS)</para>
+          
+          <para><emphasis role="bold">POS</emphasis> (a type with many subtypes, each
+            representing a different part of speech)</para>
+          
+          <para><emphasis role="bold">Noun</emphasis> (a subtype of POS)</para>
+          
+          <para><emphasis role="bold">ProperName</emphasis> (a subtype of Noun),
+            having a feature Class which holds an integer value encoding some information
+            about the proper noun.</para></blockquote></para>
+      
+      <para>If you want to filter Token instances, such that only those tokens get through
+        which are proper names of class 3 (for example), you would need a test that started with
+        a Token instance, followed its PartOfSpeech reference to another instance (the
+        ProperName instance) and then tested the Class feature of that instance for a value
+        equal to 3.</para>
+      
+      <para>To support this, the filtering approach has components that specify tests, and
+        components that specify <quote>paths</quote>. The tests that can be done include
+        testing references to type instances to see if they are instances of some type or its
+        subtypes; this is done with a FSTypeConstraint constraint. Other tests check for
+        equality or, for numeric values, ranges.</para>
+      
+      <para>Each test may be combined with a path &ndash; to get to the value to test. Tests that
+        start from a feature structure instance can be combined with and and or connectors.
+        The Javadocs for these are in the package org.apache.uima.cas in the classes that end
+        in Constraint, plus the classes ConstraintFactory, FeaturePath and CAS.
+        Here&apos;s an example; assume the variable cas holds a reference to a CAS instance.
+        
+        
+        <programlisting>// Start by getting the constraint factory from the CAS.
+ConstraintFactory cf = cas.getConstraintFactory();
+
+// To specify a path to an item to test, you start by
+// creating an empty path.
+FeaturePath path = cas.createFeaturePath();
+
+// Add POS feature to path, creating one-element path.
+path.addFeature(posFeat);
+
+// You can extend the chain arbitrarily by adding additional
+// features.
+
+// Create a new type constraint.  
+
+// Type constraints will check that structures
+// they match against have a type at least as specific
+// as the type specified in the constraint.
+FSTypeConstraint nounConstraint = cf.createTypeConstraint();
+
+// Set the type (by default it is TOP).
+// This succeeds if the type being tested by this constraint
+// is nounType or a subtype of nounType.
+nounConstraint.add(nounType);
+
+// Embed the noun constraint under the pos path.
+// This means, associate the test with the path, so it tests the
+// proper value.
+
+// The result is a test which will
+// match a feature structure that has a posFeat defined
+// which has a value which is an instance of a nounType or
+// one of its subtypes.
+FSMatchConstraint embeddedNoun = cf.embedConstraint(path, nounConstraint);
+
+// Create a type constraint for token (or a subtype of it)
+FSTypeConstraint tokenConstraint = cf.createTypeConstraint();
+
+// Set the type.
+tokenConstraint.add(tokenType);
+
+// Create the final constraint by conjoining the two constraints.
+FSMatchConstraint nounTokenCons = cf.and(nounConstraint, tokenConstraint);
+
+// Create a filtered iterator from some annotation iterator.
+FSIterator it = cas.createFilteredIterator(annotIt, nounTokenCons);</programlisting>
+        </para></section></section>
+  
+  <section id="ugr.ref.cas.guide_to_javadocs">
+    <title>The CAS API&apos;s &ndash; a guide to the Javadocs</title>
+    <titleabbrev>CAS API&apos;s Javadocs</titleabbrev>
+    
+    <para>The CAS APIs are organized into 3 Java packages: cas, cas.impl, and cas.text. Most
+      of the APIs described here are in the cas package. The cas.impl package contains classes
+      used in serializing and deserializing (reading and writing to external strings) the
+      XCAS form of the CAS (XCAS is an XML serialization of the CAS). The XCAS form is used for
+      transporting the CAS among local and remote annotators, or for storing the CAS in
+      permanent storage. The cas.text contains the APIs that extend the CAS to support
+      artifact (including <quote>text</quote>) analysis.</para>
+    
+    <section id="ugr.ref.cas.javadocs.cas_package">
+      <title>APIs in the CAS package</title>
+      
+      <para>The main objects implementing the APIs discussed here are shown in the diagram
+        below. The hierarchy represents that there is a way to get from an upper object to an
+        instance of the lower object, usually by using a method on the upper object; this is not
+        an inheritance hierarchy.
+        <figure id="ugr.ref.cas.fig.api_hierarchy">
+          <title>CAS Object hierarchy</title>
+          <mediaobject>
+            <imageobject>
+              <imagedata width="5.8in" format="JPG"
+                fileref="&imgroot;image001.png"/>
+            </imageobject>
+            <textobject><phrase>CAS object hierarchy</phrase></textobject>
+          </mediaobject>
+        </figure> </para>
+      
+      <para>The main Interface is the CAS interface. This has most of the functionality of the
+        CAS, except for the type system metadata access, and the indexing access. JCas and CAS
+        are alternative representations and API approaches to the CAS; each has a method to
+        get the other. You can mix JCas and CAS APIs in your application as needed. To use the
+        JCas APIs, you have to create the Java classes that correspond to the CAS types, and
+        include them in the Java class path of the application. If you have a CAS object, you can
+        get a JCas object by using the getJCas() method call on the CAS object; likewise, you
+        can get the CAS object from a JCas by using the getCAS() method call on the JCas object.
+        There is also a low level CAS interface that is not part of the official API, and is
+        intended for internal use only &ndash; it is not documented here.</para>
+      
+      <para>The type system metadata APIs are found in the TypeSystem interface. The objects
+        defining each type and feature are defined by the interfaces Type and Feature. The
+        Type interface has methods to see what types subsume other types, to iterate over the
+        types available, and to extract information about the types, including what
+        features it has. The Feature interface has methods that get what type it belongs to,
+        its name, and its range (the kind of values it can hold).</para>
+      
+      <para>The FSIndexRepository gives you access to methods to get instances of indexes, and
+        also provides access to the iterator over all indexed feature structures: 
+        <literal>getAllIndexedFS(aType)</literal>.
+        The FSIndex and AnnotationIndex objects give you methods to create instances of
+        iterators.</para>
+      
+      <para>Iterators and the CAS methods that create new feature structures return
+        FeatureStructure objects. These objects can be used to set and get the values of
+        defined features within them.</para>
+    </section>
+  </section>
+</chapter>
\ No newline at end of file

Added: uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.javadocs.xml
URL: http://svn.apache.org/viewvc/uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.javadocs.xml?rev=941739&view=auto
==============================================================================
--- uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.javadocs.xml (added)
+++ uima/uimaj/branches/mavenAlign/uima-docbook-references/src/docbook/ref.javadocs.xml Thu May  6 14:01:56 2010
@@ -0,0 +1,87 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+<!ENTITY imgroot "images/references/ref.javadocs/">
+<!ENTITY tp "ugr.ref.javadocs.">
+<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.ref.javadocs">
+  <title>Javadocs</title>
+  
+  <para>The details of all the public APIs for UIMA are contained in the API Javadocs. These are located in the docs/api
+    directory; the top level to open in your browser is called <ulink url="api/index.html"/>.</para>
+  
+  <para>Eclipse supports the ability to attach the Javadocs to your project. The Javadoc should already be attached
+    to the <literal>uimaj-examples</literal> project, if you followed the setup instructions in <olink
+      targetdoc="&uima_docs_overview;" targetptr="ugr.ovv.eclipse_setup.example_code"/>. To attach
+    Javadocs to your own Eclipse project, use the following instructions.</para>
+  
+  <note><para>As an alternative, you can add the UIMA source to the UIMA binary distribution; if you
+  do this you not only will have the Javadocs automatically available (you can skip the following
+  setup), you will have the ability to step through the UIMA framework code while debugging.
+  To add the source, follow the instructions as described in the setup chapter: 
+  <olink targetdoc="&uima_docs_overview;" targetptr="ugr.ovv.eclipse_setup.adding_source"/>.</para></note>
+  
+  <para>To add the Javadocs, open a project which is referring to the UIMA APIs in its class path, and open the project properties. Then pick
+    Java Build Path. Pick the "Libraries" tab and select one of the UIMA library entries (if you don't have, for
+    instance, uima-core.jar in this list, it's unlikely your code will compile). Each library entry has a small "+"
+    sign on its left - click that to expand the view to see the Javadoc location. If you highlight that and press edit - you
+    can add a reference to the Javadocs, in the following dialog:
+    
+    
+    <screenshot>
+    <mediaobject>
+      <imageobject>
+        <imagedata width="5.8in" format="JPG" fileref="&imgroot;image002.jpg"/>
+      </imageobject>
+      <textobject><phrase>Screenshot of attaching Javadoc to source in Eclipse</phrase></textobject>
+    </mediaobject>
+  </screenshot></para>
+  
+  <para>Once you do this, Eclipse can show you Javadocs for UIMA APIs as you work. To see the Javadoc for a UIMA API, you
+    can hover over the API class or method, or select it and press shift-F2, or use the menu Navigate &rarr;
+    Open External Javadoc, or open the Javadoc view (Window &rarr; Show View &rarr; Other
+    &rarr; Java &rarr; Javadoc).</para>
+  
+  <para>In a similar manner, you can attach the source for the UIMA framework, if you download the source
+    distribution. The source corresponding to particular
+    releases is available from the Apache UIMA web site (<ulink url="http://incubator.apache.org/uima"/>) on the
+    downloads page.</para>
+  
+  <section id="ugr.ref.javadocs.libraries">
+    <title>Using named Eclipse User Libraries</title>
+  <para>You can also create a named "user library" in Eclipse containing the UIMA Jars, and attach the Javadocs (or
+  optionally, the sources); this named library is saved in the Eclipse workspace.  Once created, it can be
+  added to the classpath of newly created Eclipse projects.</para> 
+  
+  <para>Use the menu option Project &rarr; Properties
+  &rarr; Java Build Path, and then pick the Libraries tab, and click the Add Library button. Then select
+  User Libraries, click "Next", and pick the library you created for the UIMA Jars.</para> 
+  
+  <para>To create this library in the workspace,
+    use the same menu picks as above, but after you select the User Libraries and click "Next", you can click the "New Library..."
+    button to define your new library.  You use the "Add Jars" button and multi-select all the Jars in the lib directory
+    of the UIMA binary distribution.  Then you add the Javadoc attachment for each Jar.  The path to use is
+    file:/ -- insert the path to your install of UIMA -- /docs/api.  After you do this for the first Jar, you can
+    copy this string to the clipboard and paste it into the rest of the Jars.</para>
+    </section>
+</chapter>
\ No newline at end of file