You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by sc...@apache.org on 2008/08/28 23:28:16 UTC

svn commit: r689997 [9/32] - in /incubator/uima/uimaj/trunk/uima-docbooks: ./ src/ src/docbook/overview_and_setup/ src/docbook/references/ src/docbook/tools/ src/docbook/tutorials_and_users_guides/ src/docbook/uima/organization/ src/olink/references/

Modified: incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/references/ref.jcas.xml
URL: http://svn.apache.org/viewvc/incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/references/ref.jcas.xml?rev=689997&r1=689996&r2=689997&view=diff
==============================================================================
--- incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/references/ref.jcas.xml (original)
+++ incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/references/ref.jcas.xml Thu Aug 28 14:28:14 2008
@@ -1,660 +1,660 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[
-<!ENTITY % uimaents SYSTEM "../entities.ent" >  
-%uimaents;
-]>
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-   http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-<chapter id="ugr.ref.jcas">
-  <title>JCas Reference</title>
-  
-  <para>The CAS is a system for sharing data among annotators, consisting of data structures
-    (definable at run time), sets of indexes over these data, metadata describing these, subjects of
-    analysis, and a high
-    performance serialization/deserialization mechanism. JCas provides Java approach to
-    accessing CAS data, and is based on using generated, specific Java classes for each CAS
-    type.</para>
-  
-  <para>Annotators process one CAS per call to their process method. During processing,
-    annotators can retrieve feature structures from the passed in CAS, add new ones, modify
-    existing ones, and use and update CAS indexes. Of course, an annotator can also use plain
-    Java Objects in addition; but the data in the CAS is what is shared among annotators within
-    an application.</para>
-  
-  <para>All the facilities present in the APIs for the CAS are available when using the JCas
-    APIs; indeed, you can use the getCas() method to get the corresponding CAS object from a
-    JCas (and vice-versa). The JCas APIs often have helper methods that make using this
-    interface more convenient for Java developers.</para>
-  
-  <para>The data in the CAS are typed objects having fields. JCas uses a set of generated Java
-    classes (each corresponding to a particular CAS type) with <quote>getter</quote> and
-    <quote>setter</quote> methods for the features, plus a constructor so new instances can
-    be made. The Java classes don&apos;t actually store the data in the class instance;
-    instead, the getters and setters forward to the underlying CAS data representation.
-    Because of this, applications which use the JCas interface can share data with annotators
-    using plain CAS (i.e., not using the JCas approach). </para>
-  
-    <para>Users can modify the JCas generated
-    Java classes by adding fields to them; this allows arbitrary non-CAS data to also be
-    represented within the JCas objects, as well; however, the non-CAS data stored in the JCas
-    object instances cannot be shared with annotators using the plain CAS.</para>
-  
-  <para>Data in the CAS initially has no corresponding JCas type instances; these are created
-    as needed at the first reference. This means, if your annotator is passed a large CAS having
-    millions of CAS feature structures, but you only reference a few of them, and no previously
-    created Java JCas object instances were created by upstream annotators, the only Java
-    objects that will be created will be those that correspond to the CAS feature structures
-    that you reference.</para>
-  
-  <para>The JCas class Java source files are generated from XML type system descriptions. The
-    JCasGen utility does the work of generating the corresponding Java Class Model for the CAS
-    types. There are a variety of ways JCasGen can be run; these are described later. You
-    include the generated classes with your UIMA component, and you can publish these classes
-    for others who might want to use your type system.</para>
-  
-  <para>The specification of the type system in XML can be written using a conventional text
-    editor, an XML editor, or using the Eclipse plug-in that supports editing UIMA
-    descriptors.</para>
-  
-  <para>Changes to the type system are done by changing the XML and regenerating the
-    corresponding Java Class Models. Of course, once you&apos;ve published your type system
-    for others to use, you should be careful that any changes you make don&apos;t adversely
-    impact the users. Additional features can be added to existing types without breaking
-    other code.</para>
-  
-  <para>A separate Java class is generated for each type; this type implements the CAS
-    FeatureStructure interface, as well as having the special getters and setters for the
-    included features. In the current implementation, an additional helper class per type is
-    also generated. The generated Java classes have methods (getters and setters) for the
-    fields as defined in the XML type specification. Descriptor comments are reflected in the
-    generated Java code as Java-doc style comments.</para>
-  
-  
-  <section id="ugr.ref.jcas.name_spaces">
-    <title>Name Spaces</title>
-    
-    <para>Full Type names consist of a <quote>namespace</quote> prefix dotted with a simple
-      name. Namespaces are used like packages to avoid collisions between types that are
-      defined by different people at different times. The namespace is used as the Java
-      package name for generated Java files.</para>
-      
-    <para>Type names used in the CAS correspond to the generated Java classes directly. If the
-      CAS name is com.myCompany.myProject.ExampleClass, the generated Java class is in the
-      package com.myCompany.myProject, and the class is ExampleClass.</para>
-      
-    <para>
-      An exception to this rule is the built-in types
-      starting with <literal>uima.cas </literal>and <literal>uima.tcas</literal>;
-      these names are mapped to Java packages named
-      <literal>org.apache.uima.jcas.cas</literal> and
-      <literal>org.apache.uima.jcas.tcas</literal>.</para>
-    
-  </section>
-  
-  <section id="ugr.ref.jcas.use_of_description">
-    <title>XML description element</title>
-    <titleabbrev>Use of XML Description</titleabbrev>
-    
-    <para>Each XML type specification can have &lt;description ...
-      &gt; tags. The description for a type will be copied into the generated Java code, as a
-      Javadoc style comment for the class. When writing these descriptions in the XML type
-      specification file, you might want to use html tags, as allowed in Javadocs.</para>
-    
-    <para>If you use the Component Description Editor, you can write the html tags normally,
-      for instance, <quote>&lt;h1&gt;My Title&lt;/h1&gt;</quote>. The Component
-      Descriptor Editor will take care of coverting the actual descriptor source so that it
-      has the leading <quote>&lt;</quote> character written as <quote>&amp;lt;</quote>,
-      to avoid confusing the XML type specification. For example, &lt;p&gt; would be written
-      in the source of the descriptor as &amp;lt;p&gt;. Any characters used in the Javadoc
-      comment must of course be from the character set allowed by the XML type specification.
-      These specifications often start with the line &lt;?xml version=<quote>1.0</quote>
-      encoding=<quote>UTF-8</quote> ?&gt;, which means you can use any of the UTF-8
-      characters.</para>
-    
-  </section>
-  
-  <section id="ugr.ref.jcas.mapping_built_ins">
-    <title>Mapping built-in CAS types to Java types</title>
-    
-    <para>The built-in primitive CAS types map to Java types as follows:</para>
-    
-    
-    <programlisting>uima.cas.Boolean &rarr; boolean
-uima.cas.Byte    &rarr; byte
-uima.cas.Short   &rarr; short
-uima.cas.Integer &rarr; int
-uima.cas.Long    &rarr; long
-uima.cas.Float   &rarr; float
-uima.cas.Double  &rarr; double
-uima.cas.String  &rarr; String</programlisting>
-    
-  </section>
-  
-  <section id="ugr.ref.jcas.augmenting_generated_code">
-    <title>Augmenting the generated Java Code</title>
-    
-    <para>The Java Class Models generated for each type can be augmented by the user. Typical
-      augmentations include adding additional (non-CAS) fields and methods, and import
-      statements that might be needed to support these. Commonly added methods include
-      additional constructors (having different parameter signatures), and
-      implementations of toString().</para>
-    
-    <para>To augment the code, just edit the generated Java source code for the class named the
-      same as the CAS type. Here&apos;s an example of an additional method you might add; the
-      various getter methods are retrieving values from the instance:</para>
-    
-    
-    <programlisting>public String toString() { // for debugging
-  return "XsgParse "
-    + getslotName() + ": "
-    + getheadWord().getCoveredText()
-    + " seqNo: " + getseqNo()
-    + ", cAddr: " + id
-    + ", size left mods: " + getlMods().size()
-    + ", size right mods: " + getrMods().size();
-}</programlisting>
- 
-    <section id="ugr.ref.jcas.data_persistence">
-      <title>Persistence of additional data</title>
-      <para>If you add custom instance fields to JCas cover classes, these exist in the JCas cover object instance,
-        but not in the CAS itself. Each time a CAS object is referenced (by an iterator, or by following a Feature
-        Structure reference), a new JCas cover object instance may be created. If you need these values, you can (a)
-        make them CAS values if possible, or (b) hold a reference to the the particular JCas cover object instance in
-        your Java code. For some simple cases, setting the the performance tuning option JCAS_CACHE_ENABLE (see
-          <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="tug.application.pto"/>)
-         to true
-        will cause the same JCas cover object that was previously used for a particular CAS Feature Structure to be
-        reused. However, this capability won't work when other factors interfere with the ability to reuse the same
-        object.  Pear isolation is an example of this.</para>
-      <para>Because of this, and because the JCas Cache holds on to the JCas cover objects beyond their useful life and
-        prevents them from being garbage collected, it is normally recommended running with the
-        JCAS_CACHE_ENABLE set to "false".</para>
-    </section>   
-    <section id="ugr.ref.jcas.keeping_augmentations_when_regenerating">
-      <title>Keeping hand-coded augmentations when regenerating</title>
-      
-      <para>If the type system specification changes, you have to re-run the JCasGen
-        generator. This will produce updated Java for the Class Models that capture the
-        changed specification. If you have previously augmented the source for these Java
-        Class Models, your changes must be merged with the newly (re)generated Java source
-        code for the Class Models. This can be done by hand, or you can run the version of JCasGen
-        that is integrated with Eclipse, and use automatic merging that is done using Eclipse&apos;s EMF
-        plug-in. You can obtain Eclipse and the needed EMF plug-in from <ulink
-          url="http://www.eclipse.org/"/>.</para>
-      
-      <para>If you run the generator version that works without using Eclipse, it will not
-        merge Java source changes you may have previously made; if you want them retained,
-        you&apos;ll have to do the merging by hand.</para>
-      
-      <para>The Java source merging will keep additional constructors, additional fields,
-        and any changes you may have made to the readObject method (see below). Merging will
-        <emphasis>not</emphasis> delete classes in the target corresponding to deleted CAS types, which no longer
-        are in the source &ndash; you should delete these by hand.</para>
-      
-      <warning><para>The merging supports Java 1.4 syntactic constructs only.  
-        JCasGen generates Java 1.4 code, so as long as any code you change here also sticks to 
-        only Java 1.4 constructs, the merge will work.  If you use Java 5 or later specific syntax or constructs, the merge
-        operation will likely fail to merge properly.</para></warning>
-    </section>
-    
-    <section id="ugr.ref.jcas.additional_constructors">
-      <title>Additional Constructors</title>
-      
-      <para>Any additional constructors that you add must include the JCas argument. The
-        first line of your constructor is required to be</para>
-      
-      
-      <programlisting>this(jcas);        // run the standard constructor</programlisting>
-      
-      <para>where jcas is the passed in JCas reference. If the type you&apos;re defining
-        extends <literal>uima.tcas.Annotation</literal>, JCasGen will automatically
-        add a constructor which takes 2 additional parameters &ndash; the begin and end Java
-        int values, and set the <literal>uima.tcas.Annotation</literal>
-        <literal>begin</literal> and <literal>end</literal> fields.</para>
-      
-      <para>Here&apos;s an example: If you&apos;re defining a type MyType which has a
-        feature parent, you might make an additional constructor which has an additional
-        argument of parent:</para>
-      
-      
-      <programlisting>MyType(JCas jcas, MyType parent) {
-  this(jcas);        // run the standard constructor
-  setParent(parent); // set the parent field from the parameter
-}</programlisting>
-      
-      <section id="ugr.ref.jcas.using_readobject">
-        <title>Using readObject</title>
-        
-        <para>Fields defined by augmenting the Java Class Model to include additional
-          fields represent data that exist for this class in Java, in a local JVM (Java Virtual
-          Machine), but do not exist in the CAS when it is passed to other environments (for
-          example, passing to a remote annotator).</para>
-        
-        <para>A problem can arise when new instances are created, perhaps by the underlying
-          system when it iterates over an index, which is: how to insure that any additional
-          non-CAS fields are properly initialized. To allow for arbitrary initialization
-          at instance creation time, an initialization method in the Java Class Model,
-          called readObject is used. The generated default for this method is to do nothing,
-          but it is one of the methods that you can modify &ndash; to do whatever
-          initialization might be needed. It is called with 0 parameters, during the
-          constructor for the object, after the basic object fields have been set up. It can
-          refer to fields in the CAS using the getters and setters, and other fields in the Java
-          object instance being initialized.</para>
-        
-        <para>A pre-existing CAS feature structure could exist if a CAS was being passed to
-          this annotator; in this case the JCas system calls the readObject method when
-          creating the corresponding Java instance for the first time for the CAS feature
-          structure. This can happen at two points: when a new object is being returned from an
-          iterator over a CAS index, or a getter method is getting a field for the first time
-          whose value is a feature structure.</para>
-        
-      </section>
-    </section>
-    
-    <section id="ugr.ref.jcas.modifying_generated_items">
-      <title>Modifying generated items</title>
-      
-      <para>The following modifications, if made in generated items, will be preserved when
-        regenerating.</para>
-      
-      <para>The public/private etc. flags associated with methods (getters and setters).
-        You can change the default (<quote>public</quote>) if needed.</para>
-      
-      <para><quote>final</quote> or <quote>abstract</quote> can be added to the type
-        itself, with the usual semantics.</para>
-      
-    </section>
-  </section>
-  
-  <section id="ugr.ref.jcas.merging_types_from_other_specs">
-    <title>Merging types</title>
-    <titleabbrev>Merging Types</titleabbrev>
-    <para>Type definitions are merged by the framework from all the components being run together.</para>
-    
-    <section id="ugr.ref.jcas.merging_types.aggregates_and_cpes">
-      <title>Aggregate AEs and CPEs as sources of types</title>
-      
-      <para>When running aggregate AEs (Analysis Engines), or a set of AEs in a collection processing engine, the
-        UIMA framework will build a merged type system (Note: this <quote>merge</quote> is merging types, not to be
-        confused with merging Java source code, discussed above). This merged type system has all the types of every
-        component used in the application.  In addition, application code can use UIMA Framework APIs to read and merge
-        type descriptions, manually.</para>
-      
-      <para>In most cases, each type system can have its own Java Class Models generated individually, perhaps at an
-        earlier time, and the resulting class files (or .jar files containing these class files) can be put in the
-        class path to enable JCas.</para>
-      
-      <para>However, it is possible that there may be multiple definitions of the same CAS type, each of which might
-        have different features defined. In this case, the UIMA framework will create a merged type by accumulating
-        all the defined features for a particular type into that type&apos;s type definition. However, the JCas
-        classes for these types are not automatically merged, which can create some issues for JCas users, as
-        discussed in the next section.</para>
-
-    </section>
-    
-    <section id="ugr.ref.jcas.merging_types.jcasgen_support">
-      <title>JCasGen support for type merging</title>
-      
-      <para>When there are multiple definitions of the same CAS type with different features defined, then JCasGen
-        can be re-run on the merged type system, to create one set of JCas Class definitions for the merged types,
-        which can then be shared by all the components. 
-        Directions for running JCasGen can be found in <olink
-          targetdoc="&uima_docs_tools;" targetptr="ugr.tools.jcasgen"/>. This is typically done by the person who
-        is assembling the Aggregate Analysis Engine or Collection Processing Engine. The resulting merged Java
-        Class Model will then contain get and set methods for the complete set of features. These Java classes must
-        then be made available in the class path, <emphasis>replacing</emphasis> the pre-merge versions of the
-        classes.</para>
-      
-      <para>If hand-modifications were done to the pre-merge versions of the classes, these must be applied to the
-        merged versions, as described in section <xref
-          linkend="ugr.ref.jcas.keeping_augmentations_when_regenerating"/>, above. If just one of the
-        pre-merge versions had hand-modifications, the source for this hand-modified version can be put into the
-        file system where the generated output will go, and the -merge option for JCasGen will automatically
-        merge the hand-modifications with the generated code. If
-        <emphasis>both</emphasis> pre-merged versions had hand-modifications, then these modifications must
-        be manually merged.</para>
-      
-      <para>An alternative to this is packaging the components as individual PEAR files, each with their own
-      version of the JCas generated Classes.  The Framework (as of release 2.2) can run PEAR files using the 
-      pear file descriptor, and supply each component with its particular version of the JCas generated class.</para>
-      
-    </section>
-    
-    <section id="ugr.ref.jcas.impact_of_type_merging_on_composability">
-      <title>Impact of Type Merging on Composability of Annotators</title>
-      <titleabbrev>Type Merging impacts on Composability</titleabbrev>
-      
-      <para>The recommended approach in UIMA is to build and maintain type systems as separate components, which are
-        imported by Annotators. Using this approach, Type Merging does not occur because the Type System and its JCas
-        classes are centrally managed and shared by the annotators.</para>
-      
-      <para>If you do choose to create a JCas Annotator that relies on Type Merging (meaning that your annotator
-        redefines a Type that is already in use elsewhere, and adds its own features), this can negatively impact the
-        reusability of your annotator, unless your component is used as a PEAR file.</para>
-      
-      <para>If not using PEAR file packaging isolation capability, whenever 
-        anyone wants to combine your annotator with another annotator that uses a different version of
-        the same Type, they will need to be aware of all of the issues described in the previous section. They will need
-        to have the know-how to re-run JCasGen and appropriately set up their classpath to include the merged Java
-        classes and to not include the pre-merge classes. (To enable this, you should package these classes
-        separately from other .jar files for your annotator, so that they can be more easily excluded.) And, if you
-        have done hand-modifications to your JCas classes, the person assembling your annotator will need to
-        properly merge those changes. These issues significantly complicate the task of combining annotators, and
-        will cause your annotator not to be as easily reusable as other UIMA annotators. </para>
-      
-    </section>
-    
-    <section id="ugr.ref.jcas.documentannotation_issues">
-      <title>Adding Features to DocumentAnnotation</title>
-      
-      <para>There is one built-in type, <literal>uima.tcas.DocumentAnnotion</literal>, 
-        to which applications can add additional features.  (All other built-in types
-        are "feature-final" and you cannot add additional features to them.)  Frequently,
-        additional features are added to <literal>uima.tcas.DocumentAnnotion</literal> 
-        to provide a place to store document-level metadata.</para>
-      
-      <para>For the same reasons mentioned in the previous section, adding features to 
-        DocumentAnnotation is not recommended if you are using JCas.  Instead, it is recommended
-        that you define your own type for storing your document-level metadata.  You can create 
-        an instance of this type and add it to the indexes in the usual way.  You can then
-        retrieve this instance using the iterator returned from the method<literal>getAllIndexedFS(type)</literal>
-        on an instance of a JFSIndexRepository object.
-        (As of UIMA v2.1, you do not have to declare a custom index in your descriptor to
-        get this to work).</para>
-      
-      <para>If you do choose to add features to DocumentAnnotation, there are additional issues to
-        be aware of.  The UIMA SDK provides the JCas cover class for the built-in definition of
-        DocumentAnnotation, in the separate jar file <literal>uima-document-annotation.jar</literal>.
-        If you add additional features to DocumentAnnotation, you must remove this jar file
-        from your classpath, because you will not want to use the default JCas cover class.
-        You will need to re-run JCasGen as described in <xref
-          linkend="ugr.ref.jcas.merging_types.jcasgen_support"/>.  JCasGen will generate a new cover
-        class for DocumentAnnotation, which you must place in your classpath in lieu of the version
-        in <literal>uima-document-annotation.jar</literal>.</para>
-        
-      <para>Also, this is the reason why the method <literal>JCas.getDocumentAnnotationFs()</literal> returns
-        type <literal>TOP</literal>, rather than type <literal>DocumentAnnotation</literal>.  Because the
-        <literal>DocumentAnnotation</literal> class can be replaced by users, it is not part of
-        <literal>uima-core.jar</literal> and so the core UIMA framework cannot have any references
-        to it.  In your code, you may <quote>cast</quote> the result of <literal>JCas.getDocumentAnnotationFs()</literal> 
-        to type <literal>DocumentAnnotation</literal>, which must be available on the classpath either via 
-        <literal>uima-document-annotation.jar</literal> or by including a custom version that you have generated using JCasGen.</para>
-    </section>
-    
-  </section>
-  
-  <section id="ugr.ref.jcas.using_within_an_annotator">
-    <title>Using JCas within an Annotator</title>
-    
-    <para>To use JCas within an annotator, you must include the generated Java classes output
-      from JCasGen in the class path.</para>
-    
-    <para>An annotator written using JCas is built by defining a class for the annotator that
-      extends JCasAnnotator_ImplBase. The process method for this annotator is
-      written</para>
-    
-    <programlisting>public void process(JCas jcas)
-     throws AnalysisEngineProcessException {
-  ... // body of annotator goes here
-}</programlisting>
-    
-    <para>The process method is passed the JCas instance to use as a parameter.</para>
-    
-    <para>The JCas reference is used throughout the annotator to refer to the particular JCas
-      instance being worked on. In pooled or multi-threaded implementations, there will be a
-      separate JCas for each thread being (simultaneously) worked on.</para>
-    
-    <para>You can do several kinds of operations using the JCas APIs: create new feature
-      structures (instances of CAS types) (using the new operator), access existing feature
-      structures passed to your annotator in the JCas (for example, by using the next method of
-      an iterator over the feature structures), get and set the fields of a particular
-      instance of a feature structure, and add and remove feature structure instances from
-      the CAS indexes. To support iteration, there are also functions to get and use indexes
-      and iterators over the instances in a JCas.</para>
-    
-    <section id="ugr.ref.jcas.new_instances">
-      <title>Creating new instances using the Java <quote>new</quote> operator</title>
-      <titleabbrev>Creating new instances</titleabbrev>
-      
-      <para>The new operator creates new instances of JCas types. It takes at least one
-        parameter, the JCas instance in which the type is to be created. For example, if there
-        was a type Meeting defined, you can create a new instance of it using:
-        
-        <programlisting>Meeting m = new Meeting(jcas);</programlisting></para>
-      
-      <para>Other variations of constructors can be added in custom code; the single
-        parameter version is the one automatically generated by JCasGen. For types that are
-        subtypes of Annotation, JCasGen also generates an additional constructor with
-        additional <quote>begin</quote> and <quote>end</quote> arguments.</para>
-      
-    </section>
-    <section id="ugr.ref.jcas.getters_and_setters">
-      <title>Getters and Setters</title>
-      
-      <para>If the CAS type Meeting had fields location and time, you could get or set these by
-        using getter or setter methods. These methods have names formed by splicing together
-        the word <quote>get</quote> or <quote>set</quote> followed by the field name, with
-        the first letter of the field name capitalized. For instance
-        
-        <programlisting>getLocation()</programlisting></para>
-      
-      <para>The getter forms take no parameters and return the value of the field; the setter
-        forms take one parameter, the value to set into the field, and return void.</para>
-      
-      <para>There are built-in CAS types for arrays of integers, strings, floats, and
-        feature structures. For fields whose values are these types of arrays, there is an
-        alternate form of getters and setters that take an additional parameter, written as
-        the first parameter, which is the index in the array of an item to get or set.</para>
-      
-    </section>
-    
-    <section id="ugr.ref.jcas.obtaining_refs_to_indexes">
-      <title>Obtaining references to Indexes</title>
-      
-      <para>The only way to access instances (not otherwise referenced from other
-        instances) passed in to your annotator in its JCas is to use an iterator over some
-        index. Indexes in the CAS are specified in the annotator descriptor. Indexes have a
-        name; text annotators have a built-in, standard index over all annotations.</para>
-      
-      <para>To get an index, first get the JFSIndexRepository from the JCas using the method
-        jcas.getJFSIndexRepository(). Here are the calls to get indexes:</para>
-      
-      
-      <programlisting>JFSIndexRepository ir = jcas.getJFSIndexRepository();
-
-ir.getIndex(name-of-index) // get the index by its name, a string
-ir.getIndex(name-of-index, Foo.type) // filtered by specific type
-
-ir.getAnnotationIndex()      // get AnnotationIndex
-ir.getAnnotationIndex(Foo.type)      // filtered by specific type</programlisting>
-      
-      <para>For convenience, the getAnnotationIndex method is available directly on the JCas object
-      instance; the implementation merely forwards to the associated index repository.</para>
-      
-      <para>Filtering types have to be a subtype of the type specified for this index in its
-        index specification. They can be written as either Foo.type or if you have an instance
-        of Foo, you can write</para>
-      
-      <programlisting>fooInstance.jcasType.casType.  </programlisting>
-      
-      <para>Foo is (of course) an example of the name of the type.</para>
-      
-    </section>
-    <section id="ugr.ref.jcas.adding_removing_instances_to_indexes">
-      <title>Adding (and removing) instances to (from) indexes</title>
-      <titleabbrev>Updating Indexes</titleabbrev>
-      
-      <para>CAS indexes are maintained automatically by the CAS. But you must add any
-        instances of feature structures you want the index to find, to the indexes by using the
-        call:</para>
-      
-      <programlisting>myInstance.addToIndexes();</programlisting>
-      
-      <para>Do this after setting all features in the instance <emphasis role="bold-italic">which could be used in indexing</emphasis>, for example, in
-        determining the sorting order. After indexing, do not change the values of these
-        particular features because the indexes will not be updated. If you need to change the
-        values, you must first remove the instance from the CAS indexes, change the values,
-        and then add the instance back. To remove an instance from the indexes, use the method:
-        
-        <programlisting>myInstance.removeFromIndexes();</programlisting></para>
-      <note><para>It&apos;s OK to change feature values which are not used in determining
-      sort ordering (or set membership), without removing and re-adding back to the index.
-      </para></note>
-      
-      <para>When writing a Multi-View component, you may need to index instances in multiple
-        CAS views. The methods above use the indexes associated with the current JCas object.
-        There is a variation of the <literal>addToIndexes / removeFromIndexes</literal> methods which
-        takes one argument: a reference to a JCas object holding the view in which you want to 
-        index this instance.
-        <programlisting>myInstance.addToIndexes(anotherJCas)
-myInstance.removeFromIndexes(anotherJCas)</programlisting>
-      </para>
-      
-      <para>
-        You can also explicitly add instances to other views using the addFsToIndexes method on
-        other JCas (or CAS) objects. For instance, if you had 2 other CAS views (myView1 and
-        myView2), in which you wanted to index myInstance, you could write:</para>
-      
-      <programlisting>myInstance.addToIndexes(); //addToIndexes used with the new operator
-myView1.addFsToIndexes(myInstance); // index myInstance in myView1
-myView2.addFsToIndexes(myInstance); // index myInstance in myView2</programlisting>
-      
-      <para>
-        The rules for determining which index to use with a particular JCas object are designed to
-        behave the way most would think they should; if you need specific behavior, you can always 
-        explicitly designate which view the index adding and removing operations should work on.
-      </para>
-      
-      <para>
-        The rules are:
-        If the instance is a subtype of AnnotationBase, then the view is the view associated with the 
-        annotation as specified in the feature holding the view reference in AnnotationBase.
-        Otherwise, if the instance was created using the "new" operator, then the view is the view passed to the 
-        instance's constructor.
-        Otherwise, if the instance was created by getting a feature value from some other instance, whose range
-        type is a feature structure, then the view is the same as the referring instance.
-        Otherwise, if the instance was created by any of the Feature Structure Iterator operations over some index,
-        then it is the view associated with the index.
-      </para>
-    </section>
-    
-    <section id="ugr.ref.jcas.using_iterators">
-      <title>Using Iterators</title>
-      
-      <para>Once you have an index obtained from the JCas, you can get an iterator from the
-        index; here is an example:</para>
-      
-      
-      <programlisting>FSIndexRepository ir = jcas.getFSIndexRepository();
-FSIndex myIndex = ir.getIndex("myIndexName");
-FSIterator myIterator = myIndex.iterator();
-
-JFSIndexRepository ir = jcas.getJFSIndexRepository();
-FSIndex myIndex = ir.getIndex("myIndexName", Foo.type); // filtered
-FSIterator myIterator = myIndex.iterator();</programlisting>
-      
-      <para>Iterators work like normal Java iterators, but are augmented to support
-        additional capabilities. Iterators are described in the CAS Reference, <olink
-          targetdoc="&uima_docs_ref;"
-          targetptr="ugr.ref.cas.indexes_and_iterators"/>.</para>
-      
-    </section>
-    
-    <section id="ugr.ref.jcas.class_loaders">
-      <title>Class Loaders in UIMA</title>
-      
-      <para>The basic concept of a UIMA application includes assembling engines into a flow.
-        The application made up of these Engines are run within the UIMA Framework, either by
-        the Collection Processing Manager, or by using more basic UIMA Framework
-        APIs.</para>
-      
-      <para>The UIMA Framework exists within a JVM (Java Virtual Machine). A JVM has the
-        capability to load multiple applications, in a way where each one is isolated from the
-        others, by using a separate class loader for each application. For instance, one set
-        of UIMA Framework Classes could be shared by multiple sets of application - specific
-        classes, even if these application-specific classes had the same names but were
-        different versions.</para>
-      
-      <section id="ugr.ref.jcas.class_loaders.optional">
-        <title>Use of Class Loaders is optional</title>
-        
-        <para>The UIMA framework will use a specific ClassLoader, based on how
-          ResourceManager instances are used. Specific ClassLoaders are only created if
-          you specify an ExtensionClassPath as part of the ResourceManager. If you do not
-          need to support multiple applications within one UIMA framework within a JVM,
-          don&apos;t specify an ExtensionClassPath; in this case, the classloader used
-          will be the one used to load the UIMA framework - usually the overall application
-          class loader.</para>
-        
-        <para>Of course, you should not run multiple UIMA applications together, in this
-          way, if they have different class definitions for the same class name. This
-          includes the JCas <quote>cover</quote> classes. This case might arise, for
-          instance, if both applications extended
-          <literal>uima.tcas.DocumentAnnotation</literal> in differing,
-          incompatible ways. Each application would need its own definition of this class,
-          but only one could be loaded (unless you specify ExtensionClassPath in the
-          ResourceManager which will cause the UIMA application to load its private
-          versions of its classes, from its classpath).</para>
-      </section>
-    </section>
-    
-    <section id="ugr.ref.jcas.accessing_jcas_objects_outside_uima_components">
-      <title>Issues accessing JCas objects outside of UIMA Engine Components</title>
-      
-      <para>If you are using the ExtensionClassPaths, the JCas cover classes are loaded
-        under a class loader created by the ResourceManager part of the UIMA Framework.
-        If you reference the same JCas
-        classes outside of any UIMA component, for instance, in top level application code,
-        the JCas classes used by that top level application code also must be in the class path
-        for the application code.</para>
-      
-      <para>Alternatively, you could do all the JCas processing inside a UIMA component (and do no
-        processing using JCas outside of the UIMA pipeline).</para>
-      
-    </section>
-  </section>
-  
-  <section id="ugr.ref.jcas.setting_up_classpath">
-    <title>Setting up Classpath for JCas</title>
-    
-    <para>The JCas Java classes generated by JCasGen are typically compiled and put into a JAR
-      file, which, in turn, is put into the application&apos;s class path.</para>
-    
-    <para>This JAR file must be generated from the application&apos;s merged type system.
-      This is most conveniently done by opening the top level descriptor used by the
-      application in the Component Descriptor Editor tool, and pressing the Run-JCasGen
-      button on the Type System Definition page.</para>
-    
-  </section>
-  
-  <section id="ugr.ref.jcas.pear_support">
-    <title>PEAR isolation</title>
-    <para>
-      As of version 2.2, the framework supports component descriptors which are PEAR descriptors. 
-      These descriptors define components plus include information on the class path needed to 
-      run them.  The framework uses the class path information to set up a localized class path, just
-      for code running within the PEAR context.  This allows PEAR files requiring different 
-      versions of common code to work well together, even if the class names in the different versions
-      have the same names. 
-    </para>
-    
-  </section>
-  
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[
+<!ENTITY % uimaents SYSTEM "../entities.ent" >  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.ref.jcas">
+  <title>JCas Reference</title>
+  
+  <para>The CAS is a system for sharing data among annotators, consisting of data structures
+    (definable at run time), sets of indexes over these data, metadata describing these, subjects of
+    analysis, and a high
+    performance serialization/deserialization mechanism. JCas provides Java approach to
+    accessing CAS data, and is based on using generated, specific Java classes for each CAS
+    type.</para>
+  
+  <para>Annotators process one CAS per call to their process method. During processing,
+    annotators can retrieve feature structures from the passed in CAS, add new ones, modify
+    existing ones, and use and update CAS indexes. Of course, an annotator can also use plain
+    Java Objects in addition; but the data in the CAS is what is shared among annotators within
+    an application.</para>
+  
+  <para>All the facilities present in the APIs for the CAS are available when using the JCas
+    APIs; indeed, you can use the getCas() method to get the corresponding CAS object from a
+    JCas (and vice-versa). The JCas APIs often have helper methods that make using this
+    interface more convenient for Java developers.</para>
+  
+  <para>The data in the CAS are typed objects having fields. JCas uses a set of generated Java
+    classes (each corresponding to a particular CAS type) with <quote>getter</quote> and
+    <quote>setter</quote> methods for the features, plus a constructor so new instances can
+    be made. The Java classes don&apos;t actually store the data in the class instance;
+    instead, the getters and setters forward to the underlying CAS data representation.
+    Because of this, applications which use the JCas interface can share data with annotators
+    using plain CAS (i.e., not using the JCas approach). </para>
+  
+    <para>Users can modify the JCas generated
+    Java classes by adding fields to them; this allows arbitrary non-CAS data to also be
+    represented within the JCas objects, as well; however, the non-CAS data stored in the JCas
+    object instances cannot be shared with annotators using the plain CAS.</para>
+  
+  <para>Data in the CAS initially has no corresponding JCas type instances; these are created
+    as needed at the first reference. This means, if your annotator is passed a large CAS having
+    millions of CAS feature structures, but you only reference a few of them, and no previously
+    created Java JCas object instances were created by upstream annotators, the only Java
+    objects that will be created will be those that correspond to the CAS feature structures
+    that you reference.</para>
+  
+  <para>The JCas class Java source files are generated from XML type system descriptions. The
+    JCasGen utility does the work of generating the corresponding Java Class Model for the CAS
+    types. There are a variety of ways JCasGen can be run; these are described later. You
+    include the generated classes with your UIMA component, and you can publish these classes
+    for others who might want to use your type system.</para>
+  
+  <para>The specification of the type system in XML can be written using a conventional text
+    editor, an XML editor, or using the Eclipse plug-in that supports editing UIMA
+    descriptors.</para>
+  
+  <para>Changes to the type system are done by changing the XML and regenerating the
+    corresponding Java Class Models. Of course, once you&apos;ve published your type system
+    for others to use, you should be careful that any changes you make don&apos;t adversely
+    impact the users. Additional features can be added to existing types without breaking
+    other code.</para>
+  
+  <para>A separate Java class is generated for each type; this type implements the CAS
+    FeatureStructure interface, as well as having the special getters and setters for the
+    included features. In the current implementation, an additional helper class per type is
+    also generated. The generated Java classes have methods (getters and setters) for the
+    fields as defined in the XML type specification. Descriptor comments are reflected in the
+    generated Java code as Java-doc style comments.</para>
+  
+  
+  <section id="ugr.ref.jcas.name_spaces">
+    <title>Name Spaces</title>
+    
+    <para>Full Type names consist of a <quote>namespace</quote> prefix dotted with a simple
+      name. Namespaces are used like packages to avoid collisions between types that are
+      defined by different people at different times. The namespace is used as the Java
+      package name for generated Java files.</para>
+      
+    <para>Type names used in the CAS correspond to the generated Java classes directly. If the
+      CAS name is com.myCompany.myProject.ExampleClass, the generated Java class is in the
+      package com.myCompany.myProject, and the class is ExampleClass.</para>
+      
+    <para>
+      An exception to this rule is the built-in types
+      starting with <literal>uima.cas </literal>and <literal>uima.tcas</literal>;
+      these names are mapped to Java packages named
+      <literal>org.apache.uima.jcas.cas</literal> and
+      <literal>org.apache.uima.jcas.tcas</literal>.</para>
+    
+  </section>
+  
+  <section id="ugr.ref.jcas.use_of_description">
+    <title>XML description element</title>
+    <titleabbrev>Use of XML Description</titleabbrev>
+    
+    <para>Each XML type specification can have &lt;description ...
+      &gt; tags. The description for a type will be copied into the generated Java code, as a
+      Javadoc style comment for the class. When writing these descriptions in the XML type
+      specification file, you might want to use html tags, as allowed in Javadocs.</para>
+    
+    <para>If you use the Component Description Editor, you can write the html tags normally,
+      for instance, <quote>&lt;h1&gt;My Title&lt;/h1&gt;</quote>. The Component
+      Descriptor Editor will take care of coverting the actual descriptor source so that it
+      has the leading <quote>&lt;</quote> character written as <quote>&amp;lt;</quote>,
+      to avoid confusing the XML type specification. For example, &lt;p&gt; would be written
+      in the source of the descriptor as &amp;lt;p&gt;. Any characters used in the Javadoc
+      comment must of course be from the character set allowed by the XML type specification.
+      These specifications often start with the line &lt;?xml version=<quote>1.0</quote>
+      encoding=<quote>UTF-8</quote> ?&gt;, which means you can use any of the UTF-8
+      characters.</para>
+    
+  </section>
+  
+  <section id="ugr.ref.jcas.mapping_built_ins">
+    <title>Mapping built-in CAS types to Java types</title>
+    
+    <para>The built-in primitive CAS types map to Java types as follows:</para>
+    
+    
+    <programlisting>uima.cas.Boolean &rarr; boolean
+uima.cas.Byte    &rarr; byte
+uima.cas.Short   &rarr; short
+uima.cas.Integer &rarr; int
+uima.cas.Long    &rarr; long
+uima.cas.Float   &rarr; float
+uima.cas.Double  &rarr; double
+uima.cas.String  &rarr; String</programlisting>
+    
+  </section>
+  
+  <section id="ugr.ref.jcas.augmenting_generated_code">
+    <title>Augmenting the generated Java Code</title>
+    
+    <para>The Java Class Models generated for each type can be augmented by the user. Typical
+      augmentations include adding additional (non-CAS) fields and methods, and import
+      statements that might be needed to support these. Commonly added methods include
+      additional constructors (having different parameter signatures), and
+      implementations of toString().</para>
+    
+    <para>To augment the code, just edit the generated Java source code for the class named the
+      same as the CAS type. Here&apos;s an example of an additional method you might add; the
+      various getter methods are retrieving values from the instance:</para>
+    
+    
+    <programlisting>public String toString() { // for debugging
+  return "XsgParse "
+    + getslotName() + ": "
+    + getheadWord().getCoveredText()
+    + " seqNo: " + getseqNo()
+    + ", cAddr: " + id
+    + ", size left mods: " + getlMods().size()
+    + ", size right mods: " + getrMods().size();
+}</programlisting>
+ 
+    <section id="ugr.ref.jcas.data_persistence">
+      <title>Persistence of additional data</title>
+      <para>If you add custom instance fields to JCas cover classes, these exist in the JCas cover object instance,
+        but not in the CAS itself. Each time a CAS object is referenced (by an iterator, or by following a Feature
+        Structure reference), a new JCas cover object instance may be created. If you need these values, you can (a)
+        make them CAS values if possible, or (b) hold a reference to the the particular JCas cover object instance in
+        your Java code. For some simple cases, setting the the performance tuning option JCAS_CACHE_ENABLE (see
+          <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="tug.application.pto"/>)
+         to true
+        will cause the same JCas cover object that was previously used for a particular CAS Feature Structure to be
+        reused. However, this capability won't work when other factors interfere with the ability to reuse the same
+        object.  Pear isolation is an example of this.</para>
+      <para>Because of this, and because the JCas Cache holds on to the JCas cover objects beyond their useful life and
+        prevents them from being garbage collected, it is normally recommended running with the
+        JCAS_CACHE_ENABLE set to "false".</para>
+    </section>   
+    <section id="ugr.ref.jcas.keeping_augmentations_when_regenerating">
+      <title>Keeping hand-coded augmentations when regenerating</title>
+      
+      <para>If the type system specification changes, you have to re-run the JCasGen
+        generator. This will produce updated Java for the Class Models that capture the
+        changed specification. If you have previously augmented the source for these Java
+        Class Models, your changes must be merged with the newly (re)generated Java source
+        code for the Class Models. This can be done by hand, or you can run the version of JCasGen
+        that is integrated with Eclipse, and use automatic merging that is done using Eclipse&apos;s EMF
+        plug-in. You can obtain Eclipse and the needed EMF plug-in from <ulink
+          url="http://www.eclipse.org/"/>.</para>
+      
+      <para>If you run the generator version that works without using Eclipse, it will not
+        merge Java source changes you may have previously made; if you want them retained,
+        you&apos;ll have to do the merging by hand.</para>
+      
+      <para>The Java source merging will keep additional constructors, additional fields,
+        and any changes you may have made to the readObject method (see below). Merging will
+        <emphasis>not</emphasis> delete classes in the target corresponding to deleted CAS types, which no longer
+        are in the source &ndash; you should delete these by hand.</para>
+      
+      <warning><para>The merging supports Java 1.4 syntactic constructs only.  
+        JCasGen generates Java 1.4 code, so as long as any code you change here also sticks to 
+        only Java 1.4 constructs, the merge will work.  If you use Java 5 or later specific syntax or constructs, the merge
+        operation will likely fail to merge properly.</para></warning>
+    </section>
+    
+    <section id="ugr.ref.jcas.additional_constructors">
+      <title>Additional Constructors</title>
+      
+      <para>Any additional constructors that you add must include the JCas argument. The
+        first line of your constructor is required to be</para>
+      
+      
+      <programlisting>this(jcas);        // run the standard constructor</programlisting>
+      
+      <para>where jcas is the passed in JCas reference. If the type you&apos;re defining
+        extends <literal>uima.tcas.Annotation</literal>, JCasGen will automatically
+        add a constructor which takes 2 additional parameters &ndash; the begin and end Java
+        int values, and set the <literal>uima.tcas.Annotation</literal>
+        <literal>begin</literal> and <literal>end</literal> fields.</para>
+      
+      <para>Here&apos;s an example: If you&apos;re defining a type MyType which has a
+        feature parent, you might make an additional constructor which has an additional
+        argument of parent:</para>
+      
+      
+      <programlisting>MyType(JCas jcas, MyType parent) {
+  this(jcas);        // run the standard constructor
+  setParent(parent); // set the parent field from the parameter
+}</programlisting>
+      
+      <section id="ugr.ref.jcas.using_readobject">
+        <title>Using readObject</title>
+        
+        <para>Fields defined by augmenting the Java Class Model to include additional
+          fields represent data that exist for this class in Java, in a local JVM (Java Virtual
+          Machine), but do not exist in the CAS when it is passed to other environments (for
+          example, passing to a remote annotator).</para>
+        
+        <para>A problem can arise when new instances are created, perhaps by the underlying
+          system when it iterates over an index, which is: how to insure that any additional
+          non-CAS fields are properly initialized. To allow for arbitrary initialization
+          at instance creation time, an initialization method in the Java Class Model,
+          called readObject is used. The generated default for this method is to do nothing,
+          but it is one of the methods that you can modify &ndash; to do whatever
+          initialization might be needed. It is called with 0 parameters, during the
+          constructor for the object, after the basic object fields have been set up. It can
+          refer to fields in the CAS using the getters and setters, and other fields in the Java
+          object instance being initialized.</para>
+        
+        <para>A pre-existing CAS feature structure could exist if a CAS was being passed to
+          this annotator; in this case the JCas system calls the readObject method when
+          creating the corresponding Java instance for the first time for the CAS feature
+          structure. This can happen at two points: when a new object is being returned from an
+          iterator over a CAS index, or a getter method is getting a field for the first time
+          whose value is a feature structure.</para>
+        
+      </section>
+    </section>
+    
+    <section id="ugr.ref.jcas.modifying_generated_items">
+      <title>Modifying generated items</title>
+      
+      <para>The following modifications, if made in generated items, will be preserved when
+        regenerating.</para>
+      
+      <para>The public/private etc. flags associated with methods (getters and setters).
+        You can change the default (<quote>public</quote>) if needed.</para>
+      
+      <para><quote>final</quote> or <quote>abstract</quote> can be added to the type
+        itself, with the usual semantics.</para>
+      
+    </section>
+  </section>
+  
+  <section id="ugr.ref.jcas.merging_types_from_other_specs">
+    <title>Merging types</title>
+    <titleabbrev>Merging Types</titleabbrev>
+    <para>Type definitions are merged by the framework from all the components being run together.</para>
+    
+    <section id="ugr.ref.jcas.merging_types.aggregates_and_cpes">
+      <title>Aggregate AEs and CPEs as sources of types</title>
+      
+      <para>When running aggregate AEs (Analysis Engines), or a set of AEs in a collection processing engine, the
+        UIMA framework will build a merged type system (Note: this <quote>merge</quote> is merging types, not to be
+        confused with merging Java source code, discussed above). This merged type system has all the types of every
+        component used in the application.  In addition, application code can use UIMA Framework APIs to read and merge
+        type descriptions, manually.</para>
+      
+      <para>In most cases, each type system can have its own Java Class Models generated individually, perhaps at an
+        earlier time, and the resulting class files (or .jar files containing these class files) can be put in the
+        class path to enable JCas.</para>
+      
+      <para>However, it is possible that there may be multiple definitions of the same CAS type, each of which might
+        have different features defined. In this case, the UIMA framework will create a merged type by accumulating
+        all the defined features for a particular type into that type&apos;s type definition. However, the JCas
+        classes for these types are not automatically merged, which can create some issues for JCas users, as
+        discussed in the next section.</para>
+
+    </section>
+    
+    <section id="ugr.ref.jcas.merging_types.jcasgen_support">
+      <title>JCasGen support for type merging</title>
+      
+      <para>When there are multiple definitions of the same CAS type with different features defined, then JCasGen
+        can be re-run on the merged type system, to create one set of JCas Class definitions for the merged types,
+        which can then be shared by all the components. 
+        Directions for running JCasGen can be found in <olink
+          targetdoc="&uima_docs_tools;" targetptr="ugr.tools.jcasgen"/>. This is typically done by the person who
+        is assembling the Aggregate Analysis Engine or Collection Processing Engine. The resulting merged Java
+        Class Model will then contain get and set methods for the complete set of features. These Java classes must
+        then be made available in the class path, <emphasis>replacing</emphasis> the pre-merge versions of the
+        classes.</para>
+      
+      <para>If hand-modifications were done to the pre-merge versions of the classes, these must be applied to the
+        merged versions, as described in section <xref
+          linkend="ugr.ref.jcas.keeping_augmentations_when_regenerating"/>, above. If just one of the
+        pre-merge versions had hand-modifications, the source for this hand-modified version can be put into the
+        file system where the generated output will go, and the -merge option for JCasGen will automatically
+        merge the hand-modifications with the generated code. If
+        <emphasis>both</emphasis> pre-merged versions had hand-modifications, then these modifications must
+        be manually merged.</para>
+      
+      <para>An alternative to this is packaging the components as individual PEAR files, each with their own
+      version of the JCas generated Classes.  The Framework (as of release 2.2) can run PEAR files using the 
+      pear file descriptor, and supply each component with its particular version of the JCas generated class.</para>
+      
+    </section>
+    
+    <section id="ugr.ref.jcas.impact_of_type_merging_on_composability">
+      <title>Impact of Type Merging on Composability of Annotators</title>
+      <titleabbrev>Type Merging impacts on Composability</titleabbrev>
+      
+      <para>The recommended approach in UIMA is to build and maintain type systems as separate components, which are
+        imported by Annotators. Using this approach, Type Merging does not occur because the Type System and its JCas
+        classes are centrally managed and shared by the annotators.</para>
+      
+      <para>If you do choose to create a JCas Annotator that relies on Type Merging (meaning that your annotator
+        redefines a Type that is already in use elsewhere, and adds its own features), this can negatively impact the
+        reusability of your annotator, unless your component is used as a PEAR file.</para>
+      
+      <para>If not using PEAR file packaging isolation capability, whenever 
+        anyone wants to combine your annotator with another annotator that uses a different version of
+        the same Type, they will need to be aware of all of the issues described in the previous section. They will need
+        to have the know-how to re-run JCasGen and appropriately set up their classpath to include the merged Java
+        classes and to not include the pre-merge classes. (To enable this, you should package these classes
+        separately from other .jar files for your annotator, so that they can be more easily excluded.) And, if you
+        have done hand-modifications to your JCas classes, the person assembling your annotator will need to
+        properly merge those changes. These issues significantly complicate the task of combining annotators, and
+        will cause your annotator not to be as easily reusable as other UIMA annotators. </para>
+      
+    </section>
+    
+    <section id="ugr.ref.jcas.documentannotation_issues">
+      <title>Adding Features to DocumentAnnotation</title>
+      
+      <para>There is one built-in type, <literal>uima.tcas.DocumentAnnotion</literal>, 
+        to which applications can add additional features.  (All other built-in types
+        are "feature-final" and you cannot add additional features to them.)  Frequently,
+        additional features are added to <literal>uima.tcas.DocumentAnnotion</literal> 
+        to provide a place to store document-level metadata.</para>
+      
+      <para>For the same reasons mentioned in the previous section, adding features to 
+        DocumentAnnotation is not recommended if you are using JCas.  Instead, it is recommended
+        that you define your own type for storing your document-level metadata.  You can create 
+        an instance of this type and add it to the indexes in the usual way.  You can then
+        retrieve this instance using the iterator returned from the method<literal>getAllIndexedFS(type)</literal>
+        on an instance of a JFSIndexRepository object.
+        (As of UIMA v2.1, you do not have to declare a custom index in your descriptor to
+        get this to work).</para>
+      
+      <para>If you do choose to add features to DocumentAnnotation, there are additional issues to
+        be aware of.  The UIMA SDK provides the JCas cover class for the built-in definition of
+        DocumentAnnotation, in the separate jar file <literal>uima-document-annotation.jar</literal>.
+        If you add additional features to DocumentAnnotation, you must remove this jar file
+        from your classpath, because you will not want to use the default JCas cover class.
+        You will need to re-run JCasGen as described in <xref
+          linkend="ugr.ref.jcas.merging_types.jcasgen_support"/>.  JCasGen will generate a new cover
+        class for DocumentAnnotation, which you must place in your classpath in lieu of the version
+        in <literal>uima-document-annotation.jar</literal>.</para>
+        
+      <para>Also, this is the reason why the method <literal>JCas.getDocumentAnnotationFs()</literal> returns
+        type <literal>TOP</literal>, rather than type <literal>DocumentAnnotation</literal>.  Because the
+        <literal>DocumentAnnotation</literal> class can be replaced by users, it is not part of
+        <literal>uima-core.jar</literal> and so the core UIMA framework cannot have any references
+        to it.  In your code, you may <quote>cast</quote> the result of <literal>JCas.getDocumentAnnotationFs()</literal> 
+        to type <literal>DocumentAnnotation</literal>, which must be available on the classpath either via 
+        <literal>uima-document-annotation.jar</literal> or by including a custom version that you have generated using JCasGen.</para>
+    </section>
+    
+  </section>
+  
+  <section id="ugr.ref.jcas.using_within_an_annotator">
+    <title>Using JCas within an Annotator</title>
+    
+    <para>To use JCas within an annotator, you must include the generated Java classes output
+      from JCasGen in the class path.</para>
+    
+    <para>An annotator written using JCas is built by defining a class for the annotator that
+      extends JCasAnnotator_ImplBase. The process method for this annotator is
+      written</para>
+    
+    <programlisting>public void process(JCas jcas)
+     throws AnalysisEngineProcessException {
+  ... // body of annotator goes here
+}</programlisting>
+    
+    <para>The process method is passed the JCas instance to use as a parameter.</para>
+    
+    <para>The JCas reference is used throughout the annotator to refer to the particular JCas
+      instance being worked on. In pooled or multi-threaded implementations, there will be a
+      separate JCas for each thread being (simultaneously) worked on.</para>
+    
+    <para>You can do several kinds of operations using the JCas APIs: create new feature
+      structures (instances of CAS types) (using the new operator), access existing feature
+      structures passed to your annotator in the JCas (for example, by using the next method of
+      an iterator over the feature structures), get and set the fields of a particular
+      instance of a feature structure, and add and remove feature structure instances from
+      the CAS indexes. To support iteration, there are also functions to get and use indexes
+      and iterators over the instances in a JCas.</para>
+    
+    <section id="ugr.ref.jcas.new_instances">
+      <title>Creating new instances using the Java <quote>new</quote> operator</title>
+      <titleabbrev>Creating new instances</titleabbrev>
+      
+      <para>The new operator creates new instances of JCas types. It takes at least one
+        parameter, the JCas instance in which the type is to be created. For example, if there
+        was a type Meeting defined, you can create a new instance of it using:
+        
+        <programlisting>Meeting m = new Meeting(jcas);</programlisting></para>
+      
+      <para>Other variations of constructors can be added in custom code; the single
+        parameter version is the one automatically generated by JCasGen. For types that are
+        subtypes of Annotation, JCasGen also generates an additional constructor with
+        additional <quote>begin</quote> and <quote>end</quote> arguments.</para>
+      
+    </section>
+    <section id="ugr.ref.jcas.getters_and_setters">
+      <title>Getters and Setters</title>
+      
+      <para>If the CAS type Meeting had fields location and time, you could get or set these by
+        using getter or setter methods. These methods have names formed by splicing together
+        the word <quote>get</quote> or <quote>set</quote> followed by the field name, with
+        the first letter of the field name capitalized. For instance
+        
+        <programlisting>getLocation()</programlisting></para>
+      
+      <para>The getter forms take no parameters and return the value of the field; the setter
+        forms take one parameter, the value to set into the field, and return void.</para>
+      
+      <para>There are built-in CAS types for arrays of integers, strings, floats, and
+        feature structures. For fields whose values are these types of arrays, there is an
+        alternate form of getters and setters that take an additional parameter, written as
+        the first parameter, which is the index in the array of an item to get or set.</para>
+      
+    </section>
+    
+    <section id="ugr.ref.jcas.obtaining_refs_to_indexes">
+      <title>Obtaining references to Indexes</title>
+      
+      <para>The only way to access instances (not otherwise referenced from other
+        instances) passed in to your annotator in its JCas is to use an iterator over some
+        index. Indexes in the CAS are specified in the annotator descriptor. Indexes have a
+        name; text annotators have a built-in, standard index over all annotations.</para>
+      
+      <para>To get an index, first get the JFSIndexRepository from the JCas using the method
+        jcas.getJFSIndexRepository(). Here are the calls to get indexes:</para>
+      
+      
+      <programlisting>JFSIndexRepository ir = jcas.getJFSIndexRepository();
+
+ir.getIndex(name-of-index) // get the index by its name, a string
+ir.getIndex(name-of-index, Foo.type) // filtered by specific type
+
+ir.getAnnotationIndex()      // get AnnotationIndex
+ir.getAnnotationIndex(Foo.type)      // filtered by specific type</programlisting>
+      
+      <para>For convenience, the getAnnotationIndex method is available directly on the JCas object
+      instance; the implementation merely forwards to the associated index repository.</para>
+      
+      <para>Filtering types have to be a subtype of the type specified for this index in its
+        index specification. They can be written as either Foo.type or if you have an instance
+        of Foo, you can write</para>
+      
+      <programlisting>fooInstance.jcasType.casType.  </programlisting>
+      
+      <para>Foo is (of course) an example of the name of the type.</para>
+      
+    </section>
+    <section id="ugr.ref.jcas.adding_removing_instances_to_indexes">
+      <title>Adding (and removing) instances to (from) indexes</title>
+      <titleabbrev>Updating Indexes</titleabbrev>
+      
+      <para>CAS indexes are maintained automatically by the CAS. But you must add any
+        instances of feature structures you want the index to find, to the indexes by using the
+        call:</para>
+      
+      <programlisting>myInstance.addToIndexes();</programlisting>
+      
+      <para>Do this after setting all features in the instance <emphasis role="bold-italic">which could be used in indexing</emphasis>, for example, in
+        determining the sorting order. After indexing, do not change the values of these
+        particular features because the indexes will not be updated. If you need to change the
+        values, you must first remove the instance from the CAS indexes, change the values,
+        and then add the instance back. To remove an instance from the indexes, use the method:
+        
+        <programlisting>myInstance.removeFromIndexes();</programlisting></para>
+      <note><para>It&apos;s OK to change feature values which are not used in determining
+      sort ordering (or set membership), without removing and re-adding back to the index.
+      </para></note>
+      
+      <para>When writing a Multi-View component, you may need to index instances in multiple
+        CAS views. The methods above use the indexes associated with the current JCas object.
+        There is a variation of the <literal>addToIndexes / removeFromIndexes</literal> methods which
+        takes one argument: a reference to a JCas object holding the view in which you want to 
+        index this instance.
+        <programlisting>myInstance.addToIndexes(anotherJCas)
+myInstance.removeFromIndexes(anotherJCas)</programlisting>
+      </para>
+      
+      <para>
+        You can also explicitly add instances to other views using the addFsToIndexes method on
+        other JCas (or CAS) objects. For instance, if you had 2 other CAS views (myView1 and
+        myView2), in which you wanted to index myInstance, you could write:</para>
+      
+      <programlisting>myInstance.addToIndexes(); //addToIndexes used with the new operator
+myView1.addFsToIndexes(myInstance); // index myInstance in myView1
+myView2.addFsToIndexes(myInstance); // index myInstance in myView2</programlisting>
+      
+      <para>
+        The rules for determining which index to use with a particular JCas object are designed to
+        behave the way most would think they should; if you need specific behavior, you can always 
+        explicitly designate which view the index adding and removing operations should work on.
+      </para>
+      
+      <para>
+        The rules are:
+        If the instance is a subtype of AnnotationBase, then the view is the view associated with the 
+        annotation as specified in the feature holding the view reference in AnnotationBase.
+        Otherwise, if the instance was created using the "new" operator, then the view is the view passed to the 
+        instance's constructor.
+        Otherwise, if the instance was created by getting a feature value from some other instance, whose range
+        type is a feature structure, then the view is the same as the referring instance.
+        Otherwise, if the instance was created by any of the Feature Structure Iterator operations over some index,
+        then it is the view associated with the index.
+      </para>
+    </section>
+    
+    <section id="ugr.ref.jcas.using_iterators">
+      <title>Using Iterators</title>
+      
+      <para>Once you have an index obtained from the JCas, you can get an iterator from the
+        index; here is an example:</para>
+      
+      
+      <programlisting>FSIndexRepository ir = jcas.getFSIndexRepository();
+FSIndex myIndex = ir.getIndex("myIndexName");
+FSIterator myIterator = myIndex.iterator();
+
+JFSIndexRepository ir = jcas.getJFSIndexRepository();
+FSIndex myIndex = ir.getIndex("myIndexName", Foo.type); // filtered
+FSIterator myIterator = myIndex.iterator();</programlisting>
+      
+      <para>Iterators work like normal Java iterators, but are augmented to support
+        additional capabilities. Iterators are described in the CAS Reference, <olink
+          targetdoc="&uima_docs_ref;"
+          targetptr="ugr.ref.cas.indexes_and_iterators"/>.</para>
+      
+    </section>
+    
+    <section id="ugr.ref.jcas.class_loaders">
+      <title>Class Loaders in UIMA</title>
+      
+      <para>The basic concept of a UIMA application includes assembling engines into a flow.
+        The application made up of these Engines are run within the UIMA Framework, either by
+        the Collection Processing Manager, or by using more basic UIMA Framework
+        APIs.</para>
+      
+      <para>The UIMA Framework exists within a JVM (Java Virtual Machine). A JVM has the
+        capability to load multiple applications, in a way where each one is isolated from the
+        others, by using a separate class loader for each application. For instance, one set
+        of UIMA Framework Classes could be shared by multiple sets of application - specific
+        classes, even if these application-specific classes had the same names but were
+        different versions.</para>
+      
+      <section id="ugr.ref.jcas.class_loaders.optional">
+        <title>Use of Class Loaders is optional</title>
+        
+        <para>The UIMA framework will use a specific ClassLoader, based on how
+          ResourceManager instances are used. Specific ClassLoaders are only created if
+          you specify an ExtensionClassPath as part of the ResourceManager. If you do not
+          need to support multiple applications within one UIMA framework within a JVM,
+          don&apos;t specify an ExtensionClassPath; in this case, the classloader used
+          will be the one used to load the UIMA framework - usually the overall application
+          class loader.</para>
+        
+        <para>Of course, you should not run multiple UIMA applications together, in this
+          way, if they have different class definitions for the same class name. This
+          includes the JCas <quote>cover</quote> classes. This case might arise, for
+          instance, if both applications extended
+          <literal>uima.tcas.DocumentAnnotation</literal> in differing,
+          incompatible ways. Each application would need its own definition of this class,
+          but only one could be loaded (unless you specify ExtensionClassPath in the
+          ResourceManager which will cause the UIMA application to load its private
+          versions of its classes, from its classpath).</para>
+      </section>
+    </section>
+    
+    <section id="ugr.ref.jcas.accessing_jcas_objects_outside_uima_components">
+      <title>Issues accessing JCas objects outside of UIMA Engine Components</title>
+      
+      <para>If you are using the ExtensionClassPaths, the JCas cover classes are loaded
+        under a class loader created by the ResourceManager part of the UIMA Framework.
+        If you reference the same JCas
+        classes outside of any UIMA component, for instance, in top level application code,
+        the JCas classes used by that top level application code also must be in the class path
+        for the application code.</para>
+      
+      <para>Alternatively, you could do all the JCas processing inside a UIMA component (and do no
+        processing using JCas outside of the UIMA pipeline).</para>
+      
+    </section>
+  </section>
+  
+  <section id="ugr.ref.jcas.setting_up_classpath">
+    <title>Setting up Classpath for JCas</title>
+    
+    <para>The JCas Java classes generated by JCasGen are typically compiled and put into a JAR
+      file, which, in turn, is put into the application&apos;s class path.</para>
+    
+    <para>This JAR file must be generated from the application&apos;s merged type system.
+      This is most conveniently done by opening the top level descriptor used by the
+      application in the Component Descriptor Editor tool, and pressing the Run-JCasGen
+      button on the Type System Definition page.</para>
+    
+  </section>
+  
+  <section id="ugr.ref.jcas.pear_support">
+    <title>PEAR isolation</title>
+    <para>
+      As of version 2.2, the framework supports component descriptors which are PEAR descriptors. 
+      These descriptors define components plus include information on the class path needed to 
+      run them.  The framework uses the class path information to set up a localized class path, just
+      for code running within the PEAR context.  This allows PEAR files requiring different 
+      versions of common code to work well together, even if the class names in the different versions
+      have the same names. 
+    </para>
+    
+  </section>
+  
 </chapter>
\ No newline at end of file

Propchange: incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/references/ref.jcas.xml
------------------------------------------------------------------------------
    svn:eol-style = native