You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by sc...@apache.org on 2008/08/28 23:28:16 UTC

svn commit: r689997 [29/32] - in /incubator/uima/uimaj/trunk/uima-docbooks: ./ src/ src/docbook/overview_and_setup/ src/docbook/references/ src/docbook/tools/ src/docbook/tutorials_and_users_guides/ src/docbook/uima/organization/ src/olink/references/

Modified: incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.multi_views.xml
URL: http://svn.apache.org/viewvc/incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.multi_views.xml?rev=689997&r1=689996&r2=689997&view=diff
==============================================================================
--- incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.multi_views.xml (original)
+++ incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.multi_views.xml Thu Aug 28 14:28:14 2008
@@ -1,696 +1,696 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[
-<!ENTITY % uimaents SYSTEM "../entities.ent">  
-%uimaents;
-]>
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-   http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-<chapter id="ugr.tug.mvs">
-  <title>Multiple CAS Views of an Artifact</title>
-  <titleabbrev>Multiple CAS Views</titleabbrev>
-  
-  <para>UIMA provides an extension to the basic model of the CAS which supports analysis of
-    multiple views of the same artifact, all contained with the CAS. This chapter describes
-    the concepts, terminology, and the API and XML extensions that enable this.</para>
-  
-  <para>Multiple CAS Views can simplify things when different versions of the artifact are
-    needed at different stages of the analysis. They are also key to enabling multimodal
-    analysis where the initial artifact is transformed from one modality to another, or where
-    the artifact itself is multimodal, such as the audio, video and closed-captioned text
-    associated with an MPEG object. Each representation of the artifact can be analyzed
-    independently with the standard UIMA programming model; in addition, multi-view
-    components and applications can be constructed.</para>
-  
-  <para>UIMA supports this by augmenting the CAS with additional light-weight CAS objects,
-    one for each view, where these objects share most of the same underlying CAS, except for two
-    things: each view has its own set of indexed Feature Structures, and each view has its own
-    subject of analysis (Sofa) - its own version of the artifact being analyzed. The Feature
-    Structure instances themselves are in the shared part of the CAS; only the entries in the
-    indexes are unique for each CAS view.</para>
-  
-  <para>All of these CAS view objects are kept together with the CAS, and passed as a unit
-    between components in a UIMA application. APIs exist which allow components and
-    applications to switch among the various view objects, as needed.</para>
-  
-  <para>Feature Structures may be indexed in multiple views, if necessary. New methods on CAS
-    Views facilitate adding or removing Feature Structures to or from their index
-    repositories:</para>
-  
-  
-  <programlisting>aView.addFsToIndexes(aFeatureStructure) 
-aView.removeFsFromIndexes(aFeatureStructure)</programlisting>
-  
-  <para>specify the view in which this Feature Structure should be added to or removed from the
-    indexes.</para>
-  
-  <section id="ugr.tug.mvs.cas_views_and_sofas">
-    <title>CAS Views and Sofas</title>
-    
-    <para>Sofas (see <olink targetdoc="&uima_docs_tutorial_guides;"
-        targetptr="ugr.tug.aas.sofa"/>) and CAS Views are linked. In this implementation,
-      every CAS view has one associated Sofa, and every Sofa has one associated CAS
-      View.</para>
-    
-    <section id="ugr.tug.mvs.naming_views_sofas">
-      <title>Naming CAS Views and Sofas</title>
-      
-      <para>The developer assigns a name to the View / Sofa, which is a simple string
-        (following the rules for Java identifiers, usually without periods, but see special
-        exception below). These names are declared in the component XML metadata, and are
-        used during assembly and by the runtime to enable switching among multiple Views of
-        the CAS at the same time.</para>
-      <note><para>The name is called the Sofa name, for historical reasons, but it applies
-      equally to the View. In the rest of this chapter, we&apos;ll refer to it as the Sofa
-      name.</para></note>
-      
-      <para>Some applications contain components that expect a variable number of Sofas as
-        input or output. An example of a component that takes a variable number of input Sofas
-        could be one that takes several translations of a document and merges them, where each
-        translation was in a separate Sofa. </para>
-      
-      <para> You can specify a variable number of input or output sofa names, where each name
-        has the same base part, by writing the base part of the name (with no periods), followed
-        by a period character and an asterisk character (.*). These denote sofas that have
-        names matching the base part up to the period; for example, names such as
-        <literal>base_name_part.TTX_3d</literal> would match a specification of
-        <literal>base_name_part.*</literal>.</para>
-      
-    </section>
-    
-    <section id="ugr.tug.mvs.multi_view_and_single_view">
-      <title>Multi-View, Single-View components &amp; applications</title>
-      <titleabbrev>Multi/Single View parts in Applications</titleabbrev>
-      
-      <para>Components and applications can be written to be Multi-View or Single-View.
-        Most components used as primitive building blocks are expected to be Single-View.
-        UIMA provides capabilities to combine these kinds of components with Multi-View
-        components when assembling analysis aggregates or applications.</para>
-      
-      <para>Single-View components and applications use only one subject of analysis, and
-        one CAS View. The code and descriptors for these components do not use the facilities
-        described in this chapter.</para>
-      
-      <para>Conversely, Multi-View components and applications are aware of the
-        possibility of multiple Views and Sofas, and have code and XML descriptors that
-        create and manipulate them.</para>
-      
-    </section>
-  </section>
-  
-  <section id="ugr.tug.mvs.multi_view_components">
-    <title>Multi-View Components</title>
-    <section id="ugr.tug.mvs.deciding_multi_view">
-      <title>How UIMA decides if a component is Multi-View</title>
-      <titleabbrev>Deciding: Multi-View</titleabbrev>
-      
-      <para>Every UIMA component has an associated XML Component Descriptor. Multi-View
-        components are identified simply as those whose descriptors declare one or more Sofa
-        names in their Capability sections, as inputs or outputs. If a Component Descriptor
-        does not mention any input or output Sofa names, the framework treats that component
-        as a Single-View component.</para>
-      
-      <para>A Multi-View component is passed a special kind of a CAS object, called a base CAS,
-        which it must use to switch to the particular view it wishes to process. The base CAS
-        object itself has no Sofa and no ability to use Indexes; only the views have that
-        capability.</para>
-      
-    </section>
-    
-    <section id="ugr.tug.mvs.additional_capabilities">
-      <title>Multi-View: additional capabilities</title>
-      
-      <para>Additional capabilities provided for components and applications aware of the
-        possibilities of multiple Views and Sofas include:</para>
-      
-      <itemizedlist spacing="compact"><listitem><para>Creating new Views, and for
-        each, setting up the associated Sofa data</para></listitem>
-        
-        <listitem><para>Getting a reference to an existing View and its associated Sofa, by
-          name </para></listitem>
-        
-        <listitem><para>Specifying a view in which to index a particular Feature Structure
-          instance </para></listitem></itemizedlist>
-      
-    </section>
-    
-    <section id="ugr.tug.mvs.component_xml_metadata">
-      <title>Component XML metadata</title>
-      
-      <para>Each Multi-View component that creates a Sofa or wants to switch to a specific
-        previously created Sofa must declare the name for the Sofa in the capabilities
-        section. For example, a component expecting as input a web document in html format and
-        creating a plain text document for further processing might declare:</para>
-      
-      
-      <programlisting>&lt;capabilities&gt;
-  &lt;capability&gt;
-    &lt;inputs/&gt;
-    &lt;outputs/&gt;
-    &lt;inputSofas&gt;
-<emphasis role="bold">      &lt;sofaName&gt;rawContent&lt;/sofaName&gt;</emphasis>
-    &lt;/inputSofas&gt;
-    &lt;outputSofas&gt;
-<emphasis role="bold">      &lt;sofaName&gt;detagContent&lt;/sofaName&gt;</emphasis>
-    &lt;/outputSofas&gt;
-  &lt;/capability&gt;
-&lt;/capabilities&gt;</programlisting>
-      
-      <para>Details on this specification are found in <olink
-          targetdoc="&uima_docs_ref;"
-          targetptr="ugr.ref.xml.component_descriptor"/>. The Component Descriptor
-        Editor supports Sofa declarations on the <olink targetdoc="&uima_docs_tools;"
-          targetptr="ugr.tools.cde.capabilities"/>.</para>
-      
-    </section>
-  </section>
-  
-  <section id="ugr.tug.mvs.sofa_capabilities_and_apis_for_apps">
-    <title>Sofa Capabilities and APIs for Applications</title>
-    <titleabbrev>Sofa Capabilities &amp; APIs for Apps</titleabbrev>
-    
-    <para>In addition to components, applications can make use of these capabilities. When
-      an application creates a new CAS, it also creates the initial view of that CAS - and this
-      view is the object that is returned from the create call. Additional views beyond this
-      first one can be dynamically created at any time. The application can use the Sofa APIs
-      described in <olink targetdoc="&uima_docs_tutorial_guides;"
-        targetptr="ugr.tug.aas"/> to specify the data to be analyzed.</para>
-    
-    <para>If an Application creates a new CAS, the initial CAS that is created will be a view
-      named <quote>_InitialView</quote>. This name can be used in the application and in
-      Sofa Mapping (see the next section) to refer to this otherwise unnamed view.</para>
-    
-  </section>
-  
-  <section id="ugr.tug.mvs.sofa_name_mapping">
-    <title>Sofa Name Mapping</title>
-    
-    <para>Sofa Name mapping is the mechanism which enables UIMA component developers to
-      choose locally meaningful Sofa names in their source code and let aggregate,
-      collection processing engine developers, and application developers connect output
-      Sofas created in one component to input Sofas required in another.</para>
-    
-    <para>At a given aggregation level, the assembler or application developer defines
-      names for all the Sofas, and then specifies how these names map to the contained
-      components, using the Sofa Map.</para>
-    
-    <para>Consider annotator code to create a new CAS view:</para>
-    
-    
-    <programlisting>CAS viewX = cas.createView("X");</programlisting>
-    
-    <para>Or code to get an existing CAS view:</para>
-    
-    <programlisting>CAS viewX = cas.getView("X");</programlisting>
-    
-    <para>Without Sofa name mapping the SofaID for the new Sofa will be <quote>X</quote>.
-      However, if a name mapping for <quote>X</quote> has been specified by the aggregate or
-      CPE calling this annotator, the actual SofaID in the CAS can be different.</para>
-    
-    <para>All Sofas in a CAS must have unique names. This is accomplished by mapping all
-      declared Sofas as described in the following sections. An attempt to create a Sofa with a
-      SofaID already in use will throw an exception.</para>
-    
-    <para>Sofa name mapping must not use the <quote>.</quote> (period) character. Runtime Sofa
-      mapping maps names up to the <quote>.</quote> and appends the period and the following
-      characters to the mapped name.</para>
-    
-    <para>To get a Java Iterator for all the views in a CAS:</para>
-    
-    <programlisting>Iterator allViews = cas.getViewIterator();</programlisting>
-    
-    <para>To get a Java Iterator for selected views in a CAS, for example, views whose name 
-      is either exactly equal to namePrefix or is of the form namePrefix.suffix, where suffix 
-      can be any String:</para>
-    
-    <programlisting>Iterator someViews = cas.getViewIterator(String namePrefix);</programlisting>
-
-      <note><para>Sofa name mapping is applied to namePrefix.</para></note>
-    
-    <para>Sofa name mappings are not currently supported for remote Analysis Engines.
-      See <xref linkend="ugr.tug.mvs.name_mapping_remote_services"/>.</para>
-               
-    <section id="ugr.tug.mvs.name_mapping_aggregate">
-      <title>Name Mapping in an Aggregate Descriptor</title>
-      
-      <para>For each component of an Aggregate, name mapping specifies the conversion
-        between component Sofa names and names at the aggregate level.</para>
-      
-      <para>Here&apos;s an example. Consider two Multi-View annotators to be assembled
-        into an aggregate which takes an audio segment consisting of spoken English and
-        produces a German text translation.</para>
-      
-      <para>The first annotator takes an audio segment as input Sofa and produces a text
-        transcript as output Sofa. The annotator designer might choose these Sofa names to be
-        <quote>AudioInput</quote> and <quote>TranscribedText</quote>.</para>
-      
-      <para>The second annotator is designed to translate text from English to German. This
-        developer might choose the input and output Sofa names to be
-        <quote>EnglishDocument</quote> and <quote>GermanDocument</quote>,
-        respectively.</para>
-      
-      <para>In order to hook these two annotators together, the following section would be
-        added to the top level of the aggregate descriptor:</para>
-      
-      
-      <programlisting><![CDATA[<sofaMappings>
-  <sofaMapping>
-    <componentKey>SpeechToText</componentKey>
-    <componentSofaName>AudioInput</componentSofaName>
-    <aggregateSofaName>SegementedAudio</aggregateSofaName>
-  </sofaMapping>
-  <sofaMapping>
-    <componentKey>SpeechToText</componentKey>
-    <componentSofaName>TranscribedText</componentSofaName>
-    <aggregateSofaName>EnglishTranscript</aggregateSofaName>
-  </sofaMapping>
-  <sofaMapping>
-    <componentKey>EnglishToGermanTranslator</componentKey>
-    <componentSofaName>EnglishDocument</componentSofaName>
-    <aggregateSofaName>EnglishTranscript</aggregateSofaName>
-  </sofaMapping>
-  <sofaMapping>
-    <componentKey>EnglishToGermanTranslator</componentKey>
-    <componentSofaName>GermanDocument</componentSofaName>
-    <aggregateSofaName>GermanTranslation</aggregateSofaName>
-  </sofaMapping>
-</sofaMappings>]]></programlisting>
-      
-      <para>The Component Descriptor Editor supports Sofa name mapping in aggregates and
-        simplifies the task. See <olink targetdoc="&uima_docs_tools;"
-          targetptr="ugr.tools.cde.capabilities.sofa_name_mapping"/> for details.</para> 
-    </section>
-    
-    <section id="ugr.tug.mvs.name_mapping_cpe"><title>Name Mapping in a CPE
-      Descriptor</title>
-      
-      <para>The CPE descriptor aggregates together a Collection Reader and CAS Processors
-        (Annotators and CAS Consumers). Sofa mappings can be added to the following elements
-        of CPE descriptors: <literal>&lt;collectionIterator&gt;</literal>,
-        <literal>&lt;casInitializer&gt;</literal> and the
-        <literal>&lt;casProcessor&gt;</literal>. To be consistent with the
-        organization of CPE descriptors, the maps for the CPE descriptor are distributed
-        among the XML markup for each of the parts (collectionIterator, casInitializer,
-        casProcessor). Because of this the<literal>
-        &lt;componentKey&gt;</literal> element is not needed. Finally, rather than
-        sub-elements for the parts, the XML markup for these uses attributes. See <olink
-          targetdoc="&uima_docs_ref;"
-          targetptr="ugr.ref.xml.cpe_descriptor.descriptor.cas_processors.individual.sofa_name_mappings"/>.</para>
-      
-      <para>Here&apos;s an example. Let&apos;s use the aggregate from the previous section
-        in a collection processing engine. Here we will add a Collection Reader that outputs
-        audio segments in an output Sofa named <quote>nextSegment</quote>. Remember to
-        declare an output Sofa nextSegment in the collection reader description.
-        We&apos;ll add a CAS Consumer in the next section.</para>
-      
-      
-      <programlisting>&lt;collectionReader&gt;
-  &lt;collectionIterator&gt;
-    &lt;descriptor&gt;
-    . . .
-    &lt;/descriptor&gt;
-    &lt;configurationParameterSettings&gt;...&lt;/configurationParameterSettings&gt;
-<emphasis role="bold">    &lt;sofaNameMappings&gt;
-      &lt;sofaNameMapping componentSofaName="nextSegment"
-                       cpeSofaName="SegementedAudio"/&gt;
-      &lt;/sofaNameMappings&gt;
-</emphasis>  &lt;/collectionIterator&gt;
-  &lt;casInitializer/&gt;
-&lt;collectionReader&gt;</programlisting>
-      
-      <para>At this point the CAS Processor section for the aggregate does not need any Sofa
-        mapping because the aggregate input Sofa has the same name,
-        <quote>SegementedAudio</quote>, as is being produced by the Collection
-        Reader.</para>
-      
-    </section>
-    
-    <section id="ugr.tug.mvs.specifying_cas_view_for_single_view">
-      <title>Specifying the CAS View for a Single-View Component</title>
-      <titleabbrev>CAS View for Single-View Parts</titleabbrev>
-      
-      <para>Single-View components receive a Sofa named <quote>_InitialView</quote>, or
-        a Sofa that is mapped to this name.</para>
-      
-      <para>For example, assume that the CAS Consumer to be used in our CPE is a Single-View
-        component that expects the analysis results associated with the input CAS, and that
-        we want it to use the results from the translated German text Sofa. The following
-        mapping added to the CAS Processor section for the CPE will instruct the CPE to get the
-        CAS view for the German text Sofa and pass it to the CAS Consumer:</para>
-      
-      
-      <programlisting>&lt;casProcessor&gt;
-  . . .
-  <emphasis role="bold">&lt;sofaNameMappings&gt;
-    &lt;sofaNameMapping componentSofaName="_InitialView"
-                           cpeSofaName="GermanTranslation"/&gt;
-  &lt;sofaNameMappings&gt;
-</emphasis>&lt;/casProcessor&gt;</programlisting>
-      
-      <para id="ugr.tug.mvs.sofa_mapping_leav_out_name">An alternative syntax for
-        this kind of mapping is to simply leave out the component sofa name in this
-        case.</para>
-      
-    </section>
-    
-    <section id="ugr.tug.mvs.name_mapping_application">
-      <title>Name Mapping in a UIMA Application</title>
-      
-      <para>Applications which instantiate UIMA components directly using the
-        UIMAFramework methods can also create a top level Sofa mapping using the
-        <quote>additional parameters</quote> capability.</para>
-      
-      
-      <programlisting>//create a "root" UIMA context for your whole application
-
-UimaContextAdmin rootContext =
-   UIMAFramework.newUimaContext(UIMAFramework.getLogger(),
-      UIMAFramework.newDefaultResourceManager(),
-      UIMAFramework.newConfigurationManager());
-
-input = new XMLInputSource("test.xml");
-desc = UIMAFramework.getXMLParser().parseAnalysisEngineDescription(input);
-
-//setup sofa name mappings using the api
-
-HashMap sofamappings = new HashMap();
-sofamappings.put("localName1", "globalName1");
-sofamappings.put("localName2", "globalName2");
-  
-//create a UIMA Context for the new AE we are about to create
-
-//first argument is unique key among all AEs used in the application
-UimaContextAdmin childContext = rootContext.createChild("myAE", sofamap);
-
-//instantiate AE, passing the UIMA Context through the additional
-//parameters map
-
-Map additionalParams = new HashMap();
-additionalParams.put(Resource.PARAM_UIMA_CONTEXT, childContext);
-
-AnalysisEngine ae = 
-        UIMAFramework.produceAnalysisEngine(desc,additionalParams);</programlisting>
-      
-      <para>Sofa mappings are applied from the inside out, i.e., local to global. First, any
-        aggregate mappings are applied, then any CPE mappings, and finally, any specified
-        using this <quote>additional parameters</quote> capability.</para>
-      
-    </section>
-    
-    <section id="ugr.tug.mvs.name_mapping_remote_services">
-      <title>Name Mapping for Remote Services</title>
-      
-      <para>Currently, no client-side Sofa mapping information is passed from a UIMA client
-        to a remote service. This can cause complications for UIMA services in a Multi-View
-        application.</para>
-      
-      <para>Remote Multi-View services will work only if the service is Single-View, or if the 
-        Sofa names expected by the service exactly match the Sofa names produced by the client.</para>
-      
-      <para>If your application requires Sofa mappings for a remote Analysis Engine, you
-        can wrap your remotely deployed AE in an aggregate (on the remote side), and specify
-        the necessary Sofa mappings in the descriptor for that aggregate.</para>
-    </section>
-  </section>
-  
-  <section id="ugr.tug.mvs.jcas_extensions_for_multi_views">
-    <title>JCas extensions for Multiple Views</title>
-    
-    <para>The JCas interface to the CAS can be used with any / all views, as well as the base CAS
-      sent to Multi-View components. You can always get a JCas object from an existing CAS
-      object by using the method getJCas(); this call will create the JCas if it doesn&apos;t
-      already exist. If it does exist, it just returns the existing JCas that corresponds to
-      the CAS.</para>
-    
-    <para>JCas implements the getView(...) method, enabling switching to other named
-      views, just like the corresponding method on the CAS. The JCas version, however,
-      returns JCas objects, instead of CAS objects, corresponding to the view.</para>
-  </section>
-  
-  <section id="ugr.tug.mvs.sample_application">
-    <title>Sample Multi-View Application</title>
-    
-    <para>The UIMA SDK contains a simple Sofa example application which demonstrates many
-      Sofa specific concepts and methods. The source code for the application driver is in
-      <literal>examples/src/org/apache/uima/examples/SofaExampleApplication.java</literal>
-      and the Multi-View annotator is given in
-      <literal>SofaExampleAnnotator.java</literal> in the same directory.</para>
-    
-    <para>This sample application demonstrates a language translator annotator which
-      expects an input text Sofa with an English document and creates an output text Sofa
-      containing a German translation. Some of the key Sofa concepts illustrated here
-      include:</para>
-    
-    <itemizedlist spacing="compact"><listitem><para>Sofa creation.</para>
-      </listitem>
-      
-      <listitem><para>Access of multiple CAS views.</para></listitem>
-      
-      <listitem><para>Unique feature structure index space for each view.</para>
-        </listitem>
-      
-      <listitem><para>Feature structures containing cross references between
-        annotations in different CAS views.</para></listitem>
-      
-      <listitem><para>The strong affinity of annotations with a specific Sofa. </para>
-        </listitem></itemizedlist>
-    
-    <section id="ugr.tug.mvs.sample_application.descriptor">
-      <title>Annotator Descriptor</title>
-      
-      <para>The annotator descriptor in
-        <literal>examples/descriptors/analysis_engine/SofaExampleAnnotator.xml</literal>
-        declares an input Sofa named <quote>EnglishDocument</quote> and an output Sofa
-        named <quote>GermanDocument</quote>. A custom type
-        <quote>CrossAnnotation</quote> is also defined:</para>
-      
-      
-      <programlisting><![CDATA[<typeDescription>
-  <name>sofa.test.CrossAnnotation</name>
-  <description/>
-  <supertypeName>uima.tcas.Annotation</supertypeName>
-  <features>
-    <featureDescription>
-      <name>otherAnnotation</name>
-      <description/>
-      <rangeTypeName>uima.tcas.Annotation</rangeTypeName>
-    </featureDescription>
-  </features>
-</typeDescription>]]></programlisting>
-      
-      <para>The <literal>CrossAnnotation</literal> type is derived from
-        <literal>uima.tcas.Annotation </literal>and includes one new feature: a
-        reference to another annotation.</para>
-      
-    </section>
-    
-    <section id="ugr.tug.mvs.sample_application.setup">
-      <title>Application Setup</title>
-      
-      <para>The application driver instantiates an analysis engine,
-        <literal>seAnnotator</literal>, from the annotator descriptor, obtains a new
-        base CAS using that engine&apos;s CAS definition, and creates the expected input
-        Sofa using:</para>
-      
-      
-      <programlisting>CAS cas = seAnnotator.newCAS();
-CAS aView = cas.createView("EnglishDocument");</programlisting>
-      
-      <para>Since <literal>seAnnotator</literal> is a primitive component, and no Sofa
-        mapping has been defined, the SofaID will be <quote>EnglishDocument</quote>.
-        Local Sofa data is set using:</para>
-      
-      
-      <programlisting>aView.setDocumentText("this beer is good");</programlisting>
-      
-      <para>At this point the CAS contains all necessary inputs for the translation
-        annotator and its process method is called.</para>
-      
-    </section>
-    
-    <section id="ugr.tug.mvs.sample_application.annotator_processing">
-      <title>Annotator Processing</title>
-      
-      <para>Annotator processing consists of parsing the English document into individual
-        words, doing word-by-word translation and concatenating the translations into a
-        German translation. Analysis metadata on the English Sofa will be an annotation for
-        each English word. Analysis metadata on the German Sofa will be a
-        <literal>CrossAnnotation</literal> for each German word, where the
-        <literal>otherAnnotation</literal> feature will be a reference to the associated
-        English annotation.</para>
-      
-      <para>Code of interest includes two CAS views:</para>
-      
-      
-      <programlisting>// get View of the English text Sofa
-englishView = aCas.getView("EnglishDocument");
-
-// Create the output German text Sofa
-germanView = aCas.createView("GermanDocument");</programlisting>
-      
-      <para>the indexing of annotations with the appropriate view:</para>
-      
-      
-      <programlisting>englishView.addFsToIndexes(engAnnot);
-. . .
-germanView.addFsToIndexes(germAnnot);</programlisting>
-      
-      <para>and the combining of metadata belonging to different Sofas in the same feature
-        structure:</para>
-      
-      
-      <programlisting>// add link to English text
-germAnnot.setFeatureValue(other, engAnnot);</programlisting>
-      
-    </section>
-    
-    <section id="ugr.tug.mvs.sample_application.accessing_results">
-      <title>Accessing the results of analysis</title>
-      
-      <para>The application needs to get the results of analysis, which may be in different
-        views. Analysis results for each Sofa are dumped independently by iterating over all
-        annotations for each associated CAS view. For the English Sofa:</para>
-      
-      
-      <programlisting>//get annotation iterator for this CAS
-FSIndex anIndex = aView.getAnnotationIndex();
-FSIterator anIter = anIndex.iterator();
-while (anIter.isValid()) {
-  AnnotationFS annot = (AnnotationFS) anIter.get();
-  System.out.println(" " + annot.getType().getName()
-                         + ": " + annot.getCoveredText());
-  anIter.moveToNext();
-}</programlisting>
-      
-      <para>Iterating over all German annotations looks the same, except for the
-        following:</para>
-      
-      
-      <programlisting>if (annot.getType() == cross) {
-  AnnotationFS crossAnnot =
-          (AnnotationFS) annot.getFeatureValue(other);
-  System.out.println("   other annotation feature: "
-          + crossAnnot.getCoveredText());
-}</programlisting>
-      
-      <para>Of particular interest here is the built-in Annotation type method
-        <literal>getCoveredText()</literal>. This method uses the
-        <quote>begin</quote> and <quote>end</quote> features of the annotation to create
-        a substring from the CAS document. The SofaRef feature of the annotation is used to
-        identify the correct Sofa&apos;s data from which to create the substring.</para>
-      
-      <para>The example program output is:</para>
-      
-      
-      <programlisting>---Printing all annotations for English Sofa---
-uima.tcas.DocumentAnnotation: this beer is good
-uima.tcas.Annotation: this
-uima.tcas.Annotation: beer
-uima.tcas.Annotation: is
-uima.tcas.Annotation: good
-      
----Printing all annotations for German Sofa---
-uima.tcas.DocumentAnnotation: das bier ist gut
-sofa.test.CrossAnnotation: das
- other annotation feature: this
-sofa.test.CrossAnnotation: bier
- other annotation feature: beer
-sofa.test.CrossAnnotation: ist
- other annotation feature: is
-sofa.test.CrossAnnotation: gut
- other annotation feature: good</programlisting>
-      
-    </section>
-  </section>
-  
-  <section id="ugr.tug.mvs.views_api_summary">
-    <title>Views API Summary</title>
-    
-    <para>The recommended way to deliver a particular CAS view to a <emphasis role="bold-italic">Single-View</emphasis> component is to use by Sofa-mapping in
-      the CPE and/or aggregate descriptors.</para>
-    
-    <para>For <emphasis role="bold-italic">Multi-View </emphasis> components or
-      applications, the following methods are used to create or get a reference to a CAS view
-      for a particular Sofa:</para>
-    
-    <para>Creating a new View:</para>
-    
-    
-    <programlisting>JCas newView = aJCas.createView(String localNameOfTheViewBeforeMapping);
-CAS  newView = aCAS .createView(String localNameOfTheViewBeforeMapping);</programlisting>
-    
-    <para>Getting a View from a CAS or JCas:</para>
-    
-    
-    <programlisting><?db-font-size 80% ?>JCas myView = aJCas.getView(String localNameOfTheViewBeforeMapping);
-CAS  myView = aCAS .getView(String localNameOfTheViewBeforeMapping);
-Iterator allViews = aCasOrJCas.getViewIterator();
-Iterator someViews = aCasOrJCas.getViewIterator(String localViewNamePrefix);</programlisting>
-    
-    <para>The following methods are useful for all annotators and applications:</para>
-    
-    <para>Setting Sofa data for a CAS or JCas:</para>
-    
-    
-    <programlisting>aCasOrJCas.setDocumentText(String docText);
-aCasOrJCas.setSofaDataString(String docText, String mimeType);
-aCasOrJCas.setSofaDataArray(FeatureStructure array, String mimeType);
-aCasOrJCas.setSofaDataURI(String uri, String mimeType);</programlisting>
-    
-    <para>Getting Sofa data for a particular CAS or JCas:</para>
-    
-    
-    <programlisting>String doc = aCasOrJCas.getDocumentText();
-String doc = aCasOrJCas.getSofaDataString();
-FeatureStructure array = aCasOrJCas.getSofaDataArray();
-String uri = aCasOrJCas.getSofaDataURI();
-InputStream is = aCasOrJCas.getSofaDataStream();</programlisting>
-    
-  </section>
-  
-  <section id="ugr.tug.mvs.sofa_incompatibilities_v1_v2">
-    <title>Sofa Incompatibilities between UIMA version 1 and version 2</title>
-    <titleabbrev>Sofa Incompatibilities: V1 and V2</titleabbrev>
-    
-    <para>A major change in version 2 is related to the support of Single-View components
-      and applications. Given an analysis engine, <literal>ae</literal>, the API
-      
-      <programlisting>CAS cas = ae.newCas();</programlisting>
-      used to return the base CAS. Now it returns a view of the Sofa named
-      <quote>_InitialView</quote>. This Sofa will actually only be created if any Sofa data
-      is set for this view. The initial view is used for Single-View applications and
-      Multi-View annotators with no Sofa mapping.</para>
-    
-    <para>The process method of Multi-View annotators receive the base CAS, however the base
-      CAS no longer has an index repository to hold <quote>global</quote> data. Global data
-      needs to be put in a specific named CAS view of your choice.</para>
-    
-    <para>Because of these changes, the following scenarios will break with v2.0 clients:
-      
-      <itemizedlist spacing="compact"><listitem><para>Any version 1.x services (you
-        must migrate the services to version 2).</para></listitem>
-        
-        <listitem><para>Applications or components explicitly referencing
-          <quote>_DefaultTextSofaName</quote> in code or descriptors.</para>
-          </listitem>
-        
-        <listitem><para>Multi-View applications using the Base CAS index repository.
-          </para></listitem></itemizedlist></para>
-  </section>
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[
+<!ENTITY % uimaents SYSTEM "../entities.ent">  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tug.mvs">
+  <title>Multiple CAS Views of an Artifact</title>
+  <titleabbrev>Multiple CAS Views</titleabbrev>
+  
+  <para>UIMA provides an extension to the basic model of the CAS which supports analysis of
+    multiple views of the same artifact, all contained with the CAS. This chapter describes
+    the concepts, terminology, and the API and XML extensions that enable this.</para>
+  
+  <para>Multiple CAS Views can simplify things when different versions of the artifact are
+    needed at different stages of the analysis. They are also key to enabling multimodal
+    analysis where the initial artifact is transformed from one modality to another, or where
+    the artifact itself is multimodal, such as the audio, video and closed-captioned text
+    associated with an MPEG object. Each representation of the artifact can be analyzed
+    independently with the standard UIMA programming model; in addition, multi-view
+    components and applications can be constructed.</para>
+  
+  <para>UIMA supports this by augmenting the CAS with additional light-weight CAS objects,
+    one for each view, where these objects share most of the same underlying CAS, except for two
+    things: each view has its own set of indexed Feature Structures, and each view has its own
+    subject of analysis (Sofa) - its own version of the artifact being analyzed. The Feature
+    Structure instances themselves are in the shared part of the CAS; only the entries in the
+    indexes are unique for each CAS view.</para>
+  
+  <para>All of these CAS view objects are kept together with the CAS, and passed as a unit
+    between components in a UIMA application. APIs exist which allow components and
+    applications to switch among the various view objects, as needed.</para>
+  
+  <para>Feature Structures may be indexed in multiple views, if necessary. New methods on CAS
+    Views facilitate adding or removing Feature Structures to or from their index
+    repositories:</para>
+  
+  
+  <programlisting>aView.addFsToIndexes(aFeatureStructure) 
+aView.removeFsFromIndexes(aFeatureStructure)</programlisting>
+  
+  <para>specify the view in which this Feature Structure should be added to or removed from the
+    indexes.</para>
+  
+  <section id="ugr.tug.mvs.cas_views_and_sofas">
+    <title>CAS Views and Sofas</title>
+    
+    <para>Sofas (see <olink targetdoc="&uima_docs_tutorial_guides;"
+        targetptr="ugr.tug.aas.sofa"/>) and CAS Views are linked. In this implementation,
+      every CAS view has one associated Sofa, and every Sofa has one associated CAS
+      View.</para>
+    
+    <section id="ugr.tug.mvs.naming_views_sofas">
+      <title>Naming CAS Views and Sofas</title>
+      
+      <para>The developer assigns a name to the View / Sofa, which is a simple string
+        (following the rules for Java identifiers, usually without periods, but see special
+        exception below). These names are declared in the component XML metadata, and are
+        used during assembly and by the runtime to enable switching among multiple Views of
+        the CAS at the same time.</para>
+      <note><para>The name is called the Sofa name, for historical reasons, but it applies
+      equally to the View. In the rest of this chapter, we&apos;ll refer to it as the Sofa
+      name.</para></note>
+      
+      <para>Some applications contain components that expect a variable number of Sofas as
+        input or output. An example of a component that takes a variable number of input Sofas
+        could be one that takes several translations of a document and merges them, where each
+        translation was in a separate Sofa. </para>
+      
+      <para> You can specify a variable number of input or output sofa names, where each name
+        has the same base part, by writing the base part of the name (with no periods), followed
+        by a period character and an asterisk character (.*). These denote sofas that have
+        names matching the base part up to the period; for example, names such as
+        <literal>base_name_part.TTX_3d</literal> would match a specification of
+        <literal>base_name_part.*</literal>.</para>
+      
+    </section>
+    
+    <section id="ugr.tug.mvs.multi_view_and_single_view">
+      <title>Multi-View, Single-View components &amp; applications</title>
+      <titleabbrev>Multi/Single View parts in Applications</titleabbrev>
+      
+      <para>Components and applications can be written to be Multi-View or Single-View.
+        Most components used as primitive building blocks are expected to be Single-View.
+        UIMA provides capabilities to combine these kinds of components with Multi-View
+        components when assembling analysis aggregates or applications.</para>
+      
+      <para>Single-View components and applications use only one subject of analysis, and
+        one CAS View. The code and descriptors for these components do not use the facilities
+        described in this chapter.</para>
+      
+      <para>Conversely, Multi-View components and applications are aware of the
+        possibility of multiple Views and Sofas, and have code and XML descriptors that
+        create and manipulate them.</para>
+      
+    </section>
+  </section>
+  
+  <section id="ugr.tug.mvs.multi_view_components">
+    <title>Multi-View Components</title>
+    <section id="ugr.tug.mvs.deciding_multi_view">
+      <title>How UIMA decides if a component is Multi-View</title>
+      <titleabbrev>Deciding: Multi-View</titleabbrev>
+      
+      <para>Every UIMA component has an associated XML Component Descriptor. Multi-View
+        components are identified simply as those whose descriptors declare one or more Sofa
+        names in their Capability sections, as inputs or outputs. If a Component Descriptor
+        does not mention any input or output Sofa names, the framework treats that component
+        as a Single-View component.</para>
+      
+      <para>A Multi-View component is passed a special kind of a CAS object, called a base CAS,
+        which it must use to switch to the particular view it wishes to process. The base CAS
+        object itself has no Sofa and no ability to use Indexes; only the views have that
+        capability.</para>
+      
+    </section>
+    
+    <section id="ugr.tug.mvs.additional_capabilities">
+      <title>Multi-View: additional capabilities</title>
+      
+      <para>Additional capabilities provided for components and applications aware of the
+        possibilities of multiple Views and Sofas include:</para>
+      
+      <itemizedlist spacing="compact"><listitem><para>Creating new Views, and for
+        each, setting up the associated Sofa data</para></listitem>
+        
+        <listitem><para>Getting a reference to an existing View and its associated Sofa, by
+          name </para></listitem>
+        
+        <listitem><para>Specifying a view in which to index a particular Feature Structure
+          instance </para></listitem></itemizedlist>
+      
+    </section>
+    
+    <section id="ugr.tug.mvs.component_xml_metadata">
+      <title>Component XML metadata</title>
+      
+      <para>Each Multi-View component that creates a Sofa or wants to switch to a specific
+        previously created Sofa must declare the name for the Sofa in the capabilities
+        section. For example, a component expecting as input a web document in html format and
+        creating a plain text document for further processing might declare:</para>
+      
+      
+      <programlisting>&lt;capabilities&gt;
+  &lt;capability&gt;
+    &lt;inputs/&gt;
+    &lt;outputs/&gt;
+    &lt;inputSofas&gt;
+<emphasis role="bold">      &lt;sofaName&gt;rawContent&lt;/sofaName&gt;</emphasis>
+    &lt;/inputSofas&gt;
+    &lt;outputSofas&gt;
+<emphasis role="bold">      &lt;sofaName&gt;detagContent&lt;/sofaName&gt;</emphasis>
+    &lt;/outputSofas&gt;
+  &lt;/capability&gt;
+&lt;/capabilities&gt;</programlisting>
+      
+      <para>Details on this specification are found in <olink
+          targetdoc="&uima_docs_ref;"
+          targetptr="ugr.ref.xml.component_descriptor"/>. The Component Descriptor
+        Editor supports Sofa declarations on the <olink targetdoc="&uima_docs_tools;"
+          targetptr="ugr.tools.cde.capabilities"/>.</para>
+      
+    </section>
+  </section>
+  
+  <section id="ugr.tug.mvs.sofa_capabilities_and_apis_for_apps">
+    <title>Sofa Capabilities and APIs for Applications</title>
+    <titleabbrev>Sofa Capabilities &amp; APIs for Apps</titleabbrev>
+    
+    <para>In addition to components, applications can make use of these capabilities. When
+      an application creates a new CAS, it also creates the initial view of that CAS - and this
+      view is the object that is returned from the create call. Additional views beyond this
+      first one can be dynamically created at any time. The application can use the Sofa APIs
+      described in <olink targetdoc="&uima_docs_tutorial_guides;"
+        targetptr="ugr.tug.aas"/> to specify the data to be analyzed.</para>
+    
+    <para>If an Application creates a new CAS, the initial CAS that is created will be a view
+      named <quote>_InitialView</quote>. This name can be used in the application and in
+      Sofa Mapping (see the next section) to refer to this otherwise unnamed view.</para>
+    
+  </section>
+  
+  <section id="ugr.tug.mvs.sofa_name_mapping">
+    <title>Sofa Name Mapping</title>
+    
+    <para>Sofa Name mapping is the mechanism which enables UIMA component developers to
+      choose locally meaningful Sofa names in their source code and let aggregate,
+      collection processing engine developers, and application developers connect output
+      Sofas created in one component to input Sofas required in another.</para>
+    
+    <para>At a given aggregation level, the assembler or application developer defines
+      names for all the Sofas, and then specifies how these names map to the contained
+      components, using the Sofa Map.</para>
+    
+    <para>Consider annotator code to create a new CAS view:</para>
+    
+    
+    <programlisting>CAS viewX = cas.createView("X");</programlisting>
+    
+    <para>Or code to get an existing CAS view:</para>
+    
+    <programlisting>CAS viewX = cas.getView("X");</programlisting>
+    
+    <para>Without Sofa name mapping the SofaID for the new Sofa will be <quote>X</quote>.
+      However, if a name mapping for <quote>X</quote> has been specified by the aggregate or
+      CPE calling this annotator, the actual SofaID in the CAS can be different.</para>
+    
+    <para>All Sofas in a CAS must have unique names. This is accomplished by mapping all
+      declared Sofas as described in the following sections. An attempt to create a Sofa with a
+      SofaID already in use will throw an exception.</para>
+    
+    <para>Sofa name mapping must not use the <quote>.</quote> (period) character. Runtime Sofa
+      mapping maps names up to the <quote>.</quote> and appends the period and the following
+      characters to the mapped name.</para>
+    
+    <para>To get a Java Iterator for all the views in a CAS:</para>
+    
+    <programlisting>Iterator allViews = cas.getViewIterator();</programlisting>
+    
+    <para>To get a Java Iterator for selected views in a CAS, for example, views whose name 
+      is either exactly equal to namePrefix or is of the form namePrefix.suffix, where suffix 
+      can be any String:</para>
+    
+    <programlisting>Iterator someViews = cas.getViewIterator(String namePrefix);</programlisting>
+
+      <note><para>Sofa name mapping is applied to namePrefix.</para></note>
+    
+    <para>Sofa name mappings are not currently supported for remote Analysis Engines.
+      See <xref linkend="ugr.tug.mvs.name_mapping_remote_services"/>.</para>
+               
+    <section id="ugr.tug.mvs.name_mapping_aggregate">
+      <title>Name Mapping in an Aggregate Descriptor</title>
+      
+      <para>For each component of an Aggregate, name mapping specifies the conversion
+        between component Sofa names and names at the aggregate level.</para>
+      
+      <para>Here&apos;s an example. Consider two Multi-View annotators to be assembled
+        into an aggregate which takes an audio segment consisting of spoken English and
+        produces a German text translation.</para>
+      
+      <para>The first annotator takes an audio segment as input Sofa and produces a text
+        transcript as output Sofa. The annotator designer might choose these Sofa names to be
+        <quote>AudioInput</quote> and <quote>TranscribedText</quote>.</para>
+      
+      <para>The second annotator is designed to translate text from English to German. This
+        developer might choose the input and output Sofa names to be
+        <quote>EnglishDocument</quote> and <quote>GermanDocument</quote>,
+        respectively.</para>
+      
+      <para>In order to hook these two annotators together, the following section would be
+        added to the top level of the aggregate descriptor:</para>
+      
+      
+      <programlisting><![CDATA[<sofaMappings>
+  <sofaMapping>
+    <componentKey>SpeechToText</componentKey>
+    <componentSofaName>AudioInput</componentSofaName>
+    <aggregateSofaName>SegementedAudio</aggregateSofaName>
+  </sofaMapping>
+  <sofaMapping>
+    <componentKey>SpeechToText</componentKey>
+    <componentSofaName>TranscribedText</componentSofaName>
+    <aggregateSofaName>EnglishTranscript</aggregateSofaName>
+  </sofaMapping>
+  <sofaMapping>
+    <componentKey>EnglishToGermanTranslator</componentKey>
+    <componentSofaName>EnglishDocument</componentSofaName>
+    <aggregateSofaName>EnglishTranscript</aggregateSofaName>
+  </sofaMapping>
+  <sofaMapping>
+    <componentKey>EnglishToGermanTranslator</componentKey>
+    <componentSofaName>GermanDocument</componentSofaName>
+    <aggregateSofaName>GermanTranslation</aggregateSofaName>
+  </sofaMapping>
+</sofaMappings>]]></programlisting>
+      
+      <para>The Component Descriptor Editor supports Sofa name mapping in aggregates and
+        simplifies the task. See <olink targetdoc="&uima_docs_tools;"
+          targetptr="ugr.tools.cde.capabilities.sofa_name_mapping"/> for details.</para> 
+    </section>
+    
+    <section id="ugr.tug.mvs.name_mapping_cpe"><title>Name Mapping in a CPE
+      Descriptor</title>
+      
+      <para>The CPE descriptor aggregates together a Collection Reader and CAS Processors
+        (Annotators and CAS Consumers). Sofa mappings can be added to the following elements
+        of CPE descriptors: <literal>&lt;collectionIterator&gt;</literal>,
+        <literal>&lt;casInitializer&gt;</literal> and the
+        <literal>&lt;casProcessor&gt;</literal>. To be consistent with the
+        organization of CPE descriptors, the maps for the CPE descriptor are distributed
+        among the XML markup for each of the parts (collectionIterator, casInitializer,
+        casProcessor). Because of this the<literal>
+        &lt;componentKey&gt;</literal> element is not needed. Finally, rather than
+        sub-elements for the parts, the XML markup for these uses attributes. See <olink
+          targetdoc="&uima_docs_ref;"
+          targetptr="ugr.ref.xml.cpe_descriptor.descriptor.cas_processors.individual.sofa_name_mappings"/>.</para>
+      
+      <para>Here&apos;s an example. Let&apos;s use the aggregate from the previous section
+        in a collection processing engine. Here we will add a Collection Reader that outputs
+        audio segments in an output Sofa named <quote>nextSegment</quote>. Remember to
+        declare an output Sofa nextSegment in the collection reader description.
+        We&apos;ll add a CAS Consumer in the next section.</para>
+      
+      
+      <programlisting>&lt;collectionReader&gt;
+  &lt;collectionIterator&gt;
+    &lt;descriptor&gt;
+    . . .
+    &lt;/descriptor&gt;
+    &lt;configurationParameterSettings&gt;...&lt;/configurationParameterSettings&gt;
+<emphasis role="bold">    &lt;sofaNameMappings&gt;
+      &lt;sofaNameMapping componentSofaName="nextSegment"
+                       cpeSofaName="SegementedAudio"/&gt;
+      &lt;/sofaNameMappings&gt;
+</emphasis>  &lt;/collectionIterator&gt;
+  &lt;casInitializer/&gt;
+&lt;collectionReader&gt;</programlisting>
+      
+      <para>At this point the CAS Processor section for the aggregate does not need any Sofa
+        mapping because the aggregate input Sofa has the same name,
+        <quote>SegementedAudio</quote>, as is being produced by the Collection
+        Reader.</para>
+      
+    </section>
+    
+    <section id="ugr.tug.mvs.specifying_cas_view_for_single_view">
+      <title>Specifying the CAS View for a Single-View Component</title>
+      <titleabbrev>CAS View for Single-View Parts</titleabbrev>
+      
+      <para>Single-View components receive a Sofa named <quote>_InitialView</quote>, or
+        a Sofa that is mapped to this name.</para>
+      
+      <para>For example, assume that the CAS Consumer to be used in our CPE is a Single-View
+        component that expects the analysis results associated with the input CAS, and that
+        we want it to use the results from the translated German text Sofa. The following
+        mapping added to the CAS Processor section for the CPE will instruct the CPE to get the
+        CAS view for the German text Sofa and pass it to the CAS Consumer:</para>
+      
+      
+      <programlisting>&lt;casProcessor&gt;
+  . . .
+  <emphasis role="bold">&lt;sofaNameMappings&gt;
+    &lt;sofaNameMapping componentSofaName="_InitialView"
+                           cpeSofaName="GermanTranslation"/&gt;
+  &lt;sofaNameMappings&gt;
+</emphasis>&lt;/casProcessor&gt;</programlisting>
+      
+      <para id="ugr.tug.mvs.sofa_mapping_leav_out_name">An alternative syntax for
+        this kind of mapping is to simply leave out the component sofa name in this
+        case.</para>
+      
+    </section>
+    
+    <section id="ugr.tug.mvs.name_mapping_application">
+      <title>Name Mapping in a UIMA Application</title>
+      
+      <para>Applications which instantiate UIMA components directly using the
+        UIMAFramework methods can also create a top level Sofa mapping using the
+        <quote>additional parameters</quote> capability.</para>
+      
+      
+      <programlisting>//create a "root" UIMA context for your whole application
+
+UimaContextAdmin rootContext =
+   UIMAFramework.newUimaContext(UIMAFramework.getLogger(),
+      UIMAFramework.newDefaultResourceManager(),
+      UIMAFramework.newConfigurationManager());
+
+input = new XMLInputSource("test.xml");
+desc = UIMAFramework.getXMLParser().parseAnalysisEngineDescription(input);
+
+//setup sofa name mappings using the api
+
+HashMap sofamappings = new HashMap();
+sofamappings.put("localName1", "globalName1");
+sofamappings.put("localName2", "globalName2");
+  
+//create a UIMA Context for the new AE we are about to create
+
+//first argument is unique key among all AEs used in the application
+UimaContextAdmin childContext = rootContext.createChild("myAE", sofamap);
+
+//instantiate AE, passing the UIMA Context through the additional
+//parameters map
+
+Map additionalParams = new HashMap();
+additionalParams.put(Resource.PARAM_UIMA_CONTEXT, childContext);
+
+AnalysisEngine ae = 
+        UIMAFramework.produceAnalysisEngine(desc,additionalParams);</programlisting>
+      
+      <para>Sofa mappings are applied from the inside out, i.e., local to global. First, any
+        aggregate mappings are applied, then any CPE mappings, and finally, any specified
+        using this <quote>additional parameters</quote> capability.</para>
+      
+    </section>
+    
+    <section id="ugr.tug.mvs.name_mapping_remote_services">
+      <title>Name Mapping for Remote Services</title>
+      
+      <para>Currently, no client-side Sofa mapping information is passed from a UIMA client
+        to a remote service. This can cause complications for UIMA services in a Multi-View
+        application.</para>
+      
+      <para>Remote Multi-View services will work only if the service is Single-View, or if the 
+        Sofa names expected by the service exactly match the Sofa names produced by the client.</para>
+      
+      <para>If your application requires Sofa mappings for a remote Analysis Engine, you
+        can wrap your remotely deployed AE in an aggregate (on the remote side), and specify
+        the necessary Sofa mappings in the descriptor for that aggregate.</para>
+    </section>
+  </section>
+  
+  <section id="ugr.tug.mvs.jcas_extensions_for_multi_views">
+    <title>JCas extensions for Multiple Views</title>
+    
+    <para>The JCas interface to the CAS can be used with any / all views, as well as the base CAS
+      sent to Multi-View components. You can always get a JCas object from an existing CAS
+      object by using the method getJCas(); this call will create the JCas if it doesn&apos;t
+      already exist. If it does exist, it just returns the existing JCas that corresponds to
+      the CAS.</para>
+    
+    <para>JCas implements the getView(...) method, enabling switching to other named
+      views, just like the corresponding method on the CAS. The JCas version, however,
+      returns JCas objects, instead of CAS objects, corresponding to the view.</para>
+  </section>
+  
+  <section id="ugr.tug.mvs.sample_application">
+    <title>Sample Multi-View Application</title>
+    
+    <para>The UIMA SDK contains a simple Sofa example application which demonstrates many
+      Sofa specific concepts and methods. The source code for the application driver is in
+      <literal>examples/src/org/apache/uima/examples/SofaExampleApplication.java</literal>
+      and the Multi-View annotator is given in
+      <literal>SofaExampleAnnotator.java</literal> in the same directory.</para>
+    
+    <para>This sample application demonstrates a language translator annotator which
+      expects an input text Sofa with an English document and creates an output text Sofa
+      containing a German translation. Some of the key Sofa concepts illustrated here
+      include:</para>
+    
+    <itemizedlist spacing="compact"><listitem><para>Sofa creation.</para>
+      </listitem>
+      
+      <listitem><para>Access of multiple CAS views.</para></listitem>
+      
+      <listitem><para>Unique feature structure index space for each view.</para>
+        </listitem>
+      
+      <listitem><para>Feature structures containing cross references between
+        annotations in different CAS views.</para></listitem>
+      
+      <listitem><para>The strong affinity of annotations with a specific Sofa. </para>
+        </listitem></itemizedlist>
+    
+    <section id="ugr.tug.mvs.sample_application.descriptor">
+      <title>Annotator Descriptor</title>
+      
+      <para>The annotator descriptor in
+        <literal>examples/descriptors/analysis_engine/SofaExampleAnnotator.xml</literal>
+        declares an input Sofa named <quote>EnglishDocument</quote> and an output Sofa
+        named <quote>GermanDocument</quote>. A custom type
+        <quote>CrossAnnotation</quote> is also defined:</para>
+      
+      
+      <programlisting><![CDATA[<typeDescription>
+  <name>sofa.test.CrossAnnotation</name>
+  <description/>
+  <supertypeName>uima.tcas.Annotation</supertypeName>
+  <features>
+    <featureDescription>
+      <name>otherAnnotation</name>
+      <description/>
+      <rangeTypeName>uima.tcas.Annotation</rangeTypeName>
+    </featureDescription>
+  </features>
+</typeDescription>]]></programlisting>
+      
+      <para>The <literal>CrossAnnotation</literal> type is derived from
+        <literal>uima.tcas.Annotation </literal>and includes one new feature: a
+        reference to another annotation.</para>
+      
+    </section>
+    
+    <section id="ugr.tug.mvs.sample_application.setup">
+      <title>Application Setup</title>
+      
+      <para>The application driver instantiates an analysis engine,
+        <literal>seAnnotator</literal>, from the annotator descriptor, obtains a new
+        base CAS using that engine&apos;s CAS definition, and creates the expected input
+        Sofa using:</para>
+      
+      
+      <programlisting>CAS cas = seAnnotator.newCAS();
+CAS aView = cas.createView("EnglishDocument");</programlisting>
+      
+      <para>Since <literal>seAnnotator</literal> is a primitive component, and no Sofa
+        mapping has been defined, the SofaID will be <quote>EnglishDocument</quote>.
+        Local Sofa data is set using:</para>
+      
+      
+      <programlisting>aView.setDocumentText("this beer is good");</programlisting>
+      
+      <para>At this point the CAS contains all necessary inputs for the translation
+        annotator and its process method is called.</para>
+      
+    </section>
+    
+    <section id="ugr.tug.mvs.sample_application.annotator_processing">
+      <title>Annotator Processing</title>
+      
+      <para>Annotator processing consists of parsing the English document into individual
+        words, doing word-by-word translation and concatenating the translations into a
+        German translation. Analysis metadata on the English Sofa will be an annotation for
+        each English word. Analysis metadata on the German Sofa will be a
+        <literal>CrossAnnotation</literal> for each German word, where the
+        <literal>otherAnnotation</literal> feature will be a reference to the associated
+        English annotation.</para>
+      
+      <para>Code of interest includes two CAS views:</para>
+      
+      
+      <programlisting>// get View of the English text Sofa
+englishView = aCas.getView("EnglishDocument");
+
+// Create the output German text Sofa
+germanView = aCas.createView("GermanDocument");</programlisting>
+      
+      <para>the indexing of annotations with the appropriate view:</para>
+      
+      
+      <programlisting>englishView.addFsToIndexes(engAnnot);
+. . .
+germanView.addFsToIndexes(germAnnot);</programlisting>
+      
+      <para>and the combining of metadata belonging to different Sofas in the same feature
+        structure:</para>
+      
+      
+      <programlisting>// add link to English text
+germAnnot.setFeatureValue(other, engAnnot);</programlisting>
+      
+    </section>
+    
+    <section id="ugr.tug.mvs.sample_application.accessing_results">
+      <title>Accessing the results of analysis</title>
+      
+      <para>The application needs to get the results of analysis, which may be in different
+        views. Analysis results for each Sofa are dumped independently by iterating over all
+        annotations for each associated CAS view. For the English Sofa:</para>
+      
+      
+      <programlisting>//get annotation iterator for this CAS
+FSIndex anIndex = aView.getAnnotationIndex();
+FSIterator anIter = anIndex.iterator();
+while (anIter.isValid()) {
+  AnnotationFS annot = (AnnotationFS) anIter.get();
+  System.out.println(" " + annot.getType().getName()
+                         + ": " + annot.getCoveredText());
+  anIter.moveToNext();
+}</programlisting>
+      
+      <para>Iterating over all German annotations looks the same, except for the
+        following:</para>
+      
+      
+      <programlisting>if (annot.getType() == cross) {
+  AnnotationFS crossAnnot =
+          (AnnotationFS) annot.getFeatureValue(other);
+  System.out.println("   other annotation feature: "
+          + crossAnnot.getCoveredText());
+}</programlisting>
+      
+      <para>Of particular interest here is the built-in Annotation type method
+        <literal>getCoveredText()</literal>. This method uses the
+        <quote>begin</quote> and <quote>end</quote> features of the annotation to create
+        a substring from the CAS document. The SofaRef feature of the annotation is used to
+        identify the correct Sofa&apos;s data from which to create the substring.</para>
+      
+      <para>The example program output is:</para>
+      
+      
+      <programlisting>---Printing all annotations for English Sofa---
+uima.tcas.DocumentAnnotation: this beer is good
+uima.tcas.Annotation: this
+uima.tcas.Annotation: beer
+uima.tcas.Annotation: is
+uima.tcas.Annotation: good
+      
+---Printing all annotations for German Sofa---
+uima.tcas.DocumentAnnotation: das bier ist gut
+sofa.test.CrossAnnotation: das
+ other annotation feature: this
+sofa.test.CrossAnnotation: bier
+ other annotation feature: beer
+sofa.test.CrossAnnotation: ist
+ other annotation feature: is
+sofa.test.CrossAnnotation: gut
+ other annotation feature: good</programlisting>
+      
+    </section>
+  </section>
+  
+  <section id="ugr.tug.mvs.views_api_summary">
+    <title>Views API Summary</title>
+    
+    <para>The recommended way to deliver a particular CAS view to a <emphasis role="bold-italic">Single-View</emphasis> component is to use by Sofa-mapping in
+      the CPE and/or aggregate descriptors.</para>
+    
+    <para>For <emphasis role="bold-italic">Multi-View </emphasis> components or
+      applications, the following methods are used to create or get a reference to a CAS view
+      for a particular Sofa:</para>
+    
+    <para>Creating a new View:</para>
+    
+    
+    <programlisting>JCas newView = aJCas.createView(String localNameOfTheViewBeforeMapping);
+CAS  newView = aCAS .createView(String localNameOfTheViewBeforeMapping);</programlisting>
+    
+    <para>Getting a View from a CAS or JCas:</para>
+    
+    
+    <programlisting><?db-font-size 80% ?>JCas myView = aJCas.getView(String localNameOfTheViewBeforeMapping);
+CAS  myView = aCAS .getView(String localNameOfTheViewBeforeMapping);
+Iterator allViews = aCasOrJCas.getViewIterator();
+Iterator someViews = aCasOrJCas.getViewIterator(String localViewNamePrefix);</programlisting>
+    
+    <para>The following methods are useful for all annotators and applications:</para>
+    
+    <para>Setting Sofa data for a CAS or JCas:</para>
+    
+    
+    <programlisting>aCasOrJCas.setDocumentText(String docText);
+aCasOrJCas.setSofaDataString(String docText, String mimeType);
+aCasOrJCas.setSofaDataArray(FeatureStructure array, String mimeType);
+aCasOrJCas.setSofaDataURI(String uri, String mimeType);</programlisting>
+    
+    <para>Getting Sofa data for a particular CAS or JCas:</para>
+    
+    
+    <programlisting>String doc = aCasOrJCas.getDocumentText();
+String doc = aCasOrJCas.getSofaDataString();
+FeatureStructure array = aCasOrJCas.getSofaDataArray();
+String uri = aCasOrJCas.getSofaDataURI();
+InputStream is = aCasOrJCas.getSofaDataStream();</programlisting>
+    
+  </section>
+  
+  <section id="ugr.tug.mvs.sofa_incompatibilities_v1_v2">
+    <title>Sofa Incompatibilities between UIMA version 1 and version 2</title>
+    <titleabbrev>Sofa Incompatibilities: V1 and V2</titleabbrev>
+    
+    <para>A major change in version 2 is related to the support of Single-View components
+      and applications. Given an analysis engine, <literal>ae</literal>, the API
+      
+      <programlisting>CAS cas = ae.newCas();</programlisting>
+      used to return the base CAS. Now it returns a view of the Sofa named
+      <quote>_InitialView</quote>. This Sofa will actually only be created if any Sofa data
+      is set for this view. The initial view is used for Single-View applications and
+      Multi-View annotators with no Sofa mapping.</para>
+    
+    <para>The process method of Multi-View annotators receive the base CAS, however the base
+      CAS no longer has an index repository to hold <quote>global</quote> data. Global data
+      needs to be put in a specific named CAS view of your choice.</para>
+    
+    <para>Because of these changes, the following scenarios will break with v2.0 clients:
+      
+      <itemizedlist spacing="compact"><listitem><para>Any version 1.x services (you
+        must migrate the services to version 2).</para></listitem>
+        
+        <listitem><para>Applications or components explicitly referencing
+          <quote>_DefaultTextSofaName</quote> in code or descriptors.</para>
+          </listitem>
+        
+        <listitem><para>Multi-View applications using the Base CAS index repository.
+          </para></listitem></itemizedlist></para>
+  </section>
 </chapter>
\ No newline at end of file

Propchange: incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.multi_views.xml
------------------------------------------------------------------------------
    svn:eol-style = native

Modified: incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.xmi_emf.xml
URL: http://svn.apache.org/viewvc/incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.xmi_emf.xml?rev=689997&r1=689996&r2=689997&view=diff
==============================================================================
--- incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.xmi_emf.xml (original)
+++ incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.xmi_emf.xml Thu Aug 28 14:28:14 2008
@@ -1,153 +1,153 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[
-<!ENTITY % uimaents SYSTEM "../entities.ent">  
-%uimaents;
-]>
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-   http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-<chapter id="ugr.tug.xmi_emf">
-  <title>XMI and EMF Interoperability</title>
-  <titleabbrev>XMI &amp; EMF</titleabbrev>
-  
-  <section id="ugr.tug.xmi_emf.overview">
-    <title>Overview</title>
-    
-    <para>In traditional object-oriented terms, a UIMA Type System is a class model and a UIMA CAS is an object graph.
-      There are established standards in this area
-      &ndash; specifically, <trademark class="registered">UML</trademark> is an <trademark class="trade">
-      OMG</trademark> standard for class models and XMI (XML Metadata Interchange) is an OMG standard for the XML
-      representation of object graphs.</para>
-    
-    <para>Furthermore, the Eclipse Modeling Framework (EMF) is an open-source framework for model-based
-      application development, and it is based on UML and XMI. In EMF, you define class models using a metamodel called
-      Ecore, which is similar to UML. EMF provides tools for converting a UML model to Ecore. EMF can then generate Java
-      classes from your model, and supports persistence of those classes in the XMI format.</para>
-    
-    <para>The UIMA SDK provides tools for interoperability with XMI and EMF. These tools allow conversions of UIMA
-      Type Systems to and from Ecore models, as well as conversions of UIMA CASes to and from XMI format. This provides a
-      number of advantages, including:</para>
-    
-    <blockquote>
-      <para>You can define a model using a UML Editor, such as Rational Rose or EclipseUML, and then automatically
-        convert it to a UIMA Type System.</para>
-      
-      <para>You can take an existing UIMA application, convert its type system to Ecore, and save the CASes it
-        produces to XMI. This data is now in a form where it can easily be ingested by an EMF-based application.</para>
-    </blockquote>
-    
-    <para>More generally, we are adopting the well-documented, open standard XMI as the standard way to represent
-      UIMA-compliant analysis results (replacing the UIMA-specific XCAS format). This use of an open standard
-      enables other applications to more easily produce or consume these UIMA analysis results.</para>
-    
-    <para>For more information on XMI, see Grose et al. <emphasis>Mastering XMI. Java Programming with XMI, XML, and
-      UML.</emphasis> John Wiley &amp; Sons, Inc. 2002.</para>
-    
-    <para>For more information on EMF, see Budinsky et al. <emphasis>Eclipse Modeling Framework 2.0.</emphasis>
-      Addison-Wesley. 2006.</para>
-    
-    <para>For details of how the UIMA CAS is represented in XMI format, see <olink targetdoc="&uima_docs_ref;"
-        targetptr="ugr.ref.xmi"/> .</para>
-    
-  </section>
-  
-  <section id="ugr.tug.xmi_emf.converting_ecore_to_from_uima_type_system">
-    <title>Converting an Ecore Model to or from a UIMA Type System</title>
-    
-    <para>The UIMA SDK provides the following two classes:</para>
-    
-    <para><emphasis role="bold"><literal>Ecore2UimaTypeSystem:</literal>
-      </emphasis> converts from an .ecore model developed using EMF to a UIMA-compliant
-      TypeSystem descriptor. This is a Java class that can be run as a standalone program or
-      invoked from another Java application. To run as a standalone program,
-      execute:</para>
-    
-    <para><command>java org.apache.uima.ecore.Ecore2UimaTypeSystem &lt;ecore
-      file&gt; &lt;output file&gt;</command></para>
-    
-    <para>The input .ecore file will be converted to a UIMA TypeSystem descriptor and written
-      to the specified output file. You can then use the resulting TypeSystem descriptor in
-      your UIMA application.</para>
-    
-    <para><emphasis role="bold"><literal>UimaTypeSystem2Ecore:</literal>
-      </emphasis> converts from a UIMA TypeSystem descriptor to an .ecore model. This is a
-      Java class that can be run as a standalone program or invoked from another Java
-      application. To run as a standalone program, execute:</para>
-    
-    <para><command>java org.apache.uima.ecore.UimaTypeSystem2Ecore
-      &lt;TypeSystem descriptor&gt; &lt;output file&gt;</command></para>
-    
-    <para>The input UIMA TypeSystem descriptor will be converted to an Ecore model file and
-      written to the specified output file. You can then use the resulting Ecore model in EMF
-      applications. The converted type system will include any
-      <literal>&lt;import...&gt;</literal>ed TypeSystems; the fact that they were
-      imported is currently not preserved.</para>
-    
-    <para>To run either of these converters, your classpath will need to include the UIMA jar
-      files as well as the following jar files from the EMF distribution: common.jar,
-      ecore.jar, and ecore.xmi.jar.</para>
-    
-    <para>Also, note that the uima-core.jar file contains the Ecore model file uima.ecore,
-      which defines the built-in UIMA types. You may need to use this file from your EMF
-      applications.</para>
-    
-  </section>
-  
-  <section id="ugr.tug.xmi_emf.using_xmi_cas_serialization">
-    <title>Using XMI CAS Serialization</title>
-    
-    <para>The UIMA SDK provides XMI support through the following two classes:</para>
-    
-    <para><emphasis role="bold"><literal>XmiCasSerializer:</literal></emphasis>
-      can be run from within a UIMA application to write out a CAS to the standard XMI format. The
-      XMI that is generated will be compliant with the Ecore model generated by
-      <literal>UimaTypeSystem2Ecore</literal>. An EMF application could use this Ecore
-      model to ingest and process the XMI produced by the XmiCasSerializer.</para>
-    
-    <para><emphasis role="bold"><literal>XmiCasDeserializer:</literal></emphasis>
-      can be run from within a UIMA application to read in an XMI document and populate a CAS. The
-      XMI must conform to the Ecore model generated by
-      <literal>UimaTypeSystem2Ecore</literal>.</para>
-    
-    <para>Also, the uimaj-examples Eclipse project contains some example code that shows
-      how to use the serializer and deserializer:
-
-    <blockquote>
-    <para><literal>org.apache.uima.examples.xmi.XmiWriterCasConsumer:</literal>
-      This is a CAS Consumer that writes each CAS to an output file in XMI format. It is analogous
-      to the XCasWriter CAS Consumer that has existed in prior UIMA versions, except that it
-      uses the XMI serialization format.</para>
-    
-    <para><literal>org.apache.uima.examples.xmi.XmiCollectionReader:</literal>
-      This is a Collection Reader that reads a directory of XMI files and deserializes each of
-      them into a CAS. For example, this would allow you to build a Collection Processing
-      Engine that reads XMI files, which could contain some previous analysis results, and
-      then do further analysis.</para>
-    </blockquote></para>
-    
-    <para>Finally, in under the folder <literal>uimaj-examples/ecore_src</literal> is
-      the class
-      <literal>org.apache.uima.examples.xmi.XmiEcoreCasConsumer</literal>, which
-      writes each CAS to XMI format and also saves the Type System as an Ecore file. Since this
-      uses the <literal>UimaTypeSystem2Ecore</literal> converter, to compile it you must
-      add to your classpath the EMF jars common.jar, ecore.jar, and ecore.xmi.jar &ndash;
-      see ecore_src/readme.txt for instructions.</para>
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[
+<!ENTITY % uimaents SYSTEM "../entities.ent">  
+%uimaents;
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<chapter id="ugr.tug.xmi_emf">
+  <title>XMI and EMF Interoperability</title>
+  <titleabbrev>XMI &amp; EMF</titleabbrev>
+  
+  <section id="ugr.tug.xmi_emf.overview">
+    <title>Overview</title>
+    
+    <para>In traditional object-oriented terms, a UIMA Type System is a class model and a UIMA CAS is an object graph.
+      There are established standards in this area
+      &ndash; specifically, <trademark class="registered">UML</trademark> is an <trademark class="trade">
+      OMG</trademark> standard for class models and XMI (XML Metadata Interchange) is an OMG standard for the XML
+      representation of object graphs.</para>
+    
+    <para>Furthermore, the Eclipse Modeling Framework (EMF) is an open-source framework for model-based
+      application development, and it is based on UML and XMI. In EMF, you define class models using a metamodel called
+      Ecore, which is similar to UML. EMF provides tools for converting a UML model to Ecore. EMF can then generate Java
+      classes from your model, and supports persistence of those classes in the XMI format.</para>
+    
+    <para>The UIMA SDK provides tools for interoperability with XMI and EMF. These tools allow conversions of UIMA
+      Type Systems to and from Ecore models, as well as conversions of UIMA CASes to and from XMI format. This provides a
+      number of advantages, including:</para>
+    
+    <blockquote>
+      <para>You can define a model using a UML Editor, such as Rational Rose or EclipseUML, and then automatically
+        convert it to a UIMA Type System.</para>
+      
+      <para>You can take an existing UIMA application, convert its type system to Ecore, and save the CASes it
+        produces to XMI. This data is now in a form where it can easily be ingested by an EMF-based application.</para>
+    </blockquote>
+    
+    <para>More generally, we are adopting the well-documented, open standard XMI as the standard way to represent
+      UIMA-compliant analysis results (replacing the UIMA-specific XCAS format). This use of an open standard
+      enables other applications to more easily produce or consume these UIMA analysis results.</para>
+    
+    <para>For more information on XMI, see Grose et al. <emphasis>Mastering XMI. Java Programming with XMI, XML, and
+      UML.</emphasis> John Wiley &amp; Sons, Inc. 2002.</para>
+    
+    <para>For more information on EMF, see Budinsky et al. <emphasis>Eclipse Modeling Framework 2.0.</emphasis>
+      Addison-Wesley. 2006.</para>
+    
+    <para>For details of how the UIMA CAS is represented in XMI format, see <olink targetdoc="&uima_docs_ref;"
+        targetptr="ugr.ref.xmi"/> .</para>
+    
+  </section>
+  
+  <section id="ugr.tug.xmi_emf.converting_ecore_to_from_uima_type_system">
+    <title>Converting an Ecore Model to or from a UIMA Type System</title>
+    
+    <para>The UIMA SDK provides the following two classes:</para>
+    
+    <para><emphasis role="bold"><literal>Ecore2UimaTypeSystem:</literal>
+      </emphasis> converts from an .ecore model developed using EMF to a UIMA-compliant
+      TypeSystem descriptor. This is a Java class that can be run as a standalone program or
+      invoked from another Java application. To run as a standalone program,
+      execute:</para>
+    
+    <para><command>java org.apache.uima.ecore.Ecore2UimaTypeSystem &lt;ecore
+      file&gt; &lt;output file&gt;</command></para>
+    
+    <para>The input .ecore file will be converted to a UIMA TypeSystem descriptor and written
+      to the specified output file. You can then use the resulting TypeSystem descriptor in
+      your UIMA application.</para>
+    
+    <para><emphasis role="bold"><literal>UimaTypeSystem2Ecore:</literal>
+      </emphasis> converts from a UIMA TypeSystem descriptor to an .ecore model. This is a
+      Java class that can be run as a standalone program or invoked from another Java
+      application. To run as a standalone program, execute:</para>
+    
+    <para><command>java org.apache.uima.ecore.UimaTypeSystem2Ecore
+      &lt;TypeSystem descriptor&gt; &lt;output file&gt;</command></para>
+    
+    <para>The input UIMA TypeSystem descriptor will be converted to an Ecore model file and
+      written to the specified output file. You can then use the resulting Ecore model in EMF
+      applications. The converted type system will include any
+      <literal>&lt;import...&gt;</literal>ed TypeSystems; the fact that they were
+      imported is currently not preserved.</para>
+    
+    <para>To run either of these converters, your classpath will need to include the UIMA jar
+      files as well as the following jar files from the EMF distribution: common.jar,
+      ecore.jar, and ecore.xmi.jar.</para>
+    
+    <para>Also, note that the uima-core.jar file contains the Ecore model file uima.ecore,
+      which defines the built-in UIMA types. You may need to use this file from your EMF
+      applications.</para>
+    
+  </section>
+  
+  <section id="ugr.tug.xmi_emf.using_xmi_cas_serialization">
+    <title>Using XMI CAS Serialization</title>
+    
+    <para>The UIMA SDK provides XMI support through the following two classes:</para>
+    
+    <para><emphasis role="bold"><literal>XmiCasSerializer:</literal></emphasis>
+      can be run from within a UIMA application to write out a CAS to the standard XMI format. The
+      XMI that is generated will be compliant with the Ecore model generated by
+      <literal>UimaTypeSystem2Ecore</literal>. An EMF application could use this Ecore
+      model to ingest and process the XMI produced by the XmiCasSerializer.</para>
+    
+    <para><emphasis role="bold"><literal>XmiCasDeserializer:</literal></emphasis>
+      can be run from within a UIMA application to read in an XMI document and populate a CAS. The
+      XMI must conform to the Ecore model generated by
+      <literal>UimaTypeSystem2Ecore</literal>.</para>
+    
+    <para>Also, the uimaj-examples Eclipse project contains some example code that shows
+      how to use the serializer and deserializer:
+
+    <blockquote>
+    <para><literal>org.apache.uima.examples.xmi.XmiWriterCasConsumer:</literal>
+      This is a CAS Consumer that writes each CAS to an output file in XMI format. It is analogous
+      to the XCasWriter CAS Consumer that has existed in prior UIMA versions, except that it
+      uses the XMI serialization format.</para>
+    
+    <para><literal>org.apache.uima.examples.xmi.XmiCollectionReader:</literal>
+      This is a Collection Reader that reads a directory of XMI files and deserializes each of
+      them into a CAS. For example, this would allow you to build a Collection Processing
+      Engine that reads XMI files, which could contain some previous analysis results, and
+      then do further analysis.</para>
+    </blockquote></para>
+    
+    <para>Finally, in under the folder <literal>uimaj-examples/ecore_src</literal> is
+      the class
+      <literal>org.apache.uima.examples.xmi.XmiEcoreCasConsumer</literal>, which
+      writes each CAS to XMI format and also saves the Type System as an Ecore file. Since this
+      uses the <literal>UimaTypeSystem2Ecore</literal> converter, to compile it you must
+      add to your classpath the EMF jars common.jar, ecore.jar, and ecore.xmi.jar &ndash;
+      see ecore_src/readme.txt for instructions.</para>
 
     <section id="ugr.tug.xmi_emf.xml_character_issues">
     <title>Character Encoding Issues with XML Serialization</title>
@@ -181,6 +181,6 @@
     </para>
   
     </section>
-  </section>
-  
+  </section>
+  
 </chapter>
\ No newline at end of file

Propchange: incubator/uima/uimaj/trunk/uima-docbooks/src/docbook/tutorials_and_users_guides/tug.xmi_emf.xml
------------------------------------------------------------------------------
    svn:eol-style = native