You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by al...@apache.org on 2007/03/14 23:12:10 UTC
svn commit: r518354 [17/21] - in /incubator/uima/site/trunk/uima-website:
docs/ docs/downloads/releaseDocs/
docs/downloads/releaseDocs/2.1.0-incubating/
docs/downloads/releaseDocs/2.1.0-incubating/docs/
docs/downloads/releaseDocs/2.1.0-incubating/docs/...
Added: incubator/uima/site/trunk/uima-website/xdocs/downloads/releaseDocs/2.1.0-incubating/docs/html/references/references.html
URL: http://svn.apache.org/viewvc/incubator/uima/site/trunk/uima-website/xdocs/downloads/releaseDocs/2.1.0-incubating/docs/html/references/references.html?view=auto&rev=518354
==============================================================================
--- incubator/uima/site/trunk/uima-website/xdocs/downloads/releaseDocs/2.1.0-incubating/docs/html/references/references.html (added)
+++ incubator/uima/site/trunk/uima-website/xdocs/downloads/releaseDocs/2.1.0-incubating/docs/html/references/references.html Wed Mar 14 15:11:54 2007
@@ -0,0 +1,3786 @@
+<html><head>
+ <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
+ <title>UIMA References</title><link rel="stylesheet" href="css/stylesheet.css" type="text/css"><meta name="generator" content="DocBook XSL Stylesheets V1.70.0"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="book" lang="en" id="d0e2"><div class="titlepage"><div><div><h1 class="title"><a name="d0e2"></a>UIMA References</h1></div><div><div class="authorgroup"><h3 class="corpauthor">Authors: The Apache UIMA Development Community</h3></div></div><div><p class="releaseinfo">Version 2.1</p></div><div><p class="copyright">Copyright © 2006, 2007 The Apache Software Foundation</p></div><div><p class="copyright">Copyright © 2004, 2006 International Business Machines Corporation</p></div><div><div class="legalnotice"><a name="d0e15"></a><p> </p><p><b>Incubation Notice and Disclaimer. </b>Apache UIMA is an effort undergoing incubation at the Apache Software Foundation (ASF).
+ Incubation is required of all newly accepted projects until a further review indicates that
+ the infrastructure, communications, and decision making process have stabilized in a manner
+ consistent with other successful ASF projects. While incubation status is not necessarily
+ a reflection of the completeness or stability of the code,
+ it does indicate that the project has yet to be fully endorsed by the ASF.</p><p> </p><p> </p><p><b>License and Disclaimer. </b>The ASF licenses this documentation
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this documentation except in compliance
+ with the License. You may obtain a copy of the License at
+
+ </p><div class="blockquote"><blockquote class="blockquote"><a href="http://www.apache.org/licenses/LICENSE-2.0" target="_top">http://www.apache.org/licenses/LICENSE-2.0</a></blockquote></div><p>
+
+ Unless required by applicable law or agreed to in writing,
+ this documentation and its contents are distributed under the License
+ on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ </p><p> </p><p> </p><p><b>Trademarks. </b>All terms mentioned in the text that are known to be trademarks or
+ service marks have been appropriately capitalized. Use of such terms
+ in this book should not be regarded as affecting the validity of the
+ the trademark or service mark.
+ </p></div></div><div><p class="pubdate">February, 2007</p></div></div><hr></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="chapter"><a href="#ugr.ref.javadocs">1. Javadocs</a></span></dt><dt><span class="chapter"><a href="#ugr.ref.xml.component_descriptor">2. Component Descriptor Reference</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.notation">2.1. Notation</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.imports">2.2. Imports</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.type_system">2.3. Type System Descriptors</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.type_system.imports">2.3.1. Imports</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.type_system.types">2.3.2. Types</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.component
_descriptor.type_system.features">2.3.3. Features</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.type_system.string_subtypes">2.3.4. String Subtypes</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.aes">2.4. Analysis Engine Descriptors</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.aes.primitive">2.4.1. Primitive Analysis Engine Descriptors</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.aes.aggregate">2.4.2. Aggregate Analysis Engine Descriptors</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.flow_controller">2.5. Flow Controller Descriptors</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.collection_processing_parts">2.6. Collection Processing Component Descriptors</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.xm
l.component_descriptor.collection_processing_parts.collection_reader">2.6.1. Collection Reader Descriptors</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.collection_processing_parts.cas_initializer">2.6.2. CAS Initializer Descriptors (deprecated)</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.collection_processing_parts.cas_consumer">2.6.3. CAS Consumer Descriptors</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.xml.component_descriptor.service_client">2.7. Service Client Descriptors</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.ref.xml.cpe_descriptor">3. CPE Descriptor Reference</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.xml.cpe_descriptor.overview">3.1. CPE Overview</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.cpe_descriptor.notation">3.2. Notation</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.cpe_descr
iptor.imports">3.3. Imports</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.cpe_descriptor.descriptor">3.4. CPE Descriptor Overview</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.cpe_descriptor.descriptor.collection_reader">3.5. Collection Reader</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.xml.cpe_descriptor.descriptor.collection_reader.error_handling">3.5.1. Error handling for Collection Readers</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.xml.cpe_descriptor.descriptor.cas_processors">3.6. CAS Processors</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.xml.cpe_descriptor.descriptor.cas_processors.individual">3.6.1. Specifying an Individual CAS Processor</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.xml.cpe_descriptor.descriptor.operational_parameters">3.7. CPE Operational Parameters</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.cpe_descript
or.descriptor.resource_manager_configuration">3.8. Resource Manager Configuration</a></span></dt><dt><span class="section"><a href="#ugr.ref.xml.cpe_descriptor.descriptor.example">3.9. Example CPE Descriptor</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.ref.cas">4. CAS Reference</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.cas.javadocs">4.1. JavaDocs</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.overview">4.2. CAS Overview</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.cas.type_system">4.2.1. The Type System</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.creating_accessing_manipulating_data">4.2.2. Creating/Accessing/Changing data</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.creating_using_indexes">4.2.3. Creating and using indexes</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.cas.builtin_types">4.3. Built-in CAS Types</a></span></dt><dt><spa
n class="section"><a href="#ugr.ref.cas.accessing_the_type_system">4.4. Accessing the type system</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.cas.type_system.printer_example">4.4.1. TypeSystemPrinter example</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.cas_apis_create_modify_feature_structures">4.4.2. Using CAS APIs: Feature Structures</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.cas.creating_feature_structures">4.5. Creating feature structures</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.accessing_modifying_features_of_feature_structures">4.6. Accessing or modifying Features</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.indexes_and_iterators">4.7. Indexes and Iterators</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.cas.index.built_in_indexes">4.7.1. Built-in Indexes</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.index.adding_to_indexes">4.7
.2. Adding Feature Structures to the Indexes</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.index.iterators">4.7.3. Iterators</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.index.annotation_index">4.7.4. Special iterators for Annotation types</a></span></dt><dt><span class="section"><a href="#ugr.ref.cas.index.constraints_and_filtered_iterators">4.7.5. Constraints and Filtered iterators</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.cas.guide_to_javadocs">4.8. CAS API's JavaDocs</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.cas.javadocs.cas_package">4.8.1. APIs in the CAS package</a></span></dt></dl></dd></dl></dd><dt><span class="chapter"><a href="#ugr.ref.jcas">5. JCas Reference</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.jcas.name_spaces">5.1. Name Spaces</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.use_of_description">5.2. Use of XML Description</a></span>
</dt><dt><span class="section"><a href="#ugr.ref.jcas.mapping_built_ins">5.3. Mapping built-in CAS types to Java types</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.augmenting_generated_code">5.4. Augmenting the generated Java Code</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.jcas.keeping_augmentations_when_regenerating">5.4.1. Keeping hand-coded augmentations when regenerating</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.additional_constructors">5.4.2. Additional Constructors</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.modifying_generated_items">5.4.3. Modifying generated items</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.jcas.merging_types_from_other_specs">5.5. Merging Types</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.jcas.merging_types.aggregates_and_cpes">5.5.1. Aggregate AEs and CPEs as sources of types</a></span></dt><dt><span class="section"><
a href="#ugr.ref.jcas.merging_types.jcasgen_support">5.5.2. JCasGen support for type merging</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.impact_of_type_merging_on_composability">5.5.3. Type Merging impacts on Composability</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.documentannotation_issues">5.5.4. Adding Features to DocumentAnnotation</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.jcas.using_within_an_annotator">5.6. Using JCas within an Annotator</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.jcas.new_instances">5.6.1. Creating new instances</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.getters_and_setters">5.6.2. Getters and Setters</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.obtaining_refs_to_indexes">5.6.3. Obtaining references to Indexes</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.adding_removing_instances_to_indexes">5.6.4. Updat
ing Indexes</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.using_iterators">5.6.5. Using Iterators</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.class_loaders">5.6.6. Class Loaders in UIMA</a></span></dt><dt><span class="section"><a href="#ugr.ref.jcas.accessing_jcas_objects_outside_uima_components">5.6.7. Issues accessing JCas objects outside of UIMA Engine Components</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.jcas.setting_up_classpath">5.7. Setting up Classpath for JCas</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.ref.pear">6. PEAR Reference</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.pear.packaging_a_component">6.1. Packaging a UIMA component</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.pear.creating_pear_structure">6.1.1. Creating the PEAR structure</a></span></dt><dt><span class="section"><a href="#ugr.ref.pear.populating_pear_structure">6.1.2. P
opulating the PEAR structure</a></span></dt><dt><span class="section"><a href="#ugr.ref.pear.creating_installation_descriptor">6.1.3. Creating the installation descriptor</a></span></dt><dt><span class="section"><a href="#ugr.ref.pear.installation_descriptor">6.1.4. Installation Descriptor: template</a></span></dt><dt><span class="section"><a href="#ugr.ref.pear.packaging_into_1_file">6.1.5. Packaging the PEAR structure into one file</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.pear.installing">6.2. Installing a PEAR package</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.pear.installing_pear_using_API">6.2.1. Installing a PEAR file using the PEAR APIs</a></span></dt></dl></dd></dl></dd><dt><span class="chapter"><a href="#ugr.ref.xmi">7. XMI CAS Serialization Reference</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.xmi.xmi_tag">7.1. XMI Tag</a></span></dt><dt><span class="section"><a href="#ugr.ref.xmi.feature_st
ructures">7.2. Feature Structures</a></span></dt><dt><span class="section"><a href="#ugr.ref.xmi.primitive_features">7.3. Primitive Features</a></span></dt><dt><span class="section"><a href="#ugr.ref.xmi.reference_features">7.4. Reference Features</a></span></dt><dt><span class="section"><a href="#ugr.ref.xmi.array_and_list_features">7.5. Array and List Features</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.ref.xmi.array_and_list_features.as_multi_valued_properties">7.5.1. Arrays and Lists as Multi-Valued Properties</a></span></dt><dt><span class="section"><a href="#ugr.ref.xmi.array_and_list_features.as_1st_class_objects">7.5.2. Arrays and Lists as First-Class Objects</a></span></dt><dt><span class="section"><a href="#ugr.ref.xmi.null_array_list_elements">7.5.3. Null Array/List Elements</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.ref.xmi.sofas_views">7.6. Subjects of Analysis (Sofas) and Views</a></span></dt><dt><span class="section"><a
href="#ugr.ref.xmi.linking_to_ecore_type_system">7.7. Linking XMI docs to Ecore Type System</a></span></dt></dl></dd></dl></div><div class="chapter" lang="en" id="ugr.ref.javadocs"><div class="titlepage"><div><div><h2 class="title"><a name="ugr.ref.javadocs"></a>Chapter 1. Javadocs</h2></div></div></div><p>The details of all the public APIs for UIMA are contained in the API JavaDocs. These are located in the docs/api
+ directory; the top level to open in your browser is called <a href="api/index.html" target="_top">api/index.html</a>.</p><p>Eclipse supports the ability to attach the JavaDocs to your project. The Javadoc should already be attached
+ to the <code class="literal">uimaj-examples</code> project, if you followed the setup instructions in <a href="../overview_and_setup/overview_and_setup.html#ugr.ovv.eclipse_setup.example_code" class="olink">Section 3.2, “Setting up Eclipse to view Example Code”</a> in <span class="olinkdocname">Overview & Setup</span>. To attach
+ Javadocs to your own Eclipse project, use the following instructions.</p><p>Open a project which is referring to the UIMA APIs in its class path, and open the project properties. Then pick
+ Java Build Path. Pick the "Libraries" tab and select one of the UIMA library entries (if you don't have, for
+ instance, uima-core.jar in this list, it's unlikely your code will compile). Each library entry has a small "+"
+ sign on its left - click that to expand the view to see the Javadoc location. If you highlight that and press edit - you
+ can add a reference to the Javadocs, in the following dialog:
+
+
+ </p><div class="screenshot"><div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="576"><tr><td><img src="../images/references/ref.javadocs/image002.jpg" width="576" alt="Screenshot of attaching Javadoc to source in Eclipse"></td></tr></table></div></div><p>Once you do this, Eclipse can show you JavaDocs for UIMA APIs as you work. To see the JavaDoc for a UIMA API, you
+ can hover over the API class or method, or select it and press shift-F2, or use the menu Navigate <span class="symbol">→</span>
+ OpenExternalJavaDoc, or open the Javadoc view (Window <span class="symbol">→</span> Show View <span class="symbol">→</span> Other
+ <span class="symbol">→</span> Java <span class="symbol">→</span> Javadoc).</p><p>In a similar manner, you can attach the source for the UIMA framework. The source is, of course, available from
+ the Apache UIMA website (<a href="http://incubator.apache.org/uima" target="_top">http://incubator.apache.org/uima</a>).</p></div><div class="chapter" lang="en" id="ugr.ref.xml.component_descriptor"><div class="titlepage"><div><div><h2 class="title"><a name="ugr.ref.xml.component_descriptor"></a>Chapter 2. Component Descriptor Reference</h2></div></div></div><p>This chapter is the reference guide for the UIMA SDK's Component Descriptor XML
+ schema. A <span class="emphasis"><em>Component Descriptor</em></span> (also sometimes called a
+ <span class="emphasis"><em>Resource Specifier</em></span> in the code) is an XML file that either (a)
+ completely describes a component, including all information needed to construct the
+ component and interact with it, or (b) specifies how to connect to and interact with an
+ existing component that has been published as a remote service.
+ <span class="emphasis"><em>Component</em></span> (also called <span class="emphasis"><em>Resource</em></span>) is a
+ general term for modules produced by UIMA developers and used by UIMA applications. The
+ types of Components are: Analysis Engines, Collection Readers, CAS
+ Initializers<sup>[<a name="d0e122" href="#ftn.d0e122">1</a>]</sup>, CAS Consumers, and Collection Processing Engines.
+ However, Collection Processing Engine Descriptors are significantly different in
+ format and are covered in a separate chapter, <a href="references.html#ugr.ref.xml.cpe_descriptor" class="olink">Chapter 3, Collection Processing Engine Descriptor Reference
+ </a>.</p><p><a href="#ugr.ref.xml.component_descriptor.notation" title="2.1. Notation">Section 2.1, “Notation”</a> describes the notation used in this
+ chapter.</p><p><a href="#ugr.ref.xml.component_descriptor.imports" title="2.2. Imports">Section 2.2, “Imports”</a> describes the UIMA SDK's
+ <span class="emphasis"><em>import</em></span> syntax, used to allow XML descriptors to import
+ information from other XML files, to allow sharing of information between several XML
+ descriptors.</p><p><a href="#ugr.ref.xml.component_descriptor.aes" title="2.4. Analysis Engine Descriptors">Section 2.4, “Analysis Engine Descriptors”</a> describes the XML format for <span class="emphasis"><em>Analysis Engine
+ Descriptors</em></span>. These are descriptors that completely describe Analysis
+ Engines, including all information needed to construct and interact with them.</p><p><a href="#ugr.ref.xml.component_descriptor.collection_processing_parts" title="2.6. Collection Processing Component Descriptors">Section 2.6, “Collection Processing Component Descriptors”</a> describes the XML format for
+ <span class="emphasis"><em>Collection Processing Component Descriptors</em></span>. This includes
+ Collection Iterator, CAS Initializer, and CAS Consumer Descriptors.</p><p><a href="#ugr.ref.xml.component_descriptor.service_client" title="2.7. Service Client Descriptors">Section 2.7, “Service Client Descriptors”</a> describes the XML format for
+ <span class="emphasis"><em>Service Client Descriptors</em></span>, which specify how to connect to and
+ interact with resources deployed as remote services.</p><div class="section" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="ugr.ref.xml.component_descriptor.notation"></a>2.1. Notation</h2></div></div></div><p>This chapter uses an informal notation to specify the syntax of Component
+ Descriptors. The formal syntax is defined by an XML schema definition, which is
+ contained in the file <code class="literal">resourceSpecifierSchema.xsd</code>,
+ located in the <code class="literal">uima-core.jar</code> file.</p><p>The notation used in this chapter is:</p><div class="itemizedlist"><ul type="disc"><li><p>An ellipsis (...) inside an element body indicates
+ that the substructure of that element has been omitted (to be described in another
+ section of this chapter). An example of this would be:
+
+
+ </p><pre class="programlisting"><analysisEngineMetaData>
+...
+</analysisEngineMetaData></pre><p>
+ An ellipsis immediately after an element indicates that the element type may be may be
+ repeated arbitrarily many times. For example:
+
+
+ </p><pre class="programlisting"><parameter>[String]</parameter>
+<parameter>[String]</parameter>
+...</pre><p>
+ indicates that there may be arbitrarily many parameter elements in this
+ context.</p></li><li><p>Bracketed expressions (e.g. <code class="literal">[String]</code>)
+ indicate the type of value that may be used at that location.</p></li><li><p>A vertical bar, as in <code class="literal">true|false</code>, indicates
+ alternatives. This can be applied to literal values, bracketed type names, and
+ elements.</p></li><li><p>Which elements are optional and which are required is specified in
+ prose, not in the syntax definition. </p></li></ul></div></div><div class="section" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="ugr.ref.xml.component_descriptor.imports"></a>2.2. Imports</h2></div></div></div><p>The UIMA SDK defines a particular syntax for XML descriptors to import information
+ from other XML files. When one of the following appears in an XML descriptor:
+
+
+ </p><pre class="programlisting"><import location="[URL]" /> or
+<import name="[Name]" /></pre><p>
+ it indicates that information from a separate XML file is being imported. Note that
+ imports are allowed only in certain places in the descriptor. In the remainder of this
+ chapter, it will be indicated at which points imports are allowed.</p><p>If an import specifies a <code class="literal">location</code> attribute, the value of
+ that attribute specifies the URL at which the XML file to import will be found. This can be
+ a relative URL, which will be resolved relative to the descriptor containing the
+ <code class="literal">import</code> element, or an absolute URL. Relative URLs can be written
+ without a protocol/scheme (e.g., “<span class="quote">file:</span>”), and without a host machine
+ name. In this case the relative URL might look something like
+ <code class="literal">org/apache/myproj/MyTypeSystem.xml.</code></p><p>An absolute URL is written with one of the following prefixes, followed by a path
+ such as <code class="literal">org/apache/myproj/MyTypeSystem.xml</code>:
+
+ </p><div class="itemizedlist"><ul type="disc" compact><li><p>file:/ <span class="symbol">←</span> has no network
+ address</p></li><li><p>file:/// <span class="symbol">←</span> has an empty network address</p></li><li><p>file://some.network.address/</p></li></ul></div><p>For more information about URLs, please read the javadoc information for the Java
+ class “<span class="quote">URL</span>”.</p><p>If an import specifies a <code class="literal">name</code> attribute, the value of that
+ attribute should take the form of a Java-style dotted name (e.g.
+ <code class="literal">org.apache.myproj.MyTypeSystem</code>). An .xml file with this name
+ will be searched for in the classpath or datapath (described below). As in Java, the dots
+ in the name will be converted to file path separators. So an import specifying the
+ example name in this paragraph will result in a search for
+ <code class="literal">org/apache/myproj/MyTypeSystem.xml</code> in the classpath or
+ datapath.</p><p><a name="ugr.ref.xml.component_descriptor.datapath"></a>The datapath works similarly to the classpath but can be set programmatically
+ through the resource manager API. Application developers can specify a datapath
+ during initialization, using the following code:
+
+
+ </p><pre class="programlisting">
+ResourceManager resMgr = UIMAFramework.newDefaultResourceManager();
+resMgr.setDataPath(yourPathString);
+AnalysisEngine ae = UIMAFramework.produceAE(desc, resMgr, null);
+</pre><p>The default datapath for the entire JVM can be set via the
+ <code class="literal">uima.datapath</code> Java system property, but this feature should
+ only be used for standalone applications that don't need to run in the same JVM as
+ other code that may need a different datapath.</p><p>Previous versions of UIMA also supported XInclude. That support didn't work in
+ many situations, and it is no longer supported. To include other files, please use
+ <import>.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="ugr.ref.xml.component_descriptor.type_system"></a>2.3. Type System Descriptors</h2></div></div></div><p>A Type System Descriptor is used to define the types and features that can be
+ represented in the CAS. A Type System Descriptor can be imported into an Analysis Engine
+ or Collection Processing Component Descriptor.</p><p>The basic structure of a Type System Descriptor is as follows:
+
+
+ </p><pre class="programlisting"><typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
+
+ <name> [String] </name>
+ <description>[String]</description>
+ <version>[String]</version>
+ <vendor>[String]</vendor>
+
+ <imports>
+ <import ...>
+ ...
+ </imports>
+
+ <types>
+ <typeDescription>
+ ...
+ </typeDescription>
+
+ ...
+
+ </types>
+
+</typeSystemDescription></pre><p>All of the subelements are optional.</p><div class="section" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="ugr.ref.xml.component_descriptor.type_system.imports"></a>2.3.1. Imports</h3></div></div></div><p>The <code class="literal">imports</code> section allows this descriptor to import
+ types from other type system descriptors. The import syntax is described in <a href="#ugr.ref.xml.component_descriptor.imports" title="2.2. Imports">Section 2.2, “Imports”</a>. A type system may import any number of other type
+ systems and then define additional types which refer to imported types. Circular
+ imports are allowed.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="ugr.ref.xml.component_descriptor.type_system.types"></a>2.3.2. Types</h3></div></div></div><p>The <code class="literal">types</code> element contains zero or more
+ <code class="literal">typeDescription</code> elements. Each
+ <code class="literal">typeDescription</code> has the form:
+
+
+ </p><pre class="programlisting"><typeDescription>
+ <name>[TypeName]</name>
+ <description>[String]</description>
+ <supertypeName>[TypeName]</supertypeName>
+ <features>
+ ...
+ </features>
+</typeDescription></pre><p>The name element contains the name of the type. A
+ <code class="literal">[TypeName]</code> is a dot-separated list of names, where each name
+ consists of a letter followed by any number of letters, digits, or underscores.
+ <code class="literal">TypeNames</code> are case sensitive. Letter and digit are as defined
+ by Java; therefore, any Unicode letter or digit may be used (subject to the character
+ encoding defined by the descriptor file's XML header). The name following the
+ final dot is considered to be the “<span class="quote">short name</span>” of the type; the
+ preceding portion is the namespace (analogous to the package.class syntax used in
+ Java). Namespaces beginning with uima are reserved and should not be used. Examples
+ of valid type names are:</p><div class="itemizedlist"><ul type="disc" compact><li><p>test.TokenAnnotation</p></li><li><p>org.myorg.TokenAnnotation</p></li><li><p>com.my_company.proj123.TokenAnnotation </p></li></ul></div><p>These would all be considered distinct types since they have different
+ namespaces. Best practice here is to follow the normal Java naming conventions of
+ having namespaces be all lowercase, with the short type names having an initial
+ capital, but this is not mandated, so <code class="literal">ABC.mYtyPE</code> is an allowed
+ type name. While type names without namespaces (e.g.
+ <code class="literal">TokenAnnotation</code> alone) are allowed, but discouraged because
+ naming conflicts can then result when combining annotators that use different
+ type systems.</p><p>The <code class="literal">description</code> element contains a textual description
+ of the type. The <code class="literal">supertypeName</code> element contains the name of the
+ type from which it inherits (this can be set to the name of another user-defined type,
+ or it may be set to any built-in type which may be subclassed, such as
+ <code class="literal">uima.tcas.Annotation</code> for a new annotation
+ type or <code class="literal">uima.cas.TOP</code> for a new type that is not
+ an annotation). All three of these elements are required.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="ugr.ref.xml.component_descriptor.type_system.features"></a>2.3.3. Features</h3></div></div></div><p>The <code class="literal">features</code> element of a
+ <code class="literal">typeDescription</code> is required only if the type we are specifying
+ introduces new features. If the <code class="literal">features</code> element is present,
+ it contains zero or more <code class="literal">featureDescription</code> elements, each of
+ which has the form:</p><pre class="programlisting"><featureDescription>
+ <name>[Name]</name>
+ <description>[String]</description>
+ <rangeTypeName>[Name]</rangeTypeName>
+ <elementType>[Name]</elementType>
+ <multipleReferencesAllowed>true|false</multipleReferencesAllowed>
+</featureDescription></pre><p>A feature's name follows the same rules as a type short name – a letter
+ followed by any number of letters, digits, or underscores. Feature names are case
+ sensitive.</p><p>The feature's <code class="literal">rangeTypeName</code> specifies the type of
+ value that the feature can take. This may be the name of any type defined in your type
+ system, or one of the predefined types. All of the predefined types have names that are
+ prefixed with <code class="literal">uima.cas</code> or <code class="literal">uima.tcas</code>,
+ for example:
+
+
+ </p><pre class="programlisting">uima.cas.TOP
+uima.cas.String
+uima.cas.Long
+uima.cas.FSArray
+uima.cas.StringList
+uima.tcas.Annotation.</pre><p>
+ For a complete list of predefined types, see the CAS API documentation.</p><p>The <code class="literal">elementType</code> of a feature is optional, and applies only
+ when the <code class="literal">rangeTypeName</code> is
+ <code class="literal">uima.cas.FSArray</code> or <code class="literal">uima.cas.FSList</code>
+ The <code class="literal">elementType</code> specifies what type of value can be assigned as
+ an element of the array or list. This must be the name of a non-primitive type. If
+ omitted, it defaults to <code class="literal">uima.cas.TOP</code>, meaning that any
+ FeatureStructure can be assigned as an element the array or list. Note: depending on
+ the CAS Interface that you use in your code, this constraint may or may not be
+ enforced.</p><p>The <code class="literal">multipleReferencesAllowed</code> feature is optional, and
+ applies only when the <code class="literal">rangeTypeName</code> is an array or list type (it
+ applies to arrays and lists of primitive as well as non-primitive types). Setting
+ this to false (the default) indicates that this feature has exclusive ownership of
+ the array or list, so changes to the array or list are localized. Setting this to true
+ indicates that the array or list may be shared, so changes to it may affect other
+ objects in the CAS. Note: there is currently no guarantee that the framework will
+ enforce this restriction. However, this setting may affect how the CAS is
+ serialized.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="ugr.ref.xml.component_descriptor.type_system.string_subtypes"></a>2.3.4. String Subtypes</h3></div></div></div><p>There is one other special type that you can declare – a subset of the String
+ type that specifies a restricted set of allowed values. This is useful for features
+ that can have only certain String values, such as parts of speech. Here is an example of
+ how to declare such a type:</p><pre class="programlisting"><typeDescription>
+ <name>PartOfSpeech</name>
+ <description>A part of speech.</description>
+ <supertypeName>uima.cas.String</supertypeName>
+ <allowedValues>
+ <value>
+ <string>NN</string>
+ <description>Noun, singular or mass.</description>
+ </value>
+ <value>
+ <string>NNS</string>
+ <description>Noun, plural.</description>
+ </value>
+ <value>
+ <string>VB</string>
+ <description>Verb, base form.</description>
+ </value></programlisting>
+ ...
+ </allowedValues>
+</typeDescription></pre></div></div><div class="section" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="ugr.ref.xml.component_descriptor.aes"></a>2.4. Analysis Engine Descriptors</h2></div></div></div><p>Analysis Engine (AE) descriptors completely describe Analysis Engines. There
+ are two basic types of Analysis Engines – <span class="emphasis"><em>Primitive</em></span> and
+ <span class="emphasis"><em>Aggregate</em></span>. A <span class="emphasis"><em>Primitive</em></span> Analysis
+ Engine is a container for a single <span class="emphasis"><em>annotator</em></span>, where as an
+ <span class="emphasis"><em>Aggregate</em></span> Analysis Engine is composed of a collection of other
+ Analysis Engines. (For more information on this and other terminology, see <a href="../overview_and_setup/overview_and_setup.html#ugr.ovv.conceptual" class="olink">Chapter 2, UIMA Conceptual Overview
+ </a> in <span class="olinkdocname">Overview & Setup</span>).</p><p>Both Primitive and Aggregate Analysis Engines have descriptors, and the two types
+ of descriptors have some similarities and some differences. <a href="#ugr.ref.xml.component_descriptor.aes.primitive" title="2.4.1. Primitive Analysis Engine Descriptors">Section 2.4.1, “Primitive Analysis Engine Descriptors”</a>
+ discusses Primitive Analysis Engine descriptors. <a href="#ugr.ref.xml.component_descriptor.aes.aggregate" title="2.4.2. Aggregate Analysis Engine Descriptors">Section 2.4.2, “Aggregate Analysis Engine Descriptors”</a> then
+ describes how Aggregate Analysis Engine descriptors are different.</p><div class="section" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="ugr.ref.xml.component_descriptor.aes.primitive"></a>2.4.1. Primitive Analysis Engine Descriptors</h3></div></div></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.primitive.basic"></a>2.4.1.1. Basic Structure</h4></div></div></div><pre class="programlisting"><?xml version="1.0" encoding="UTF-8" ?>
+<analysisEngineDescription
+ xmlns="http://uima.apache.org/resourceSpecifier">
+ <frameworkImplementation>org.apache.uima.java</frameworkImplementation>
+
+ <primitive>true</primitive>
+ <annotatorImplementationName> [String] </annotatorImplementationName>
+
+ <analysisEngineMetaData>
+ ...
+ </analysisEngineMetaData>
+
+ <externalResourceDependencies>
+ ...
+ </externalResourceDependencies>
+
+ <resourceManagerConfiguration>
+ ...
+ </resourceManagerConfiguration>
+
+</analysisEngineDescription></pre><p>The document begins with a standard XML header. The recommended root tag is
+ <code class="literal"><analysisEngineDescription></code>, although
+ <code class="literal"><taeDescription></code> is also allowed for backwards
+ compatibility.</p><p>Within the root element we declare that we are using the XML namespace
+ <code class="literal">http://uima.apache.org/resourceSpecifier.</code> It is
+ required that this namespace be used; otherwise, the descriptor will not be able to
+ be validated for errors.</p><p> The first subelement,
+ <code class="literal"><frameworkImplementation>,</code> currently must have
+ the value <code class="literal">org.apache.uima.java</code>, or
+ <code class="literal">org.apache.uima.cpp</code>. In future versions, there may be
+ other framework implementations, or perhaps implementations produced by other
+ vendors.</p><p>The second subelement, <code class="literal"><primitive>,</code> contains
+ the Boolean value <code class="literal">true</code>, indicating that this XML document
+ describes a <span class="emphasis"><em>Primitive</em></span> Analysis Engine.</p><p>The next subelement,<code class="literal">
+ <annotatorImplementationName></code> is how the UIMA framework
+ determines which annotator class to use. This should contain a fully-qualified
+ Java class name for Java implementations, or the name of a .dll or .so file for C++
+ implementations.</p><p>The <code class="literal"><analysisEngineMetaData></code> object contains
+ descriptive information about the analysis engine and what it does. It is
+ described in <a href="#ugr.ref.xml.component_descriptor.aes.metadata" title="2.4.1.2. Analysis Engine MetaData">Section 2.4.1.2, “Analysis Engine MetaData”</a>.</p><p>The <code class="literal"><externalResourceDependencies></code> and
+ <code class="literal"><resourceManagerConfiguration></code> elements declare
+ the external resource files that the analysis engine relies
+ upon. They are optional and are described in <a href="#ugr.ref.xml.component_descriptor.aes.primitive.external_resource_dependencies" title="2.4.1.10. External Resource Dependencies">Section 2.4.1.10, “External Resource Dependencies”</a> and <a href="#ugr.ref.xml.component_descriptor.aes.primitive.resource_manager_configuration" title="2.4.1.11. Resource Manager Configuration">Section 2.4.1.11, “Resource Manager Configuration”</a>.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.metadata"></a>2.4.1.2. Analysis Engine MetaData</h4></div></div></div><pre class="programlisting"><analysisEngineMetaData>
+ <name> [String] </name>
+ <description>[String]</description>
+ <version>[String]</version>
+ <vendor>[String]</vendor>
+
+ <configurationParameters> ... </configurationParameters>
+
+ <configurationParameterSettings>
+ ...
+ </configurationParameterSettings>
+
+ <typeSystemDescription> ... </typeSystemDescription>
+
+ <typePriorities> ... </typePriorities>
+
+ <fsIndexCollection> ... </fsIndexCollection>
+
+ <capabilities> ... </capabilities>
+
+ <operationalProperties> ... </operationalProperties>
+
+</analysisEngineMetaData></pre><p>The <code class="literal">analysisEngineMetaData</code> element contains four
+ simple string fields – <code class="literal">name</code>,
+ <code class="literal">description</code>, <code class="literal">version</code>, and
+ <code class="literal">vendor</code>. Only the <code class="literal">name</code> field is
+ required, but providing values for the other fields is recommended. The
+ <code class="literal">name</code> field is just a descriptive name meant to be read by
+ users; it does not need to be unique across all Analysis Engines.</p><p>The other sub-elements –
+ <code class="literal">configurationParameters</code>,
+ <code class="literal">configurationParameterSettings</code>,
+ <code class="literal">typeSystemDescription</code>,
+ <code class="literal">typePriorities</code>, <code class="literal">fsIndexes</code>,
+ <code class="literal">capabilities</code> and
+ <code class="literal">operationalProperties</code> are described in the following
+ sections. The only one of these that is required is
+ <code class="literal">capabilities</code>; the others are optional.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.configuration_parameter_declaration"></a>2.4.1.3. Configuration Parameter Declaration</h4></div></div></div><p>Configuration Parameters are made available to annotator
+ implementations and applications by the following interfaces:
+ <code class="literal">AnnotatorContext</code> <sup>[<a name="d0e570" href="#ftn.d0e570">2</a>]</sup> (passed as an argument to the
+ initialize() method of a version 1 annotator),
+ <code class="literal">ConfigurableResource</code> (every Analysis Engine
+ implements this interface), and the <code class="literal">UimaContext</code> (passed
+ as an argument to the initialize() method of a version 2 annotator) (you can get
+ this from any resource, including Analysis Engines, using the method
+ <code class="literal">getUimaContext</code>()).</p><p>Use AnnotatorContext within version 1 annotators and UimaContext for
+ version 2 annotators and outside of annotators (for instance, in CasConsumers,
+ or the containing application) to access configuration parameters.</p><p>Configuration parameters are set from the corresponding elements in the
+ XML descriptor for the application. If you need to programmatically change
+ parameter settings within an application, you can use methods in
+ ConfigurableResource; if you do this, you need to call reconfigure()
+ afterwards to have the UIMA framework notify all the contained analysis
+ components that the parameter configuration has changed (the analysis
+ engine's reinitialize() methods will be called). Note that in the current
+ implementation, only integrated deployment components have configuration
+ parameters passed to them; remote components obtain their parameters from
+ their remote startup environment. This will likely change in the
+ future.</p><p>There are two ways to specify the
+ <code class="literal"><configurationParameters></code> section – as a
+ list of configuration parameters or a list of groups. A list of parameters, which
+ are not part of any group, looks like this:
+
+
+ </p><pre class="programlisting"><configurationParameters>
+ <configurationParameter>
+ <name>[String]</name>
+ <description>[String]</description>
+ <type>String|Integer|Float|Boolean</type>
+ <multiValued>true|false</multiValued>
+ <mandatory>true|false</mandatory>
+ <overrides>
+ <parameter>[String]</parameter>
+ <parameter>[String]</parameter>
+ ...
+ </overrides>
+ </configurationParameter>
+ <configurationParameter>
+ ...
+ </configurationParameter>
+ ...
+</configurationParameters></pre><p>For each configuration parameter, the following are specified:</p><div class="itemizedlist"><ul type="disc"><li><p><span class="bold"><strong>name</strong></span>
+ – the name by which the annotator code refers to the parameter. All
+ parameters declared in an analysis engine descriptor must have distinct names.
+ (required). The name is composed of normal Java identifier characters.</p></li><li><p><span class="bold"><strong>description</strong></span> – a
+ natural language description of the intent of the parameter
+ (optional)</p></li><li><p><span class="bold"><strong>type</strong></span> – the data
+ type of the parameter's value – must be one of
+ <code class="literal">String</code>, <code class="literal">Integer</code>,
+ <code class="literal">Float</code>, or <code class="literal">Boolean</code>
+ (required).</p></li><li><p><span class="bold"><strong>multiValued</strong></span> –
+ <code class="literal">true</code> if the parameter can take multiple-values (an
+ array), <code class="literal">false</code> if the parameter takes only a single value
+ (optional, defaults to false).</p></li><li><p><span class="bold"><strong>mandatory</strong></span> –
+ <code class="literal">true</code> if a value must be provided for the parameter
+ (optional, defaults to false).</p></li><li><p><span class="bold"><strong>overrides</strong></span> – this
+ is used only in aggregate Analysis Engines, but is included here for
+ completeness. See <a href="#ugr.ref.xml.component_descriptor.aes.aggregate.configuration_parameter_overrides" title="2.4.2.4. Configuration Parameter Overrides">Section 2.4.2.4, “Configuration Parameter Overrides”</a>
+ for a discussion of configuration parameter overriding in aggregate
+ Analysis Engines. (optional) </p></li></ul></div><p>A list of groups looks like this:
+
+
+ </p><pre class="programlisting"><configurationParameters defaultGroup="[String]"
+ searchStrategy="none|default_fallback|language_fallback" >
+
+ <commonParameters>
+ [zero or more parameters]
+ </commonParameters>
+
+ <configurationGroup names="name1 name2 name3 ...">
+ [zero or more parameters]
+ </configurationGroup>
+
+ <configurationGroup names="name4 name5 ...">
+ [zero or more parameters]
+ </configurationGroup>
+
+ ...
+
+</configurationParameters></pre><p>Both the<code class="literal"> <commonParameters></code> and
+ <code class="literal"><configurationGroup></code> elements contain zero or
+ more <code class="literal"><configurationParameter></code> elements, with
+ the same syntax described above.</p><p>The <code class="literal"><commonParameters></code> element declares
+ parameters that exist in all groups. Each
+ <code class="literal"><configurationGroup></code> element has a names
+ attribute, which contains a list of group names separated by whitespace (space
+ or tab characters). Names consist of any number of non-whitespace characters;
+ however the Component Descriptor Editor tool restricts this to be normal Java
+ identifiers, including the period (.) and the dash (-). One configuration group
+ will be created for each name, and all of the groups will contain the same set of
+ parameters.</p><p>The <code class="literal">defaultGroup</code> attribute specifies the name of the
+ group to be used in the case where an annotator does a lookup for a configuration
+ parameter without specifying a group name. It may also be used as a fallback if the
+ annotator specifies a group that does not exist – see below.</p><p>The <code class="literal">searchStrategy</code> attribute determines the action
+ to be taken when the context is queried for the value of a parameter belonging to a
+ particular configuration group, if that group does not exist or does not contain
+ a value for the requested parameter. There are currently three possible values:
+
+ </p><div class="itemizedlist"><ul type="disc"><li><p><span class="bold"><strong>none</strong></span>
+ – there is no fallback; return null if there is no value in the exact group
+ specified by the user.</p></li><li><p><span class="bold"><strong>default_fallback</strong></span>
+ – if there is no value found in the specified group, look in the default
+ group (as defined by the <code class="literal">default</code> attribute)</p></li><li><p><span class="bold"><strong>language_fallback</strong></span>
+ – this setting allows for a specific use of configuration parameter
+ groups where the groups names correspond to ISO language and country codes
+ (for an example, see below). The fallback sequence is:
+ <code class="literal"><lang>_<country>_<region> <span class="symbol">→</span>
+ <lang>_<country> <span class="symbol">→</span> <lang> <span class="symbol">→</span>
+ <default>.</code> </p></li></ul></div><p>
+ </p><div class="section" lang="en"><div class="titlepage"><div><div><h5 class="title"><a name="ugr.ref.xml.component_descriptor.aes.configuration_parameter_declaration.example"></a>Example</h5></div></div></div><pre class="programlisting"><configurationParameters defaultGroup="en"
+ searchStrategy="language_fallback">
+
+ <commonParameters>
+ <configurationParameter>
+ <name>DictionaryFile</name>
+ <description>Location of dictionary for this
+ language</description>
+ <type>String</type>
+ <multiValued>false</multiValued>
+ <mandatory>false</mandatory>
+ </configurationParameter>
+ </commonParameters>
+
+ <configurationGroup names="en de en-US"/>
+
+ <configurationGroup names="zh">
+ <configurationParameter>
+ <name>DBC_Strategy</name>
+ <description>Strategy for dealing with double-byte
+ characters.</description>
+ <type>String</type>
+ <multiValued>false</multiValued>
+ <mandatory>false</mandatory>
+ </configurationParameter>
+ </configurationGroup>
+
+</configurationParameters></pre><p>In this example, we are declaring a <code class="literal">DictionaryFile</code>
+ parameter that can have a different value for each of the languages that our AE
+ supports
+ – English (general), German, U.S. English, and Chinese. For Chinese
+ only, we also declare a <code class="literal">DBC_Strategy</code>
+ parameter.</p><p>We are using the <code class="literal">language_fallback</code> search
+ strategy, so if an annotator requests the dictionary file for the
+ <code class="literal">en-GB</code> (British English) group, we will fall back to the
+ more general <code class="literal">en</code> group.</p><p>Since we have defined <code class="literal">en</code> as the default group, this
+ value will be returned if the context is queried for the
+ <code class="literal">DictionaryFile</code> parameter without specifying any
+ group name, or if a nonexistent group name is specified.</p></div></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.configuration_parameter_settings"></a>2.4.1.4. Configuration Parameter Settings</h4></div></div></div><p>If no configuration groups were declared, the
+ <code class="literal"><configurationParameterSettings></code> element
+ looks like this:
+
+
+ </p><pre class="programlisting"><configurationParameterSettings>
+ <nameValuePair>
+ <name>[String]</name>
+ <value>
+ <string>[String]</string> |
+ <integer>[Integer]</integer> |
+ <float>[Float]</float> |
+ <boolean>true|false</boolean> |
+ <array> ... </array>
+ </value>
+ </nameValuePair>
+
+ <nameValuePair>
+ ...
+ </nameValuePair>
+ ...
+</configurationParameterSettings></pre><p>There are zero or more <code class="literal">nameValuePair</code> elements. Each
+ <code class="literal">nameValuePair</code> contains a name (which refers to one of the
+ configuration parameters) and a value for that parameter.</p><p>The <code class="literal">value</code> element contains an element that matches
+ the type of the parameter. For single-valued parameters, this is either
+ <code class="literal"><string></code>, <code class="literal"><integer></code>
+ , <code class="literal"><float></code>, or
+ <code class="literal"><boolean></code>. For multi-valued parameters, this is
+ an <code class="literal"><array></code> element, which then contains zero or
+ more instances of the appropriate type of primitive value, e.g.:
+
+
+ </p><pre class="programlisting"><array><string>One</string><string>Two</string></array></pre><p>If configuration groups were declared, then the
+ <code class="literal"><configurationParameterSettings></code> element
+ looks like this:
+
+
+ </p><pre class="programlisting"><configurationParameterSettings>
+
+ <settingsForGroup name="[String]">
+ [one or more <nameValuePair> elements]
+ </settingsForGroup>
+
+ <settingsForGroup name="[String]">
+ [one or more <nameValuePair> elements]
+ </settingsForGroup>
+
+...
+
+</configurationParameterSettings></pre><p>
+ where each <code class="literal"><settingsForGroup></code> element has a name
+ that matches one of the configuration groups declared under the
+ <code class="literal"><configurationParameters></code> element and contains
+ the parameter settings for that group.</p><div class="section" lang="en"><div class="titlepage"><div><div><h5 class="title"><a name="ugr.ref.xml.component_descriptor.aes.configuration_parameter_settings.example"></a>Example</h5></div></div></div><p>Here are the settings that correspond to the parameter declarations in
+ the previous example:
+
+
+ </p><pre class="programlisting"><configurationParameterSettings>
+
+ <settingsForGroup name="en">
+ <nameValuePair>
+ <name>DictionaryFile</name>
+ <value><string>resourcesEnglishdictionary.dat></string></value>
+ </nameValuePair>
+ </settingsForGroup>
+
+ <settingsForGroup name="en-US">
+ <nameValuePair>
+ <name>DictionaryFile</name>
+ <value><string>resourcesEnglish_USdictionary.dat</string></value>
+ </nameValuePair>
+ </settingsForGroup>
+
+ <settingsForGroup name="de">
+ <nameValuePair>
+ <name>DictionaryFile</name>
+ <value><string>resourcesDeutschdictionary.dat</string></value>
+ </nameValuePair>
+ </settingsForGroup>
+
+ <settingsForGroup name="zh">
+ <nameValuePair>
+ <name>DictionaryFile</name>
+ <value><string>resourcesChinesedictionary.dat</string></value>
+ </nameValuePair>
+
+ <nameValuePair>
+ <name>DBC_Strategy</name>
+ <value><string>default</string></value>
+ </nameValuePair>
+
+ </settingsForGroup>
+
+</configurationParameterSettings></pre></div></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.type_system"></a>2.4.1.5. Type System Definition</h4></div></div></div><pre class="programlisting"><typeSystemDescription>
+
+ <name> [String] </name>
+ <description>[String]</description>
+ <version>[String]</version>
+ <vendor>[String]</vendor>
+
+ <imports>
+ <import ...>
+ ...
+ </imports>
+
+ <types>
+ <typeDescription>
+ ...
+ </typeDescription>
+
+ ...
+
+ </types>
+
+</typeSystemDescription></pre><p>A <code class="literal">typeSystemDescription</code> element defines a type
+ system for an Analysis Engine. The syntax for the element is described in <a href="#ugr.ref.xml.component_descriptor.type_system" title="2.3. Type System Descriptors">Section 2.3, “Type System Descriptors”</a>.</p><p>The recommended usage is to <code class="literal">import</code> an external type
+ system, using the import syntax described in <a href="#ugr.ref.xml.component_descriptor.imports" title="2.2. Imports">Section 2.2, “Imports”</a>
+ of this chapter. For example:
+
+
+ </p><pre class="programlisting"><typeSystemDescription>
+ <imports>
+ <import location="MySharedTypeSystem.xml">
+ </imports>
+</typeSystemDescription></pre><p>This allows several AEs to share a single type system definition. The file
+ <code class="literal">MySharedTypeSystem.xml</code> would then contain the full
+ type system information, including the <code class="literal">name</code>,
+ <code class="literal">description</code>, <code class="literal">vendor</code>,
+ <code class="literal">version</code>, and <code class="literal">types</code>.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.type_priority"></a>2.4.1.6. Type Priority Definition</h4></div></div></div><pre class="programlisting"></programlisting><typePriorities>
+ <name> [String] </name>
+ <description>[String]</description>
+ <version>[String]</version>
+ <vendor>[String]</vendor>
+
+ <imports>
+ <import ...>
+ ...
+ </imports>
+
+ <priorityLists>
+ <priorityList>
+ <type>[TypeName]</type>
+ <type>[TypeName]</type>
+ ...
+ </priorityList>
+
+ ...
+
+ </priorityLists>
+</typePriorities></pre><p>The <code class="literal"><typePriorities></code> element contains
+ zero or more <code class="literal"><priorityList></code> elements; each
+ <code class="literal"><priorityList></code> contains zero or more types.
+ Like a type system, a type priorities definition may also declare a name,
+ description, version, and vendor, and may import other type priorities. See
+ <a href="#ugr.ref.xml.component_descriptor.imports" title="2.2. Imports">Section 2.2, “Imports”</a> for the import syntax.</p><p>Type priority is used when iterating over feature structures in the CAS.
+ For example, if the CAS contains a <code class="literal">Sentence</code> annotation
+ and a <code class="literal">Paragraph</code> annotation with the same span of text
+ (i.e. a one-sentence paragraph), which annotation should be returned first
+ by an iterator? Probably the Paragraph, since it is conceptually
+ “<span class="quote">bigger,</span>” but the framework does not know that and must be
+ explicitly told that the Paragraph annotation has priority over the Sentence
+ annotation, like this:
+
+
+ </p><pre class="programlisting"><typePriorities>
+ <priorityList>
+ <type>org.myorg.Paragraph</type>
+ <type>org.myorg.Sentence</type>
+ </priorityList>
+</typePriorities></pre><p>All of the <code class="literal"><priorityList></code> elements defined
+ in the descriptor (and in all component descriptors of an aggregate analysis
+ engine descriptor) are merged to produce a single priority list.</p><p>Subtypes of types specified here are also ordered, unless overridden by
+ another user-specified type ordering. For example, if you specify type A
+ comes before type B, then subtypes of A will come before subtypes of B, unless
+ there is an overriding specification which declares some subtype of B comes
+ before some subtype of A.</p><p>If there are inconsistencies between the priority list (type A declared
+ before type B in one priority list, and type B declared before type A in
+ another), the framework will throw an exception.</p><p>User defined indexes may declare if they wish to use the type priority or
+ not; see the next section.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.index"></a>2.4.1.7. Index Definition</h4></div></div></div><pre class="programlisting"><fsIndexCollection>
+
+ <name>[String]</name>
+ <description>[String]</description>
+ <version>[String]</version>
+ <vendor>[String]</vendor>
+
+ <imports>
+ <import ...>
+ ...
+ </imports>
+
+ <fsIndexes>
+
+ <fsIndexDescription>
+ ...
+ </fsIndexDescription>
+
+ <fsIndexDescription>
+ ...
+ </fsIndexDescription>
+
+ </fsIndexes>
+
+</fsIndexCollection></pre><p>The <code class="literal">fsIndexCollection</code> element declares<span class="emphasis"><em> Feature Structure
+ Indexes</em></span>, each of which defined an index that holds feature structures of a given type.
+ Information in the CAS is always accessed through an index. There is a built-in default annotation
+ index declared which can be used to access instances of type
+ <code class="literal">uima.tcas.Annotation</code> (or its subtypes), sorted based on their
+ <code class="literal">begin</code> and <code class="literal">end</code> features. For all other types, there is a
+ default, unsorted (bag) index. If there is a need for a specialized index it must be declared in this
+ element of the descriptor. See <a href="references.html#ugr.ref.cas.indexes_and_iterators" class="olink">Section 4.7, “Indexes and Iterators”</a> for details on FS indexes.</p><p>Like type systems and type priorities, an
+ <code class="literal">fsIndexCollection</code> can declare a
+ <code class="literal">name</code>, <code class="literal">description</code>,
+ <code class="literal">vendor</code>, and <code class="literal">version</code>, and may
+ import other <code class="literal">fsIndexCollection</code>s. The import syntax is
+ described in <a href="#ugr.ref.xml.component_descriptor.imports" title="2.2. Imports">Section 2.2, “Imports”</a>.</p><p>An <code class="literal">fsIndexCollection</code> may also define zero or more
+ <code class="literal">fsIndexDescription</code> elements, each of which defines a
+ single index. Each <code class="literal">fsIndexDescription</code> has the form:
+
+
+ </p><pre class="programlisting"><fsIndexDescription>
+
+ <label>[String]</label>
+ <typeName>[TypeName]</typeName>
+ <kind>sorted|bag|set</kind>
+
+ <keys>
+
+ <fsIndexKey>
+ <featureName>[Name]</featureName>
+ <comparator>standard|reverse</comparator>
+ </fsIndexKey>
+
+ <fsIndexKey>
+ <typePriority/>
+ </fsIndexKey>
+
+ ...
+
+ </keys>
+</fsIndexDescription></pre><p>The <code class="literal">label</code> element defines the name by which
+ applications and annotators refer to this index. The
+ <code class="literal">typeName</code> element contains the name of the type that will
+ be contained in this index. This must match one of the type names defined in the
+ <code class="literal"><typeSystemDescription></code>.</p><p>There are three possible values for the
+ <code class="literal"><kind></code> of index. Sorted indexes enforce an
+ ordering of feature structures, and may contain duplicates. Bag indexes do
+ not enforce ordering, and also may contain duplicates. Set indexes do not
+ enforce ordering and may not contain duplicates. If the <code class="literal"><kind></code>element is omitted, it will default to
+ sorted, which is the most common type of index.</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>There is usually no need to explicitly declare a Bag index in your descriptor.
+ As of UIMA v2.1, if you do not declare any index for a type (or any of its
+ supertypes), a Bag index will be automatically created.</p></div><p>An index may define one or more <span class="emphasis"><em>keys</em></span>. These keys
+ determine the sort order of the feature structures within a sorted index, and
+ determine equality for set indexes. Bag indexes do not use keys. Keys are
+ ordered by precedence – the first key is evaluated first, and
+ subsequent keys are evaluated only if necessary.</p><p>Each key is represented by an <code class="literal">fsIndexKey</code> element.
+ Most <code class="literal">fsIndexKeys</code> contains a
+ <code class="literal">featureName</code> and a <code class="literal">comparator</code>.
+ The <code class="literal">featureName</code> must match the name of one of the
+ features for the type specified in the
+ <code class="literal"><typeName></code> element for this index. The
+ comparator defines how the features will be compared – a value of
+ <code class="literal">standard</code> means that features will be compared using the
+ standard comparison for their data type (e.g. for numerical types, smaller
+ values precede larger values, and for string types, Unicode string
+ comparison is performed). A value of <code class="literal">reverse</code> means that
+ features will be compared using the reverse of the standard comparison (e.g.
+ for numerical types, larger values precede smaller values, etc.). For Set
+ indexes, the comparator direction is ignored – the keys are only used
+ for the equality testing.</p><p>Each key used in comparisons must refer to a feature whose range type is
+ String, Float, or Integer.</p><p>There is a second type of a key, one which contains only the
+ <code class="literal"><typePriority/></code>. When this key is used, it
+ indicates that Feature Structures will be compared using the type priorities
+ declared in the <code class="literal"><typePriorities></code> section of the
+ descriptor.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.capabilities"></a>2.4.1.8. Capabilities</h4></div></div></div><pre class="programlisting"><capabilities>
+ <capability>
+
+ <inputs>
+ <type allAnnotatorFeatures="true|false"[TypeName]</type>
+ ...
+ <feature>[TypeName]:[Name]</feature>
+ ...
+ </inputs>
+
+ <outputs>
+ <type allAnnotatorFeatures="true|false"[TypeName]</type>
+ ...
+ <feature>[TypeName]:[Name]</feature>
+ ...
+ </output>
+
+ <languagesSupported>
+ <language>[ISO Language ID]</language>
+ ...
+ </languagesSupported>
+
+ <inputSofas>
+ <sofaName>[name]</sofaName>
+ ...
+ </inputSofas>
+
+ <outputSofas>
+ <sofaName>[name]</sofaName>
+ ...
+ </outputSofas>
+ </capability>
+
+ <capability>
+ ...
+ </capability>
+
+ ...
+
+</capabilities></pre><p>The capabilities definition is used by the UIMA Framework in several
+ ways, including setting up the Results Specification for process calls,
+ routing control for aggregates based on language, and as part of the Sofa
+ mapping function.</p><p>The <code class="literal">capabilities</code> element contains one or more
+ <code class="literal">capability</code> elements. In Version 2 and onwards, only one
+ capability set should be used (multiple sets will continue to work for a while,
+ but they're not logically consistently supported).
+ </p><p>Each <code class="literal">capability</code> contains
+ <code class="literal">inputs</code>, <code class="literal">outputs</code>,
+ <code class="literal">languagesSupported, inputSofas, and outputSofas</code>.
+ Inputs and outputs element are required (though they may be empty);
+ <code class="literal"><languagesSupported>, <inputSofas</code>>,
+ and <code class="literal"><outputSofas></code> are optional.</p><p>Both inputs and outputs may contain a mixture of type and feature
+ elements.</p><p><code class="literal"><type...></code> elements contain the name of one
+ of the types defined in the type system or one of the built in types. Declaring a
+ type as an input means that this component expects instances of this type to be
+ in the CAS when it receives it to process. Declaring a type as an output means
+ that this component creates new instances of this type in the CAS.</p><p>There is an optional attribute
+ <code class="literal">allAnnotatorFeatures</code>, which defaults to false if
+ omitted. The Component Descriptor Editor tool defaults this to true when a new
+ type is added to the list of inputs and/or outputs. When this attribute is true,
+ it specifies that all of the type's features are also declared as input or
+ output. Otherwise, the features that are required as inputs or populated as
+ outputs must be explicitly specified in feature elements.</p><p><code class="literal"><feature...></code> elements contain the
+ “<span class="quote">fully-qualified</span>” feature name, which is the type name
+ followed by a colon, followed by the feature name, e.g.
+ <code class="literal">org.myorg.TokenAnnotation:lemma</code>.
+ <code class="literal"><feature...></code> elements in the
+ <code class="literal"><inputs></code> section must also have a corresponding
+ type declared as an input. In output sections, this is not required. If the type
+ is not specified as an output, but a feature for that type is, this means that
+ existing instances of the type have the values of the specified features
+ updated. Any type mentioned in a <code class="literal"><feature></code>
+ element must be either specified as an input or an output or both.</p><p><code class="literal">language </code>elements contain one of the ISO language
+ identifiers, such as <code class="literal">en</code> for English, or
+ <code class="literal">en-US</code> for the United States dialect of English.</p><p>The list of language codes can be found here: <a href="http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt" target="_top">http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt</a>
+ and the country codes here:
+ <a href="http://www.chemie.fu-berlin.de/diverse/doc/ISO_3166.html" target="_top">http://www.chemie.fu-berlin.de/diverse/doc/ISO_3166.html</a>
+ </p><p><code class="literal"><inputSofas></code> and
+ <code class="literal"><outputSofas></code> declare sofa names used by this
+ component. All Sofa names must be unique within a particular capability set. A
+ Sofa name must be an input or an output, and cannot be both. It is an error to have a
+ Sofa name declared as an input in one capability set, and also have it declared
+ as an output in another capability set.</p><p>A <code class="literal"><sofaName></code> is written as a simple
+ Java-style identifier, without any periods in the name, except that it may be
+ written to end in “<span class="quote"><code class="literal">.*</code></span>”. If written in this
+ manner, it specifies a set of Sofa names, all of which start with the base name
+ (the part before the .*) followed by a period and then an arbitrary Java
+ identifier (without periods). This form is used to specify in the descriptor
+ that the component could generate an arbitrary number of Sofas, the exact
+ names and numbers of which are unknown before the component is run.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.operational_properties"></a>2.4.1.9. OperationalProperties</h4></div></div></div><p>Components can specify specific operational properties that can be
+ useful in deployment. The following are available:</p><pre class="programlisting"><operationalProperties>
+ <modifiesCas> true|false </modifiesCas>
+ <multipleDeploymentAllowed> true|false </multipleDeploymentAllowed>
+ <outputsNewCASes> true|false </outputsNewCASes>
+</operationalProperties></pre><p><code class="literal">ModifiesCas</code>, if false, indicates that this
+ component does not modify the CAS. If it is not specified, the default value is
+ true except for CAS Consumer components.</p><p><code class="literal">multipleDeploymentAllowed</code>, if true, allows the
+ component to be deployed multiple times to increase performance throught
+ scale-out techniques. If it is not specified, the default value is true,
+ except for CAS Consumer and Collection Reader components.</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>If you wrap one or more CAS Consumers inside an aggregate as the only
+ components, you must explicitly specify in the aggregate the
+ <code class="literal">multipleDeploymentAllowed</code> property as false (assuming the CAS Consumer
+ components take the default here); otherwise the framework will complain about inconsistent
+ settings for these.</p></div><p><code class="literal">outputsNewCASes</code>, if true, allows the component to
+ create new CASes during processing, for example to break a large artifact into
+ smaller pieces. See <a href="../tutorials_and_users_guides/tutorials_and_users_guides.html#ugr.tug.cm" class="olink">Chapter 7, CAS Multiplier Developer's Guide
+ </a> in <span class="olinkdocname">UIMA Tutorial and Developers' Guides</span> for details.</p></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.primitive.external_resource_dependencies"></a>2.4.1.10. External Resource Dependencies</h4></div></div></div><pre class="programlisting"><externalResourceDependencies>
+ <externalResourceDependency>
+ <key>[String]</key>
+ <description>[String] </description>
+ <interfaceName>[String]</interfaceName>
+ <optional>true|false</optional>
+ </externalResourceDependency>
+
+ <externalResourceDependency>
+ ...
+ </externalResourceDependency></programlisting>
+
+ ...
+
+</externalResourceDependencies></pre><p>A primitive annotator may declare zero or more
+ <code class="literal"><externalResourceDependency></code> elements. Each
+ dependency has the following elements:
+
+ </p><div class="itemizedlist"><ul type="disc"><li><p><code class="literal">key</code> – the
+ string by which the annotator code will attempt to access the resource. Must
+ be unique within this annotator.</p></li><li><p><code class="literal">description</code> – a textual
+ description of the dependency</p></li><li><p><code class="literal">interfaceName</code> – the
+ fully-qualified name of the Java interface through which the annotator
+ will access the data. This is optional. If not specified, the annotator
+ can only get an InputStream to the data.</p></li><li><p><code class="literal">optional</code> – whether the
+ resource is optional. If false, an exception will be thrown if no resource
+ is assigned to satisfy this dependency. Defaults to false. </p></li></ul></div></div><div class="section" lang="en"><div class="titlepage"><div><div><h4 class="title"><a name="ugr.ref.xml.component_descriptor.aes.primitive.resource_manager_configuration"></a>2.4.1.11. Resource Manager Configuration</h4></div></div></div><pre class="programlisting"></programlisting><resourceManagerConfiguration>
+
+ <name>[String]</name>
+ <description>[String]</description>
+ <version>[String]</version>
+ <vendor>[String]</vendor>
+
+ <imports>
+ <import ...>
+ ...
+ </imports>
+
+ <externalResources>
+
+ <externalResource>
+ <name>[String]</name>
+ <description>[String]</description>
+ <fileResourceSpecifier>
+ <fileUrl>[URL]</fileUrl>
+ </fileResourceSpecifier>
+ <implementationName>[String]</implementationName>
+ </externalResource>
+ ...
+ </externalResources>
+
+ <externalResourceBindings>
+ <externalResourceBinding>
+ <key>[String]</key>
+ <resourceName>[String]</resourceName>
+ </externalResourceBinding>
+ ...
+ </externalResourceBindings>
+
+</resourceManagerConfiguration></pre><p>This element declares external resources and binds them to
+ annotators' external resource dependencies.</p><p>The <code class="literal">resourceManagerConfiguration</code> element may
+ optionally contain an <code class="literal">import</code>, which allows resource
+ definitions to be stored in a separate (shareable) file. See <a href="#ugr.ref.xml.component_descriptor.imports" title="2.2. Imports">Section 2.2, “Imports”</a> for details.</p><p>The <code class="literal">externalResources</code> element contains zero or
+ more <code class="literal">externalResource</code> elements, each of which
+ consists of:
+
+ </p><div class="itemizedlist"><ul type="disc"><li><p><code class="literal">name</code> – the
+ name of the resource. This name is referred to in the bindings (see below).
+ Resource names need to be unique within any Aggregate Analysis Engine or
+ Collection Processing Engine, so the Java-like
+ <code class="literal">org.myorg.mycomponent.MyResource</code> syntax is
+ recommended.</p></li><li><p><code class="literal">description</code> – English
+ description of the resource</p></li><li><p>Resource Specifier –
+ Declares the location of the resource. There are different
+ possibilities for how this is done (see below).</p></li><li><p><code class="literal">implementationName</code> – The
+ fully-qualified name of the Java class that will be instantiated from the
+ resource data. This is optional; if not specified, the resource will be
+ accessible as an input stream to the raw data. If specified, the Java class
+ must implement the <code class="literal">interfaceName</code> that is
+ specified in the External Resource Dependency to which it is bound.
+ </p></li></ul></div><p>One possibility for the resource specifier is a
+ <code class="literal"><fileResourceSpecifier></code>, as shown above. This
+ simply declares a URL to the resource data. This support is built on the Java
+ class URL and its method URL.openStream(); it supports the protocols
+ “<span class="quote">file</span>”, “<span class="quote">http</span>” and “<span class="quote">jar</span>” (for
+ referring to files in jars) by default, and you can plug in handlers for other
+ protocols. The URL has to start with file: (or some other protocol). It is
+ relative to either the classpath or the “<span class="quote">data path</span>”. The data
+ path works like the classpath but can be set programmatically via
+ <code class="literal">ResourceManager.setDataPath()</code>. Setting the Java
+ System property <code class="literal">uima.datapath</code> also works.</p><p><code class="literal">file:com/apache.d.txt</code> is a relative path;
+ relative paths for resources are resolved using the classpath and/or the
+ datapath. For the file protocol, URLs starting with file:/ or file:/// are
+ absolute. Note that <code class="literal">file://org/apache/d.txt</code> is NOT an
+ absolute path starting with “<span class="quote">org</span>”. The “<span class="quote">//</span>”
+ indicates that what follows is a host name. Therefore if you try to use this URL
+ it will complain that it can't connect to the host “<span class="quote">org</span>”
+ </p><p>Another option is a
+ <code class="literal"><fileLanguageResourceSpecifier></code>, which is
+ intended to support resources, such as dictionaries, that depend on the
+ language of the document being processed. Instead of a single URL, a prefix and
+ suffix are specified, like this:
+
+
+ </p><pre class="programlisting"><fileLanguageResourceSpecifier>
+ <fileUrlPrefix>file:FileLanguageResource_implTest_data_</fileUrlPrefix>
+ <fileUrlSuffix>.dat</fileUrlSuffix>
+</fileLanguageResourceSpecifier></pre><p>The URL of the actual resource is then formed by concatenating the prefix,
+ the language of the document (as an ISO language code, e.g.
+ <code class="literal">en</code> or <code class="literal">en-US</code>
+ – see <a href="#ugr.ref.xml.component_descriptor.aes.capabilities" title="2.4.1.8. Capabilities">Section 2.4.1.8, “Capabilities”</a> for more
+ information), and the suffix.</p><p>The <code class="literal">externalResourceBindings</code> element declares
+ which resources are bound to which dependencies. Each
+ <code class="literal">externalResourceBinding</code> consists of:
+
+ </p><div class="itemizedlist"><ul type="disc"><li><p><code class="literal">key</code> –
+ identifies the dependency. For a binding declared in a primitive analysis
+ engine descriptor, this must match the value of the
+ <code class="literal">key</code> element of one of the
+ <code class="literal">externalResourceDependency</code> elements. Bindings
+ may also be specified in aggregate analysis engine descriptors, in which
+ case a compound key is used
+ – see <a href="#ugr.ref.xml.component_descriptor.aes.aggregate.external_resource_bindings" title="2.4.2.5. External Resource Bindings">Section 2.4.2.5, “External Resource Bindings”</a>
+ .</p></li><li><p><code class="literal">resourceName</code> – the name of
+ the resource satisfying the dependency. This must match the value of the
+ <code class="literal">name</code> element of one of the
+ <code class="literal">externalResource</code> declarations. </p></li></ul></div><p>A given resource dependency may only be bound to one external resource;
+ one external resource may be bound to many dependencies – to allow
[... 2795 lines stripped ...]