You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@uima.apache.org by "Kothuvatiparambil, Viju" <vi...@bankofamerica.com> on 2014/04/20 22:10:09 UTC

SemClass feature not working in ConceptMapper add-on

Hi All, 

I am trying to use the ConceptMapper add on to assign a SemClass feature to tokens. I am getting the following error:

SEVERE: ConceptMapper SEVERE: FeatureList[1] 'SemClass' specified, but does not exist for type: org.apache.uima.conceptMapper.DictTerm

I configured FeatureList and AttributeList in ConceptMapperOffsetTokenizer.xml as given below:

			<nameValuePair>
				<name>AttributeList</name>
				<value>
					<array>
						<string>canonical</string>
						<string>SemClass</string>
					</array>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>FeatureList</name>
				<value>
					<array>
						<string>DictCanon</string>
						<string>SemClass</string>
					</array>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>ResultingAnnotationName</name>
				<value>
					<string>
						org.apache.uima.conceptMapper.DictTerm
					</string>
				</value>
			</nameValuePair>

Here is my simplified dict.xml file

<synonym>
  <token canonical="grocery" SemClass="category">
     <variant base="grocery"/>
  </token>
</synonym>

I debugged the problem and found that it is looking for the SemClass feature in resultAnnotationType which DictTerm. But actually, the SemClass is not a feature in DictTerm type.

      resultEnclosingSpan = resultAnnotationType.getFeatureByBaseName(resultEnclosingSpanName);
      if (resultEnclosingSpan == null) {
        logger.logError(PARAM_ENCLOSINGSPAN + " '" + resultEnclosingSpanName
                + "' specified, but does not exist for type: " + resultAnnotationType.getName());
        throw new AnnotatorInitializationException();
      }

I just started using UIMA, so I don't understand the complete architecture yet. Could any of you point me to the right direction ?  Thanks a lot in advance.

Viju Kothuvatiparambil

Here is the complete ConceptMapperOffsetTokenizer.xml file contents:

<taeDescription xmlns="http://uima.apache.org/resourceSpecifier">
	<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
	<primitive>true</primitive>
	<annotatorImplementationName>org.apache.uima.conceptMapper.ConceptMapper</annotatorImplementationName>
	<analysisEngineMetaData>
		<name>ConceptMapper</name>
		<description></description>
		<version>1</version>
		<vendor></vendor>
		<configurationParameters>
			<configurationParameter>
				<name>caseMatch</name>
				<description>
					this parameter specifies the case folding mode:
					ignoreall - fold everything to lowercase for
					matching insensitive - fold only tokens with initial
					caps to lowercase digitfold - fold all (and only)
					tokens with a digit sensitive - perform no case
					folding
				</description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>true</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>Stemmer</name>
				<description>
					Name of stemmer class to use before matching. MUST
					have a zero-parameter constructor! If not specified,
					no stemming will be performed.
				</description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>ResultingAnnotationName</name>
				<description>
					Name of the annotation type created by this TAE,
					must match the typeSystemDescription entry
				</description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>true</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>ResultingEnclosingSpanName</name>
				<description>
					Name of the feature in the resultingAnnotation to
					contain the span that encloses it (i.e. its
					sentence)
				</description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>AttributeList</name>
				<description>
					List of attribute names for XML dictionary entry
					record - must correspond to FeatureList
				</description>
				<type>String</type>
				<multiValued>true</multiValued>
				<mandatory>true</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>FeatureList</name>
				<description>
					List of feature names for CAS annotation - must
					correspond to AttributeList
				</description>
				<type>String</type>
				<multiValued>true</multiValued>
				<mandatory>true</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>TokenAnnotation</name>
				<description></description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>true</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>TokenClassFeatureName</name>
				<description>
					Name of feature used when doing lookups against
					IncludedTokenClasses and ExcludedTokenClasses
				</description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>TokenTextFeatureName</name>
				<description></description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>SpanFeatureStructure</name>
				<description>
					Type of annotation which corresponds to spans of
					data for processing (e.g. a Sentence)
				</description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>true</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>OrderIndependentLookup</name>
				<description>
					True if should ignore element order during lookup
					(i.e., "top box" would equal "box top"). Default is
					False.
				</description>
				<type>Boolean</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>TokenTypeFeatureName</name>
				<description>
					Name of feature used when doing lookups against
					IncludedTokenTypes and ExcludedTokenTypes
				</description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>IncludedTokenTypes</name>
				<description>
					Type of tokens to include in lookups (if not
					supplied, then all types are included except those
					specifically mentioned in ExcludedTokenTypes)
				</description>
				<type>Integer</type>
				<multiValued>true</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>ExcludedTokenTypes</name>
				<description></description>
				<type>Integer</type>
				<multiValued>true</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>ExcludedTokenClasses</name>
				<description>
					Class of tokens to exclude from lookups (if not
					supplied, then all classes are excluded except those
					specifically mentioned in IncludedTokenClasses,
					unless IncludedTokenClasses is not supplied, in
					which case none are excluded)
				</description>
				<type>String</type>
				<multiValued>true</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>IncludedTokenClasses</name>
				<description>
					Class of tokens to include in lookups (if not
					supplied, then all classes are included except those
					specifically mentioned in ExcludedTokenClasses)
				</description>
				<type>String</type>
				<multiValued>true</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>TokenClassWriteBackFeatureNames</name>
				<description>
					names of features that should be written back to a
					token, such as a POS tag
				</description>
				<type>String</type>
				<multiValued>true</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>ResultingAnnotationMatchedTextFeature</name>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>PrintDictionary</name>
				<type>Boolean</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>SearchStrategy</name>
				<description>
					Can be either "SkipAnyMatch",
					"SkipAnyMatchAllowOverlap" or
					"ContiguousMatch"&#13;&#13;ContiguousMatch: longest
					match of contiguous tokens within enclosing
					span(taking into account included/excluded items).
					DEFAULT strategy &#13;SkipAnyMatch: longest match of
					not-necessarily contiguous tokens within enclosing
					span (taking into account included/excluded items).
					Subsequent lookups begin in span after complete
					match. IMPLIES order-independent lookup
					&#13;SkipAnyMatchAllowOverlap: longest match of
					not-necessarily contiguous tokens within enclosing
					span (taking into account included/excluded items).
					Subsequent lookups begin in span after next token.
					IMPLIES order-independent lookup
				</description>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>StopWords</name>
				<type>String</type>
				<multiValued>true</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>FindAllMatches</name>
				<type>Boolean</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>MatchedTokensFeatureName</name>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>ReplaceCommaWithAND</name>
				<type>Boolean</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>TokenizerDescriptorPath</name>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>true</mandatory>
			</configurationParameter>
			<configurationParameter>
				<name>LanguageID</name>
				<type>String</type>
				<multiValued>false</multiValued>
				<mandatory>false</mandatory>
			</configurationParameter>
		</configurationParameters>
		<configurationParameterSettings>
			<nameValuePair>
				<name>caseMatch</name>
				<value>
					<string>ignoreall</string>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>AttributeList</name>
				<value>
					<array>
						<string>canonical</string>
						<string>SemClass</string>
					</array>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>FeatureList</name>
				<value>
					<array>
						<string>DictCanon</string>
						<string>SemClass</string>
					</array>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>TokenAnnotation</name>
				<value>
					<string>uima.tt.TokenAnnotation</string>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>ResultingAnnotationName</name>
				<value>
					<string>
						org.apache.uima.conceptMapper.DictTerm
					</string>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>SpanFeatureStructure</name>
				<value>
					<string>uima.tcas.DocumentAnnotation</string>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>OrderIndependentLookup</name>
				<value>
					<boolean>false</boolean>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>TokenClassWriteBackFeatureNames</name>
				<value>
					<array />
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>IncludedTokenClasses</name>
				<value>
					<array />
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>PrintDictionary</name>
				<value>
					<boolean>false</boolean>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>FindAllMatches</name>
				<value>
					<boolean>false</boolean>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>StopWords</name>
				<value>
					<array />
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>ReplaceCommaWithAND</name>
				<value>
					<boolean>false</boolean>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>TokenizerDescriptorPath</name>
				<value>
					<string>
						/search/uima/conf/descriptors/OffsetTokenizer.xml
					</string>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>ResultingEnclosingSpanName</name>
				<value>
					<string>enclosingSpan</string>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>MatchedTokensFeatureName</name>
				<value>
					<string>matchedTokens</string>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>ResultingAnnotationMatchedTextFeature</name>
				<value>
					<string>matchedText</string>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>SearchStrategy</name>
				<value>
					<string>ContiguousMatch</string>
				</value>
			</nameValuePair>
			<nameValuePair>
				<name>LanguageID</name>
				<value>
					<string>en</string>
				</value>
			</nameValuePair>
		</configurationParameterSettings>
		<typeSystemDescription>
			<imports>
				<import name="org.apache.uima.conceptMapper.DictTerm" />
				<import
					name="org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation" />
			</imports>
			<types>
				<typeDescription>
					<name>uima.tt.TokenAnnotation</name>
					<description></description>
					<supertypeName>uima.tcas.Annotation</supertypeName>
					<features>
						<featureDescription>
							<name>SemClass</name>
							<description>
								semantic class of token
							</description>
							<rangeTypeName>
								uima.cas.String
							</rangeTypeName>
						</featureDescription>
						<featureDescription>
							<name>POS</name>
							<description>
								Part of SPeech of term to which this
								token is a part
							</description>
							<rangeTypeName>
								uima.cas.String
							</rangeTypeName>
						</featureDescription>
						<featureDescription>
							<name>frost_TokenType</name>
							<description></description>
							<rangeTypeName>
								uima.cas.Integer
							</rangeTypeName>
						</featureDescription>
					</features>
				</typeDescription>
			</types>
		</typeSystemDescription>
		<typePriorities>
			<priorityList>
				<!-- <type>uima.tt.SentenceAnnotation</type> -->
				<type>uima.tt.TokenAnnotation</type>
			</priorityList>
		</typePriorities>
		<fsIndexCollection />
		<capabilities>
			<capability>
				<inputs>
					<type allAnnotatorFeatures="true">
						uima.tt.TokenAnnotation
					</type>
					<!-- <type allAnnotatorFeatures="true">uima.tt.SentenceAnnotation</type>
						<type allAnnotatorFeatures="true">uima.tt.ParagraphAnnotation</type> -->
				</inputs>
				<outputs>
					<type allAnnotatorFeatures="true">
						org.apache.uima.conceptMapper.DictTerm
					</type>
					<type allAnnotatorFeatures="true">
						uima.tt.TokenAnnotation
					</type>
					<type allAnnotatorFeatures="true">
						org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation
					</type>
					<type allAnnotatorFeatures="true">
						uima.tcas.DocumentAnnotation
					</type>
				</outputs>
				<languagesSupported />
			</capability>
		</capabilities>
		<operationalProperties>
			<modifiesCas>true</modifiesCas>
			<multipleDeploymentAllowed>true</multipleDeploymentAllowed>
			<outputsNewCASes>false</outputsNewCASes>
		</operationalProperties>
	</analysisEngineMetaData>
	<externalResourceDependencies>
		<externalResourceDependency>
			<key>DictionaryFile</key>
			<description>dictionary file loader.</description>
			<interfaceName>
				org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource
			</interfaceName>
			<optional>false</optional>
		</externalResourceDependency>
	</externalResourceDependencies>
	<resourceManagerConfiguration>
		<externalResources>
			<externalResource>
				<name>DictionaryFileName</name>
				<description>
					A file containing the dictionary. Modify this URL to
					use a different dictionary.
				</description>
				<fileResourceSpecifier>
					<fileUrl>file:/search/uima/conf/testDict.xml</fileUrl>
				</fileResourceSpecifier>
				<implementationName>
					org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource_impl
				</implementationName>
			</externalResource>
		</externalResources>
		<externalResourceBindings>
			<externalResourceBinding>
				<key>DictionaryFile</key>
				<resourceName>DictionaryFileName</resourceName>
			</externalResourceBinding>
		</externalResourceBindings>
	</resourceManagerConfiguration>
</taeDescription>
[Kothuvatiparambil, Viju] 

----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.

Re: SemClass feature not working in ConceptMapper add-on

Posted by Michael Tanenblatt <sl...@park-slope.net>.

SemClass doesn’t need to be part of the token annotation, but it can be—let me explain: the DictTerm annotations are being used to indicate a match found from the dictionary, and could cover multiple token annotations. Therefore a token annotation is not sufficient for indicating the match. But, ConceptMapper does have the ability to (optionally) write values back to the individual tokens of a match. So, if a match is found in the dictionary that has a SemClass of “X”, it can be configured to also set the SemClass feature of the token(s) that were matched in addition to the DictTerm that covers those tokens.

On Apr 21, 2014, at 10:28 AM, Kothuvatiparambil, Viju <vi...@bankofamerica.com> wrote:

> Hi Michael,
> 
> Thank you so much for your reply. I think I can follow your suggestion and get it working, but I still have one more question in my mind. I see that the SemClass is already in the type system as a feature of uima.tt.TokenAnnotation (see the XML fragment below). What is the purpose of this ? How should I decide that a feature should be part of TokenAnnotation or DictTerm ?
> 
> 
> 		<typeSystemDescription>
> 			<imports>
> 				<import name="org.apache.uima.conceptMapper.DictTerm" />
> 				<import
> 					name="org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation" />
> 			</imports>
> 			<types>
> 				<typeDescription>
> 					<name>uima.tt.TokenAnnotation</name>
> 					<description></description>
> 					<supertypeName>uima.tcas.Annotation</supertypeName>
> 					<features>
> 					
> 						<featureDescription>
> 							<name>SemClass</name>
> 							<description>
> 								semantic class of token
> 							</description>
> 							<rangeTypeName>
> 								uima.cas.String
> 							</rangeTypeName>
> 						</featureDescription>
>                     ....
> 
> Btw, this is a great framework. I can see that I will be using it a lot. I would like to get involved in the development if you are looking for new resources.
> 
> Thanks
> Viju.
> 
> 
> 
> 
> -----Original Message-----
> From: Michael Tanenblatt [mailto:slothrop@park-slope.net] 
> Sent: Monday, April 21, 2014 6:24 AM
> To: user@uima.apache.org
> Subject: Re: SemClass feature not working in ConceptMapper add-on
> 
> You are exactly correct in your analysis: by specifying those values for AttributeList and FeatureList, ConceptMapper is trying to write the value of the SemClass in your dictionary entries to your resulting annotation, which appears to be DictTerm, and DIctTerm does not appear to have the SemClass feature as it is currently defined. The solution is to extend the definition of the DictTerm type to include the the feature SemClass (which should be a String).
> 
> 
> On Apr 20, 2014, at 4:10 PM, Kothuvatiparambil, Viju <vi...@bankofamerica.com> wrote:
> 
>> Hi All, 
>> 
>> I am trying to use the ConceptMapper add on to assign a SemClass feature to tokens. I am getting the following error:
>> 
>> SEVERE: ConceptMapper SEVERE: FeatureList[1] 'SemClass' specified, but does not exist for type: org.apache.uima.conceptMapper.DictTerm
>> 
>> I configured FeatureList and AttributeList in ConceptMapperOffsetTokenizer.xml as given below:
>> 
>> 			<nameValuePair>
>> 				<name>AttributeList</name>
>> 				<value>
>> 					<array>
>> 						<string>canonical</string>
>> 						<string>SemClass</string>
>> 					</array>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>FeatureList</name>
>> 				<value>
>> 					<array>
>> 						<string>DictCanon</string>
>> 						<string>SemClass</string>
>> 					</array>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>ResultingAnnotationName</name>
>> 				<value>
>> 					<string>
>> 						org.apache.uima.conceptMapper.DictTerm
>> 					</string>
>> 				</value>
>> 			</nameValuePair>
>> 
>> Here is my simplified dict.xml file
>> 
>> <synonym>
>> <token canonical="grocery" SemClass="category">
>>    <variant base="grocery"/>
>> </token>
>> </synonym>
>> 
>> I debugged the problem and found that it is looking for the SemClass feature in resultAnnotationType which DictTerm. But actually, the SemClass is not a feature in DictTerm type.
>> 
>>     resultEnclosingSpan = resultAnnotationType.getFeatureByBaseName(resultEnclosingSpanName);
>>     if (resultEnclosingSpan == null) {
>>       logger.logError(PARAM_ENCLOSINGSPAN + " '" + resultEnclosingSpanName
>>               + "' specified, but does not exist for type: " + resultAnnotationType.getName());
>>       throw new AnnotatorInitializationException();
>>     }
>> 
>> I just started using UIMA, so I don't understand the complete architecture yet. Could any of you point me to the right direction ?  Thanks a lot in advance.
>> 
>> Viju Kothuvatiparambil
>> 
>> Here is the complete ConceptMapperOffsetTokenizer.xml file contents:
>> 
>> <taeDescription xmlns="http://uima.apache.org/resourceSpecifier">
>> 	<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
>> 	<primitive>true</primitive>
>> 	<annotatorImplementationName>org.apache.uima.conceptMapper.ConceptMapper</annotatorImplementationName>
>> 	<analysisEngineMetaData>
>> 		<name>ConceptMapper</name>
>> 		<description></description>
>> 		<version>1</version>
>> 		<vendor></vendor>
>> 		<configurationParameters>
>> 			<configurationParameter>
>> 				<name>caseMatch</name>
>> 				<description>
>> 					this parameter specifies the case folding mode:
>> 					ignoreall - fold everything to lowercase for
>> 					matching insensitive - fold only tokens with initial
>> 					caps to lowercase digitfold - fold all (and only)
>> 					tokens with a digit sensitive - perform no case
>> 					folding
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>true</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>Stemmer</name>
>> 				<description>
>> 					Name of stemmer class to use before matching. MUST
>> 					have a zero-parameter constructor! If not specified,
>> 					no stemming will be performed.
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>ResultingAnnotationName</name>
>> 				<description>
>> 					Name of the annotation type created by this TAE,
>> 					must match the typeSystemDescription entry
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>true</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>ResultingEnclosingSpanName</name>
>> 				<description>
>> 					Name of the feature in the resultingAnnotation to
>> 					contain the span that encloses it (i.e. its
>> 					sentence)
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>AttributeList</name>
>> 				<description>
>> 					List of attribute names for XML dictionary entry
>> 					record - must correspond to FeatureList
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>true</multiValued>
>> 				<mandatory>true</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>FeatureList</name>
>> 				<description>
>> 					List of feature names for CAS annotation - must
>> 					correspond to AttributeList
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>true</multiValued>
>> 				<mandatory>true</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>TokenAnnotation</name>
>> 				<description></description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>true</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>TokenClassFeatureName</name>
>> 				<description>
>> 					Name of feature used when doing lookups against
>> 					IncludedTokenClasses and ExcludedTokenClasses
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>TokenTextFeatureName</name>
>> 				<description></description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>SpanFeatureStructure</name>
>> 				<description>
>> 					Type of annotation which corresponds to spans of
>> 					data for processing (e.g. a Sentence)
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>true</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>OrderIndependentLookup</name>
>> 				<description>
>> 					True if should ignore element order during lookup
>> 					(i.e., "top box" would equal "box top"). Default is
>> 					False.
>> 				</description>
>> 				<type>Boolean</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>TokenTypeFeatureName</name>
>> 				<description>
>> 					Name of feature used when doing lookups against
>> 					IncludedTokenTypes and ExcludedTokenTypes
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>IncludedTokenTypes</name>
>> 				<description>
>> 					Type of tokens to include in lookups (if not
>> 					supplied, then all types are included except those
>> 					specifically mentioned in ExcludedTokenTypes)
>> 				</description>
>> 				<type>Integer</type>
>> 				<multiValued>true</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>ExcludedTokenTypes</name>
>> 				<description></description>
>> 				<type>Integer</type>
>> 				<multiValued>true</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>ExcludedTokenClasses</name>
>> 				<description>
>> 					Class of tokens to exclude from lookups (if not
>> 					supplied, then all classes are excluded except those
>> 					specifically mentioned in IncludedTokenClasses,
>> 					unless IncludedTokenClasses is not supplied, in
>> 					which case none are excluded)
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>true</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>IncludedTokenClasses</name>
>> 				<description>
>> 					Class of tokens to include in lookups (if not
>> 					supplied, then all classes are included except those
>> 					specifically mentioned in ExcludedTokenClasses)
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>true</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>TokenClassWriteBackFeatureNames</name>
>> 				<description>
>> 					names of features that should be written back to a
>> 					token, such as a POS tag
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>true</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>ResultingAnnotationMatchedTextFeature</name>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>PrintDictionary</name>
>> 				<type>Boolean</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>SearchStrategy</name>
>> 				<description>
>> 					Can be either "SkipAnyMatch",
>> 					"SkipAnyMatchAllowOverlap" or
>> 					"ContiguousMatch"&#13;&#13;ContiguousMatch: longest
>> 					match of contiguous tokens within enclosing
>> 					span(taking into account included/excluded items).
>> 					DEFAULT strategy &#13;SkipAnyMatch: longest match of
>> 					not-necessarily contiguous tokens within enclosing
>> 					span (taking into account included/excluded items).
>> 					Subsequent lookups begin in span after complete
>> 					match. IMPLIES order-independent lookup
>> 					&#13;SkipAnyMatchAllowOverlap: longest match of
>> 					not-necessarily contiguous tokens within enclosing
>> 					span (taking into account included/excluded items).
>> 					Subsequent lookups begin in span after next token.
>> 					IMPLIES order-independent lookup
>> 				</description>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>StopWords</name>
>> 				<type>String</type>
>> 				<multiValued>true</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>FindAllMatches</name>
>> 				<type>Boolean</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>MatchedTokensFeatureName</name>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>ReplaceCommaWithAND</name>
>> 				<type>Boolean</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>TokenizerDescriptorPath</name>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>true</mandatory>
>> 			</configurationParameter>
>> 			<configurationParameter>
>> 				<name>LanguageID</name>
>> 				<type>String</type>
>> 				<multiValued>false</multiValued>
>> 				<mandatory>false</mandatory>
>> 			</configurationParameter>
>> 		</configurationParameters>
>> 		<configurationParameterSettings>
>> 			<nameValuePair>
>> 				<name>caseMatch</name>
>> 				<value>
>> 					<string>ignoreall</string>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>AttributeList</name>
>> 				<value>
>> 					<array>
>> 						<string>canonical</string>
>> 						<string>SemClass</string>
>> 					</array>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>FeatureList</name>
>> 				<value>
>> 					<array>
>> 						<string>DictCanon</string>
>> 						<string>SemClass</string>
>> 					</array>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>TokenAnnotation</name>
>> 				<value>
>> 					<string>uima.tt.TokenAnnotation</string>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>ResultingAnnotationName</name>
>> 				<value>
>> 					<string>
>> 						org.apache.uima.conceptMapper.DictTerm
>> 					</string>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>SpanFeatureStructure</name>
>> 				<value>
>> 					<string>uima.tcas.DocumentAnnotation</string>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>OrderIndependentLookup</name>
>> 				<value>
>> 					<boolean>false</boolean>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>TokenClassWriteBackFeatureNames</name>
>> 				<value>
>> 					<array />
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>IncludedTokenClasses</name>
>> 				<value>
>> 					<array />
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>PrintDictionary</name>
>> 				<value>
>> 					<boolean>false</boolean>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>FindAllMatches</name>
>> 				<value>
>> 					<boolean>false</boolean>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>StopWords</name>
>> 				<value>
>> 					<array />
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>ReplaceCommaWithAND</name>
>> 				<value>
>> 					<boolean>false</boolean>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>TokenizerDescriptorPath</name>
>> 				<value>
>> 					<string>
>> 						/search/uima/conf/descriptors/OffsetTokenizer.xml
>> 					</string>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>ResultingEnclosingSpanName</name>
>> 				<value>
>> 					<string>enclosingSpan</string>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>MatchedTokensFeatureName</name>
>> 				<value>
>> 					<string>matchedTokens</string>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>ResultingAnnotationMatchedTextFeature</name>
>> 				<value>
>> 					<string>matchedText</string>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>SearchStrategy</name>
>> 				<value>
>> 					<string>ContiguousMatch</string>
>> 				</value>
>> 			</nameValuePair>
>> 			<nameValuePair>
>> 				<name>LanguageID</name>
>> 				<value>
>> 					<string>en</string>
>> 				</value>
>> 			</nameValuePair>
>> 		</configurationParameterSettings>
>> 		<typeSystemDescription>
>> 			<imports>
>> 				<import name="org.apache.uima.conceptMapper.DictTerm" />
>> 				<import
>> 					name="org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation" />
>> 			</imports>
>> 			<types>
>> 				<typeDescription>
>> 					<name>uima.tt.TokenAnnotation</name>
>> 					<description></description>
>> 					<supertypeName>uima.tcas.Annotation</supertypeName>
>> 					<features>
>> 						<featureDescription>
>> 							<name>SemClass</name>
>> 							<description>
>> 								semantic class of token
>> 							</description>
>> 							<rangeTypeName>
>> 								uima.cas.String
>> 							</rangeTypeName>
>> 						</featureDescription>
>> 						<featureDescription>
>> 							<name>POS</name>
>> 							<description>
>> 								Part of SPeech of term to which this
>> 								token is a part
>> 							</description>
>> 							<rangeTypeName>
>> 								uima.cas.String
>> 							</rangeTypeName>
>> 						</featureDescription>
>> 						<featureDescription>
>> 							<name>frost_TokenType</name>
>> 							<description></description>
>> 							<rangeTypeName>
>> 								uima.cas.Integer
>> 							</rangeTypeName>
>> 						</featureDescription>
>> 					</features>
>> 				</typeDescription>
>> 			</types>
>> 		</typeSystemDescription>
>> 		<typePriorities>
>> 			<priorityList>
>> 				<!-- <type>uima.tt.SentenceAnnotation</type> -->
>> 				<type>uima.tt.TokenAnnotation</type>
>> 			</priorityList>
>> 		</typePriorities>
>> 		<fsIndexCollection />
>> 		<capabilities>
>> 			<capability>
>> 				<inputs>
>> 					<type allAnnotatorFeatures="true">
>> 						uima.tt.TokenAnnotation
>> 					</type>
>> 					<!-- <type allAnnotatorFeatures="true">uima.tt.SentenceAnnotation</type>
>> 						<type allAnnotatorFeatures="true">uima.tt.ParagraphAnnotation</type> -->
>> 				</inputs>
>> 				<outputs>
>> 					<type allAnnotatorFeatures="true">
>> 						org.apache.uima.conceptMapper.DictTerm
>> 					</type>
>> 					<type allAnnotatorFeatures="true">
>> 						uima.tt.TokenAnnotation
>> 					</type>
>> 					<type allAnnotatorFeatures="true">
>> 						org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation
>> 					</type>
>> 					<type allAnnotatorFeatures="true">
>> 						uima.tcas.DocumentAnnotation
>> 					</type>
>> 				</outputs>
>> 				<languagesSupported />
>> 			</capability>
>> 		</capabilities>
>> 		<operationalProperties>
>> 			<modifiesCas>true</modifiesCas>
>> 			<multipleDeploymentAllowed>true</multipleDeploymentAllowed>
>> 			<outputsNewCASes>false</outputsNewCASes>
>> 		</operationalProperties>
>> 	</analysisEngineMetaData>
>> 	<externalResourceDependencies>
>> 		<externalResourceDependency>
>> 			<key>DictionaryFile</key>
>> 			<description>dictionary file loader.</description>
>> 			<interfaceName>
>> 				org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource
>> 			</interfaceName>
>> 			<optional>false</optional>
>> 		</externalResourceDependency>
>> 	</externalResourceDependencies>
>> 	<resourceManagerConfiguration>
>> 		<externalResources>
>> 			<externalResource>
>> 				<name>DictionaryFileName</name>
>> 				<description>
>> 					A file containing the dictionary. Modify this URL to
>> 					use a different dictionary.
>> 				</description>
>> 				<fileResourceSpecifier>
>> 					<fileUrl>file:/search/uima/conf/testDict.xml</fileUrl>
>> 				</fileResourceSpecifier>
>> 				<implementationName>
>> 					org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource_impl
>> 				</implementationName>
>> 			</externalResource>
>> 		</externalResources>
>> 		<externalResourceBindings>
>> 			<externalResourceBinding>
>> 				<key>DictionaryFile</key>
>> 				<resourceName>DictionaryFileName</resourceName>
>> 			</externalResourceBinding>
>> 		</externalResourceBindings>
>> 	</resourceManagerConfiguration>
>> </taeDescription>
>> [Kothuvatiparambil, Viju] 
>> 
>> ----------------------------------------------------------------------
>> This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.
> 
> ----------------------------------------------------------------------
> This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.

RE: SemClass feature not working in ConceptMapper add-on

Posted by "Kothuvatiparambil, Viju" <vi...@bankofamerica.com>.

Hi Michael,

Thank you so much for your reply. I think I can follow your suggestion and get it working, but I still have one more question in my mind. I see that the SemClass is already in the type system as a feature of uima.tt.TokenAnnotation (see the XML fragment below). What is the purpose of this ? How should I decide that a feature should be part of TokenAnnotation or DictTerm ?


		<typeSystemDescription>
			<imports>
				<import name="org.apache.uima.conceptMapper.DictTerm" />
				<import
					name="org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation" />
			</imports>
			<types>
				<typeDescription>
					<name>uima.tt.TokenAnnotation</name>
					<description></description>
					<supertypeName>uima.tcas.Annotation</supertypeName>
					<features>
					
						<featureDescription>
							<name>SemClass</name>
							<description>
								semantic class of token
							</description>
							<rangeTypeName>
								uima.cas.String
							</rangeTypeName>
						</featureDescription>
                     ....

Btw, this is a great framework. I can see that I will be using it a lot. I would like to get involved in the development if you are looking for new resources.

Thanks
Viju.


  

-----Original Message-----
From: Michael Tanenblatt [mailto:slothrop@park-slope.net] 
Sent: Monday, April 21, 2014 6:24 AM
To: user@uima.apache.org
Subject: Re: SemClass feature not working in ConceptMapper add-on

You are exactly correct in your analysis: by specifying those values for AttributeList and FeatureList, ConceptMapper is trying to write the value of the SemClass in your dictionary entries to your resulting annotation, which appears to be DictTerm, and DIctTerm does not appear to have the SemClass feature as it is currently defined. The solution is to extend the definition of the DictTerm type to include the the feature SemClass (which should be a String).


On Apr 20, 2014, at 4:10 PM, Kothuvatiparambil, Viju <vi...@bankofamerica.com> wrote:

> Hi All, 
> 
> I am trying to use the ConceptMapper add on to assign a SemClass feature to tokens. I am getting the following error:
> 
> SEVERE: ConceptMapper SEVERE: FeatureList[1] 'SemClass' specified, but does not exist for type: org.apache.uima.conceptMapper.DictTerm
> 
> I configured FeatureList and AttributeList in ConceptMapperOffsetTokenizer.xml as given below:
> 
> 			<nameValuePair>
> 				<name>AttributeList</name>
> 				<value>
> 					<array>
> 						<string>canonical</string>
> 						<string>SemClass</string>
> 					</array>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>FeatureList</name>
> 				<value>
> 					<array>
> 						<string>DictCanon</string>
> 						<string>SemClass</string>
> 					</array>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ResultingAnnotationName</name>
> 				<value>
> 					<string>
> 						org.apache.uima.conceptMapper.DictTerm
> 					</string>
> 				</value>
> 			</nameValuePair>
> 
> Here is my simplified dict.xml file
> 
> <synonym>
>  <token canonical="grocery" SemClass="category">
>     <variant base="grocery"/>
>  </token>
> </synonym>
> 
> I debugged the problem and found that it is looking for the SemClass feature in resultAnnotationType which DictTerm. But actually, the SemClass is not a feature in DictTerm type.
> 
>      resultEnclosingSpan = resultAnnotationType.getFeatureByBaseName(resultEnclosingSpanName);
>      if (resultEnclosingSpan == null) {
>        logger.logError(PARAM_ENCLOSINGSPAN + " '" + resultEnclosingSpanName
>                + "' specified, but does not exist for type: " + resultAnnotationType.getName());
>        throw new AnnotatorInitializationException();
>      }
> 
> I just started using UIMA, so I don't understand the complete architecture yet. Could any of you point me to the right direction ?  Thanks a lot in advance.
> 
> Viju Kothuvatiparambil
> 
> Here is the complete ConceptMapperOffsetTokenizer.xml file contents:
> 
> <taeDescription xmlns="http://uima.apache.org/resourceSpecifier">
> 	<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
> 	<primitive>true</primitive>
> 	<annotatorImplementationName>org.apache.uima.conceptMapper.ConceptMapper</annotatorImplementationName>
> 	<analysisEngineMetaData>
> 		<name>ConceptMapper</name>
> 		<description></description>
> 		<version>1</version>
> 		<vendor></vendor>
> 		<configurationParameters>
> 			<configurationParameter>
> 				<name>caseMatch</name>
> 				<description>
> 					this parameter specifies the case folding mode:
> 					ignoreall - fold everything to lowercase for
> 					matching insensitive - fold only tokens with initial
> 					caps to lowercase digitfold - fold all (and only)
> 					tokens with a digit sensitive - perform no case
> 					folding
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>Stemmer</name>
> 				<description>
> 					Name of stemmer class to use before matching. MUST
> 					have a zero-parameter constructor! If not specified,
> 					no stemming will be performed.
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ResultingAnnotationName</name>
> 				<description>
> 					Name of the annotation type created by this TAE,
> 					must match the typeSystemDescription entry
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ResultingEnclosingSpanName</name>
> 				<description>
> 					Name of the feature in the resultingAnnotation to
> 					contain the span that encloses it (i.e. its
> 					sentence)
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>AttributeList</name>
> 				<description>
> 					List of attribute names for XML dictionary entry
> 					record - must correspond to FeatureList
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>FeatureList</name>
> 				<description>
> 					List of feature names for CAS annotation - must
> 					correspond to AttributeList
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenAnnotation</name>
> 				<description></description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenClassFeatureName</name>
> 				<description>
> 					Name of feature used when doing lookups against
> 					IncludedTokenClasses and ExcludedTokenClasses
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenTextFeatureName</name>
> 				<description></description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>SpanFeatureStructure</name>
> 				<description>
> 					Type of annotation which corresponds to spans of
> 					data for processing (e.g. a Sentence)
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>OrderIndependentLookup</name>
> 				<description>
> 					True if should ignore element order during lookup
> 					(i.e., "top box" would equal "box top"). Default is
> 					False.
> 				</description>
> 				<type>Boolean</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenTypeFeatureName</name>
> 				<description>
> 					Name of feature used when doing lookups against
> 					IncludedTokenTypes and ExcludedTokenTypes
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>IncludedTokenTypes</name>
> 				<description>
> 					Type of tokens to include in lookups (if not
> 					supplied, then all types are included except those
> 					specifically mentioned in ExcludedTokenTypes)
> 				</description>
> 				<type>Integer</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ExcludedTokenTypes</name>
> 				<description></description>
> 				<type>Integer</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ExcludedTokenClasses</name>
> 				<description>
> 					Class of tokens to exclude from lookups (if not
> 					supplied, then all classes are excluded except those
> 					specifically mentioned in IncludedTokenClasses,
> 					unless IncludedTokenClasses is not supplied, in
> 					which case none are excluded)
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>IncludedTokenClasses</name>
> 				<description>
> 					Class of tokens to include in lookups (if not
> 					supplied, then all classes are included except those
> 					specifically mentioned in ExcludedTokenClasses)
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenClassWriteBackFeatureNames</name>
> 				<description>
> 					names of features that should be written back to a
> 					token, such as a POS tag
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ResultingAnnotationMatchedTextFeature</name>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>PrintDictionary</name>
> 				<type>Boolean</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>SearchStrategy</name>
> 				<description>
> 					Can be either "SkipAnyMatch",
> 					"SkipAnyMatchAllowOverlap" or
> 					"ContiguousMatch"&#13;&#13;ContiguousMatch: longest
> 					match of contiguous tokens within enclosing
> 					span(taking into account included/excluded items).
> 					DEFAULT strategy &#13;SkipAnyMatch: longest match of
> 					not-necessarily contiguous tokens within enclosing
> 					span (taking into account included/excluded items).
> 					Subsequent lookups begin in span after complete
> 					match. IMPLIES order-independent lookup
> 					&#13;SkipAnyMatchAllowOverlap: longest match of
> 					not-necessarily contiguous tokens within enclosing
> 					span (taking into account included/excluded items).
> 					Subsequent lookups begin in span after next token.
> 					IMPLIES order-independent lookup
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>StopWords</name>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>FindAllMatches</name>
> 				<type>Boolean</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>MatchedTokensFeatureName</name>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ReplaceCommaWithAND</name>
> 				<type>Boolean</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenizerDescriptorPath</name>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>LanguageID</name>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 		</configurationParameters>
> 		<configurationParameterSettings>
> 			<nameValuePair>
> 				<name>caseMatch</name>
> 				<value>
> 					<string>ignoreall</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>AttributeList</name>
> 				<value>
> 					<array>
> 						<string>canonical</string>
> 						<string>SemClass</string>
> 					</array>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>FeatureList</name>
> 				<value>
> 					<array>
> 						<string>DictCanon</string>
> 						<string>SemClass</string>
> 					</array>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>TokenAnnotation</name>
> 				<value>
> 					<string>uima.tt.TokenAnnotation</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ResultingAnnotationName</name>
> 				<value>
> 					<string>
> 						org.apache.uima.conceptMapper.DictTerm
> 					</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>SpanFeatureStructure</name>
> 				<value>
> 					<string>uima.tcas.DocumentAnnotation</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>OrderIndependentLookup</name>
> 				<value>
> 					<boolean>false</boolean>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>TokenClassWriteBackFeatureNames</name>
> 				<value>
> 					<array />
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>IncludedTokenClasses</name>
> 				<value>
> 					<array />
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>PrintDictionary</name>
> 				<value>
> 					<boolean>false</boolean>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>FindAllMatches</name>
> 				<value>
> 					<boolean>false</boolean>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>StopWords</name>
> 				<value>
> 					<array />
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ReplaceCommaWithAND</name>
> 				<value>
> 					<boolean>false</boolean>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>TokenizerDescriptorPath</name>
> 				<value>
> 					<string>
> 						/search/uima/conf/descriptors/OffsetTokenizer.xml
> 					</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ResultingEnclosingSpanName</name>
> 				<value>
> 					<string>enclosingSpan</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>MatchedTokensFeatureName</name>
> 				<value>
> 					<string>matchedTokens</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ResultingAnnotationMatchedTextFeature</name>
> 				<value>
> 					<string>matchedText</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>SearchStrategy</name>
> 				<value>
> 					<string>ContiguousMatch</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>LanguageID</name>
> 				<value>
> 					<string>en</string>
> 				</value>
> 			</nameValuePair>
> 		</configurationParameterSettings>
> 		<typeSystemDescription>
> 			<imports>
> 				<import name="org.apache.uima.conceptMapper.DictTerm" />
> 				<import
> 					name="org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation" />
> 			</imports>
> 			<types>
> 				<typeDescription>
> 					<name>uima.tt.TokenAnnotation</name>
> 					<description></description>
> 					<supertypeName>uima.tcas.Annotation</supertypeName>
> 					<features>
> 						<featureDescription>
> 							<name>SemClass</name>
> 							<description>
> 								semantic class of token
> 							</description>
> 							<rangeTypeName>
> 								uima.cas.String
> 							</rangeTypeName>
> 						</featureDescription>
> 						<featureDescription>
> 							<name>POS</name>
> 							<description>
> 								Part of SPeech of term to which this
> 								token is a part
> 							</description>
> 							<rangeTypeName>
> 								uima.cas.String
> 							</rangeTypeName>
> 						</featureDescription>
> 						<featureDescription>
> 							<name>frost_TokenType</name>
> 							<description></description>
> 							<rangeTypeName>
> 								uima.cas.Integer
> 							</rangeTypeName>
> 						</featureDescription>
> 					</features>
> 				</typeDescription>
> 			</types>
> 		</typeSystemDescription>
> 		<typePriorities>
> 			<priorityList>
> 				<!-- <type>uima.tt.SentenceAnnotation</type> -->
> 				<type>uima.tt.TokenAnnotation</type>
> 			</priorityList>
> 		</typePriorities>
> 		<fsIndexCollection />
> 		<capabilities>
> 			<capability>
> 				<inputs>
> 					<type allAnnotatorFeatures="true">
> 						uima.tt.TokenAnnotation
> 					</type>
> 					<!-- <type allAnnotatorFeatures="true">uima.tt.SentenceAnnotation</type>
> 						<type allAnnotatorFeatures="true">uima.tt.ParagraphAnnotation</type> -->
> 				</inputs>
> 				<outputs>
> 					<type allAnnotatorFeatures="true">
> 						org.apache.uima.conceptMapper.DictTerm
> 					</type>
> 					<type allAnnotatorFeatures="true">
> 						uima.tt.TokenAnnotation
> 					</type>
> 					<type allAnnotatorFeatures="true">
> 						org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation
> 					</type>
> 					<type allAnnotatorFeatures="true">
> 						uima.tcas.DocumentAnnotation
> 					</type>
> 				</outputs>
> 				<languagesSupported />
> 			</capability>
> 		</capabilities>
> 		<operationalProperties>
> 			<modifiesCas>true</modifiesCas>
> 			<multipleDeploymentAllowed>true</multipleDeploymentAllowed>
> 			<outputsNewCASes>false</outputsNewCASes>
> 		</operationalProperties>
> 	</analysisEngineMetaData>
> 	<externalResourceDependencies>
> 		<externalResourceDependency>
> 			<key>DictionaryFile</key>
> 			<description>dictionary file loader.</description>
> 			<interfaceName>
> 				org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource
> 			</interfaceName>
> 			<optional>false</optional>
> 		</externalResourceDependency>
> 	</externalResourceDependencies>
> 	<resourceManagerConfiguration>
> 		<externalResources>
> 			<externalResource>
> 				<name>DictionaryFileName</name>
> 				<description>
> 					A file containing the dictionary. Modify this URL to
> 					use a different dictionary.
> 				</description>
> 				<fileResourceSpecifier>
> 					<fileUrl>file:/search/uima/conf/testDict.xml</fileUrl>
> 				</fileResourceSpecifier>
> 				<implementationName>
> 					org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource_impl
> 				</implementationName>
> 			</externalResource>
> 		</externalResources>
> 		<externalResourceBindings>
> 			<externalResourceBinding>
> 				<key>DictionaryFile</key>
> 				<resourceName>DictionaryFileName</resourceName>
> 			</externalResourceBinding>
> 		</externalResourceBindings>
> 	</resourceManagerConfiguration>
> </taeDescription>
> [Kothuvatiparambil, Viju] 
> 
> ----------------------------------------------------------------------
> This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.

----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.

Re: SemClass feature not working in ConceptMapper add-on

Posted by Michael Tanenblatt <sl...@park-slope.net>.

You are exactly correct in your analysis: by specifying those values for AttributeList and FeatureList, ConceptMapper is trying to write the value of the SemClass in your dictionary entries to your resulting annotation, which appears to be DictTerm, and DIctTerm does not appear to have the SemClass feature as it is currently defined. The solution is to extend the definition of the DictTerm type to include the the feature SemClass (which should be a String).


On Apr 20, 2014, at 4:10 PM, Kothuvatiparambil, Viju <vi...@bankofamerica.com> wrote:

> Hi All, 
> 
> I am trying to use the ConceptMapper add on to assign a SemClass feature to tokens. I am getting the following error:
> 
> SEVERE: ConceptMapper SEVERE: FeatureList[1] 'SemClass' specified, but does not exist for type: org.apache.uima.conceptMapper.DictTerm
> 
> I configured FeatureList and AttributeList in ConceptMapperOffsetTokenizer.xml as given below:
> 
> 			<nameValuePair>
> 				<name>AttributeList</name>
> 				<value>
> 					<array>
> 						<string>canonical</string>
> 						<string>SemClass</string>
> 					</array>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>FeatureList</name>
> 				<value>
> 					<array>
> 						<string>DictCanon</string>
> 						<string>SemClass</string>
> 					</array>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ResultingAnnotationName</name>
> 				<value>
> 					<string>
> 						org.apache.uima.conceptMapper.DictTerm
> 					</string>
> 				</value>
> 			</nameValuePair>
> 
> Here is my simplified dict.xml file
> 
> <synonym>
>  <token canonical="grocery" SemClass="category">
>     <variant base="grocery"/>
>  </token>
> </synonym>
> 
> I debugged the problem and found that it is looking for the SemClass feature in resultAnnotationType which DictTerm. But actually, the SemClass is not a feature in DictTerm type.
> 
>      resultEnclosingSpan = resultAnnotationType.getFeatureByBaseName(resultEnclosingSpanName);
>      if (resultEnclosingSpan == null) {
>        logger.logError(PARAM_ENCLOSINGSPAN + " '" + resultEnclosingSpanName
>                + "' specified, but does not exist for type: " + resultAnnotationType.getName());
>        throw new AnnotatorInitializationException();
>      }
> 
> I just started using UIMA, so I don't understand the complete architecture yet. Could any of you point me to the right direction ?  Thanks a lot in advance.
> 
> Viju Kothuvatiparambil
> 
> Here is the complete ConceptMapperOffsetTokenizer.xml file contents:
> 
> <taeDescription xmlns="http://uima.apache.org/resourceSpecifier">
> 	<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
> 	<primitive>true</primitive>
> 	<annotatorImplementationName>org.apache.uima.conceptMapper.ConceptMapper</annotatorImplementationName>
> 	<analysisEngineMetaData>
> 		<name>ConceptMapper</name>
> 		<description></description>
> 		<version>1</version>
> 		<vendor></vendor>
> 		<configurationParameters>
> 			<configurationParameter>
> 				<name>caseMatch</name>
> 				<description>
> 					this parameter specifies the case folding mode:
> 					ignoreall - fold everything to lowercase for
> 					matching insensitive - fold only tokens with initial
> 					caps to lowercase digitfold - fold all (and only)
> 					tokens with a digit sensitive - perform no case
> 					folding
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>Stemmer</name>
> 				<description>
> 					Name of stemmer class to use before matching. MUST
> 					have a zero-parameter constructor! If not specified,
> 					no stemming will be performed.
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ResultingAnnotationName</name>
> 				<description>
> 					Name of the annotation type created by this TAE,
> 					must match the typeSystemDescription entry
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ResultingEnclosingSpanName</name>
> 				<description>
> 					Name of the feature in the resultingAnnotation to
> 					contain the span that encloses it (i.e. its
> 					sentence)
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>AttributeList</name>
> 				<description>
> 					List of attribute names for XML dictionary entry
> 					record - must correspond to FeatureList
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>FeatureList</name>
> 				<description>
> 					List of feature names for CAS annotation - must
> 					correspond to AttributeList
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenAnnotation</name>
> 				<description></description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenClassFeatureName</name>
> 				<description>
> 					Name of feature used when doing lookups against
> 					IncludedTokenClasses and ExcludedTokenClasses
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenTextFeatureName</name>
> 				<description></description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>SpanFeatureStructure</name>
> 				<description>
> 					Type of annotation which corresponds to spans of
> 					data for processing (e.g. a Sentence)
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>OrderIndependentLookup</name>
> 				<description>
> 					True if should ignore element order during lookup
> 					(i.e., "top box" would equal "box top"). Default is
> 					False.
> 				</description>
> 				<type>Boolean</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenTypeFeatureName</name>
> 				<description>
> 					Name of feature used when doing lookups against
> 					IncludedTokenTypes and ExcludedTokenTypes
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>IncludedTokenTypes</name>
> 				<description>
> 					Type of tokens to include in lookups (if not
> 					supplied, then all types are included except those
> 					specifically mentioned in ExcludedTokenTypes)
> 				</description>
> 				<type>Integer</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ExcludedTokenTypes</name>
> 				<description></description>
> 				<type>Integer</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ExcludedTokenClasses</name>
> 				<description>
> 					Class of tokens to exclude from lookups (if not
> 					supplied, then all classes are excluded except those
> 					specifically mentioned in IncludedTokenClasses,
> 					unless IncludedTokenClasses is not supplied, in
> 					which case none are excluded)
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>IncludedTokenClasses</name>
> 				<description>
> 					Class of tokens to include in lookups (if not
> 					supplied, then all classes are included except those
> 					specifically mentioned in ExcludedTokenClasses)
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenClassWriteBackFeatureNames</name>
> 				<description>
> 					names of features that should be written back to a
> 					token, such as a POS tag
> 				</description>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ResultingAnnotationMatchedTextFeature</name>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>PrintDictionary</name>
> 				<type>Boolean</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>SearchStrategy</name>
> 				<description>
> 					Can be either "SkipAnyMatch",
> 					"SkipAnyMatchAllowOverlap" or
> 					"ContiguousMatch"&#13;&#13;ContiguousMatch: longest
> 					match of contiguous tokens within enclosing
> 					span(taking into account included/excluded items).
> 					DEFAULT strategy &#13;SkipAnyMatch: longest match of
> 					not-necessarily contiguous tokens within enclosing
> 					span (taking into account included/excluded items).
> 					Subsequent lookups begin in span after complete
> 					match. IMPLIES order-independent lookup
> 					&#13;SkipAnyMatchAllowOverlap: longest match of
> 					not-necessarily contiguous tokens within enclosing
> 					span (taking into account included/excluded items).
> 					Subsequent lookups begin in span after next token.
> 					IMPLIES order-independent lookup
> 				</description>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>StopWords</name>
> 				<type>String</type>
> 				<multiValued>true</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>FindAllMatches</name>
> 				<type>Boolean</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>MatchedTokensFeatureName</name>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>ReplaceCommaWithAND</name>
> 				<type>Boolean</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>TokenizerDescriptorPath</name>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>true</mandatory>
> 			</configurationParameter>
> 			<configurationParameter>
> 				<name>LanguageID</name>
> 				<type>String</type>
> 				<multiValued>false</multiValued>
> 				<mandatory>false</mandatory>
> 			</configurationParameter>
> 		</configurationParameters>
> 		<configurationParameterSettings>
> 			<nameValuePair>
> 				<name>caseMatch</name>
> 				<value>
> 					<string>ignoreall</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>AttributeList</name>
> 				<value>
> 					<array>
> 						<string>canonical</string>
> 						<string>SemClass</string>
> 					</array>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>FeatureList</name>
> 				<value>
> 					<array>
> 						<string>DictCanon</string>
> 						<string>SemClass</string>
> 					</array>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>TokenAnnotation</name>
> 				<value>
> 					<string>uima.tt.TokenAnnotation</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ResultingAnnotationName</name>
> 				<value>
> 					<string>
> 						org.apache.uima.conceptMapper.DictTerm
> 					</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>SpanFeatureStructure</name>
> 				<value>
> 					<string>uima.tcas.DocumentAnnotation</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>OrderIndependentLookup</name>
> 				<value>
> 					<boolean>false</boolean>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>TokenClassWriteBackFeatureNames</name>
> 				<value>
> 					<array />
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>IncludedTokenClasses</name>
> 				<value>
> 					<array />
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>PrintDictionary</name>
> 				<value>
> 					<boolean>false</boolean>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>FindAllMatches</name>
> 				<value>
> 					<boolean>false</boolean>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>StopWords</name>
> 				<value>
> 					<array />
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ReplaceCommaWithAND</name>
> 				<value>
> 					<boolean>false</boolean>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>TokenizerDescriptorPath</name>
> 				<value>
> 					<string>
> 						/search/uima/conf/descriptors/OffsetTokenizer.xml
> 					</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ResultingEnclosingSpanName</name>
> 				<value>
> 					<string>enclosingSpan</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>MatchedTokensFeatureName</name>
> 				<value>
> 					<string>matchedTokens</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>ResultingAnnotationMatchedTextFeature</name>
> 				<value>
> 					<string>matchedText</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>SearchStrategy</name>
> 				<value>
> 					<string>ContiguousMatch</string>
> 				</value>
> 			</nameValuePair>
> 			<nameValuePair>
> 				<name>LanguageID</name>
> 				<value>
> 					<string>en</string>
> 				</value>
> 			</nameValuePair>
> 		</configurationParameterSettings>
> 		<typeSystemDescription>
> 			<imports>
> 				<import name="org.apache.uima.conceptMapper.DictTerm" />
> 				<import
> 					name="org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation" />
> 			</imports>
> 			<types>
> 				<typeDescription>
> 					<name>uima.tt.TokenAnnotation</name>
> 					<description></description>
> 					<supertypeName>uima.tcas.Annotation</supertypeName>
> 					<features>
> 						<featureDescription>
> 							<name>SemClass</name>
> 							<description>
> 								semantic class of token
> 							</description>
> 							<rangeTypeName>
> 								uima.cas.String
> 							</rangeTypeName>
> 						</featureDescription>
> 						<featureDescription>
> 							<name>POS</name>
> 							<description>
> 								Part of SPeech of term to which this
> 								token is a part
> 							</description>
> 							<rangeTypeName>
> 								uima.cas.String
> 							</rangeTypeName>
> 						</featureDescription>
> 						<featureDescription>
> 							<name>frost_TokenType</name>
> 							<description></description>
> 							<rangeTypeName>
> 								uima.cas.Integer
> 							</rangeTypeName>
> 						</featureDescription>
> 					</features>
> 				</typeDescription>
> 			</types>
> 		</typeSystemDescription>
> 		<typePriorities>
> 			<priorityList>
> 				<!-- <type>uima.tt.SentenceAnnotation</type> -->
> 				<type>uima.tt.TokenAnnotation</type>
> 			</priorityList>
> 		</typePriorities>
> 		<fsIndexCollection />
> 		<capabilities>
> 			<capability>
> 				<inputs>
> 					<type allAnnotatorFeatures="true">
> 						uima.tt.TokenAnnotation
> 					</type>
> 					<!-- <type allAnnotatorFeatures="true">uima.tt.SentenceAnnotation</type>
> 						<type allAnnotatorFeatures="true">uima.tt.ParagraphAnnotation</type> -->
> 				</inputs>
> 				<outputs>
> 					<type allAnnotatorFeatures="true">
> 						org.apache.uima.conceptMapper.DictTerm
> 					</type>
> 					<type allAnnotatorFeatures="true">
> 						uima.tt.TokenAnnotation
> 					</type>
> 					<type allAnnotatorFeatures="true">
> 						org.apache.uima.conceptMapper.support.tokenizer.TokenAnnotation
> 					</type>
> 					<type allAnnotatorFeatures="true">
> 						uima.tcas.DocumentAnnotation
> 					</type>
> 				</outputs>
> 				<languagesSupported />
> 			</capability>
> 		</capabilities>
> 		<operationalProperties>
> 			<modifiesCas>true</modifiesCas>
> 			<multipleDeploymentAllowed>true</multipleDeploymentAllowed>
> 			<outputsNewCASes>false</outputsNewCASes>
> 		</operationalProperties>
> 	</analysisEngineMetaData>
> 	<externalResourceDependencies>
> 		<externalResourceDependency>
> 			<key>DictionaryFile</key>
> 			<description>dictionary file loader.</description>
> 			<interfaceName>
> 				org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource
> 			</interfaceName>
> 			<optional>false</optional>
> 		</externalResourceDependency>
> 	</externalResourceDependencies>
> 	<resourceManagerConfiguration>
> 		<externalResources>
> 			<externalResource>
> 				<name>DictionaryFileName</name>
> 				<description>
> 					A file containing the dictionary. Modify this URL to
> 					use a different dictionary.
> 				</description>
> 				<fileResourceSpecifier>
> 					<fileUrl>file:/search/uima/conf/testDict.xml</fileUrl>
> 				</fileResourceSpecifier>
> 				<implementationName>
> 					org.apache.uima.conceptMapper.support.dictionaryResource.DictionaryResource_impl
> 				</implementationName>
> 			</externalResource>
> 		</externalResources>
> 		<externalResourceBindings>
> 			<externalResourceBinding>
> 				<key>DictionaryFile</key>
> 				<resourceName>DictionaryFileName</resourceName>
> 			</externalResourceBinding>
> 		</externalResourceBindings>
> 	</resourceManagerConfiguration>
> </taeDescription>
> [Kothuvatiparambil, Viju] 
> 
> ----------------------------------------------------------------------
> This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.