You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Katja Buyko <bu...@coling-uni-jena.de> on 2006/12/12 13:09:02 UTC

String Subtypes (restricted set of allowed values)

Hello!

I tried to use in my Type System the subset of the String that specifies 
a restricted set of allowed values (cf. UIMA SDK 23-295).
I assumed that UIMA will check then if the assigned value is correct (is 
in the restricted list). It was not the case because I could assign 
other values as well.

Please, see the extracts from the Type System and the code in the 
Annotator (see below). Is there something wrong?

 Type System:

<typeDescription>
<name>de.julielab.jules.types.PennPOSTag</name>
<description></description>
<supertypeName>de.julielab.jules.types.POSTag</supertypeName>
<features>
<featureDescription>
<name>value</name>
<description></description>
<rangeTypeName>de.julielab.jules.types.PennPOS</rangeTypeName>
</featureDescription>
</features>
</typeDescription>
<typeDescription>

<typeDescription>
<name>de.julielab.jules.types.PennPOS</name>
<description></description>
<supertypeName>uima.cas.String</supertypeName>
<allowedValues>
<value>
<string>CC</string>
<description>Coordinating conjunction</description>
</value>
<value>
<string>CD</string>
<description>Cardinal number</description>
</value>
<value>
<string>DT</string>
<description>Determiner</description>
</value>
<value>
</allowedValues>
</typeDescription>
<typeDescription>


Annotator:

posTag = "CD";
PennPOSTag pos = new PennPOSTag(aJCas, token.getBegin(),token.getEnd() );
((PennPOSTag)pos).setValue(posTag);


With best regards

E. Buyko




Re: String Subtypes (restricted set of allowed values)

Posted by Ekaterina Buyko <bu...@coling-uni-jena.de>.
Thilo Goetz schrieb:
> Ah, true, such an exception should not be thrown.  Can you post the 
> stack trace?
>
> --Thilo
>
I hope it helps. I have forced an exception in my Annotator. Is there an 
other way to get stack trace?

Stack Trace
--------------------------------------------------

at de.julielab.jules.ae.OpenNLPPosTagger.process(OpenNLPPosTagger.java:150)
	at
com.ibm.uima.reference_impl.analysis_engine.compatibility.AnnotatorAdapter.process(AnnotatorAdapter.java:147)
	at
com.ibm.uima.reference_impl.analysis_engine.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:392)
	at
com.ibm.uima.reference_impl.analysis_engine.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:297)
	at
com.ibm.uima.reference_impl.analysis_engine.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:218)
	at
com.ibm.uima.reference_impl.analysis_engine.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:197)
	at
com.ibm.uima.reference_impl.analysis_engine.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:243)
	at
de.julielab.jules.ae.OpenNLPPOSTaggerAnnotatorTest.testProcess(OpenNLPPOSTaggerAnnotatorTest.java:101)

-------------------------------------------------

-- 

Ekaterina Buyko
Jena University Language and Information Engineering (JULIE) Lab
Phone: +49-3641-944322
Fax:   +49-3641-944321
email: buyko@coling-uni-jena.de
URL:   http://www.coling.uni-jena.de



Re: String Subtypes (restricted set of allowed values)

Posted by Marshall Schor <ms...@schor.com>.
Thilo Goetz wrote:
> Ah, true, such an exception should not be thrown.  Can you post the 
> stack trace?
I may be confused, but I think the problem is the opposite:

The user is expecting that when a value is set which is not in the 
"allowed-value" set, an execption *should be thrown*, but it isn't.

Is this correct?

-Marshall


Re: String Subtypes (restricted set of allowed values)

Posted by Thilo Goetz <tw...@gmx.de>.
Ah, true, such an exception should not be thrown.  Can you post the 
stack trace?

--Thilo

Ekaterina Buyko wrote:
> Marshall Schor schrieb:
>> OK - this should work (that is, when you assign a value not in the set 
>> of allowed values, you should get an exception).
>>
>> In your case, what happens?  Does is just do the assignment without 
>> complaints?
>>
>> -Marshall
> 
> I do not get any exceptions. My process method attends in the catch 
> block the IndexOutOfBoundsException.
> What kind of exception should be thrown in the described case?
> 
> Best
> 
> E. Buyko
> 
> 

Re: String Subtypes (restricted set of allowed values)

Posted by Thilo Goetz <tw...@gmx.de>.
Fixed in Apache version with https://issues.apache.org/jira/browse/UIMA-128

--Thilo

Ekaterina Buyko wrote:
> I just checked the code for setting Strings using JCas, and it seems 
> you've stumbled onto a bug ...
>> The generated JCas code uses low level string access APIs into the CAS 
>> and those do not include the
>> invalid-range checking for strings with allowed types.
>>
>> We'll put in a defect to fix this in an upcoming release (probably on 
>> Apache).
>>
>> Can you work around this for now?
>>
>> -Marshall
> Thank you very much for this check. Currently, we can work around this, 
> but this feature is really nice to have.
> We have designed our type system with restricted types where it is 
> possible in order to have an further control about the annotation.
> 
> Best
> 
> E. Buyko
> 

Re: String Subtypes (restricted set of allowed values)

Posted by Ekaterina Buyko <bu...@coling-uni-jena.de>.
I just checked the code for setting Strings using JCas, and it seems 
you've stumbled onto a bug ...
> The generated JCas code uses low level string access APIs into the CAS 
> and those do not include the
> invalid-range checking for strings with allowed types.
>
> We'll put in a defect to fix this in an upcoming release (probably on 
> Apache).
>
> Can you work around this for now?
>
> -Marshall
Thank you very much for this check. Currently, we can work around this, 
but this feature is really nice to have.
We have designed our type system with restricted types where it is 
possible in order to have an further control about the annotation.

Best

E. Buyko

-- 

Ekaterina Buyko
Jena University Language and Information Engineering (JULIE) Lab
Phone: +49-3641-944322
Fax:   +49-3641-944321
email: buyko@coling-uni-jena.de
URL:   http://www.coling.uni-jena.de



Re: String Subtypes (restricted set of allowed values)

Posted by Marshall Schor <ms...@schor.com>.
I just checked the code for setting Strings using JCas, and it seems 
you've stumbled onto a bug ...
The generated JCas code uses low level string access APIs into the CAS 
and those do not include the
invalid-range checking for strings with allowed types.

We'll put in a defect to fix this in an upcoming release (probably on 
Apache).

Can you work around this for now?

-Marshall

Marshall Schor wrote:
> Ekaterina Buyko wrote:
>> <snip>I do not get any exceptions. My process method attends in the 
>> catch block the IndexOutOfBoundsException.
>> What kind of exception should be thrown in the described case?
> You should be getting a CASRuntimeException, with the message:
> Error setting string value: string "{0}" is not valid for a value of 
> type "{1}".
>
> with appropriate substitutions for {0} and {1}.  Can you try catching 
> that exception?
> -Marshall
>
>


Re: String Subtypes (restricted set of allowed values)

Posted by Marshall Schor <ms...@schor.com>.
Ekaterina Buyko wrote:
> <snip>I do not get any exceptions. My process method attends in the 
> catch block the IndexOutOfBoundsException.
> What kind of exception should be thrown in the described case?
You should be getting a CASRuntimeException, with the message:
Error setting string value: string "{0}" is not valid for a value of 
type "{1}".

with appropriate substitutions for {0} and {1}.  Can you try catching 
that exception?
-Marshall

Re: String Subtypes (restricted set of allowed values)

Posted by Ekaterina Buyko <bu...@coling-uni-jena.de>.
Marshall Schor schrieb:
> OK - this should work (that is, when you assign a value not in the set 
> of allowed values, you should get an exception).
>
> In your case, what happens?  Does is just do the assignment without 
> complaints?
>
> -Marshall

I do not get any exceptions. My process method attends in the catch 
block the IndexOutOfBoundsException.
What kind of exception should be thrown in the described case?

Best

E. Buyko




Re: String Subtypes (restricted set of allowed values)

Posted by Marshall Schor <ms...@schor.com>.
buyko@coling-uni-jena.de wrote:
> Hi!
>
> I use UIMA SDK Version 2.0.1.
>
> I am sorry, I have not correctly copied the extract from the Type 
> System, here
> is the correct version.
OK - this should work (that is, when you assign a value not in the set 
of allowed values, you should get an exception).

In your case, what happens?  Does is just do the assignment without 
complaints?

-Marshall

Re: String Subtypes (restricted set of allowed values)

Posted by bu...@coling-uni-jena.de.
Hi!

I use UIMA SDK Version 2.0.1.

I am sorry, I have not correctly copied the extract from the Type System, here
is the correct version.

<typeDescription>
<name>de.julielab.jules.types.PennPOSTag</name>
<description></description>
<supertypeName>de.julielab.jules.types.POSTag</supertypeName>
<features>
<featureDescription>
<name>value</name>
<description></description>
<rangeTypeName>de.julielab.jules.types.PennPOS</rangeTypeName>
</featureDescription>
</features>
</typeDescription>

<typeDescription>
<name>de.julielab.jules.types.PennPOS</name>
<description></description>
<supertypeName>uima.cas.String</supertypeName>
<allowedValues>
<value>
<string>CC</string>
<description>Coordinating conjunction</description>
</value>
<value>
<string>CD</string>
<description>Cardinal number</description>
</value>
<value>
<string>DT</string>
<description>Determiner</description>
</value>
</allowedValues>
</typeDescription>


With best regards

E. Buyko

Quoting Marshall Schor <ms...@schor.com>:

> Hi - it seems the descriptor (at least as posted) is invalid?  It 
> appears to have
> unmatched XML tags in it?
>
> Assuming these were typos- can you post what version of UIMA you're 
> seeing this issue with?
>
> -Marshall
>
> Katja Buyko wrote:
>> <typeDescription>
>>   <name>de.julielab.jules.types.PennPOS</name>
>>   <description></description>
>>   <supertypeName>uima.cas.String</supertypeName>
>>   <allowedValues>
>>     <value>
>>       <string>CC</string>
>>       <description>Coordinating conjunction</description>
>>     </value>
>>     <value>
>>       <string>CD</string>
>>       <description>Cardinal number</description>
>>     </value>
>>     <value>
>>       <string>DT</string>       <description>Determiner</description>
>>     </value>
>> <value>  <!-- APPEARS TO BE EXTRA??? -->
>> </allowedValues>
>> </typeDescription>
>> <typeDescription>   <!-- APPEARS TO BE EXTRA ??? -->
>
>>
>> Annotator:
>>
>> posTag = "CD";
>> PennPOSTag pos = new PennPOSTag(aJCas, token.getBegin(),token.getEnd() );
>> ((PennPOSTag)pos).setValue(posTag);
>>
>>
>> With best regards
>>
>> E. Buyko
>>
>>
>>
>>
>>
>
>



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


Re: String Subtypes (restricted set of allowed values)

Posted by Marshall Schor <ms...@schor.com>.
Hi - it seems the descriptor (at least as posted) is invalid?  It 
appears to have
unmatched XML tags in it?

Assuming these were typos- can you post what version of UIMA you're 
seeing this issue with?

-Marshall

Katja Buyko wrote:
> <typeDescription>
>   <name>de.julielab.jules.types.PennPOS</name>
>   <description></description>
>   <supertypeName>uima.cas.String</supertypeName>
>   <allowedValues>
>     <value>
>       <string>CC</string>
>       <description>Coordinating conjunction</description>
>     </value>
>     <value>
>       <string>CD</string>
>       <description>Cardinal number</description>
>     </value>
>     <value>
>       <string>DT</string> 
>       <description>Determiner</description>
>     </value>
> <value>  <!-- APPEARS TO BE EXTRA??? -->
> </allowedValues>
> </typeDescription>
> <typeDescription>   <!-- APPEARS TO BE EXTRA ??? -->

>
> Annotator:
>
> posTag = "CD";
> PennPOSTag pos = new PennPOSTag(aJCas, token.getBegin(),token.getEnd() );
> ((PennPOSTag)pos).setValue(posTag);
>
>
> With best regards
>
> E. Buyko
>
>
>
>
>