You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Richard Eckart de Castilho <re...@apache.org> on 2016/12/01 01:23:23 UTC

Re: newbie questions about UIMA Types

It is possible to customize the generated JCas classes, yes. You can e.g. add own methods or even own fields. However, own fields would not be saved/loaded when you persist a CAS e.g. to XMI.

As a case for a custom method, consider e.g. the DKPro Core Token "setText(string)" method [1].
If the "string" passed to the method differs from the covered text of the Token, then a new
"Form" annotation with the value "string" is created, linked to the Token.

Another case would be the "links()" method on the DKPro Core CorefChain type. It returns all
elements in the respective coreference chain as a List thus saving the user to manually iterate
over the whole chain to reach all elements.

FSList and friends are built-in types of UIMA Core - you can't modify these. But uimaFIT provides
several methods to make working with these things much more convenient. See

- org.apache.uima.fit.util.FSCollectionFactory and its methods to create FSList etc from Java collections
- org.apache.uima.fit.util.JCasUtil has select methods to retrieve elements from FSList etc
- org.apache.uima.fit.util.FSUtil has methods to conveniently get/set feature values including multi-valued features.

Best,

-- Richard

[1] https://github.com/dkpro/dkpro-core/blob/71fda5c6ba91748b6e87312554e418ac1e2911c6/dkpro-core-api-segmentation-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/api/segmentation/type/Token.java#L313
[2] https://github.com/dkpro/dkpro-core/blob/ba33629fc0f077337f9af39e38e1b58531e1674e/dkpro-core-api-coref-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/api/coref/type/CoreferenceChain.java#L101

> On 30.11.2016, at 20:25, David Fox <Da...@humedica.com> wrote:
> 
> Does the UIMA Java framework support modifying or extend the java class generated by JCasGen corresponding to a custom Type?   If so, are there any common circumstances where this is necessary?
> 
> I didn’t see anything in the examples or documentation about modifying the generated classes, but I also didn’t see anything saying you couldn’t.  I suspect that this is not supported (and that otherwise you wouldn’t be able to pass a CAS between distributed UIMA AS components, or between a Java annotator and a C++ one).  But it would be nice to know for certain.
> 
> The reason I ask is that the set of data structures supported by UIMA types (individual FS references,  FSList linked lists, and FSArray arrays) is fairly limited compared to modern programming languages, which often directly support associative arrays, trees, and graphs.  I’m trying to understand whether this is a restriction on the implementation of custom types (which it would be if modifying/extending the generated class was not supported), or just on the public interface accessible via the UIMA API.
> 
> David

Re: newbie questions about UIMA Types

Posted by David Fox <Da...@humedica.com>.
Thanks for the detailed reply and examples.

I¹ve got some tangentially related questions about types in UIMA C++,
which I hope that either you or someone else can answer:

If you need to use a custom Type in an annotator written with the UIMA C++
SDK, 

1) do you need to define a corresponding custom C++ class (analogous to
the one generated by JCasGen)?
2) if so, is there a comparable CppCasGen, or do you need to write it
manually?

Thanks in advance,
David


On 11/30/16, 8:23 PM, "Richard Eckart de Castilho" <re...@apache.org> wrote:

>It is possible to customize the generated JCas classes, yes. You can e.g.
>add own methods or even own fields. However, own fields would not be
>saved/loaded when you persist a CAS e.g. to XMI.
>
>As a case for a custom method, consider e.g. the DKPro Core Token
>"setText(string)" method [1].
>If the "string" passed to the method differs from the covered text of the
>Token, then a new
>"Form" annotation with the value "string" is created, linked to the Token.
>
>Another case would be the "links()" method on the DKPro Core CorefChain
>type. It returns all
>elements in the respective coreference chain as a List thus saving the
>user to manually iterate
>over the whole chain to reach all elements.
>
>FSList and friends are built-in types of UIMA Core - you can't modify
>these. But uimaFIT provides
>several methods to make working with these things much more convenient.
>See
>
>- org.apache.uima.fit.util.FSCollectionFactory and its methods to create
>FSList etc from Java collections
>- org.apache.uima.fit.util.JCasUtil has select methods to retrieve
>elements from FSList etc
>- org.apache.uima.fit.util.FSUtil has methods to conveniently get/set
>feature values including multi-valued features.
>
>Best,
>
>-- Richard
>
>[1] 
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dkpro_dkpr
>o-2Dcore_blob_71fda5c6ba91748b6e87312554e418ac1e2911c6_dkpro-2Dcore-2Dapi-
>2Dsegmentation-2Dasl_src_main_java_de_tudarmstadt_ukp_dkpro_core_api_segme
>ntation_type_Token.java-23L313&d=DgIF-g&c=3XrKki35ZWuh8X2qbeRISQ&r=BYS7q6K
>6Famz8NiMJzvOgYA-WQSvBt9z6TEbaT3nnNM&m=HngGj3axgoDuVIMZym8FO61Tu_FMjQ_zxdk
>T4SVvZWQ&s=XrClIvXlvCk4wq9FakxA9hWNOdyZAcmxRvmyBj9GJaw&e=
>[2] 
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dkpro_dkpr
>o-2Dcore_blob_ba33629fc0f077337f9af39e38e1b58531e1674e_dkpro-2Dcore-2Dapi-
>2Dcoref-2Dasl_src_main_java_de_tudarmstadt_ukp_dkpro_core_api_coref_type_C
>oreferenceChain.java-23L101&d=DgIF-g&c=3XrKki35ZWuh8X2qbeRISQ&r=BYS7q6K6Fa
>mz8NiMJzvOgYA-WQSvBt9z6TEbaT3nnNM&m=HngGj3axgoDuVIMZym8FO61Tu_FMjQ_zxdkT4S
>VvZWQ&s=ul8Zztzk4X2HysLTy5P9MA6G_SHnU-firAU3B9s9EMc&e=
>
>> On 30.11.2016, at 20:25, David Fox <Da...@humedica.com> wrote:
>> 
>> Does the UIMA Java framework support modifying or extend the java class
>>generated by JCasGen corresponding to a custom Type?   If so, are there
>>any common circumstances where this is necessary?
>> 
>> I didn¹t see anything in the examples or documentation about modifying
>>the generated classes, but I also didn¹t see anything saying you
>>couldn¹t.  I suspect that this is not supported (and that otherwise you
>>wouldn¹t be able to pass a CAS between distributed UIMA AS components,
>>or between a Java annotator and a C++ one).  But it would be nice to
>>know for certain.
>> 
>> The reason I ask is that the set of data structures supported by UIMA
>>types (individual FS references,  FSList linked lists, and FSArray
>>arrays) is fairly limited compared to modern programming languages,
>>which often directly support associative arrays, trees, and graphs.  I¹m
>>trying to understand whether this is a restriction on the implementation
>>of custom types (which it would be if modifying/extending the generated
>>class was not supported), or just on the public interface accessible via
>>the UIMA API.
>> 
>> David