You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Marshall Schor (JIRA)" <de...@uima.apache.org> on 2017/10/03 20:17:00 UTC

[jira] [Commented] (UIMA-5601) uv3: CasCopier problems with custom subclasses of DocumentAnnotation

    [ https://issues.apache.org/jira/browse/UIMA-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190270#comment-16190270 ] 

Marshall Schor commented on UIMA-5601:
--------------------------------------

I think this Jira is opening some new area for detailed design.  I suspect the issue raised also exists in V2 (maybe in somewhat different aspect).  

The framework creates DocumentAnnotation instances automatically if not already in existence, when needed.  For instance, if you create a new CAS, and then do aCas.setDocumentLanguage(...). a new document annotation instance is created.  

If you define a type which is a subtype-of-documentAnnotation, there's currently no way to "register" this type as the type that should be used when creating automatically a new instance.  The best that can be done, I think (currently) is to insure the user code creates an instance of your subtype-of-documentAnnotation in each view where it might be used, before it is needed.

I presume this is what is happening in your use case above, is this correct?

Before considering the CasCopier, I wonder what the UIMA framework should do in this case, when the user has not previously created a subtype-of-documentAnnotation, but the type system indicates that such a type (or types, even) exist.  

If there was just one, the system could create an instance of the lowest-in-the-type-tree subtype. (Note: this would be a design change for UIMA).  This wouldn't work for a case where there were 2 or more subtypes at the maximal depth in the type inheritance hierarchy.

Another approach could be to have the system do what it now does: create an instance of DocumentAnnotation when needed (if it doesn't exist), but modify the create new feature structure code to "special case" making instances of subtypes of DocumentAnnotation:
  
# if an instance of subtype-of-documentAnnotation already existed, replace the previous one or throw an error?
# if any instance of DocumentAnnotation or a subtype of it exists, delete this one and replace it with a new instance of the subtype-of-documentAnnotation, copying the fields of the previous instance into the new one (assuming the features exist).

This would insure there's only one documentation or subtype per view. 

In the CasCopier case, a further case to consider: what should happen if "lenient" mode is on, and the target Type System doesn't include the subtype-of-documentAnnotation?  Probably the most logical thing to do would be to create an instance in the target of DocumentAnnotation type, copying  those features which are common to the two types.

I guess I would lean toward implementing a design change along the above lines.  Because creating multiple subtype-of-documentAnnotation seems more likely an error than a wanted feature, I would throw an exception for this subcase.

WDYT?

> uv3: CasCopier problems with custom subclasses of DocumentAnnotation
> --------------------------------------------------------------------
>
>                 Key: UIMA-5601
>                 URL: https://issues.apache.org/jira/browse/UIMA-5601
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 3.0.0SDK-beta
>            Reporter: Richard Eckart de Castilho
>
> It seems as if there may be a bug in the way that CasCopier handles the documen annotation. 
> Specifically, it seems as if the CasCopier incorrectly handles the case where the target CAS already contains a document annotation. In my case, I do:
> * create the target CAS
> * add a document annotation (DocumentMetaData extends DocumentAnnotation) to the target CAS
> * create the CasCopier with the source and target CAS
> * copy several FSes but *not* the document annotation
> Expected:
> * target CAS contains 1 DocumentMetaData annotation
> Actual
> * target CAS contains 2 DocumentMetaData annotation
> Also, it seems that `isDocumentAnnotation` may not able to handle it if a CAS uses a custom subclass of DocumentAnnotation:
> {noformat}
>   private <T extends FeatureStructure> boolean isDocumentAnnotation(T aFS) {
>     if (((TOP)aFS)._getTypeCode() != TypeSystemConstants.docTypeCode) {
>       return false;
>     }
>     if (srcCasDocumentAnnotation == null) {
>       srcCasDocumentAnnotation = srcCasViewImpl.getDocumentAnnotationNoCreate(); 
>     }
>     return aFS == srcCasDocumentAnnotation;
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)