You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Richard Eckart de Castilho (JIRA)" <de...@uima.apache.org> on 2016/10/07 17:19:20 UTC

[jira] [Comment Edited] (UIMA-5135) UIMA CasIOUtils enhancements in handling type systems

    [ https://issues.apache.org/jira/browse/UIMA-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15555675#comment-15555675 ] 

Richard Eckart de Castilho edited comment on UIMA-5135 at 10/7/16 5:18 PM:
---------------------------------------------------------------------------

It is not uncommon that a single "typesystem.xml" file is written to a folder to which XMI files are written, but I don't feel comfortable baking that convention into CasIoUtils.

We could extend CasIoUtil with a new SerialFormat "XMI_TS" and if that is used treat the tsiOS parameter of the 4-arg save method as the target of an XML TS.

Regarding UIMA-5120, I wonder if using COMPRESSED_FILTERED_TSI wouldn't be a better option for logging CASes. It already includes the TSI, it is more compact than XMI and probably faster to read/write.


was (Author: rec):
It is not uncommon that a single "typesystem.xml" file is written to a folder to which XMI files are written.

We could extend CasIoUtil with a new SerialFormat "XMI_TS" and if that is used treat the tsiOS parameter of the 4-arg save method as the target of an XML TS.

Regarding UIMA-5120, I wonder if using COMPRESSED_FILTERED_TSI wouldn't be a better option for logging CASes. It already includes the TSI, it is more compact than XMI and probably faster to read/write.

> UIMA CasIOUtils enhancements in handling type systems
> -----------------------------------------------------
>
>                 Key: UIMA-5135
>                 URL: https://issues.apache.org/jira/browse/UIMA-5135
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>            Reporter: Marshall Schor
>            Priority: Minor
>             Fix For: 3.0.0SDKexp, 2.9.1SDK
>
>
> A recent Jira UIMA-5120 was logging CASs to file system directories, and including a type system.  
> It would be good to have a conventional,supported way to do this common kind of operation, added to CasIOUtils.
> Additionally, it would be good to support as an alternative the standard XML serialization format for type systems.
> Some possible conventions:  
> * multiple cas files, in 1 directory, with one additional file with the name "typesystem.xml".
> * the above style, in one zip file (for example, to be able to read it, one cas at a time, via some iterator).
> * finding a type system via the class path following uimaFIT conventions
> One factor that probably is important is to store the type system for this kind of thing "close to" the serialized forms it applies to.
> It would be possible of course to support multiple conventions.  However, the more conventions, the less benefit from "standardization", so this ought to be a balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)