You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@uima.apache.org by Anton Shuster <br...@gmail.com> on 2009/12/08 22:29:24 UTC

Deserializing using an extended type system

Hello all,

I read in the documentation that, in general, you have to use the same
type system for serializing and deserializing. However, it seems that
this requirement is not exact. I was able to do the following:

1. set up a type system with type A containing feature A1, type B
containing feature B1, type C containing feature C1, and type D with
no features.
2. process a document with an annotator that creates annotations of
type A, B, and C.
3. serialize to a file
4. modify the type system by adding a feature A2 to type A.
5. deserialize from the file
6. continue processing the document with another annotator that
creates annotations of type D.

My question is, how flexible is deserializing with respect to modified
type systems? I can't seem to find any documentation on this. In
particular, I'm interested in the following cases:
- types are added
- types are removed
- features are added to some types
- features are removed from some types

Thanks for your help!
--Anton

Re: Deserializing using an extended type system

Posted by Anton Shuster <br...@gmail.com>.

> > - features are removed from some types
> 
> If there are annotations/FSs in your CAS that use those features,
> I believe that will result in a runtime exception.  Not even the
> "lenient" flag will save you there.  Not 100% sure though, you
> may want to try it out or look at the code.

I did a little more testing and I just wanted to confirm
that deserializing with a type system where a feature has been removed
from a type works as long you use the lenient flag when the XMI contains
annotations for that feature.

--Anton

Re: Deserializing using an extended type system

Posted by Thilo Goetz <tw...@gmx.de>.

On 12/8/2009 22:29, Anton Shuster wrote:
> Hello all,
> 
> I read in the documentation that, in general, you have to use the same
> type system for serializing and deserializing. However, it seems that
> this requirement is not exact. I was able to do the following:
> 
> 1. set up a type system with type A containing feature A1, type B
> containing feature B1, type C containing feature C1, and type D with
> no features.
> 2. process a document with an annotator that creates annotations of
> type A, B, and C.
> 3. serialize to a file
> 4. modify the type system by adding a feature A2 to type A.
> 5. deserialize from the file
> 6. continue processing the document with another annotator that
> creates annotations of type D.
> 
> My question is, how flexible is deserializing with respect to modified
> type systems? I can't seem to find any documentation on this. In
> particular, I'm interested in the following cases:
> - types are added

I think that's fine.

> - types are removed

That's fine as long as you have no annotions/FSs of that type
in the CAS you're deserializing.  If you do, you can specify
the "lenient" flag in the deserializer, see
http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/api/org/apache/uima/cas/impl/XmiCasDeserializer.html#deserialize%28java.io.InputStream,%20org.apache.uima.cas.CAS,%20boolean%29

> - features are added to some types

That's fine I think.

> - features are removed from some types

If there are annotations/FSs in your CAS that use those features,
I believe that will result in a runtime exception.  Not even the
"lenient" flag will save you there.  Not 100% sure though, you
may want to try it out or look at the code.

--Thilo

> 
> Thanks for your help!
> --Anton