You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Alain Désilets <al...@gmail.com> on 2018/12/15 12:20:33 UTC

Is it possible to define dynamically typed annotations?

Is it possible to create dynamically typed annotations in UIMA? In other
words, would it be possible for users of my system to create a new type of
annotation without having to recompile the Java code?

I need this functionality so that non-dev users can define new types of
Named Entities and train a model that can recognize them without having to
recompile the code.

I suspect the answer is no, because all annotation types correspond to a
Java class. True, those classes are defined in an XML file, but in order to
use them you have to generate the Java code from the XML and recompile your
code.

If UIMA does not yet have something that supports dynamic annotations, I
will have to implement one myself. What I have in mind is to define a
sub-class of Annotation called say, DynamicallTypedAnnotation, which would
have two new member variables:

    String typeName = null;
    Map<String,Object> attributes = new HashMap<String,Object>();

The 'typeName' variable would correspond to the type of the annotation (ex:
"Room Number" for an annotation that captures the number of a room) and the
'attributes' variable would allow storage of arbitrary information about
the annotation.

Does that make sense?
Thx

Re: Is it possible to define dynamically typed annotations?

Posted by Alain Désilets <al...@gmail.com>.
>
>
>
> There is never a need to recompile as long as you simply stick to the CAS
> API.

It is possible to write a piece of Java code that sets up a type system, a

CAS, adds annotations, etc. without ever having to run JCasGen. E.g.


Thx Richard. BTW, your Inception project sounds really interesting:

[2] https://inception-project.github.io


I'll definitely give it a spin. It looks very much like what I was about to
develop to help with the growing number of IE projects here at NRC.

Alain

Re: Is it possible to define dynamically typed annotations?

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 16. Dec 2018, at 01:56, Alain Désilets <al...@gmail.com> wrote:
> 
> I am not sure I understand what you wrote. Although I have been using UIMA
> for 2 years now, I am still baffled by it most of the time ;-).
> 
> It SOUNDS like you are saying that it's possible to add new types in the
> XML typesystem file, and tell a RUNNING application to reload the XML file
> without having to recompile that application. Is that correct?

Yes. 

There is never a need to recompile as long as you simply stick to the CAS API.
It is possible to write a piece of Java code that sets up a type system, a
CAS, adds annotations, etc. without ever having to run JCasGen. E.g.

```
// Define type system
TypeSystemDescription tsd = new TypeSystemDescription_impl();
TypeDescription td = tsd.addType("TestType", "", CAS.TYPE_NAME_ANNOTATION);
td.addFeature("value", "", CAS.TYPE_NAME_INTEGER);

// Create CAS and initialize it with the prev. created TS
CAS cas = CasCreationUtils.createCas(tsd, null, null);

// Add annotation to the CAS
Type type = cas.getTypeSystem().getType("TestType");
Feature feat = type.getFeatureByBaseName("value");
AnnotationFS fs = cas.createAnnotation(type, 0, 10);
fs.setIntValue(feat, i);
cas.addFsToIndexes(fs);
```

The context I am using the CAS reinitialization approach that I described
in my previous mail is annotation editors (i.e. WebAnno [1] and INCEpTION [2]).

Annotation editors are hardly useful if they only support hard-coded types,
i.e. if you need to recompile them to support custom types.

Both editors allow the user to configure their type system through a web-based UI. 
Internally, they represent the annotations in UIMA CAS objects which are persisted
in different ways.

When a user opens a document (i.e. loads a UIMA CAS object), the editor needs to make
sure that the CAS is compatible with whatever type system the user has defined. E.g.
the user might have added new types or features since the document was last opened, or
might even have removed some. Note that the editors do not permit to change the type
of features (but the user could remove and re-add them).

The way to ensure that the CAS object is compatible with the current type system is:

* load the CAS object from its persistence format
* serialize it in-memory into the "compressed binary" format
* (re-)initialize the CAS with the current type system
* deserialize the CAS back into the (re-)initialized CAS

The last step is lenient and discards any types/features no longer present
in the current type system.

The DKPro Core XMI Reader [3] is btw. using the same approach in order to be able
to initialize the CAS from a type system file *while* a pipeline is being
executed. Normally the type system would need to be fixed *before* a pipeline
is executed.

It works for me, but it has its limits. E.g. such an approach is
not viable in an UIMA-AS setup (Jerry may correct my if I am wrong).

There have been thoughts running around from time to time of relaxing the
"committing" of the type system in the CAS. I believe that theoretically, it
may be possible to permit certain modifications to the type system even
after it has been "committed", i.e. within certain constraints, adding
new types and adding new features may be possible - but Marshall
can probably say more about this. Constraints would probably again be
that such a feature could not be used in distributed (at least not
without quite a bit of refactoring of the scale-out tools).

Cheers,

-- Richard

[1] https://github.com/webanno/webanno
[2] https://inception-project.github.io
[3] https://github.com/dkpro/dkpro-core/blob/master/dkpro-core-io-xmi-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/io/xmi/XmiReader.java#L117

Re: Is it possible to define dynamically typed annotations?

Posted by Alain Désilets <al...@gmail.com>.
I am not sure I understand what you wrote. Although I have been using UIMA
for 2 years now, I am still baffled by it most of the time ;-).

It SOUNDS like you are saying that it's possible to add new types in the
XML typesystem file, and tell a RUNNING application to reload the XML file
without having to recompile that application. Is that correct?

If so, I don't see a need for Dynamically typed annotations after all.


On Sat, Dec 15, 2018 at 5:40 PM Richard Eckart de Castilho <re...@apache.org>
wrote:

> Hi folks
>
> > On 15. Dec 2018, at 13:41, Nicolas Paris <ni...@riseup.net>
> wrote:
> >
> > On Sat, Dec 15, 2018 at 07:20:33AM -0500, Alain Désilets wrote:
> >> Is it possible to create dynamically typed annotations in UIMA? In other
> >> words, would it be possible for users of my system to create a new type
> of
> >> annotation without having to recompile the Java code?
>
> My take on the problem is to redefine the CAS in-place. The following
> code is used by the WebAnno annotation editor to handle the case where
> the user modifies the type system:
>
> ---
>
>     /**
>      * Load the contents from the source CAS, upgrade it to the target
> type system and write the
>      * results to the target CAS. An in-place upgrade can be achieved by
> using the same CAS as
>      * source and target.
>      */
>     private void upgradeCas(CAS aSourceCas, CAS aTargetCas,
> TypeSystemDescription aTargetTypeSystem)
>         throws UIMAException, IOException
>     {
>         // Save source CAS type system (do this early since we might do an
> in-place upgrade)
>         TypeSystem sourceTypeSystem = aSourceCas.getTypeSystem();
>
>         // Save source CAS contents
>         ByteArrayOutputStream serializedCasContents = new
> ByteArrayOutputStream();
>         Serialization.serializeWithCompression(aSourceCas,
> serializedCasContents, sourceTypeSystem);
>
>         // Re-initialize the target CAS with new type system
>         CAS tempCas = JCasFactory.createJCas(aTargetTypeSystem).getCas();
>         CASCompleteSerializer serializer =
> Serialization.serializeCASComplete((CASImpl) tempCas);
>         Serialization.deserializeCASComplete(serializer, (CASImpl)
> aTargetCas);
>
>         // Leniently load the source CAS contents into the target CAS
>         CasIOUtils.load(new
> ByteArrayInputStream(serializedCasContents.toByteArray()), aTargetCas,
>                 sourceTypeSystem);
>
>         // Make sure JCas is properly initialized too
>         aTargetCas.getJCas();
>     }
>
> ---
>
> This procedure takes a bit so it shouldn't be done often and also it
> discards any non-reachable
> feature structures - but it works. It also discards any information that
> is not compatible with
> the target type system - within any limits that lenient CAS loading may
> impose.
>
> Basically you call it with
>
>           upgradeCas(aCas, aCas, aTargetTypeSystem);
>
> in order to perform an in-place upgrade of a singla CAS to the given type
> system.
>
> Cheers,
>
> -- Richard

Re: Is it possible to define dynamically typed annotations?

Posted by Richard Eckart de Castilho <re...@apache.org>.
Hi folks

> On 15. Dec 2018, at 13:41, Nicolas Paris <ni...@riseup.net> wrote:
> 
> On Sat, Dec 15, 2018 at 07:20:33AM -0500, Alain Désilets wrote:
>> Is it possible to create dynamically typed annotations in UIMA? In other
>> words, would it be possible for users of my system to create a new type of
>> annotation without having to recompile the Java code?

My take on the problem is to redefine the CAS in-place. The following
code is used by the WebAnno annotation editor to handle the case where
the user modifies the type system:

---

    /**
     * Load the contents from the source CAS, upgrade it to the target type system and write the
     * results to the target CAS. An in-place upgrade can be achieved by using the same CAS as
     * source and target.
     */
    private void upgradeCas(CAS aSourceCas, CAS aTargetCas, TypeSystemDescription aTargetTypeSystem)
        throws UIMAException, IOException
    {
        // Save source CAS type system (do this early since we might do an in-place upgrade)
        TypeSystem sourceTypeSystem = aSourceCas.getTypeSystem();

        // Save source CAS contents
        ByteArrayOutputStream serializedCasContents = new ByteArrayOutputStream();
        Serialization.serializeWithCompression(aSourceCas, serializedCasContents, sourceTypeSystem);

        // Re-initialize the target CAS with new type system
        CAS tempCas = JCasFactory.createJCas(aTargetTypeSystem).getCas();
        CASCompleteSerializer serializer = Serialization.serializeCASComplete((CASImpl) tempCas);
        Serialization.deserializeCASComplete(serializer, (CASImpl) aTargetCas);

        // Leniently load the source CAS contents into the target CAS
        CasIOUtils.load(new ByteArrayInputStream(serializedCasContents.toByteArray()), aTargetCas,
                sourceTypeSystem);

        // Make sure JCas is properly initialized too
        aTargetCas.getJCas();
    }

---

This procedure takes a bit so it shouldn't be done often and also it discards any non-reachable
feature structures - but it works. It also discards any information that is not compatible with
the target type system - within any limits that lenient CAS loading may impose.

Basically you call it with 

          upgradeCas(aCas, aCas, aTargetTypeSystem);

in order to perform an in-place upgrade of a singla CAS to the given type system.

Cheers,

-- Richard

Re: Is it possible to define dynamically typed annotations?

Posted by Nicolas Paris <ni...@riseup.net>.
Hi Alain,

On Sat, Dec 15, 2018 at 07:20:33AM -0500, Alain Désilets wrote:
> Is it possible to create dynamically typed annotations in UIMA? In other
> words, would it be possible for users of my system to create a new type of
> annotation without having to recompile the Java code?

Also very interested in such feature. Right now, my solution is to
generate a large number of annotation (say from anno1 to anno1000) once
a user creates it's annotation it is mapped to one of them internally.

The solution avoids to recompile the program, but it is limited since it
does not allow user to add annotations features or complex annotations.


-- 
nicolas

Re: Is it possible to define dynamically typed annotations?

Posted by Alain Désilets <al...@gmail.com>.
On Sat, Dec 15, 2018 at 1:15 PM Marshall Schor <ms...@schor.com> wrote:

> I guess the question is why have a new type?  The answer to that could
> motivate
> what properties the solution should have.
>
> What you propose is fine, but in some ways is not a new type, in that it
> doesn't
> seem to have many of the properties UIMA types have.



>     If that is OK in your application, then that's fine.
>
I guess I didn't make that explicit, but I was thinking that
DynamicallyTypedAnnotation would be a subclass of Annotation.  So it would
inherit all the properties and capabilities of Annotation.

>
>

>     If not, please say more about what properties of types you want to
> have,
>     that this approach might not satisfy.
>

Well, basically I want to be able to new types that have new attributes at
runtime, without having to recompile the code. I also want the ability to
select annotations based on the type name, without having to recompile the
code. The reason I need this is that I am building a web app that will
allow non-dev users to train models to recognize new types of entities,
without having to write java code. Maybe this is already possible with
UIMA, but all the examples I have seen where new types are defined involves
writing XML code and using Eclipse to recompile the app. In my context,
this sounds impractical because the app would have to somehow recompile and
redeploy itself on Tomcat (and I don't even know if that's possible).

With the approach I propose, the user would just provide a name for the new
type, as well as a list of attribute names. The system would then create
instances of the new type like this:

   HashMap<String,Object> attrs = new HashMap<String,Object>()
   attrs.put("attName1" null);
   ....
   attrs.put("attNameN" null);
   DynamicTypedAnnotation ann = new DynamicTypedAnnotation(typeName, attrs)

Alain





>

Re: Is it possible to define dynamically typed annotations?

Posted by Marshall Schor <ms...@schor.com>.
I guess the question is why have a new type?  The answer to that could motivate
what properties the solution should have.

What you propose is fine, but in some ways is not a new type, in that it doesn't
seem to have many of the properties UIMA types have. 

    If that is OK in your application, then that's fine. 

    If not, please say more about what properties of types you want to have,
    that this approach might not satisfy.

Here's some examples of what having types provides:

1) a type hierarchy - subtypes have features inherited from super types.

2) a way to have "indexes" which provide access to a type (and its subtypes)
instances in the CAS.

3) a way to have getters / setters, with special versions for "array" types that
give access to elements

4) for some types, a way to "order" them in the CAS.  For instance, if a type is
a subtype of "Annotation", it gets (via inheritance) a begin and end "feature",
and there's a built-in index that is sorted, making use of these features (and
also making use of "type priority" ordering).

Note that if you don't need this for your types, then they should *not* be
subtypes of Annotation.

5) a way to serialize / deserialize (in several formats) for storage and
transmission (for instance, when some annotators in a pipeline are remote
services). 

Your suggestion to have a general Map<String,Object> might be an issue for
serialization / deserialization.

-Marshall

On 12/15/2018 7:20 AM, Alain Désilets wrote:
> Is it possible to create dynamically typed annotations in UIMA? In other
> words, would it be possible for users of my system to create a new type of
> annotation without having to recompile the Java code?
>
> I need this functionality so that non-dev users can define new types of
> Named Entities and train a model that can recognize them without having to
> recompile the code.
>
> I suspect the answer is no, because all annotation types correspond to a
> Java class. True, those classes are defined in an XML file, but in order to
> use them you have to generate the Java code from the XML and recompile your
> code.
>
> If UIMA does not yet have something that supports dynamic annotations, I
> will have to implement one myself. What I have in mind is to define a
> sub-class of Annotation called say, DynamicallTypedAnnotation, which would
> have two new member variables:
>
>     String typeName = null;
>     Map<String,Object> attributes = new HashMap<String,Object>();
>
> The 'typeName' variable would correspond to the type of the annotation (ex:
> "Room Number" for an annotation that captures the number of a room) and the
> 'attributes' variable would allow storage of arbitrary information about
> the annotation.
>
> Does that make sense?
> Thx
>

Re: Is it possible to define dynamically typed annotations?

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 15. Dec 2018, at 13:20, Alain Désilets <al...@gmail.com> wrote:
> 
> Is it possible to create dynamically typed annotations in UIMA? In other
> words, would it be possible for users of my system to create a new type of
> annotation without having to recompile the Java code?

Alain had also posted this question on the DKPro Core user's mailing list.

Part of my answer can be found there:

  https://groups.google.com/d/msg/dkpro-core-user/irvvRr0MaBo/IIabZSslCgAJ

In short: I personally tend towards dynamically changing the type system,
redefining the CAS using the code I posted in my other mail to this list
and using the CAS API in order to work with the CAS. Works nicely for
any types that are not known at compile time.

-- Richard