You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@uima.apache.org by James Baker <ja...@gmail.com> on 2017/03/29 17:11:54 UTC

Retrieving annotator back from analysis engine

In my UIMA application, I have a number of AnalysisEngines (as you might
expect). These were created using UIMAFramework.produceAnalysisEngine(...)
on my annotators, which all extend MyAnnotator (which in turn extends
JCasAnnotator_ImplBase).

I want to get from the AnalysisEngine back to the original class (cast to
MyAnnotator) so that I can access some of the additional functions I've
added to the class. However, I can't seem to work out how to do that. Could
someone give some pointers?

For clarity, I've included below some code of what I'm trying to acheive
(I'm aware that the code below doesn't work as I've tried it!)

----------------------------

AnalysisEngine ae = getAnalysisEngine(); //Get the analysis engine from
whereever it is, this bit's not important

MyAnnotator ma = (MyAnnotator) ae; //Throws ClassCastException
ma.callMyFunction(); //This is what I'm really trying to get to

----------------------------

Thanks,
James

Re: Retrieving annotator back from analysis engine

Posted by ja...@gmail.com.

I think the issue with both of those approaches is that the information I need is only available after the annotator has been initialised, so I need access to the actual initialised instance of the class.

In the case of the Flow Controller, I think I end up in the same situation where I don’t have direct access to the annotator; and in the case of the Map, as the initialisation is done within produceAnalysisEngine I’m still not sure if I have a route back to the initialised instance of class (unless I can get that from the ResourceSpecifier)?

I guess I could perhaps add some code in the initialise method that stores the required information at an application level which I could then associate by name (is there a way to get the analysis engine name from within the annotator?) with the analysis engine, if there’s no native way within UIMA to do it.

Out of interest, how big would the modification to UIMA be to allow what I’m trying to do? Are we talking a significant change, or is it small enough that if I were to put it in as a feature request someone might implement it?

James

> On 30 Mar 2017, at 22:26, Marshall Schor <ms...@schor.com> wrote:
> 
> Hi James,
> 
> Although you might be able to find some chain of references to eventually get
> what you want, it would end up being very brittle, subject to change from
> release to release without warning etc.
> 
> A more general approach, that's immune to these would be to just create a user
> map in your application code which is calling produceAnalysisEngine, so at the
> next line it adds to this map a key (which is the analysis engine) and a value
> (which is the resourceSpecifier used to create the Analysis Engine) (which in
> your special case, is also the subclass of MyAnnotator, as I understand it).
> 
> Then whenever you need to go from the Analysis Engine to the MyAnnotator
> instance, you just look it up in this map.
> 
> -Marshall
> 
> 
> On 3/30/2017 10:24 AM, James Baker wrote:
>> Thanks Marshall,
>> 
>> What I have is each annotator wrapped as a separate analysis engine
>> ("pipeline"), and then I'm manually running each of those in turn because I
>> want to be able to control the order. In fact, what I'm really trying to
>> achieve is controlling the order that the annotators are run in, based
>> information I get back from them.
>> 
>> Surely the analysis engine/resource specifier must have some kind of
>> reference back to the original class, otherwise how does it know what code
>> to run? Perhaps there's not a method at the moment to get back to the
>> original annotator, but is it stored somewhere I could get to via
>> reflection (accepting all the risks and bad practices that entails!)
>> 
>> James
>> 
>> On 30 March 2017 at 15:07, Marshall Schor <ms...@schor.com> wrote:
>> 
>>> Hi James,
>>> 
>>> The UIMA terminology discusses two kinds of entities:
>>> 
>>>  a) Annotators - take a CAS in, operate on it, update it, etc.  These are
>>> the
>>> building blocks of pipelines.
>>> 
>>>  b) UIMA Applications (e.g., "pipelines") made up of some collection of
>>> Annotators.
>>> 
>>> In most UIMA applications, there might be 1 pipeline, each having a number
>>> of
>>> Annotators. Is this what you have?  Or are you running multiple (perhaps
>>> different) collections of annotators, each having its own pipeline?
>>> 
>>> The produceAnalysisEngine call takes an object which is a
>>> ResourceSpecifier.
>>> That object is a description of the entire pipeline - what annotators are
>>> in it,
>>> configuration parameters, etc.  The output of that is an AnalysisEngine
>>> object
>>> that represents the whole pipeline.
>>> 
>>> There's no reference from that AnalysisEngine object back to the
>>> ResourceSpecifier that was used to direct the construction of the pipeline.
>>> 
>>> So, I don't think what you want to do can be done.
>>> 
>>> ============
>>> 
>>> That being said, perhaps the high level design can be adjusted.  I'm
>>> wondering
>>> if two things got a bit conflated in the design - the idea of analysis
>>> engine
>>> "components" (e.g. Annotators) and the idea of analysis engines themselves
>>> (the
>>> pipelines that contain the annotators, configuration data, etc.)?
>>> 
>>> -Marshall
>>> 
>>> 
>>> On 3/29/2017 1:11 PM, James Baker wrote:
>>>> In my UIMA application, I have a number of AnalysisEngines (as you might
>>>> expect). These were created using UIMAFramework.
>>> produceAnalysisEngine(...)
>>>> on my annotators, which all extend MyAnnotator (which in turn extends
>>>> JCasAnnotator_ImplBase).
>>>> 
>>>> I want to get from the AnalysisEngine back to the original class (cast to
>>>> MyAnnotator) so that I can access some of the additional functions I've
>>>> added to the class. However, I can't seem to work out how to do that.
>>> Could
>>>> someone give some pointers?
>>>> 
>>>> For clarity, I've included below some code of what I'm trying to acheive
>>>> (I'm aware that the code below doesn't work as I've tried it!)
>>>> 
>>>> ----------------------------
>>>> 
>>>> AnalysisEngine ae = getAnalysisEngine(); //Get the analysis engine from
>>>> whereever it is, this bit's not important
>>>> 
>>>> MyAnnotator ma = (MyAnnotator) ae; //Throws ClassCastException
>>>> ma.callMyFunction(); //This is what I'm really trying to get to
>>>> 
>>>> ----------------------------
>>>> 
>>>> Thanks,
>>>> James
>>>> 
>>> 
>

Re: Retrieving annotator back from analysis engine

Posted by Marshall Schor <ms...@schor.com>.

Hi James,

Although you might be able to find some chain of references to eventually get
what you want, it would end up being very brittle, subject to change from
release to release without warning etc.

A more general approach, that's immune to these would be to just create a user
map in your application code which is calling produceAnalysisEngine, so at the
next line it adds to this map a key (which is the analysis engine) and a value
(which is the resourceSpecifier used to create the Analysis Engine) (which in
your special case, is also the subclass of MyAnnotator, as I understand it).

Then whenever you need to go from the Analysis Engine to the MyAnnotator
instance, you just look it up in this map.

-Marshall


On 3/30/2017 10:24 AM, James Baker wrote:
> Thanks Marshall,
>
> What I have is each annotator wrapped as a separate analysis engine
> ("pipeline"), and then I'm manually running each of those in turn because I
> want to be able to control the order. In fact, what I'm really trying to
> achieve is controlling the order that the annotators are run in, based
> information I get back from them.
>
> Surely the analysis engine/resource specifier must have some kind of
> reference back to the original class, otherwise how does it know what code
> to run? Perhaps there's not a method at the moment to get back to the
> original annotator, but is it stored somewhere I could get to via
> reflection (accepting all the risks and bad practices that entails!)
>
> James
>
> On 30 March 2017 at 15:07, Marshall Schor <ms...@schor.com> wrote:
>
>> Hi James,
>>
>> The UIMA terminology discusses two kinds of entities:
>>
>>   a) Annotators - take a CAS in, operate on it, update it, etc.  These are
>> the
>> building blocks of pipelines.
>>
>>   b) UIMA Applications (e.g., "pipelines") made up of some collection of
>> Annotators.
>>
>> In most UIMA applications, there might be 1 pipeline, each having a number
>> of
>> Annotators. Is this what you have?  Or are you running multiple (perhaps
>> different) collections of annotators, each having its own pipeline?
>>
>> The produceAnalysisEngine call takes an object which is a
>> ResourceSpecifier.
>> That object is a description of the entire pipeline - what annotators are
>> in it,
>> configuration parameters, etc.  The output of that is an AnalysisEngine
>> object
>> that represents the whole pipeline.
>>
>> There's no reference from that AnalysisEngine object back to the
>> ResourceSpecifier that was used to direct the construction of the pipeline.
>>
>> So, I don't think what you want to do can be done.
>>
>> ============
>>
>> That being said, perhaps the high level design can be adjusted.  I'm
>> wondering
>> if two things got a bit conflated in the design - the idea of analysis
>> engine
>> "components" (e.g. Annotators) and the idea of analysis engines themselves
>> (the
>> pipelines that contain the annotators, configuration data, etc.)?
>>
>> -Marshall
>>
>>
>> On 3/29/2017 1:11 PM, James Baker wrote:
>>> In my UIMA application, I have a number of AnalysisEngines (as you might
>>> expect). These were created using UIMAFramework.
>> produceAnalysisEngine(...)
>>> on my annotators, which all extend MyAnnotator (which in turn extends
>>> JCasAnnotator_ImplBase).
>>>
>>> I want to get from the AnalysisEngine back to the original class (cast to
>>> MyAnnotator) so that I can access some of the additional functions I've
>>> added to the class. However, I can't seem to work out how to do that.
>> Could
>>> someone give some pointers?
>>>
>>> For clarity, I've included below some code of what I'm trying to acheive
>>> (I'm aware that the code below doesn't work as I've tried it!)
>>>
>>> ----------------------------
>>>
>>> AnalysisEngine ae = getAnalysisEngine(); //Get the analysis engine from
>>> whereever it is, this bit's not important
>>>
>>> MyAnnotator ma = (MyAnnotator) ae; //Throws ClassCastException
>>> ma.callMyFunction(); //This is what I'm really trying to get to
>>>
>>> ----------------------------
>>>
>>> Thanks,
>>> James
>>>
>>

Re: Retrieving annotator back from analysis engine

Posted by Marshall Schor <ms...@schor.com>.

Hi James,

Here's an approach, which may or may not be appropriate:

Instead of wrapping each annotator as a separate analysis engine, make just one
analysis engine with all the annotators in it, as an aggregate, and give that
aggregate a custom flow controller.

Flow controllers were architected for the use case that you describe.

There's an example flow controller in the uimaj-examples project
(WhiteboardFlowController), also you can see here:
https://svn.apache.org/repos/asf/uima/uimaj/tags/uimaj-2.9.0/uimaj-examples/src/main/java/org/apache/uima/examples/flow/WhiteboardFlowController.java

This flow controller looks in the CAS after each annotator, and based on what it
sees has been added, figures out which annotator has their requirements met, and
runs it next.

I suspect this would be a more efficient and direct way to implement what you want.

Also see:
http://uima.apache.org/d/uimaj-current/tutorials_and_users_guides.html#ugr.tug.fc
- it describes this whiteboard flow controller example a bit.

Would this work for you?

-Marshall

On 3/30/2017 10:24 AM, James Baker wrote:
> Thanks Marshall,
>
> What I have is each annotator wrapped as a separate analysis engine
> ("pipeline"), and then I'm manually running each of those in turn because I
> want to be able to control the order. In fact, what I'm really trying to
> achieve is controlling the order that the annotators are run in, based
> information I get back from them.
>
> Surely the analysis engine/resource specifier must have some kind of
> reference back to the original class, otherwise how does it know what code
> to run? Perhaps there's not a method at the moment to get back to the
> original annotator, but is it stored somewhere I could get to via
> reflection (accepting all the risks and bad practices that entails!)
>
> James
>
> On 30 March 2017 at 15:07, Marshall Schor <ms...@schor.com> wrote:
>
>> Hi James,
>>
>> The UIMA terminology discusses two kinds of entities:
>>
>>   a) Annotators - take a CAS in, operate on it, update it, etc.  These are
>> the
>> building blocks of pipelines.
>>
>>   b) UIMA Applications (e.g., "pipelines") made up of some collection of
>> Annotators.
>>
>> In most UIMA applications, there might be 1 pipeline, each having a number
>> of
>> Annotators. Is this what you have?  Or are you running multiple (perhaps
>> different) collections of annotators, each having its own pipeline?
>>
>> The produceAnalysisEngine call takes an object which is a
>> ResourceSpecifier.
>> That object is a description of the entire pipeline - what annotators are
>> in it,
>> configuration parameters, etc.  The output of that is an AnalysisEngine
>> object
>> that represents the whole pipeline.
>>
>> There's no reference from that AnalysisEngine object back to the
>> ResourceSpecifier that was used to direct the construction of the pipeline.
>>
>> So, I don't think what you want to do can be done.
>>
>> ============
>>
>> That being said, perhaps the high level design can be adjusted.  I'm
>> wondering
>> if two things got a bit conflated in the design - the idea of analysis
>> engine
>> "components" (e.g. Annotators) and the idea of analysis engines themselves
>> (the
>> pipelines that contain the annotators, configuration data, etc.)?
>>
>> -Marshall
>>
>>
>> On 3/29/2017 1:11 PM, James Baker wrote:
>>> In my UIMA application, I have a number of AnalysisEngines (as you might
>>> expect). These were created using UIMAFramework.
>> produceAnalysisEngine(...)
>>> on my annotators, which all extend MyAnnotator (which in turn extends
>>> JCasAnnotator_ImplBase).
>>>
>>> I want to get from the AnalysisEngine back to the original class (cast to
>>> MyAnnotator) so that I can access some of the additional functions I've
>>> added to the class. However, I can't seem to work out how to do that.
>> Could
>>> someone give some pointers?
>>>
>>> For clarity, I've included below some code of what I'm trying to acheive
>>> (I'm aware that the code below doesn't work as I've tried it!)
>>>
>>> ----------------------------
>>>
>>> AnalysisEngine ae = getAnalysisEngine(); //Get the analysis engine from
>>> whereever it is, this bit's not important
>>>
>>> MyAnnotator ma = (MyAnnotator) ae; //Throws ClassCastException
>>> ma.callMyFunction(); //This is what I'm really trying to get to
>>>
>>> ----------------------------
>>>
>>> Thanks,
>>> James
>>>
>>

Re: Retrieving annotator back from analysis engine

Posted by James Baker <ja...@gmail.com>.

Thanks Marshall,

What I have is each annotator wrapped as a separate analysis engine
("pipeline"), and then I'm manually running each of those in turn because I
want to be able to control the order. In fact, what I'm really trying to
achieve is controlling the order that the annotators are run in, based
information I get back from them.

Surely the analysis engine/resource specifier must have some kind of
reference back to the original class, otherwise how does it know what code
to run? Perhaps there's not a method at the moment to get back to the
original annotator, but is it stored somewhere I could get to via
reflection (accepting all the risks and bad practices that entails!)

James

On 30 March 2017 at 15:07, Marshall Schor <ms...@schor.com> wrote:

> Hi James,
>
> The UIMA terminology discusses two kinds of entities:
>
>   a) Annotators - take a CAS in, operate on it, update it, etc.  These are
> the
> building blocks of pipelines.
>
>   b) UIMA Applications (e.g., "pipelines") made up of some collection of
> Annotators.
>
> In most UIMA applications, there might be 1 pipeline, each having a number
> of
> Annotators. Is this what you have?  Or are you running multiple (perhaps
> different) collections of annotators, each having its own pipeline?
>
> The produceAnalysisEngine call takes an object which is a
> ResourceSpecifier.
> That object is a description of the entire pipeline - what annotators are
> in it,
> configuration parameters, etc.  The output of that is an AnalysisEngine
> object
> that represents the whole pipeline.
>
> There's no reference from that AnalysisEngine object back to the
> ResourceSpecifier that was used to direct the construction of the pipeline.
>
> So, I don't think what you want to do can be done.
>
> ============
>
> That being said, perhaps the high level design can be adjusted.  I'm
> wondering
> if two things got a bit conflated in the design - the idea of analysis
> engine
> "components" (e.g. Annotators) and the idea of analysis engines themselves
> (the
> pipelines that contain the annotators, configuration data, etc.)?
>
> -Marshall
>
>
> On 3/29/2017 1:11 PM, James Baker wrote:
> > In my UIMA application, I have a number of AnalysisEngines (as you might
> > expect). These were created using UIMAFramework.
> produceAnalysisEngine(...)
> > on my annotators, which all extend MyAnnotator (which in turn extends
> > JCasAnnotator_ImplBase).
> >
> > I want to get from the AnalysisEngine back to the original class (cast to
> > MyAnnotator) so that I can access some of the additional functions I've
> > added to the class. However, I can't seem to work out how to do that.
> Could
> > someone give some pointers?
> >
> > For clarity, I've included below some code of what I'm trying to acheive
> > (I'm aware that the code below doesn't work as I've tried it!)
> >
> > ----------------------------
> >
> > AnalysisEngine ae = getAnalysisEngine(); //Get the analysis engine from
> > whereever it is, this bit's not important
> >
> > MyAnnotator ma = (MyAnnotator) ae; //Throws ClassCastException
> > ma.callMyFunction(); //This is what I'm really trying to get to
> >
> > ----------------------------
> >
> > Thanks,
> > James
> >
>
>

Re: Retrieving annotator back from analysis engine

Posted by Marshall Schor <ms...@schor.com>.

Hi James,

The UIMA terminology discusses two kinds of entities:

  a) Annotators - take a CAS in, operate on it, update it, etc.  These are the
building blocks of pipelines.

  b) UIMA Applications (e.g., "pipelines") made up of some collection of
Annotators.

In most UIMA applications, there might be 1 pipeline, each having a number of
Annotators. Is this what you have?  Or are you running multiple (perhaps
different) collections of annotators, each having its own pipeline?

The produceAnalysisEngine call takes an object which is a ResourceSpecifier. 
That object is a description of the entire pipeline - what annotators are in it,
configuration parameters, etc.  The output of that is an AnalysisEngine object
that represents the whole pipeline.

There's no reference from that AnalysisEngine object back to the
ResourceSpecifier that was used to direct the construction of the pipeline.

So, I don't think what you want to do can be done.

============

That being said, perhaps the high level design can be adjusted.  I'm wondering
if two things got a bit conflated in the design - the idea of analysis engine
"components" (e.g. Annotators) and the idea of analysis engines themselves (the
pipelines that contain the annotators, configuration data, etc.)?

-Marshall

On 3/29/2017 1:11 PM, James Baker wrote:
> In my UIMA application, I have a number of AnalysisEngines (as you might
> expect). These were created using UIMAFramework.produceAnalysisEngine(...)
> on my annotators, which all extend MyAnnotator (which in turn extends
> JCasAnnotator_ImplBase).
>
> I want to get from the AnalysisEngine back to the original class (cast to
> MyAnnotator) so that I can access some of the additional functions I've
> added to the class. However, I can't seem to work out how to do that. Could
> someone give some pointers?
>
> For clarity, I've included below some code of what I'm trying to acheive
> (I'm aware that the code below doesn't work as I've tried it!)
>
> ----------------------------
>
> AnalysisEngine ae = getAnalysisEngine(); //Get the analysis engine from
> whereever it is, this bit's not important
>
> MyAnnotator ma = (MyAnnotator) ae; //Throws ClassCastException
> ma.callMyFunction(); //This is what I'm really trying to get to
>
> ----------------------------
>
> Thanks,
> James
>