You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Hugues de Mazancourt <hu...@mazancourt.com> on 2018/04/11 13:54:25 UTC

Dynamically bind resources to AnalysisEngine

Hello,

Is there a way to dynamically bind/update resources for an AnalysisEngine ?
My use-case is : I build a query parser that will be used to retrieve information in an indexed text database.
The parser performs spelling correction, but doesn't have to consider words in the index as spelling mistakes. Thus, the (aggregate) engine is bound to the index vocabulary (ie a word list).
My point is : when the index gets updated, its vocabulary will also be updated. I can re-build a new aggregate parser, with the updated resource, but this takes time, mainly for loading resources that were already loaded (POS model, lexica, etc.). Is there a way to update a given resource on my parser without having to rebuild it ?

Thanks for your help,
PS: I'm mostly building on top of DKPro components. I may miss some basic UIMA mechanisms
Hugues de Mazancourt
Mazancourt Conseil

E: hugues@mazancourt.com (mailto:hugues@mazancourt.com)
P: +33-6 72 78 70 33 (tel:+33-6%2072%2078%2070%2033)
W: http://www.mazancourt.com


Re: Dynamically bind resources to AnalysisEngine

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 15.04.2018, at 10:04, Hugues de Mazancourt <hu...@mazancourt.com> wrote:
> 
> However, if I reading Richard’s suggestions, especially on this :
>> 
>> However, if you want to use external resources, having a look at 
>> 
>> https://svn.apache.org/repos/asf/uima/uimafit/trunk/uimafit-core/src/test/java/org/apache/uima/fit/factory/ExternalResourceFactoryTest.java<https://svn.apache.org/repos/asf/uima/uimafit/trunk/uimafit-core/src/test/java/org/apache/uima/fit/factory/ExternalResourceFactoryTest.java>
> 
> …if I correctly read the code, it means that I can bind a POJO as a resource to my AE. I thought a Shared resource had to be described (and accessed) through a DataResource.
> If I can directly inject my application’s vocabulary as resource, then my problems are gone, because the vocabulary object gets updated each time the index changes. Am I missing something ?

If you are working in a single JVM then uimaFIT offers several options to inject POJOs into UIMA components, yes. The most straight-forward one is probably the SimpleNamedResourceManager.

-- Richard

Re: Dynamically bind resources to AnalysisEngine

Posted by Hugues de Mazancourt <hu...@mazancourt.com>.
Thanks to all for your answers.
I guess the simplest method is Marshall’s one: having my AE explicitly call load() on the resource when changes are detected.

However, if I reading Richard’s suggestions, especially on this :
> 
> However, if you want to use external resources, having a look at 
> 
> https://svn.apache.org/repos/asf/uima/uimafit/trunk/uimafit-core/src/test/java/org/apache/uima/fit/factory/ExternalResourceFactoryTest.java <https://svn.apache.org/repos/asf/uima/uimafit/trunk/uimafit-core/src/test/java/org/apache/uima/fit/factory/ExternalResourceFactoryTest.java>

…if I correctly read the code, it means that I can bind a POJO as a resource to my AE. I thought a Shared resource had to be described (and accessed) through a DataResource.
If I can directly inject my application’s vocabulary as resource, then my problems are gone, because the vocabulary object gets updated each time the index changes. Am I missing something ?

Best,

— Hugues


Re: Dynamically bind resources to AnalysisEngine

Posted by Richard Eckart de Castilho <re...@apache.org>.
Hi,

>> Is there a way to dynamically bind/update resources for an AnalysisEngine ?
> 

> There may be more conventions / built-in ways that DKPro has
> for this scenario.

There are no conventions in DKPro Core for resource binding. It should also not
interfere if you do resource binding with any of the components you may have
implemented yourself and mix/match in a pipeline with DKPro Core components.

> This link in the UIMA Reference manual describes Resources:
> https://uima.apache.org/d/uimaj-2.10.2/references.html#ugr.ref.resources
> 
> See also the Javadocs for SharedResourceObject
> https://uima.apache.org/d/uimaj-2.10.2/apidocs/org/apache/uima/resource/SharedResourceObject.html

uimaFIT also has support for external resources. If you use DKPro Core,
I expect you also make use of uimaFIT. You can find a bit of 
documentation here:

https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.externalresources

However, if you want to use external resources, having a look at 

https://svn.apache.org/repos/asf/uima/uimafit/trunk/uimafit-core/src/test/java/org/apache/uima/fit/factory/ExternalResourceFactoryTest.java

In particular, you might not want to use a SharedResourceObject, but instead
build your parser resource on top of Resource_ImplBase and instead of
relying on SharedResourceObject.load() you could just implement arbitrary
methods, e.g. "getLatestParser()".

That said, instead of rebinding resources to components, I would suggest that
you put your compiled parsers somewhere a "parser resource" bound
to an analysis engine would be able to find it. Then, when the 
AE asks the resource for the actual parser, the resource should
return the latest parser available. 

Cheers,

-- Richard

Re: Dynamically bind resources to AnalysisEngine

Posted by Marshall Schor <ms...@schor.com>.
Hi,

I don't know about DKPro, so someone more familiar with its conventions could
respond.

UIMA supports a decoupling of resources, shared among annotators running in some
pipeline.  I'm guessing you're asking about this mechanism,  but before
proceeding, there's nothing preventing you from implementing an annotator (let's
call it the spelling corrector annotator) which could load a dictionary (let's
say, specified by a configuration parameter), and then have some mechanism to
"reload it", if it changes.

This link in the UIMA Reference manual describes Resources:
https://uima.apache.org/d/uimaj-2.10.2/references.html#ugr.ref.resources

See also the Javadocs for SharedResourceObject
https://uima.apache.org/d/uimaj-2.10.2/apidocs/org/apache/uima/resource/SharedResourceObject.html

These have a "load" method which the user is supposed to implement to cause the
resource to be "loaded".  Typically, if the resource, for example, implemennts a
hashmap, the load might read some external file and initialize the hashmap from
that.

The implementation of the load method is the responsibility of the resource
implementer. UIMA will instantiate the resource class, and call the load method,
once.

One possibility would be to have your spelling annotator check "every so often"
to see if the on-disk version has changed, and if so, call the load method
again.  If you consider doing this, remember that your annotator might (in some
deployments) be "scaled up" in multiple Java threads, so you might need to do
this under a synchronization lock.

Does this help?  There may be more conventions / built-in ways that DKPro has
for this scenario.

Cheers. -Marshall


On 4/11/2018 9:54 AM, Hugues de Mazancourt wrote:
> Hello,
>
> Is there a way to dynamically bind/update resources for an AnalysisEngine ?
> My use-case is : I build a query parser that will be used to retrieve information in an indexed text database.
> The parser performs spelling correction, but doesn't have to consider words in the index as spelling mistakes. Thus, the (aggregate) engine is bound to the index vocabulary (ie a word list).
> My point is : when the index gets updated, its vocabulary will also be updated. I can re-build a new aggregate parser, with the updated resource, but this takes time, mainly for loading resources that were already loaded (POS model, lexica, etc.). Is there a way to update a given resource on my parser without having to rebuild it ?
>
> Thanks for your help,
> PS: I'm mostly building on top of DKPro components. I may miss some basic UIMA mechanisms
> Hugues de Mazancourt
> Mazancourt Conseil
>
> E: hugues@mazancourt.com (mailto:hugues@mazancourt.com)
> P: +33-6 72 78 70 33 (tel:+33-6%2072%2078%2070%2033)
> W: http://www.mazancourt.com
>
>