You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by James Baker <ja...@gmail.com> on 2014/09/17 11:33:58 UTC

Share database connections between annotators

What is the best way of sharing a database connection (in my case, to
MongoDB) between annotators?

Currently, I instantiate a new database connection for each annotator, but
as my UIMA pipeline now has over 50 annotators (not all of which connect to
the database), I am ending up with a large number of database connections.
Ideally, I'd like to move to a single connection pool that can be shared by
the annotators. The Mongo driver supports connection pooling, but I'm
unsure how best to implement in UIMA.

Any advice would be appreciated.

Thanks,
James

Re: Share database connections between annotators

Posted by Richard Eckart de Castilho <re...@apache.org>.
You declare an external resource once in the descriptor and then
bind it to various engines.

All engines binding to the same external resource eventually talk to the
same external resource instance. If you use CPE this is also true for 
scaled-out instances.

It should be noted though, that external resources have no proper life-cycle.
In particular, they never get notified when the pipeline is over. You'd need
to find some way of notifying your DB pool that connections can be closed.

Cheers,

-- Richard

On 17.09.2014, at 12:20, James Baker <ja...@gmail.com> wrote:

> Thanks Johannes,
> 
> I've not used External Resources before and the documentation seems to be
> fairly limited. How do I ensure the resource is shared across multiple
> annotators? Do I give it the same key, name and/or URI? Does it need to be
> declared separately and then referenced somehow?
> 
> James
> 
> On 17 September 2014 10:57, Johannes Darms <
> johannes.darms@scai.fraunhofer.de> wrote:
> 
>> Hey James,
>> 
>> I don't know if its the best way. I would implement a collection Pool as
>> an External Resources and pass these Resource to the different Annotators
>> in your AE.
>> 
>> Regards,
>> 
>> Johannes
>> 
>> [1]
>> https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.externalresources
>> 
>> ----- Original Message -----
>> From: "James Baker" <ja...@gmail.com>
>> To: user@uima.apache.org
>> Sent: Wednesday, September 17, 2014 11:33:58 AM
>> Subject: Share database connections between annotators
>> 
>> What is the best way of sharing a database connection (in my case, to
>> MongoDB) between annotators?
>> 
>> Currently, I instantiate a new database connection for each annotator, but
>> as my UIMA pipeline now has over 50 annotators (not all of which connect to
>> the database), I am ending up with a large number of database connections.
>> Ideally, I'd like to move to a single connection pool that can be shared by
>> the annotators. The Mongo driver supports connection pooling, but I'm
>> unsure how best to implement in UIMA.
>> 
>> Any advice would be appreciated.
>> 
>> Thanks,
>> James

Re: Share database connections between annotators

Posted by Richard Eckart de Castilho <re...@apache.org>.
Good question ;) I think that should work. I did a quick test modifying one
of the uimaFIT CPE test cases:

  @Test
  public void testMultiBinding() throws Exception {
    ExternalResourceDescription extDesc = createExternalResourceDescription(ResourceWithAssert.class);

    // Binding external resource to each Annotator individually
    AnalysisEngineDescription aed1 = createEngineDescription(MultiBindAE.class,
            MultiBindAE.RES_KEY, extDesc);
    AnalysisEngineDescription aed2 = createEngineDescription(MultiBindAE.class,
            MultiBindAE.RES_KEY, extDesc);

    // Check the external resource was injected
    MultiBindAE.reset();
//    AnalysisEngineDescription aaed = createEngineDescription(aed1, aed2);
    CpePipeline.runPipeline(CollectionReaderFactory.createReaderDescription(Reader.class), aed1, aed2);
  }

Output: 

MultiBindAE: org.apache.uima.fit.cpe.ExternalResourceFactoryTest$ResourceWithAssert@6edb35f8
MultiBindAE: org.apache.uima.fit.cpe.ExternalResourceFactoryTest$ResourceWithAssert@6edb35f8

Looks good I'd say ;)

I believe what happens internally is that the external resource descriptions
from all the delegates are merged into one configuration for the resource manager.

Cheers,

-- Richard

P.S: as a comparison - using two separate external resources (not sharing)

  @Test
  public void testMultiBinding() throws Exception {
    ExternalResourceDescription extDesc1 = createExternalResourceDescription(ResourceWithAssert.class);

    ExternalResourceDescription extDesc2 = createExternalResourceDescription(ResourceWithAssert.class);

    // Binding external resource to each Annotator individually
    AnalysisEngineDescription aed1 = createEngineDescription(MultiBindAE.class,
            MultiBindAE.RES_KEY, extDesc1);
    AnalysisEngineDescription aed2 = createEngineDescription(MultiBindAE.class,
            MultiBindAE.RES_KEY, extDesc2);

    // Check the external resource was injected
    MultiBindAE.reset();
//    AnalysisEngineDescription aaed = createEngineDescription(aed1, aed2);
    CpePipeline.runPipeline(CollectionReaderFactory.createReaderDescription(Reader.class), aed1, aed2);
  }

Output: 

MultiBindAE: org.apache.uima.fit.cpe.ExternalResourceFactoryTest$ResourceWithAssert@28a34522
MultiBindAE: org.apache.uima.fit.cpe.ExternalResourceFactoryTest$ResourceWithAssert@113a89af

On 17.09.2014, at 13:54, James Baker <ja...@gmail.com> wrote:

> My annotators aren't encompassed within an aggregate AE, but are chained
> together in a CPE descriptor. Is it possible to define the external
> resource in the CPE descriptor for the annotators to access, rather than
> having bundle them into an aggregate in order to share the resource?
> 
> On 17 September 2014 12:16, Richard Eckart de Castilho <re...@apache.org>
> wrote:
> 
>> When you have created a description in uimaFIT, call toXML on it to get the
>> configuration file. Mind that the uimaFIT-generated description may contain
>> many redundancies, in particular the types might be re-defined multiple
>> times.
>> uimaFIT doesn't take care of minimizing the description since that is not
>> important for its operation.
>> 
>> Cheers,
>> 
>> -- Richard
>> 
>> On 17.09.2014, at 13:00, Johannes Darms <jo...@scai.fraunhofer.de>
>> wrote:
>> 
>>> Hey James,
>>> 
>>>> How do I ensure the resource is shared across multiple
>>>> annotators? Do I give it the same key, name and/or URI? Does it need to
>> be
>>>> declared separately and then referenced somehow?
>>> I'm not sure how to do it with configuration Files. I create and inject
>> them using UIMAfit [1].
>>> 
>>> 
>>> Regards,
>>> 
>>> Johannes
>>> 
>>> [1]
>> https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e387
>>> ----- Original Message -----
>>> From: "James Baker" <ja...@gmail.com>
>>> To: user@uima.apache.org
>>> Sent: Wednesday, September 17, 2014 12:20:54 PM
>>> Subject: Re: Share database connections between annotators
>>> 
>>> Thanks Johannes,
>>> 
>>> I've not used External Resources before and the documentation seems to be
>>> fairly limited. How do I ensure the resource is shared across multiple
>>> annotators? Do I give it the same key, name and/or URI? Does it need to
>> be
>>> declared separately and then referenced somehow?
>>> 
>>> James
>>> 
>>> On 17 September 2014 10:57, Johannes Darms <
>>> johannes.darms@scai.fraunhofer.de> wrote:
>>> 
>>>> Hey James,
>>>> 
>>>> I don't know if its the best way. I would implement a collection Pool as
>>>> an External Resources and pass these Resource to the different
>> Annotators
>>>> in your AE.
>>>> 
>>>> Regards,
>>>> 
>>>> Johannes
>>>> 
>>>> [1]
>>>> 
>> https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.externalresources
>>>> 
>>>> ----- Original Message -----
>>>> From: "James Baker" <ja...@gmail.com>
>>>> To: user@uima.apache.org
>>>> Sent: Wednesday, September 17, 2014 11:33:58 AM
>>>> Subject: Share database connections between annotators
>>>> 
>>>> What is the best way of sharing a database connection (in my case, to
>>>> MongoDB) between annotators?
>>>> 
>>>> Currently, I instantiate a new database connection for each annotator,
>> but
>>>> as my UIMA pipeline now has over 50 annotators (not all of which
>> connect to
>>>> the database), I am ending up with a large number of database
>> connections.
>>>> Ideally, I'd like to move to a single connection pool that can be
>> shared by
>>>> the annotators. The Mongo driver supports connection pooling, but I'm
>>>> unsure how best to implement in UIMA.
>>>> 
>>>> Any advice would be appreciated.
>>>> 
>>>> Thanks,
>>>> James
>>>> 
>> 
>> 


Re: Share database connections between annotators

Posted by James Baker <ja...@gmail.com>.
My annotators aren't encompassed within an aggregate AE, but are chained
together in a CPE descriptor. Is it possible to define the external
resource in the CPE descriptor for the annotators to access, rather than
having bundle them into an aggregate in order to share the resource?

On 17 September 2014 12:16, Richard Eckart de Castilho <re...@apache.org>
wrote:

> When you have created a description in uimaFIT, call toXML on it to get the
> configuration file. Mind that the uimaFIT-generated description may contain
> many redundancies, in particular the types might be re-defined multiple
> times.
> uimaFIT doesn't take care of minimizing the description since that is not
> important for its operation.
>
> Cheers,
>
> -- Richard
>
> On 17.09.2014, at 13:00, Johannes Darms <jo...@scai.fraunhofer.de>
> wrote:
>
> > Hey James,
> >
> >> How do I ensure the resource is shared across multiple
> >> annotators? Do I give it the same key, name and/or URI? Does it need to
> be
> >> declared separately and then referenced somehow?
> > I'm not sure how to do it with configuration Files. I create and inject
> them using UIMAfit [1].
> >
> >
> > Regards,
> >
> > Johannes
> >
> > [1]
> https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e387
> > ----- Original Message -----
> > From: "James Baker" <ja...@gmail.com>
> > To: user@uima.apache.org
> > Sent: Wednesday, September 17, 2014 12:20:54 PM
> > Subject: Re: Share database connections between annotators
> >
> > Thanks Johannes,
> >
> > I've not used External Resources before and the documentation seems to be
> > fairly limited. How do I ensure the resource is shared across multiple
> > annotators? Do I give it the same key, name and/or URI? Does it need to
> be
> > declared separately and then referenced somehow?
> >
> > James
> >
> > On 17 September 2014 10:57, Johannes Darms <
> > johannes.darms@scai.fraunhofer.de> wrote:
> >
> >> Hey James,
> >>
> >> I don't know if its the best way. I would implement a collection Pool as
> >> an External Resources and pass these Resource to the different
> Annotators
> >> in your AE.
> >>
> >> Regards,
> >>
> >> Johannes
> >>
> >> [1]
> >>
> https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.externalresources
> >>
> >> ----- Original Message -----
> >> From: "James Baker" <ja...@gmail.com>
> >> To: user@uima.apache.org
> >> Sent: Wednesday, September 17, 2014 11:33:58 AM
> >> Subject: Share database connections between annotators
> >>
> >> What is the best way of sharing a database connection (in my case, to
> >> MongoDB) between annotators?
> >>
> >> Currently, I instantiate a new database connection for each annotator,
> but
> >> as my UIMA pipeline now has over 50 annotators (not all of which
> connect to
> >> the database), I am ending up with a large number of database
> connections.
> >> Ideally, I'd like to move to a single connection pool that can be
> shared by
> >> the annotators. The Mongo driver supports connection pooling, but I'm
> >> unsure how best to implement in UIMA.
> >>
> >> Any advice would be appreciated.
> >>
> >> Thanks,
> >> James
> >>
>
>

Re: Share database connections between annotators

Posted by Richard Eckart de Castilho <re...@apache.org>.
When you have created a description in uimaFIT, call toXML on it to get the
configuration file. Mind that the uimaFIT-generated description may contain
many redundancies, in particular the types might be re-defined multiple times.
uimaFIT doesn't take care of minimizing the description since that is not 
important for its operation.

Cheers,

-- Richard

On 17.09.2014, at 13:00, Johannes Darms <jo...@scai.fraunhofer.de> wrote:

> Hey James,
> 
>> How do I ensure the resource is shared across multiple
>> annotators? Do I give it the same key, name and/or URI? Does it need to be
>> declared separately and then referenced somehow?
> I'm not sure how to do it with configuration Files. I create and inject them using UIMAfit [1].
> 
> 
> Regards,
> 
> Johannes
> 
> [1] https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e387
> ----- Original Message -----
> From: "James Baker" <ja...@gmail.com>
> To: user@uima.apache.org
> Sent: Wednesday, September 17, 2014 12:20:54 PM
> Subject: Re: Share database connections between annotators
> 
> Thanks Johannes,
> 
> I've not used External Resources before and the documentation seems to be
> fairly limited. How do I ensure the resource is shared across multiple
> annotators? Do I give it the same key, name and/or URI? Does it need to be
> declared separately and then referenced somehow?
> 
> James
> 
> On 17 September 2014 10:57, Johannes Darms <
> johannes.darms@scai.fraunhofer.de> wrote:
> 
>> Hey James,
>> 
>> I don't know if its the best way. I would implement a collection Pool as
>> an External Resources and pass these Resource to the different Annotators
>> in your AE.
>> 
>> Regards,
>> 
>> Johannes
>> 
>> [1]
>> https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.externalresources
>> 
>> ----- Original Message -----
>> From: "James Baker" <ja...@gmail.com>
>> To: user@uima.apache.org
>> Sent: Wednesday, September 17, 2014 11:33:58 AM
>> Subject: Share database connections between annotators
>> 
>> What is the best way of sharing a database connection (in my case, to
>> MongoDB) between annotators?
>> 
>> Currently, I instantiate a new database connection for each annotator, but
>> as my UIMA pipeline now has over 50 annotators (not all of which connect to
>> the database), I am ending up with a large number of database connections.
>> Ideally, I'd like to move to a single connection pool that can be shared by
>> the annotators. The Mongo driver supports connection pooling, but I'm
>> unsure how best to implement in UIMA.
>> 
>> Any advice would be appreciated.
>> 
>> Thanks,
>> James
>> 


Re: Share database connections between annotators

Posted by Johannes Darms <jo...@scai.fraunhofer.de>.
Hey James,

>How do I ensure the resource is shared across multiple
>annotators? Do I give it the same key, name and/or URI? Does it need to be
>declared separately and then referenced somehow?
I'm not sure how to do it with configuration Files. I create and inject them using UIMAfit [1].


Regards,

Johannes

[1] https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e387
----- Original Message -----
From: "James Baker" <ja...@gmail.com>
To: user@uima.apache.org
Sent: Wednesday, September 17, 2014 12:20:54 PM
Subject: Re: Share database connections between annotators

Thanks Johannes,

I've not used External Resources before and the documentation seems to be
fairly limited. How do I ensure the resource is shared across multiple
annotators? Do I give it the same key, name and/or URI? Does it need to be
declared separately and then referenced somehow?

James

On 17 September 2014 10:57, Johannes Darms <
johannes.darms@scai.fraunhofer.de> wrote:

> Hey James,
>
> I don't know if its the best way. I would implement a collection Pool as
> an External Resources and pass these Resource to the different Annotators
> in your AE.
>
> Regards,
>
> Johannes
>
> [1]
> https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.externalresources
>
> ----- Original Message -----
> From: "James Baker" <ja...@gmail.com>
> To: user@uima.apache.org
> Sent: Wednesday, September 17, 2014 11:33:58 AM
> Subject: Share database connections between annotators
>
> What is the best way of sharing a database connection (in my case, to
> MongoDB) between annotators?
>
> Currently, I instantiate a new database connection for each annotator, but
> as my UIMA pipeline now has over 50 annotators (not all of which connect to
> the database), I am ending up with a large number of database connections.
> Ideally, I'd like to move to a single connection pool that can be shared by
> the annotators. The Mongo driver supports connection pooling, but I'm
> unsure how best to implement in UIMA.
>
> Any advice would be appreciated.
>
> Thanks,
> James
>

Re: Share database connections between annotators

Posted by James Baker <ja...@gmail.com>.
Thanks Johannes,

I've not used External Resources before and the documentation seems to be
fairly limited. How do I ensure the resource is shared across multiple
annotators? Do I give it the same key, name and/or URI? Does it need to be
declared separately and then referenced somehow?

James

On 17 September 2014 10:57, Johannes Darms <
johannes.darms@scai.fraunhofer.de> wrote:

> Hey James,
>
> I don't know if its the best way. I would implement a collection Pool as
> an External Resources and pass these Resource to the different Annotators
> in your AE.
>
> Regards,
>
> Johannes
>
> [1]
> https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.externalresources
>
> ----- Original Message -----
> From: "James Baker" <ja...@gmail.com>
> To: user@uima.apache.org
> Sent: Wednesday, September 17, 2014 11:33:58 AM
> Subject: Share database connections between annotators
>
> What is the best way of sharing a database connection (in my case, to
> MongoDB) between annotators?
>
> Currently, I instantiate a new database connection for each annotator, but
> as my UIMA pipeline now has over 50 annotators (not all of which connect to
> the database), I am ending up with a large number of database connections.
> Ideally, I'd like to move to a single connection pool that can be shared by
> the annotators. The Mongo driver supports connection pooling, but I'm
> unsure how best to implement in UIMA.
>
> Any advice would be appreciated.
>
> Thanks,
> James
>

Re: Share database connections between annotators

Posted by Johannes Darms <jo...@scai.fraunhofer.de>.
Hey James,

I don't know if its the best way. I would implement a collection Pool as an External Resources and pass these Resource to the different Annotators in your AE.

Regards,

Johannes

[1] https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.externalresources

----- Original Message -----
From: "James Baker" <ja...@gmail.com>
To: user@uima.apache.org
Sent: Wednesday, September 17, 2014 11:33:58 AM
Subject: Share database connections between annotators

What is the best way of sharing a database connection (in my case, to
MongoDB) between annotators?

Currently, I instantiate a new database connection for each annotator, but
as my UIMA pipeline now has over 50 annotators (not all of which connect to
the database), I am ending up with a large number of database connections.
Ideally, I'd like to move to a single connection pool that can be shared by
the annotators. The Mongo driver supports connection pooling, but I'm
unsure how best to implement in UIMA.

Any advice would be appreciated.

Thanks,
James