You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Mihaela M <mm...@yahoo.com> on 2014/02/25 20:15:46 UTC

Accessing uima as pipeline from a REST interface

Hello,

I have an Uima As pipeline that has more annotators running in parallel. On top of this I want to build a REST service that would invoke the pipeline for a given text and return the annotations found back to the client. The REST service should support a high number of concurrent requests.


Because of the need of having a synchronous call to the pipeline I thought I should use UimaAsynchronousEngine's sendAndReceive(CAS) method, but because this method is synchronized and blocks until the pipeline returns the reply, if I have only one instance of UimaAsynchronousEngine and the processing time is not that good all the calls of the web service will be handled synchronously, not in parallel.

In this case , is it feasible to create a pool of UimaAsynchronousEngine clients (the pool size will match the CAS pool size of the uima as pipeline, which will add also more running instances of each annotator) and in the web service have one of the available clients in the pool reused to call the uima as pipeline synchronously? I know that each such client opens some connections to the ActiveMQ broker (at least two) so I expect to add some overhead to the message broker and my web server but I don't know how bad could it be. 


If I tune the pipeline so that is supports high throughput, what would be the best approach for adding this REST client with high throughput as well?

I'm looking forward for any feedback or suggestions.

Thanks,
Mihaela

Re: Accessing uima as pipeline from a REST interface

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.
In the uima-as the sendAndReceive() method is not synchronized meaning more
than one thread can call it. Each calling thread is blocked within the
sendAndReceive() until a reply comes or the thread is interrupted. Multiple
threads can wait for a reply at the same time.

You can also consider using an async send() instead and rely on uima-as
callback when a reply comes. The send() will not block and return
immediately once the CAS is handed off to internal queue for dispatching to
a queue.

Jerry C


On Wed, Feb 26, 2014 at 10:20 AM, Mihaela M <mm...@yahoo.com> wrote:

> The REST service will be called by multiple clients, concurrently (I have
> a web application that calls this service). On the web server, for each
> request a new thread is created that will use the service instance to call
> the functionality and have the results returned back. If I create only one
> instance of the UimaAsynchronousEngine and call the sendAndReceive() method
> on it, that is also synchornized and blocking, wouldn't this mean that the
> requests from the client will be treated in a serial manner not in
> parallel?
> Because my understanding is that because the client blocks until the reply
> and because that method is synchronized, it won't be able to send further
> CASes to the pipeline (from other threads) and have them processed in
> parallel. Please correct me if I'm wrong.
>
> Thanks,
> Mihaela
>
>
>
>
> On Wednesday, February 26, 2014 4:48 PM, Jaroslaw Cwiklik <
> uimaee@gmail.com> wrote:
>
> Mihaela, does your REST service provide threading to handle client
> requests? If so, you can consider using a shared instance of
> UimaAsynchronousEngine
> client. Each thread would call sendAndReceive() and block until reply
> comes. This would be the most efficient way of doing this I think.
>
>
> Jerry C
>
>
>
> On Tue, Feb 25, 2014 at 2:15 PM, Mihaela M <mm...@yahoo.com> wrote:
>
> > Hello,
> >
> > I have an Uima As pipeline that has more annotators running in parallel.
> > On top of this I want to build a REST service that would invoke the
> > pipeline for a given text and return the annotations found back to the
> > client. The REST service should support a high number of concurrent
> > requests.
> >
> >
> > Because of the need of having a synchronous call to the pipeline I
> thought
> > I should use UimaAsynchronousEngine's sendAndReceive(CAS) method, but
> > because this method is synchronized and blocks until the pipeline returns
> > the reply, if I have only one instance of UimaAsynchronousEngine and the
> > processing time is not that good all the calls of the web service will be
> > handled synchronously, not in parallel.
> >
> > In this case , is it feasible to create a pool of UimaAsynchronousEngine
> > clients (the pool size will match the CAS pool size of the uima as
> > pipeline, which will add also more running instances of each annotator)
> and
> > in the web service have one of the available clients in the pool reused
> to
> > call the uima as pipeline synchronously? I know that each such client
> opens
> > some connections to the ActiveMQ broker (at least two) so I expect to add
> > some overhead to the message broker and my web server but I don't know
> how
> > bad could it be.
> >
> >
> > If I tune the pipeline so that is supports high throughput, what would be
> > the best approach for adding this REST client with high throughput as
> well?
> >
> > I'm looking forward for any feedback or suggestions.
> >
> > Thanks,
> > Mihaela
>

Re: Accessing uima as pipeline from a REST interface

Posted by Mihaela M <mm...@yahoo.com>.
The REST service will be called by multiple clients, concurrently (I have a web application that calls this service). On the web server, for each request a new thread is created that will use the service instance to call the functionality and have the results returned back. If I create only one instance of the UimaAsynchronousEngine and call the sendAndReceive() method on it, that is also synchornized and blocking, wouldn't this mean that the requests from the client will be treated in a serial manner not in parallel? 
Because my understanding is that because the client blocks until the reply and because that method is synchronized, it won't be able to send further CASes to the pipeline (from other threads) and have them processed in parallel. Please correct me if I'm wrong.

Thanks,
Mihaela




On Wednesday, February 26, 2014 4:48 PM, Jaroslaw Cwiklik <ui...@gmail.com> wrote:
 
Mihaela, does your REST service provide threading to handle client
requests? If so, you can consider using a shared instance of
UimaAsynchronousEngine
client. Each thread would call sendAndReceive() and block until reply
comes. This would be the most efficient way of doing this I think.


Jerry C



On Tue, Feb 25, 2014 at 2:15 PM, Mihaela M <mm...@yahoo.com> wrote:

> Hello,
>
> I have an Uima As pipeline that has more annotators running in parallel.
> On top of this I want to build a REST service that would invoke the
> pipeline for a given text and return the annotations found back to the
> client. The REST service should support a high number of concurrent
> requests.
>
>
> Because of the need of having a synchronous call to the pipeline I thought
> I should use UimaAsynchronousEngine's sendAndReceive(CAS) method, but
> because this method is synchronized and blocks until the pipeline returns
> the reply, if I have only one instance of UimaAsynchronousEngine and the
> processing time is not that good all the calls of the web service will be
> handled synchronously, not in parallel.
>
> In this case , is it feasible to create a pool of UimaAsynchronousEngine
> clients (the pool size will match the CAS pool size of the uima as
> pipeline, which will add also more running instances of each annotator) and
> in the web service have one of the available clients in the pool reused to
> call the uima as pipeline synchronously? I know that each such client opens
> some connections to the ActiveMQ broker (at least two) so I expect to add
> some overhead to the message broker and my web server but I don't know how
> bad could it be.
>
>
> If I tune the pipeline so that is supports high throughput, what would be
> the best approach for adding this REST client with high throughput as well?
>
> I'm looking forward for any feedback or suggestions.
>
> Thanks,
> Mihaela

Re: Accessing uima as pipeline from a REST interface

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.
Mihaela, does your REST service provide threading to handle client
requests? If so, you can consider using a shared instance of
UimaAsynchronousEngine
client. Each thread would call sendAndReceive() and block until reply
comes. This would be the most efficient way of doing this I think.


Jerry C


On Tue, Feb 25, 2014 at 2:15 PM, Mihaela M <mm...@yahoo.com> wrote:

> Hello,
>
> I have an Uima As pipeline that has more annotators running in parallel.
> On top of this I want to build a REST service that would invoke the
> pipeline for a given text and return the annotations found back to the
> client. The REST service should support a high number of concurrent
> requests.
>
>
> Because of the need of having a synchronous call to the pipeline I thought
> I should use UimaAsynchronousEngine's sendAndReceive(CAS) method, but
> because this method is synchronized and blocks until the pipeline returns
> the reply, if I have only one instance of UimaAsynchronousEngine and the
> processing time is not that good all the calls of the web service will be
> handled synchronously, not in parallel.
>
> In this case , is it feasible to create a pool of UimaAsynchronousEngine
> clients (the pool size will match the CAS pool size of the uima as
> pipeline, which will add also more running instances of each annotator) and
> in the web service have one of the available clients in the pool reused to
> call the uima as pipeline synchronously? I know that each such client opens
> some connections to the ActiveMQ broker (at least two) so I expect to add
> some overhead to the message broker and my web server but I don't know how
> bad could it be.
>
>
> If I tune the pipeline so that is supports high throughput, what would be
> the best approach for adding this REST client with high throughput as well?
>
> I'm looking forward for any feedback or suggestions.
>
> Thanks,
> Mihaela

Re: Accessing uima as pipeline from a REST interface

Posted by Eddie Epstein <ea...@gmail.com>.
Hi Mihaela,

The UimaAsynchronousEngine is designed to support multiple threads
accessing a service. The engine API object has a CAS pool size parameter
and initial CAS size parameters to support this.

>From the RunRemoteAsyncAE.java sample code:
    // Add the Cas Pool Size and initial FS heap size
    appCtx.put(UimaAsynchronousEngine.CasPoolSize, casPoolSize);
    appCtx.put(UIMAFramework.CAS_INITIAL_HEAP_SIZE,
Integer.valueOf(fsHeapSize / 4).toString());

Since the default CAS size is large, over 2MB, memory can be a problem for
large pool sizes and a significant memory requirement had by reducing the
initial size.

So the engine API object is initialized once, and then multiple threads can
call the same object's getCAS() and sendAndReceive(CAS) methods in
parallel. The CAS pool size can be used to limit the number of concurrent
requests.

Eddie



On Tue, Feb 25, 2014 at 2:15 PM, Mihaela M <mm...@yahoo.com> wrote:

> Hello,
>
> I have an Uima As pipeline that has more annotators running in parallel.
> On top of this I want to build a REST service that would invoke the
> pipeline for a given text and return the annotations found back to the
> client. The REST service should support a high number of concurrent
> requests.
>
>
> Because of the need of having a synchronous call to the pipeline I thought
> I should use UimaAsynchronousEngine's sendAndReceive(CAS) method, but
> because this method is synchronized and blocks until the pipeline returns
> the reply, if I have only one instance of UimaAsynchronousEngine and the
> processing time is not that good all the calls of the web service will be
> handled synchronously, not in parallel.
>
> In this case , is it feasible to create a pool of UimaAsynchronousEngine
> clients (the pool size will match the CAS pool size of the uima as
> pipeline, which will add also more running instances of each annotator) and
> in the web service have one of the available clients in the pool reused to
> call the uima as pipeline synchronously? I know that each such client opens
> some connections to the ActiveMQ broker (at least two) so I expect to add
> some overhead to the message broker and my web server but I don't know how
> bad could it be.
>
>
> If I tune the pipeline so that is supports high throughput, what would be
> the best approach for adding this REST client with high throughput as well?
>
> I'm looking forward for any feedback or suggestions.
>
> Thanks,
> Mihaela