You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@uima.apache.org by "John David Osborne (Campus)" <oz...@uab.edu> on 2013/10/19 19:25:46 UTC

"Run as AS aggregate" and pre-fetching

What are the consequences of selecting in a UIMA-AS deployment descriptor
"Run as AS aggregate"?

I found an email from a year ago online where Eddie Epstein wrote:
"UIMA-AS will put every asynchronous component in a separate thread.Using
the ComponentDescriptorEditor on a UIMA-AS deployment
descriptor, marking an aggregate with "Run as AS aggregate" will make
every delegate in *that* aggregate an asynchronous component."

I have a deployment with 32 aggregate analysis engines but I have not
checked the box "Run as AS aggregate" in the deployment descriptor. Should
I generally be doing this for all aggregate analysis engines? I'm not sure
I understand the tradeoff very well, it sounds like I could get some
performance improvements by checking this box since everything could run
asynchronously however it sounds like if my pipeline isn't really ready to
be run asynchronously some things may break..

Did I get that right?

Also I noticed the Eclipse component editor for UIMA-AS deployment
descriptor doesn't provide the option to set pre-fetching (you can't see
it either).

 -John




-- 
John David Osborne

Research Associate
University of Alabama at Birmingham
Biomedical Informatics
Center for Clinical and Translational Science
1720 7th Avenue South
Sparks Building, Suite 175
Birmingham, AL, 35294



>

Re: "Run as AS aggregate" and pre-fetching

Posted by Eddie Epstein <ea...@gmail.com>.

In a core UIMA aggregate engine all annotators run in a single thread, and
the code length moving from one annotator to another is "small". When
deployed asynchronously, each annotator in a different thread, the code
length is much higher and there is thread switching overhead as well.

In my experience there are two generally successful approaches to deploying
UIMA-AS multithreaded.
The simplest is to keep the entire pipeline synchronous and deploy N
pipeline instances, each running in its own thread; this design is good for
high throughput.

The second approach deploys only the top level aggregate (and carefully
selected 2nd or 3rd level aggregates) with the idea that operations can
proceed in parallel and slower components be replicated; this design is
good for low latency. Note that asynchronous components can only operate in
parallel if they are working on different CASes, so the use of CAS
Multipliers each with a pool of CASes is needed.

It is best to keep aggregates synchronous unless there is a useful reason
not to.

Eddie

On Sat, Oct 19, 2013 at 1:25 PM, John David Osborne (Campus) <ozborn@uab.edu
> wrote:

> What are the consequences of selecting in a UIMA-AS deployment descriptor
> "Run as AS aggregate"?
>
> I found an email from a year ago online where Eddie Epstein wrote:
> "UIMA-AS will put every asynchronous component in a separate thread.Using
> the ComponentDescriptorEditor on a UIMA-AS deployment
> descriptor, marking an aggregate with "Run as AS aggregate" will make
> every delegate in *that* aggregate an asynchronous component."
>
> I have a deployment with 32 aggregate analysis engines but I have not
> checked the box "Run as AS aggregate" in the deployment descriptor. Should
> I generally be doing this for all aggregate analysis engines? I'm not sure
> I understand the tradeoff very well, it sounds like I could get some
> performance improvements by checking this box since everything could run
> asynchronously however it sounds like if my pipeline isn't really ready to
> be run asynchronously some things may break..
>
> Did I get that right?
>
> Also I noticed the Eclipse component editor for UIMA-AS deployment
> descriptor doesn't provide the option to set pre-fetching (you can't see
> it either).
>
>  -John
>
>
>
>
> --
> John David Osborne
>
> Research Associate
> University of Alabama at Birmingham
> Biomedical Informatics
> Center for Clinical and Translational Science
> 1720 7th Avenue South
> Sparks Building, Suite 175
> Birmingham, AL, 35294
>
>
>
> >
>
>