You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@uima.apache.org by "RYAN C. CORNIA" <ry...@utah.edu> on 2013/08/07 18:30:29 UTC

UIMA AS Asynchronous = true not behaving as expected.

I'm using UIMA AS 2.4.0, and have an example pipeline with 3 annotators. The third annotator is coded to just sleep for 3 seconds per document to simulate a slow annotator.

If I change the pipeline to async=true and set the number of scale out instances on the slow annotator to be 6, I expected the pipeline to be about 6 times faster. What I see, however, is exactly the same performance.

A bit of debugging shows UIMA AS is creating 6 different copies of the slow annotator, because each one is being called alternately per CAS, but it is waiting for the entire pipeline to be complete before getting another cas off the queue.

Any ideas what may be misconfigured? Or what to look at?

My deployment descriptor is:

<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDeploymentDescription xmlns="http://uima.apache.org/resourceSpecifier">
    <name>defaultFlapDeployDescriptor20130807.095936</name>
    <description/>
    <version>1.0</version>
    <vendor/>
    <deployment protocol="jms" provider="activemq">
        <casPool numberOfCASes="6" initialFsHeapSize="2000000"/>
        <service>
            <inputQueue endpoint="exampleQueue" brokerURL="tcp://localhost:61616" prefetch="0"/>
            <topDescriptor>
                <import location="file:/var/folders/vl/7p2qch6j4kx_kv5chvd093l80000gn/T/flapAggregate311122232121092424.xml"/>
            </topDescriptor>
            <analysisEngine async="true">
                <scaleout numberOfInstances="1"/>
                <delegates>
                    <analysisEngine key="aeWhitespaceTokenizerDescriptor211289c8cf04-b67c-45e2-a1eb-e90a85f39006" async="false">
                        <scaleout numberOfInstances="1"/>
                        <asyncAggregateErrorConfiguration>
                            <getMetadataErrors maxRetries="0" timeout="0" errorAction="terminate"/>
                            <processCasErrors thresholdCount="0" thresholdWindow="0" thresholdAction="terminate"/>
                            <collectionProcessCompleteErrors timeout="0" additionalErrorAction="terminate"/>
                        </asyncAggregateErrorConfiguration>
                    </analysisEngine>
                    <analysisEngine key="aeWordTokenizerDescriptor21126d2902a3-e6ca-4834-89cb-ec1a6c29f281" async="false">
                        <scaleout numberOfInstances="1"/>
                        <asyncAggregateErrorConfiguration>
                            <getMetadataErrors maxRetries="0" timeout="0" errorAction="terminate"/>
                            <processCasErrors thresholdCount="0" thresholdWindow="0" thresholdAction="terminate"/>
                            <collectionProcessCompleteErrors timeout="0" additionalErrorAction="terminate"/>
                        </asyncAggregateErrorConfiguration>
                    </analysisEngine>
                    <analysisEngine key="gov.va.vinci.flap.examples.ae.MySlowAnnotator2112fc3e83f1-f535-40c2-a860-895207bfff1a" async="false">
                        <scaleout numberOfInstances="6"/>
                        <asyncAggregateErrorConfiguration>
                            <getMetadataErrors maxRetries="0" timeout="0" errorAction="terminate"/>
                            <processCasErrors thresholdCount="0" thresholdWindow="0" thresholdAction="terminate"/>
                            <collectionProcessCompleteErrors timeout="0" additionalErrorAction="terminate"/>
                        </asyncAggregateErrorConfiguration>
                    </analysisEngine>
                </delegates>
                <asyncPrimitiveErrorConfiguration>
                    <processCasErrors thresholdCount="0" thresholdWindow="0" thresholdAction="terminate"/>
                    <collectionProcessCompleteErrors timeout="0" additionalErrorAction="terminate"/>
                </asyncPrimitiveErrorConfiguration>
            </analysisEngine>
        </service>
    </deployment>
</analysisEngineDeploymentDescription>

Thanks!
Ryan

Re: UIMA AS Asynchronous = true not behaving as expected.

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.

This behavior seems to be caused by a bug in the uima-as. In the async
aggregate there is a single thread servicing an input queue. This thread is
blocked (by design) until the input CAS is returned back to the client.
Only than the service thread is allowed to grab the next CAS. This design
prevents concurrent processing of input CASes in async aggregate where
CasPool size > 1. The blocking was motivated by the desire to enforce fair
load balancing where each service(process) only takes as many CASes at it
can process. The enforcing of fair load balancing is the right thing to do.
The problem is with the implementation.

I will be working on solving this shortly. Should be fixed in the upcoming
2.4.2 uima-as release.For now I created JIRA for this problem
https://issues.apache.org/jira/browse/UIMA-3160

JC

JC

On Wed, Aug 7, 2013 at 3:19 PM, Jaroslaw Cwiklik <ui...@gmail.com> wrote:

> Instead of speculating, I decided to setup a test with similar
> configuration ( casPool=6) and am noticing similar
> problem. It appears that the input queue is drained one CAS at a time as
> if there was only one thread processing
> an input queue.
>
> Investigating...
>
> JC
>
>
> On Wed, Aug 7, 2013 at 3:14 PM, Eddie Epstein <ea...@gmail.com> wrote:
>
>> Scaling via async=false and the number of instances = 6 DOES speed the
>>
>> > pipeline up by 6x, so I think the client is working correctly. It seems
>> to
>> > just be an issue when the async=true. I checked in the JMX console with
>> > async=true and slow annotator = 6 instances and inside JMX UIMA reports
>> > there are 6 instances of slow annotator configured. So it appears the
>> > configuration is getting set right inside of UIMA.
>> >
>> > Other thoughts?
>> >
>>
>> use JMX and double check the casPool size = 6?
>>
>> Eddie
>>
>
>

Re: UIMA AS Asynchronous = true not behaving as expected.

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.

Instead of speculating, I decided to setup a test with similar
configuration ( casPool=6) and am noticing similar
problem. It appears that the input queue is drained one CAS at a time as if
there was only one thread processing
an input queue.

Investigating...

JC

On Wed, Aug 7, 2013 at 3:14 PM, Eddie Epstein <ea...@gmail.com> wrote:

> Scaling via async=false and the number of instances = 6 DOES speed the
>
> > pipeline up by 6x, so I think the client is working correctly. It seems
> to
> > just be an issue when the async=true. I checked in the JMX console with
> > async=true and slow annotator = 6 instances and inside JMX UIMA reports
> > there are 6 instances of slow annotator configured. So it appears the
> > configuration is getting set right inside of UIMA.
> >
> > Other thoughts?
> >
>
> use JMX and double check the casPool size = 6?
>
> Eddie
>

Re: UIMA AS Asynchronous = true not behaving as expected.

Posted by Eddie Epstein <ea...@gmail.com>.

Scaling via async=false and the number of instances = 6 DOES speed the

> pipeline up by 6x, so I think the client is working correctly. It seems to
> just be an issue when the async=true. I checked in the JMX console with
> async=true and slow annotator = 6 instances and inside JMX UIMA reports
> there are 6 instances of slow annotator configured. So it appears the
> configuration is getting set right inside of UIMA.
>
> Other thoughts?
>

use JMX and double check the casPool size = 6?

Eddie

Re: UIMA AS Asynchronous = true not behaving as expected.

Posted by "RYAN C. CORNIA" <ry...@utah.edu>.

I removed:

<analysisEngine async="true">
                <scaleout numberOfInstances="1"/> <== at the top level
analysis Engine removed.

>From the deployment descriptor, but with the same results. (There was not
a warning before.)

I am calling the engine programmatically:

mUAEngine.setCollectionReader(collectionReader);
		mUAEngine.initialize(mAppCtx);
		
		log.info("SerializationStrategy: " +
mUAEngine.getSerializationStrategy());
		
		/**
		 * Catch exceptions that are thrown during processing, output the
exception to the log,
		 * then make sure the engine is stopped.
		 */
		try {
			mUAEngine.process();
Š
}

Scaling via async=false and the number of instances = 6 DOES speed the
pipeline up by 6x, so I think the client is working correctly. It seems to
just be an issue when the async=true. I checked in the JMX console with
async=true and slow annotator = 6 instances and inside JMX UIMA reports
there are 6 instances of slow annotator configured. So it appears the
configuration is getting set right inside of UIMA.

Other thoughts?
Ryan



		




On 8/7/13 12:08 PM, "Eddie Epstein" <ea...@gmail.com> wrote:

>What client program are you using to drive the uima-as service? Anything
>using sendAndReceiveCAS will only send one at a time.
>
>Please use the runRemoteAsyncAE.sh program and specify the number of
>outstanding CASes with -p. With no specification the number is 2.
>
>Eddie
>
>
>On Wed, Aug 7, 2013 at 12:30 PM, RYAN C. CORNIA
><ry...@utah.edu>wrote:
>
>> I'm using UIMA AS 2.4.0, and have an example pipeline with 3 annotators.
>> The third annotator is coded to just sleep for 3 seconds per document to
>> simulate a slow annotator.
>>
>> If I change the pipeline to async=true and set the number of scale out
>> instances on the slow annotator to be 6, I expected the pipeline to be
>> about 6 times faster. What I see, however, is exactly the same
>>performance.
>>
>> A bit of debugging shows UIMA AS is creating 6 different copies of the
>> slow annotator, because each one is being called alternately per CAS,
>>but
>> it is waiting for the entire pipeline to be complete before getting
>>another
>> cas off the queue.
>>
>> Any ideas what may be misconfigured? Or what to look at?
>>
>> My deployment descriptor is:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <analysisEngineDeploymentDescription xmlns="
>> http://uima.apache.org/resourceSpecifier">
>>     <name>defaultFlapDeployDescriptor20130807.095936</name>
>>     <description/>
>>     <version>1.0</version>
>>     <vendor/>
>>     <deployment protocol="jms" provider="activemq">
>>         <casPool numberOfCASes="6" initialFsHeapSize="2000000"/>
>>         <service>
>>             <inputQueue endpoint="exampleQueue"
>> brokerURL="tcp://localhost:61616" prefetch="0"/>
>>             <topDescriptor>
>>                 <import
>> 
>>location="file:/var/folders/vl/7p2qch6j4kx_kv5chvd093l80000gn/T/flapAggre
>>gate311122232121092424.xml"/>
>>             </topDescriptor>
>>             <analysisEngine async="true">
>>                 <scaleout numberOfInstances="1"/>
>>                 <delegates>
>>                     <analysisEngine
>> 
>>key="aeWhitespaceTokenizerDescriptor211289c8cf04-b67c-45e2-a1eb-e90a85f39
>>006"
>> async="false">
>>                         <scaleout numberOfInstances="1"/>
>>                         <asyncAggregateErrorConfiguration>
>>                             <getMetadataErrors maxRetries="0"
>>timeout="0"
>> errorAction="terminate"/>
>>                             <processCasErrors thresholdCount="0"
>> thresholdWindow="0" thresholdAction="terminate"/>
>>                             <collectionProcessCompleteErrors timeout="0"
>> additionalErrorAction="terminate"/>
>>                         </asyncAggregateErrorConfiguration>
>>                     </analysisEngine>
>>                     <analysisEngine
>> key="aeWordTokenizerDescriptor21126d2902a3-e6ca-4834-89cb-ec1a6c29f281"
>> async="false">
>>                         <scaleout numberOfInstances="1"/>
>>                         <asyncAggregateErrorConfiguration>
>>                             <getMetadataErrors maxRetries="0"
>>timeout="0"
>> errorAction="terminate"/>
>>                             <processCasErrors thresholdCount="0"
>> thresholdWindow="0" thresholdAction="terminate"/>
>>                             <collectionProcessCompleteErrors timeout="0"
>> additionalErrorAction="terminate"/>
>>                         </asyncAggregateErrorConfiguration>
>>                     </analysisEngine>
>>                     <analysisEngine
>> 
>>key="gov.va.vinci.flap.examples.ae.MySlowAnnotator2112fc3e83f1-f535-40c2-
>>a860-895207bfff1a"
>> async="false">
>>                         <scaleout numberOfInstances="6"/>
>>                         <asyncAggregateErrorConfiguration>
>>                             <getMetadataErrors maxRetries="0"
>>timeout="0"
>> errorAction="terminate"/>
>>                             <processCasErrors thresholdCount="0"
>> thresholdWindow="0" thresholdAction="terminate"/>
>>                             <collectionProcessCompleteErrors timeout="0"
>> additionalErrorAction="terminate"/>
>>                         </asyncAggregateErrorConfiguration>
>>                     </analysisEngine>
>>                 </delegates>
>>                 <asyncPrimitiveErrorConfiguration>
>>                     <processCasErrors thresholdCount="0"
>> thresholdWindow="0" thresholdAction="terminate"/>
>>                     <collectionProcessCompleteErrors timeout="0"
>> additionalErrorAction="terminate"/>
>>                 </asyncPrimitiveErrorConfiguration>
>>             </analysisEngine>
>>         </service>
>>     </deployment>
>> </analysisEngineDeploymentDescription>
>>
>> Thanks!
>> Ryan
>>
>>
>>

Re: UIMA AS Asynchronous = true not behaving as expected.

Posted by Eddie Epstein <ea...@gmail.com>.

What client program are you using to drive the uima-as service? Anything
using sendAndReceiveCAS will only send one at a time.

Please use the runRemoteAsyncAE.sh program and specify the number of
outstanding CASes with -p. With no specification the number is 2.

Eddie


On Wed, Aug 7, 2013 at 12:30 PM, RYAN C. CORNIA <ry...@utah.edu>wrote:

> I'm using UIMA AS 2.4.0, and have an example pipeline with 3 annotators.
> The third annotator is coded to just sleep for 3 seconds per document to
> simulate a slow annotator.
>
> If I change the pipeline to async=true and set the number of scale out
> instances on the slow annotator to be 6, I expected the pipeline to be
> about 6 times faster. What I see, however, is exactly the same performance.
>
> A bit of debugging shows UIMA AS is creating 6 different copies of the
> slow annotator, because each one is being called alternately per CAS, but
> it is waiting for the entire pipeline to be complete before getting another
> cas off the queue.
>
> Any ideas what may be misconfigured? Or what to look at?
>
> My deployment descriptor is:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <analysisEngineDeploymentDescription xmlns="
> http://uima.apache.org/resourceSpecifier">
>     <name>defaultFlapDeployDescriptor20130807.095936</name>
>     <description/>
>     <version>1.0</version>
>     <vendor/>
>     <deployment protocol="jms" provider="activemq">
>         <casPool numberOfCASes="6" initialFsHeapSize="2000000"/>
>         <service>
>             <inputQueue endpoint="exampleQueue"
> brokerURL="tcp://localhost:61616" prefetch="0"/>
>             <topDescriptor>
>                 <import
> location="file:/var/folders/vl/7p2qch6j4kx_kv5chvd093l80000gn/T/flapAggregate311122232121092424.xml"/>
>             </topDescriptor>
>             <analysisEngine async="true">
>                 <scaleout numberOfInstances="1"/>
>                 <delegates>
>                     <analysisEngine
> key="aeWhitespaceTokenizerDescriptor211289c8cf04-b67c-45e2-a1eb-e90a85f39006"
> async="false">
>                         <scaleout numberOfInstances="1"/>
>                         <asyncAggregateErrorConfiguration>
>                             <getMetadataErrors maxRetries="0" timeout="0"
> errorAction="terminate"/>
>                             <processCasErrors thresholdCount="0"
> thresholdWindow="0" thresholdAction="terminate"/>
>                             <collectionProcessCompleteErrors timeout="0"
> additionalErrorAction="terminate"/>
>                         </asyncAggregateErrorConfiguration>
>                     </analysisEngine>
>                     <analysisEngine
> key="aeWordTokenizerDescriptor21126d2902a3-e6ca-4834-89cb-ec1a6c29f281"
> async="false">
>                         <scaleout numberOfInstances="1"/>
>                         <asyncAggregateErrorConfiguration>
>                             <getMetadataErrors maxRetries="0" timeout="0"
> errorAction="terminate"/>
>                             <processCasErrors thresholdCount="0"
> thresholdWindow="0" thresholdAction="terminate"/>
>                             <collectionProcessCompleteErrors timeout="0"
> additionalErrorAction="terminate"/>
>                         </asyncAggregateErrorConfiguration>
>                     </analysisEngine>
>                     <analysisEngine
> key="gov.va.vinci.flap.examples.ae.MySlowAnnotator2112fc3e83f1-f535-40c2-a860-895207bfff1a"
> async="false">
>                         <scaleout numberOfInstances="6"/>
>                         <asyncAggregateErrorConfiguration>
>                             <getMetadataErrors maxRetries="0" timeout="0"
> errorAction="terminate"/>
>                             <processCasErrors thresholdCount="0"
> thresholdWindow="0" thresholdAction="terminate"/>
>                             <collectionProcessCompleteErrors timeout="0"
> additionalErrorAction="terminate"/>
>                         </asyncAggregateErrorConfiguration>
>                     </analysisEngine>
>                 </delegates>
>                 <asyncPrimitiveErrorConfiguration>
>                     <processCasErrors thresholdCount="0"
> thresholdWindow="0" thresholdAction="terminate"/>
>                     <collectionProcessCompleteErrors timeout="0"
> additionalErrorAction="terminate"/>
>                 </asyncPrimitiveErrorConfiguration>
>             </analysisEngine>
>         </service>
>     </deployment>
> </analysisEngineDeploymentDescription>
>
> Thanks!
> Ryan
>
>
>

Re: UIMA AS Asynchronous = true not behaving as expected.

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.

Just checked, indeed this warning is not shown if  async=true is added. I
thought I was on to something.

I would still advise to rerun the testcase without <scaleout
numberOfInstances="1"/> setting.

Also, using jConsole attached to the broker confirm that in fact the input
queue is drained one CAS at a time.

JC


On Wed, Aug 7, 2013 at 2:43 PM, Marshall Schor <ms...@schor.com> wrote:

>
> On 8/7/2013 2:07 PM, Jaroslaw Cwiklik wrote:
> > When you launch your service, do you see a Warning similar to this:
> >
> > *** WARN: line-number: 30 Top Lovel Async Primitive specifies a scaleout
> of
> > numberOfInstances="1", but also specifies a Cas Pool size of
> > numberOfCASes="6". The Cas Pool size is being forced to be the same as
> the
> > scaleout.
> My guess - that message doesn't occur, because the Top Level is not an
> Async
> *Primitive*, because it specifies async=true.
>
> I thought Async Primitive meant async=false (or an actual UIMA primitive)?
>
> -Marshall
> >
> >
> > If yes,
> >
> > <analysisEngine async="true">
> >                 <scaleout numberOfInstances="1"/>  *<<<< REMOVE THIS LINE
> > FROM Deployment Descriptor*
> >
> > JC
> >
> >
> > On Wed, Aug 7, 2013 at 12:30 PM, RYAN C. CORNIA <ryan.cornia@utah.edu
> >wrote:
> >
> >> I'm using UIMA AS 2.4.0, and have an example pipeline with 3 annotators.
> >> The third annotator is coded to just sleep for 3 seconds per document to
> >> simulate a slow annotator.
> >>
> >> If I change the pipeline to async=true and set the number of scale out
> >> instances on the slow annotator to be 6, I expected the pipeline to be
> >> about 6 times faster. What I see, however, is exactly the same
> performance.
> >>
> >> A bit of debugging shows UIMA AS is creating 6 different copies of the
> >> slow annotator, because each one is being called alternately per CAS,
> but
> >> it is waiting for the entire pipeline to be complete before getting
> another
> >> cas off the queue.
> >>
> >> Any ideas what may be misconfigured? Or what to look at?
> >>
> >> My deployment descriptor is:
> >>
> >> <?xml version="1.0" encoding="UTF-8"?>
> >> <analysisEngineDeploymentDescription xmlns="
> >> http://uima.apache.org/resourceSpecifier">
> >>     <name>defaultFlapDeployDescriptor20130807.095936</name>
> >>     <description/>
> >>     <version>1.0</version>
> >>     <vendor/>
> >>     <deployment protocol="jms" provider="activemq">
> >>         <casPool numberOfCASes="6" initialFsHeapSize="2000000"/>
> >>         <service>
> >>             <inputQueue endpoint="exampleQueue"
> >> brokerURL="tcp://localhost:61616" prefetch="0"/>
> >>             <topDescriptor>
> >>                 <import
> >>
> location="file:/var/folders/vl/7p2qch6j4kx_kv5chvd093l80000gn/T/flapAggregate311122232121092424.xml"/>
> >>             </topDescriptor>
> >>             <analysisEngine async="true">
> >>                 <scaleout numberOfInstances="1"/>
> >>                 <delegates>
> >>                     <analysisEngine
> >>
> key="aeWhitespaceTokenizerDescriptor211289c8cf04-b67c-45e2-a1eb-e90a85f39006"
> >> async="false">
> >>                         <scaleout numberOfInstances="1"/>
> >>                         <asyncAggregateErrorConfiguration>
> >>                             <getMetadataErrors maxRetries="0"
> timeout="0"
> >> errorAction="terminate"/>
> >>                             <processCasErrors thresholdCount="0"
> >> thresholdWindow="0" thresholdAction="terminate"/>
> >>                             <collectionProcessCompleteErrors timeout="0"
> >> additionalErrorAction="terminate"/>
> >>                         </asyncAggregateErrorConfiguration>
> >>                     </analysisEngine>
> >>                     <analysisEngine
> >> key="aeWordTokenizerDescriptor21126d2902a3-e6ca-4834-89cb-ec1a6c29f281"
> >> async="false">
> >>                         <scaleout numberOfInstances="1"/>
> >>                         <asyncAggregateErrorConfiguration>
> >>                             <getMetadataErrors maxRetries="0"
> timeout="0"
> >> errorAction="terminate"/>
> >>                             <processCasErrors thresholdCount="0"
> >> thresholdWindow="0" thresholdAction="terminate"/>
> >>                             <collectionProcessCompleteErrors timeout="0"
> >> additionalErrorAction="terminate"/>
> >>                         </asyncAggregateErrorConfiguration>
> >>                     </analysisEngine>
> >>                     <analysisEngine
> >>
> key="gov.va.vinci.flap.examples.ae.MySlowAnnotator2112fc3e83f1-f535-40c2-a860-895207bfff1a"
> >> async="false">
> >>                         <scaleout numberOfInstances="6"/>
> >>                         <asyncAggregateErrorConfiguration>
> >>                             <getMetadataErrors maxRetries="0"
> timeout="0"
> >> errorAction="terminate"/>
> >>                             <processCasErrors thresholdCount="0"
> >> thresholdWindow="0" thresholdAction="terminate"/>
> >>                             <collectionProcessCompleteErrors timeout="0"
> >> additionalErrorAction="terminate"/>
> >>                         </asyncAggregateErrorConfiguration>
> >>                     </analysisEngine>
> >>                 </delegates>
> >>                 <asyncPrimitiveErrorConfiguration>
> >>                     <processCasErrors thresholdCount="0"
> >> thresholdWindow="0" thresholdAction="terminate"/>
> >>                     <collectionProcessCompleteErrors timeout="0"
> >> additionalErrorAction="terminate"/>
> >>                 </asyncPrimitiveErrorConfiguration>
> >>             </analysisEngine>
> >>         </service>
> >>     </deployment>
> >> </analysisEngineDeploymentDescription>
> >>
> >> Thanks!
> >> Ryan
> >>
> >>
> >>
>
>

Re: UIMA AS Asynchronous = true not behaving as expected.

Posted by Marshall Schor <ms...@schor.com>.

On 8/7/2013 2:07 PM, Jaroslaw Cwiklik wrote:
> When you launch your service, do you see a Warning similar to this:
>
> *** WARN: line-number: 30 Top Lovel Async Primitive specifies a scaleout of
> numberOfInstances="1", but also specifies a Cas Pool size of
> numberOfCASes="6". The Cas Pool size is being forced to be the same as the
> scaleout.
My guess - that message doesn't occur, because the Top Level is not an Async
*Primitive*, because it specifies async=true.

I thought Async Primitive meant async=false (or an actual UIMA primitive)?

-Marshall
>
>
> If yes,
>
> <analysisEngine async="true">
>                 <scaleout numberOfInstances="1"/>  *<<<< REMOVE THIS LINE
> FROM Deployment Descriptor*
>
> JC
>
>
> On Wed, Aug 7, 2013 at 12:30 PM, RYAN C. CORNIA <ry...@utah.edu>wrote:
>
>> I'm using UIMA AS 2.4.0, and have an example pipeline with 3 annotators.
>> The third annotator is coded to just sleep for 3 seconds per document to
>> simulate a slow annotator.
>>
>> If I change the pipeline to async=true and set the number of scale out
>> instances on the slow annotator to be 6, I expected the pipeline to be
>> about 6 times faster. What I see, however, is exactly the same performance.
>>
>> A bit of debugging shows UIMA AS is creating 6 different copies of the
>> slow annotator, because each one is being called alternately per CAS, but
>> it is waiting for the entire pipeline to be complete before getting another
>> cas off the queue.
>>
>> Any ideas what may be misconfigured? Or what to look at?
>>
>> My deployment descriptor is:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <analysisEngineDeploymentDescription xmlns="
>> http://uima.apache.org/resourceSpecifier">
>>     <name>defaultFlapDeployDescriptor20130807.095936</name>
>>     <description/>
>>     <version>1.0</version>
>>     <vendor/>
>>     <deployment protocol="jms" provider="activemq">
>>         <casPool numberOfCASes="6" initialFsHeapSize="2000000"/>
>>         <service>
>>             <inputQueue endpoint="exampleQueue"
>> brokerURL="tcp://localhost:61616" prefetch="0"/>
>>             <topDescriptor>
>>                 <import
>> location="file:/var/folders/vl/7p2qch6j4kx_kv5chvd093l80000gn/T/flapAggregate311122232121092424.xml"/>
>>             </topDescriptor>
>>             <analysisEngine async="true">
>>                 <scaleout numberOfInstances="1"/>
>>                 <delegates>
>>                     <analysisEngine
>> key="aeWhitespaceTokenizerDescriptor211289c8cf04-b67c-45e2-a1eb-e90a85f39006"
>> async="false">
>>                         <scaleout numberOfInstances="1"/>
>>                         <asyncAggregateErrorConfiguration>
>>                             <getMetadataErrors maxRetries="0" timeout="0"
>> errorAction="terminate"/>
>>                             <processCasErrors thresholdCount="0"
>> thresholdWindow="0" thresholdAction="terminate"/>
>>                             <collectionProcessCompleteErrors timeout="0"
>> additionalErrorAction="terminate"/>
>>                         </asyncAggregateErrorConfiguration>
>>                     </analysisEngine>
>>                     <analysisEngine
>> key="aeWordTokenizerDescriptor21126d2902a3-e6ca-4834-89cb-ec1a6c29f281"
>> async="false">
>>                         <scaleout numberOfInstances="1"/>
>>                         <asyncAggregateErrorConfiguration>
>>                             <getMetadataErrors maxRetries="0" timeout="0"
>> errorAction="terminate"/>
>>                             <processCasErrors thresholdCount="0"
>> thresholdWindow="0" thresholdAction="terminate"/>
>>                             <collectionProcessCompleteErrors timeout="0"
>> additionalErrorAction="terminate"/>
>>                         </asyncAggregateErrorConfiguration>
>>                     </analysisEngine>
>>                     <analysisEngine
>> key="gov.va.vinci.flap.examples.ae.MySlowAnnotator2112fc3e83f1-f535-40c2-a860-895207bfff1a"
>> async="false">
>>                         <scaleout numberOfInstances="6"/>
>>                         <asyncAggregateErrorConfiguration>
>>                             <getMetadataErrors maxRetries="0" timeout="0"
>> errorAction="terminate"/>
>>                             <processCasErrors thresholdCount="0"
>> thresholdWindow="0" thresholdAction="terminate"/>
>>                             <collectionProcessCompleteErrors timeout="0"
>> additionalErrorAction="terminate"/>
>>                         </asyncAggregateErrorConfiguration>
>>                     </analysisEngine>
>>                 </delegates>
>>                 <asyncPrimitiveErrorConfiguration>
>>                     <processCasErrors thresholdCount="0"
>> thresholdWindow="0" thresholdAction="terminate"/>
>>                     <collectionProcessCompleteErrors timeout="0"
>> additionalErrorAction="terminate"/>
>>                 </asyncPrimitiveErrorConfiguration>
>>             </analysisEngine>
>>         </service>
>>     </deployment>
>> </analysisEngineDeploymentDescription>
>>
>> Thanks!
>> Ryan
>>
>>
>>

Re: UIMA AS Asynchronous = true not behaving as expected.

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.

When you launch your service, do you see a Warning similar to this:

*** WARN: line-number: 30 Top Lovel Async Primitive specifies a scaleout of
numberOfInstances="1", but also specifies a Cas Pool size of
numberOfCASes="6". The Cas Pool size is being forced to be the same as the
scaleout.


If yes,

<analysisEngine async="true">
                <scaleout numberOfInstances="1"/>  *<<<< REMOVE THIS LINE
FROM Deployment Descriptor*

JC


On Wed, Aug 7, 2013 at 12:30 PM, RYAN C. CORNIA <ry...@utah.edu>wrote:

> I'm using UIMA AS 2.4.0, and have an example pipeline with 3 annotators.
> The third annotator is coded to just sleep for 3 seconds per document to
> simulate a slow annotator.
>
> If I change the pipeline to async=true and set the number of scale out
> instances on the slow annotator to be 6, I expected the pipeline to be
> about 6 times faster. What I see, however, is exactly the same performance.
>
> A bit of debugging shows UIMA AS is creating 6 different copies of the
> slow annotator, because each one is being called alternately per CAS, but
> it is waiting for the entire pipeline to be complete before getting another
> cas off the queue.
>
> Any ideas what may be misconfigured? Or what to look at?
>
> My deployment descriptor is:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <analysisEngineDeploymentDescription xmlns="
> http://uima.apache.org/resourceSpecifier">
>     <name>defaultFlapDeployDescriptor20130807.095936</name>
>     <description/>
>     <version>1.0</version>
>     <vendor/>
>     <deployment protocol="jms" provider="activemq">
>         <casPool numberOfCASes="6" initialFsHeapSize="2000000"/>
>         <service>
>             <inputQueue endpoint="exampleQueue"
> brokerURL="tcp://localhost:61616" prefetch="0"/>
>             <topDescriptor>
>                 <import
> location="file:/var/folders/vl/7p2qch6j4kx_kv5chvd093l80000gn/T/flapAggregate311122232121092424.xml"/>
>             </topDescriptor>
>             <analysisEngine async="true">
>                 <scaleout numberOfInstances="1"/>
>                 <delegates>
>                     <analysisEngine
> key="aeWhitespaceTokenizerDescriptor211289c8cf04-b67c-45e2-a1eb-e90a85f39006"
> async="false">
>                         <scaleout numberOfInstances="1"/>
>                         <asyncAggregateErrorConfiguration>
>                             <getMetadataErrors maxRetries="0" timeout="0"
> errorAction="terminate"/>
>                             <processCasErrors thresholdCount="0"
> thresholdWindow="0" thresholdAction="terminate"/>
>                             <collectionProcessCompleteErrors timeout="0"
> additionalErrorAction="terminate"/>
>                         </asyncAggregateErrorConfiguration>
>                     </analysisEngine>
>                     <analysisEngine
> key="aeWordTokenizerDescriptor21126d2902a3-e6ca-4834-89cb-ec1a6c29f281"
> async="false">
>                         <scaleout numberOfInstances="1"/>
>                         <asyncAggregateErrorConfiguration>
>                             <getMetadataErrors maxRetries="0" timeout="0"
> errorAction="terminate"/>
>                             <processCasErrors thresholdCount="0"
> thresholdWindow="0" thresholdAction="terminate"/>
>                             <collectionProcessCompleteErrors timeout="0"
> additionalErrorAction="terminate"/>
>                         </asyncAggregateErrorConfiguration>
>                     </analysisEngine>
>                     <analysisEngine
> key="gov.va.vinci.flap.examples.ae.MySlowAnnotator2112fc3e83f1-f535-40c2-a860-895207bfff1a"
> async="false">
>                         <scaleout numberOfInstances="6"/>
>                         <asyncAggregateErrorConfiguration>
>                             <getMetadataErrors maxRetries="0" timeout="0"
> errorAction="terminate"/>
>                             <processCasErrors thresholdCount="0"
> thresholdWindow="0" thresholdAction="terminate"/>
>                             <collectionProcessCompleteErrors timeout="0"
> additionalErrorAction="terminate"/>
>                         </asyncAggregateErrorConfiguration>
>                     </analysisEngine>
>                 </delegates>
>                 <asyncPrimitiveErrorConfiguration>
>                     <processCasErrors thresholdCount="0"
> thresholdWindow="0" thresholdAction="terminate"/>
>                     <collectionProcessCompleteErrors timeout="0"
> additionalErrorAction="terminate"/>
>                 </asyncPrimitiveErrorConfiguration>
>             </analysisEngine>
>         </service>
>     </deployment>
> </analysisEngineDeploymentDescription>
>
> Thanks!
> Ryan
>
>
>