You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Marshall Schor (JIRA)" <ui...@incubator.apache.org> on 2008/08/20 17:52:44 UTC

[jira] Commented: (UIMA-1146) Setting the number of concurrent listeners of a reply queue for Co-located Delegates

    [ https://issues.apache.org/jira/browse/UIMA-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624031#action_12624031 ] 

Marshall Schor commented on UIMA-1146:
--------------------------------------

The need for this became apparent when Adam found the next bottleneck in scaleout using UIMA-AS.  He found that the work of the UIMA Aggregate controller: running the flow controller code, serializing the CAS back to the sender (if remote) or out to remote delegates) could be a bottleneck.  The framework already supports multiple threads for this work, but only one is being used.

Here is a summary of discussions about this with Eddie, Burn, and Tong.

This issue is about threads for doing the work of an aggregate.
  Threads for doing the work of a primitive are specified using <scaleout numberOfInstances="nnn"/>

The work of an aggregate, done on a thread, is:
  1) deserializing (not needed for communication within co-located components)
  2) running the Flow Controller
  3) serializing a CAS (not always needed)
        either back to a caller (if not co-located) or out to a remote

This work applies both to top-level aggregates, as well as contained, co-located aggregates.

The threads for doing the work are associated with the queues involved.  Each queue could have a different number of threads.

There are 3 queues for an aggregate (top level, or inner, co-located), some of which may not be present in any given deployment.  The three are:

  1) the input queue for this aggregate.  Note: inner aggregates have their own input queue
  2) local reply q for co-located delegates
  3) remote reply q for remote delegates
     
UIMA-1130 allowed specifying the scaleout for queue # 3.

This Jira is to allow specifying the scaleouts for queue # 1 and 2.

This specification is needed at multiple levels of aggregation (for those analysisEngines having async="true"; analysisEngines with async="false" are treated by UIMA-AS as "primitives")

There are two specifications needed in general:  
  1) one for the internal reply queue (in addition to the existing remote reply queues), and
  2) one for the input queue.

I think it would be less confusing if we avoided overloading the same element name (i.e., replyQueue) with different meanings, depending on context.

I would propose the following:

Each <analysisEngine async="true"> element would have a spec for these two scaleout numbers.  This could be done as attributes on the <analysisEngine... > spec itself, or as one or two nested elements.  Here's 3 proposals:

attributes on <analysisEngine> itself:
     <analysisEngine async="true" internalReplyQueueScaleout="nn1"  inputQueueScaleout="nn2">

one nested element:
     <analysisEngine async="true">
          <aggregateWorkScaleout  internalReplyQueue='nn1"  inputQueue="nn2"/>

two nested elements:
     <analysisEngine async="true">
          <internalReplyQueue scaleout="nn1"/>
          <inputQueue scaleout="nn2"/>

I prefer the first alternative, but not strongly.  If we did this, I would also propose changing what we did for UIMA-1130 to follow this same syntax, adding a remoteReplyQueueScaleout="nn1" to the <remoteAnalysisEngine> element.  We still have time to change that, I think, if we want to.

Other opinions?

> Setting the number of concurrent listeners of a reply queue for Co-located Delegates
> ------------------------------------------------------------------------------------
>
>                 Key: UIMA-1146
>                 URL: https://issues.apache.org/jira/browse/UIMA-1146
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Async Scaleout
>    Affects Versions: 2.2.2AS
>            Reporter: Tong Fin
>            Assignee: Tong Fin
>
> JIRA-1130 has improved UIMA-AS to allow users to set the number of concurrent listeners of a reply queue for  each "remote" delegate. The following is the syntax in the xml deployment descriptor (as an example):
>       <analysisEngine async="true">
>         <delegates>
>           <remoteAnalysisEngine key="RoomNumber">
>             <inputQueue brokerURL="tcp://localhost:61616" endpoint="RoomNumberAnnotatorQueue"/>
>             <replyQueue concurrentConsumers="2" location="remote"/>
>             ...
>           </remoteAnalysisEngine>
>         </delegates>
>         ...
>       </analysisEngine>
> This JIRA will do the similar thing by allowing users to set the number of concurrent listeners of a reply queue for  "co-located" delegates inside the UIMA-AS aggregate. 
> The following is the "proposed" syntax:
>       <analysisEngine async="true"> <!-- Top aggregate -->
>         <replyQueue concurrentConsumers="2">
>         ...
>         <delegates>
>           <analysisEngine key="NamesAndPersonTitlesTAE" async="true"> <!-- co-located aggregate -->
>             <replyQueue concurrentConsumers="3">
>             ...
>           </analysisEngine>
>           ...
>         </delegates>
>         ...
>       </analysisEngine>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.