You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Charles Proefrock <ch...@hotmail.com> on 2008/10/27 02:34:28 UTC

RE: CPE to AS Transition ... Porting processingUnitThreadCount

We've made progress on our transition and now have an app and a configuration that we believe should meet our needs.  We have an application derived from the RunRemoteAsyncAE with the CR and 1 deployed Aggregate.  The deployed aggregate uses a mixture of numInstances and remoteAsyncEngines to achieve the objectives outlined in my original message.  Not too much different than originally described, but a better understanding of the AS approach to scale-up, surpassing what we had achieved with the processingUnitThreadCount in the CPE application.
 
We are experiencing one problem at this point which I didn't expect.  Null CAS returned on error conditions.
 
In our CPE implementation we manage a database queue by accessing the queue with a Collection Reader and then completing the queue update via the StatusCallbackListener::entityProcessComplete() call.  Independent of whether the call ends with success or error, the CAS reference is valid, so we can access specific information about the CAS and properly update the database status.
 
Our first attempt at error detection in our AS implementation results in a null CAS passed into the entityProcessComplete() call.  This occurs on a remote delegate timeout exception.  I did not test other error scenarios to determine if the CAS is valid.
 
Please let me know how to configure the system to return a valid CAS reference on error to the StatusCallbackListener.
 
Thanks,
 
Charles



> Date: Thu, 25 Sep 2008 09:27:18 -0400> From: eaepstein@gmail.com> To: uima-user@incubator.apache.org> Subject: Re: CPE to AS Transition ... Porting processingUnitThreadCount> > In order to optimize deployment, it is good to focus on where the work is> being done and then what overhead is added by the framework.> > In your case all the work for components CR, A, C and D is expected to be> done on Machine 0. Separating the CR from the aggregate adds unnecessary CAS> serialization overhead for every document, so it would be better to move the> CR into the aggregate. Components A, C or D can be replicated as needed> (using numInstances as appropriate for each) in the one aggregate instance.> > Machines 1..N are then used to scaleout multiple instances of B.> > RunRemoteAsyncAE could just send an "empty" CAS to kick off the CR in the> aggregate, or the CAS could contain information about the collection to be> processed by the CR.> > Note that RunRemoteAsyncAE is a fairly simple application, and it is the> UIMA AS async API that optionally deploys colocated services and/or> optionally instantiates a CR. My point is that RunRemoteAsyncAE could be> replaced with a custom application that (via unspecified mechanisms) deploys> B on remote machines, then deploys the aggregate in the same JVM, runs it,> and shuts everything down at the end.> > Eddie> > On Thu, Sep 25, 2008 at 6:54 AM, Charles Proefrock <ch...@hotmail.com>wrote:> > > I've reviewed Fig. 4 and Fig. 3. Our system seems closer to Fig. 3> > (asingle Collection Reader (CR) with CasPool size X used to push documents> > to X services).Assuming the "Service Instance" is an aggregate (AG) with> > multiple AEsteps A..D, we are extending Fig. 3 with another level of remote> > AE forone of the steps: Machine0: Broker + RunRemoteAsyncAE + 2 AG Service> > InstancesMachine1: RemoteStepB_AE InstanceMachine2: RemoteStepB_AE> > Instance The AG descriptor is configured with A..D in-line, and the AG> > deploymentdescriptor has a remote 'B' override, possibly with error> > handlingcontrols, etc. CR --- || --2-- A B ---> > || --2-- remote 'B' C> > D (consumer) If I'm following your guidance, we should not use> > numInstances in the AGdeployment descriptor because we have decided to> > remote 'B'. Instead weneed to deploy the 2 AG Service Instances via our own> > launch mechanism(as either multiple -d flags on RunRemoteAsyncAE, or> > independently intheir own processes). Let me know if I'm on track.> > - Charles> >> >
_________________________________________________________________
When your life is on the go—take your life with you.
http://clk.atdmt.com/MRT/go/115298558/direct/01/

Re: CPE to AS Transition ... Porting processingUnitThreadCount

Posted by Jaroslaw Cwiklik <cw...@us.ibm.com>.




Eddie, I agree that this needs to be patched in the 2.2.2 code. Instead of
passing null for a CAS while
handling an exception we can extract the CAS from the cache and pass it
along to the listener's
entityProcessComplete() in the first arg. There is really no need for the
client code to maintain
the Map to manage CASes via CAS Ids.

I will fix the code early next week.


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Jerry Cwiklik
 UIMA Extensions
 IBM T.J.  Watson Research Center
 Hawtorne, NY, 10532
 Tel: 914-784-7665,  T/L: 863-7665
 Email: cwiklik@us.ibm.com



                                                                           
             "Eddie Epstein"                                               
             <eaepstein@gmail.                                             
             com>,                                                      To 
                                       uima-user@incubator.apache.org      
             10/28/2008 05:48                                           cc 
             PM                                                            
                                                                   Subject 
                                       Re: CPE to AS Transition ...        
             Please respond to         Porting processingUnitThreadCount   
             uima-user@incubat                                             
               or.apache.org                                               
                                                                           
                                                                           
                                                                           
                                                                           




On Sun, Oct 26, 2008 at 9:34 PM, Charles Proefrock <ch...@hotmail.com>
wrote:
>
> Our first attempt at error detection in our AS implementation results in
a null CAS passed into the entityProcessComplete() call.

One might wonder why the CAS would ever be null for the method
entityProcessComplete(CAS aCas, EntityProcessStatus aStatus). During
initial development, the UIMA AS client released outgoing CASes after
sendCAS, and therefore a CAS was only available when received back
from the service. However, on error services do not return a CAS.
Later the client was changed to cache outgoing CASes.

Given the cache, the original outgoing CAS should be returned to the
handler on errors. See UIMA-1217.

Thanks for alerting us of this problem,
Eddie

Re: CPE to AS Transition ... Porting processingUnitThreadCount

Posted by Eddie Epstein <ea...@gmail.com>.
On Sun, Oct 26, 2008 at 9:34 PM, Charles Proefrock <ch...@hotmail.com> wrote:
>
> Our first attempt at error detection in our AS implementation results in a null CAS passed into the entityProcessComplete() call.

One might wonder why the CAS would ever be null for the method
entityProcessComplete(CAS aCas, EntityProcessStatus aStatus). During
initial development, the UIMA AS client released outgoing CASes after
sendCAS, and therefore a CAS was only available when received back
from the service. However, on error services do not return a CAS.
Later the client was changed to cache outgoing CASes.

Given the cache, the original outgoing CAS should be returned to the
handler on errors. See UIMA-1217.

Thanks for alerting us of this problem,
Eddie

Re: CPE to AS Transition ... Porting processingUnitThreadCount

Posted by Eddie Epstein <ea...@gmail.com>.
Hi Charles,

Glad to hear of your progress. The problem you are describing was
anticipated not long before the first release, and although we added
a mechanism to retrieve the CAS, it was not adequately documented.

The sendCAS method returns a string which is a unique CAS reference ID.
This reference ID is then available from the status object returned to
entityProcessComplete(). The user must save an association from CAS
reference IDs to each CAS given to sendCAS.

Here is code that Jerry Cwiklik gave me to extract the reference ID
from the status object:

public synchronized void entityProcessComplete(CAS aCAS,
EntityProcessStatus aProcessStatus)
  {
      String casReferenceId="";
      if ( aProcessStatus instanceof UimaASProcessStatus )
      {
         casReferenceId =
           ((UimaASProcessStatus)aProcessStatus).getCasReferenceId();
      }

The object returned should always be an instance of UimaASProcessStatus,
so the check should not be necessary, but the casting is.

***Note***
While researching this reply, we've found a framework bug (Jira 1216):
on an error condition, the UimaAsynchronousEngine does not release
the CAS given to sendCAS. As a workaround, on errors the listener
should do CAS.release(). Unfortunately, when using the next version
with this bug fixed, the handler code must not do the release.

Regards,
Eddie

On Sun, Oct 26, 2008 at 9:34 PM, Charles Proefrock <ch...@hotmail.com> wrote:
>
> We've made progress on our transition and now have an app and a configuration that we believe should meet our needs.  We have an application derived from the RunRemoteAsyncAE with the CR and 1 deployed Aggregate.  The deployed aggregate uses a mixture of numInstances and remoteAsyncEngines to achieve the objectives outlined in my original message.  Not too much different than originally described, but a better understanding of the AS approach to scale-up, surpassing what we had achieved with the processingUnitThreadCount in the CPE application.
>
> We are experiencing one problem at this point which I didn't expect.  Null CAS returned on error conditions.
>
> In our CPE implementation we manage a database queue by accessing the queue with a Collection Reader and then completing the queue update via the StatusCallbackListener::entityProcessComplete() call.  Independent of whether the call ends with success or error, the CAS reference is valid, so we can access specific information about the CAS and properly update the database status.
>
> Our first attempt at error detection in our AS implementation results in a null CAS passed into the entityProcessComplete() call.  This occurs on a remote delegate timeout exception.  I did not test other error scenarios to determine if the CAS is valid.
>
> Please let me know how to configure the system to return a valid CAS reference on error to the StatusCallbackListener.
>
> Thanks,
>
> Charles
>
>