You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by "greg@holmberg.name" <ho...@comcast.net> on 2007/08/15 08:08:34 UTC

CasPools in a general service

I'm wondering how best to use CasPools in my system.

My system is a general service--that is, it may receive concurrent requests with arbitrary AnalysisEngineDescriptions from different applications.

Some requests may use the exact same AED object, in which case, it's obvious that documents processed from those requests can share the same AnalysisEngine and CasPool.

However, it's also likely that the system will receive AEDs that are equivalent--that is, they have the same annotators with the same configuration parameters, but they are two physically different AED objects.

Now, in this case, it would be possible to use the same AE and CasPool, if there was a way to tell that they were equivalent.  Unfortunately, the equals() methods on AnalysisEngineDescription and AnalysisEngine won't tell me this.  So what I currently do is create separate CasPools.

Is it worth it for performance and memory usage to write a method to compare two AEDs to determine if they are equivalent?  Or is creating CAS's and CasPool not expensive enough to justify the work, and I should just continue with separate CasPools?

Going further, it appears that two AnalysisEngines could share the same CasPool if only their type systems are the same--the AE's themselves don't event have to be the same (could have different configuration parameter values, for example).  They merely need the same CasDefinition. Is there an easy way to determine if two AEDs have the same type system or CasDefinition, and so could share a CasPool?

Thanks,


Greg Holmberg

Re: CasPools in a general service

Posted by Marshall Schor <ms...@schor.com>.
 From AnalysisEngineDescriptions, you should be able to get 
TypeSystemDescriptions.

The equals method for this (see MetaDataObject_impl equals method) 
appears to
do the right thing of recursively descending the substructures, doing 
equals comparisons on
all the parts.

Do you have a case where you have two different TypeSystemDescription 
objects which
you consider "equal" for which the equals method isn't working?

Thanks. -Marshall

greg@holmberg.name wrote:
> I'm wondering how best to use CasPools in my system.
>
> My system is a general service--that is, it may receive concurrent requests with arbitrary AnalysisEngineDescriptions from different applications.
>
> Some requests may use the exact same AED object, in which case, it's obvious that documents processed from those requests can share the same AnalysisEngine and CasPool.
>
> However, it's also likely that the system will receive AEDs that are equivalent--that is, they have the same annotators with the same configuration parameters, but they are two physically different AED objects.
>
> Now, in this case, it would be possible to use the same AE and CasPool, if there was a way to tell that they were equivalent.  Unfortunately, the equals() methods on AnalysisEngineDescription and AnalysisEngine won't tell me this.  So what I currently do is create separate CasPools.
>
> Is it worth it for performance and memory usage to write a method to compare two AEDs to determine if they are equivalent?  Or is creating CAS's and CasPool not expensive enough to justify the work, and I should just continue with separate CasPools?
>
> Going further, it appears that two AnalysisEngines could share the same CasPool if only their type systems are the same--the AE's themselves don't event have to be the same (could have different configuration parameter values, for example).  They merely need the same CasDefinition. Is there an easy way to determine if two AEDs have the same type system or CasDefinition, and so could share a CasPool?
>
> Thanks,
>
>
> Greg Holmberg
>
>
>