You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Richard Eckart de Castilho <ec...@tk.informatik.tu-darmstadt.de> on 2009/08/10 21:52:59 UTC

Are CasConsumers deprecated by AnalysisEngines?

Hello everybody,

recently the question of replacing all our CasConsumers with  
AnalysisEngines comes up quite frequently and the potential myth that  
CasCosumers should no longer be used and instead AnalysisEngines  
should be preferred crops up every once in a while. I tried finding a  
definitive information about this in the mailing list archives, but I  
did not.

In 2007 Adam Lally stated:

> In recent versions of UIMA there is essentially no difference  
> between an AnalysisEngine and a CasConsumer

I think at least in CPEs it makes a difference, since with the first  
CasConsumer the CPE supposed to be run in a single thread, so the  
CasConsumer sees all the documents. The CPE, however, is afaik  
deprecated and replaced by UIMA AS.

If an AnalysisEngine has "allow multiple deployment" set to "false",  
that should have the same affect within a CPE or within UIMA AS.

So the questions are:

- Is there any reason why CasConsumers are still required?

- Is there any reason that a CasConsumer cannot be replaced "as  
is" (minus inheriting from a different base class and changing the  
descriptor type) by an AnalysisEngine that may not be deployed multiply?

Best regards,

Richard Eckart de Castilho

-- 
-------------------------------------------------------------------
Richard Eckart de Castilho
Software Engineer
Ubiquitous Knowledge Processing Lab
FB 20 Computer Science Department
Technische Universität Darmstadt
Hochschulstr. 10, D-64289 Darmstadt, Germany
phone +49 (6151) 16 - 6218, fax -5455, room S2/02/E225
eckartde@tk.informatik.tu-darmstadt.de
www.ukp.tu-darmstadt.de
-------------------------------------------------------------------




Re: Are CasConsumers deprecated by AnalysisEngines?

Posted by Jörn Kottmann <ko...@gmail.com>.
Marshall Schor wrote:
> Richard Eckart de Castilho wrote:
>   
>> Hello everybody,
>>
>> recently the question of replacing all our CasConsumers with
>> AnalysisEngines comes up quite frequently and the potential myth that
>> CasCosumers should no longer be used and instead AnalysisEngines
>> should be preferred crops up every once in a while. I tried finding a
>> definitive information about this in the mailing list archives, but I
>> did not.
>>
>> In 2007 Adam Lally stated:
>>
>>     
>>> In recent versions of UIMA there is essentially no difference between
>>> an AnalysisEngine and a CasConsumer
>>>       
>> I think at least in CPEs it makes a difference, since with the first
>> CasConsumer the CPE supposed to be run in a single thread, so the
>> CasConsumer sees all the documents. The CPE, however, is afaik
>> deprecated and replaced by UIMA AS.
>>
>> If an AnalysisEngine has "allow multiple deployment" set to "false",
>> that should have the same affect within a CPE or within UIMA AS.
>>
>> So the questions are:
>>
>> - Is there any reason why CasConsumers are still required?
>>
>> - Is there any reason that a CasConsumer cannot be replaced "as is"
>> (minus inheriting from a different base class and changing the
>> descriptor type) by an AnalysisEngine that may not be deployed multiply?
>>     
>
> I think this is basically correct.  The one thing I'm not completely
> sure of because I haven't checked (:-) ) is whether calling something a
> Cas Consumer moves the component in the default flow to the end of the
> pipe line, in a CPE.  In UIMA-AS, it doesn't, I think..
>   
Is there a reason why Cas Consumer is not deprecated ?

The tutorial and dev guide suggest that it should not longer be used:

2.4.3:
"... We recommend for future work that users implement and use Analysis 
Engine components instead of CAS Consumers. ..."

Jörn

Re: Are CasConsumers deprecated by AnalysisEngines?

Posted by Marshall Schor <ms...@schor.com>.

Richard Eckart de Castilho wrote:
> Hello everybody,
>
> recently the question of replacing all our CasConsumers with
> AnalysisEngines comes up quite frequently and the potential myth that
> CasCosumers should no longer be used and instead AnalysisEngines
> should be preferred crops up every once in a while. I tried finding a
> definitive information about this in the mailing list archives, but I
> did not.
>
> In 2007 Adam Lally stated:
>
>> In recent versions of UIMA there is essentially no difference between
>> an AnalysisEngine and a CasConsumer
>
> I think at least in CPEs it makes a difference, since with the first
> CasConsumer the CPE supposed to be run in a single thread, so the
> CasConsumer sees all the documents. The CPE, however, is afaik
> deprecated and replaced by UIMA AS.
>
> If an AnalysisEngine has "allow multiple deployment" set to "false",
> that should have the same affect within a CPE or within UIMA AS.
>
> So the questions are:
>
> - Is there any reason why CasConsumers are still required?
>
> - Is there any reason that a CasConsumer cannot be replaced "as is"
> (minus inheriting from a different base class and changing the
> descriptor type) by an AnalysisEngine that may not be deployed multiply?

I think this is basically correct.  The one thing I'm not completely
sure of because I haven't checked (:-) ) is whether calling something a
Cas Consumer moves the component in the default flow to the end of the
pipe line, in a CPE.  In UIMA-AS, it doesn't, I think..

-Marshall
>
> Best regards,
>
> Richard Eckart de Castilho
>

Re: Are CasConsumers deprecated by AnalysisEngines?

Posted by Eddie Epstein <ea...@gmail.com>.
> Interesting. How do you replace a CollectionReader with an
> AnalysisEngine? Is there a way for an AnalysisEngine to create a new
> CAS for the pipeline?

Cas Multipliers are a type of Analysis Engine that produce new CASes.
To support them, an additional AE method, processAndOutputNewCASes(CAS
acas), was added.

The CPE does not support processAndOutputNewCASes.

More is at http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/html/tutorials_and_users_guides/tutorials_and_users_guides.html#ugr.tug.cm

Eddie

Re: Are CasConsumers deprecated by AnalysisEngines?

Posted by Steven Bethard <st...@gmail.com>.
On Tue, Aug 11, 2009 at 6:49 AM, Eddie Epstein<ea...@gmail.com> wrote:
> CasConsumers are not required. They (and collection readers) can be
> replaced by AnalysisEngines with appropriate operational properties in
> order to have a single component interface that needs to be
> implemented for primitives, aggregates, remote services, or anything
> else that comes along.

Interesting. How do you replace a CollectionReader with an
AnalysisEngine? Is there a way for an AnalysisEngine to create a new
CAS for the pipeline?

Steve
-- 
Where did you get that preposterous hypothesis?
Did Steve tell you that?
        --- The Hiphopopotamus

Re: Are CasConsumers deprecated by AnalysisEngines?

Posted by Eddie Epstein <ea...@gmail.com>.
> In 2007 Adam Lally stated:
>
>> In recent versions of UIMA there is essentially no difference between an
>> AnalysisEngine and a CasConsumer
>
> I think at least in CPEs it makes a difference, since with the first
> CasConsumer the CPE supposed to be run in a single thread, so the
> CasConsumer sees all the documents. The CPE, however, is afaik deprecated
> and replaced by UIMA AS.

The CPE logic for what runs in the final, "CasConsumer thread", is to
walk the pipeline backwards, stopping at the first component that is
"parallizable". Any non-parallizable components which occur before
that are run in the processing pipeline thread(s), but only a single
instance of that component is run and all pipelines route CASes thru
it.

The CPE is not deprecated, but all development effort for scale-up
moved to UIMA AS.

>
> If an AnalysisEngine has "allow multiple deployment" set to "false", that
> should have the same affect within a CPE or within UIMA AS.
>
> So the questions are:
>
> - Is there any reason why CasConsumers are still required?
>
> - Is there any reason that a CasConsumer cannot be replaced "as is" (minus
> inheriting from a different base class and changing the descriptor type) by
> an AnalysisEngine that may not be deployed multiply?

CasConsumers are not required. They (and collection readers) can be
replaced by AnalysisEngines with appropriate operational properties in
order to have a single component interface that needs to be
implemented for primitives, aggregates, remote services, or anything
else that comes along.

Eddie