You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@uima.apache.org by Nicolas Paris <ni...@gmail.com> on 2017/06/14 13:40:19 UTC

Parallell Text transformation before CAS

Hi

My goal is to apply transformation to texts before transforming them to
CAS. The possible way is to do that in a Collection Reader or a
Collection Initialiser (deprecated) Pipe.

Accordingly to documentation, the parallell features of UIMA only
applies to Analysis Engines. Then If I want to distribute my
transformations on raw texts, I need to hardcode it in the Collection
Reader part. No way to configure it with "casPoolSize" ?

Am I missing something ?


Thanks by advance,

-- 
Nicolas

Re: Parallell Text transformation before CAS

Posted by Nicolas Paris <ni...@gmail.com>.

> you can use different Views for these kind of tasks. 
> 1. Create a Reader which dumps the text asis, without transformation, into the View/Sofa.

Hi Johannes

Indeed  views/sofa method looks promising.

BTW I now wonder if the Collection Process stores all the texts into RAM
memory while collecting all the texts before running distributed AEs ?

-- 
Nicolas

Re: Parallell Text transformation before CAS

Posted by Nicolas Paris <ni...@gmail.com>.

> you can use different Views for these kind of tasks. 
> 1. Create a Reader which dumps the text asis, without transformation, into the View/Sofa.
> 2. Write AEs to take the View/SOFA with your raw text and create a new View/SOFA with the transformed text.
> 
> So you can exploit the parallel processing of AEs to do the costly transformation. 

Hi
This leads to store two copies of the text (one asis, and one
transformed). But the one asis is useless and takes resources.

No way to work on only one version ?

Thanks

Re: Parallell Text transformation before CAS

Posted by Johannes Darms <jo...@scai.fraunhofer.de>.

Hey Nicolas,

you can use different Views for these kind of tasks. 
1. Create a Reader which dumps the text asis, without transformation, into the View/Sofa.
2. Write AEs to take the View/SOFA with your raw text and create a new View/SOFA with the transformed text.

So you can exploit the parallel processing of AEs to do the costly transformation. 


Regards,

Johannes

----- Original Message -----
From: "Nicolas Paris" <ni...@gmail.com>
To: user@uima.apache.org
Sent: Wednesday, June 14, 2017 3:40:19 PM
Subject: Parallell Text transformation before CAS

Hi

My goal is to apply transformation to texts before transforming them to
CAS. The possible way is to do that in a Collection Reader or a
Collection Initialiser (deprecated) Pipe.

Accordingly to documentation, the parallell features of UIMA only
applies to Analysis Engines. Then If I want to distribute my
transformations on raw texts, I need to hardcode it in the Collection
Reader part. No way to configure it with "casPoolSize" ?

Am I missing something ?


Thanks by advance,

-- 
Nicolas