You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Alexander Klenner <al...@scai.fraunhofer.de> on 2012/01/26 10:26:43 UTC

Having mutliple instances of an AE writing in the same output file - thread safe

Hi there,

is there a tutorial for the problem mentioned above? We have multiple instances of an AE that produce output that has to be collected in one final output file (all instances are ought to share this file via e.g UIMAContext), the order of the writing into this file is not important, so if CAS-A is written first and CAS-B second or vice versa doesn't matter.

Thanks a lot for any hints,

Alex




--
Alexander G. Klenner
Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
Schloss Birlinghoven, D-53754 Sankt Augustin
Tel.: +49 - 2241 - 14 - 2736
E-mail: alexander.garvin.klenner@scai.fraunhofer.de
Internet: http://www.scai.fraunhofer.de



Re: Having mutliple instances of an AE writing in the same output file - thread safe

Posted by Alexander Klenner <al...@scai.fraunhofer.de>.
Hi Richard,

thanks a lot, we did it exactly as shown on the uimafit example pages.

Cheers,

Alex

--
Alexander G. Klenner
Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
Schloss Birlinghoven, D-53754 Sankt Augustin
Tel.: +49 - 2241 - 14 - 2736
E-mail: alexander.garvin.klenner@scai.fraunhofer.de
Internet: http://www.scai.fraunhofer.de


----- Ursprüngliche Mail -----
Von: "Richard Eckart de Castilho" <ec...@ukp.informatik.tu-darmstadt.de>
An: "<us...@uima.apache.org>" <us...@uima.apache.org>
Gesendet: Donnerstag, 26. Januar 2012 11:17:15
Betreff: Re: Having mutliple instances of an AE writing in the same output file - thread safe

Hi Alex,

you could model your output as a shared external resource and inject it into your AEs. 

This page about using external resources in uimaFIT might be helpful for you:

http://code.google.com/p/uimafit/wiki/uimaFitResources
http://code.google.com/p/uimafit/wiki/uimaFitResources#Resources_extending_Resource_ImplBase

I'd recommend your data sink resource should extend Resource_ImplBase, not implement SharedResourceObject.

-- Richard

Am 26.01.2012 um 10:26 schrieb Alexander Klenner:

> Hi there,
> 
> is there a tutorial for the problem mentioned above? We have multiple instances of an AE that produce output that has to be collected in one final output file (all instances are ought to share this file via e.g UIMAContext), the order of the writing into this file is not important, so if CAS-A is written first and CAS-B second or vice versa doesn't matter.
> 
> Thanks a lot for any hints,
> 
> Alex
> 
> 
> 
> 
> --
> Alexander G. Klenner
> Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
> Schloss Birlinghoven, D-53754 Sankt Augustin
> Tel.: +49 - 2241 - 14 - 2736
> E-mail: alexander.garvin.klenner@scai.fraunhofer.de
> Internet: http://www.scai.fraunhofer.de
-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckart@ukp.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
------------------------------------------------------------------- 







Re: Having mutliple instances of an AE writing in the same output file - thread safe

Posted by Richard Eckart de Castilho <ec...@ukp.informatik.tu-darmstadt.de>.
Hi Alex,

you could model your output as a shared external resource and inject it into your AEs. 

This page about using external resources in uimaFIT might be helpful for you:

http://code.google.com/p/uimafit/wiki/uimaFitResources
http://code.google.com/p/uimafit/wiki/uimaFitResources#Resources_extending_Resource_ImplBase

I'd recommend your data sink resource should extend Resource_ImplBase, not implement SharedResourceObject.

-- Richard

Am 26.01.2012 um 10:26 schrieb Alexander Klenner:

> Hi there,
> 
> is there a tutorial for the problem mentioned above? We have multiple instances of an AE that produce output that has to be collected in one final output file (all instances are ought to share this file via e.g UIMAContext), the order of the writing into this file is not important, so if CAS-A is written first and CAS-B second or vice versa doesn't matter.
> 
> Thanks a lot for any hints,
> 
> Alex
> 
> 
> 
> 
> --
> Alexander G. Klenner
> Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
> Schloss Birlinghoven, D-53754 Sankt Augustin
> Tel.: +49 - 2241 - 14 - 2736
> E-mail: alexander.garvin.klenner@scai.fraunhofer.de
> Internet: http://www.scai.fraunhofer.de
-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckart@ukp.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
------------------------------------------------------------------- 







Re: Having mutliple instances of an AE writing in the same output file - thread safe

Posted by Thilo Goetz <tw...@gmx.de>.
On 26/01/12 10:26, Alexander Klenner wrote:
> Hi there,
> 
> is there a tutorial for the problem mentioned above? We have multiple instances of an AE that produce output that has to be collected in one final output file (all instances are ought to share this file via e.g UIMAContext), the order of the writing into this file is not important, so if CAS-A is written first and CAS-B second or vice versa doesn't matter.
> 
> Thanks a lot for any hints,
> 
> Alex

There are any number of solutions to this kind of problem, and
it really depends very much on what sort of infrastructure you
have available.  A sort of low-tech solution would be to write
each CAS to a temp file (using the Java tmp file facility to make
sure each file name is unique).  Then when your processing is
done (e.g., on the batchProcessComplete() call or whatever it's
called), concatenate them into one big file.

If you have a Java Messaging infrastructure available, or don't
mind setting it up, you could use that to serialize the CASes.

Anyway, it's not really a UIMA-specific issue, right?

HTH,
Thilo

> 
> 
> 
> 
> --
> Alexander G. Klenner
> Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
> Schloss Birlinghoven, D-53754 Sankt Augustin
> Tel.: +49 - 2241 - 14 - 2736
> E-mail: alexander.garvin.klenner@scai.fraunhofer.de
> Internet: http://www.scai.fraunhofer.de
> 
>