You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Richard Eckart <ec...@linglit.tu-darmstadt.de> on 2008/06/04 16:35:53 UTC

Session variable potpurri in CPE

Hello there,

I have recently switched from my own home-cooked version of session  
variables to using the UIMAContext
session. Actually I am using the UIMAContextAdmin-rootContext for  
storing my session variables as I need them to set them in the  
CollectionReader and read them in the CASConsumer.

However, unless I set the casPoolSize to 1 I am having the problem  
that the CollectionReader already overwrites the session variable  
which the CASConsumer has not yet read.

Before I had encoded my variables as an annotation within the CAS  
which worked fine.

Is there any way to make use of the CAS pool AND of the UIMA session  
variables at the same time?

Richard Eckart

Technische Universität Darmstadt
Institute of Linguistics and Literary Studies
Department of English Linguistics

Hochschulstrasse 1
64289 Darmstadt
Germany




Re: Session variable potpurri in CPE

Posted by Adam Lally <al...@alum.rpi.edu>.
On Wed, Jun 4, 2008 at 8:42 PM, Richard Eckart
<ec...@linglit.tu-darmstadt.de> wrote:
>> If you wanted your Collection Reader and CAS Consumer so share data
>> that was *not* related to a particular CAS, you could use UIMA's
>> external resource mechanism to accomplish that.
>
> I was originally searching for that (because it was mentioned at the
> UIMA workshop at the LREC) but I only found out how to use the session
> variables. Do you have a pointer to finding out how to use that mechanism?
>

The best place to start for learning about external resources is this
chapter of the tutorial:
http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/html/tutorials_and_users_guides/tutorials_and_users_guides.html#ugr.tug.aae.accessing_external_resource_files

What may at first be confusing is that this is talking about sharing
access to a data file between components, whereas you don't have a
file.  But the same mechanism can be used without reading from a file.
 You could just use a dummy file, or you could use a
customResourceSpecifier instead of a fileResourceSpecifier, see:
http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/html/references/references.html#ugr.ref.xml.component_descriptor.custom_resource_specifiers

> What I want is that wherever I put the Java object, it should be
> automatically
> removed when a CAS is fully processed - no matter if it was successful or
> not.
> The object is bound to a document but representable in the CAS. Before I had
> an
> ID encoded in the CAS and passed the object out-of-band via a static hashmap
> (CollectionReader puts object there under the ID and CASConsumer reads it
> again).
> However I was not too happy with this solution. Would this scenario be
> solvable
> with the external resources approach?
>

I think you will still need a hashmap keyed on the ID encoded in the
CAS (since multiple CASes may be being processed concurrently), but
instead of making it static you can put it in your external resource
object that's then shared between your components.

-Adam

Re: Session variable potpurri in CPE

Posted by Richard Eckart <ec...@linglit.tu-darmstadt.de>.
Hi again,

> It sounds like the values of these variables pertain to a particular
> CAS (since you don't want the values to change until that CAS has been
> fully processed).  If so, then storing them them in the CAS was a fine
> solution.  The CollectionReader and CAS Consumers run in separate
> threads, so the CollectionReader definitely may move on to a new CAS
> before the CAS Consumer processes the first one.

The problem is that I need to share data that is not a primitive type
which I could represent in the CAS (it's a rather complex Java object).

> If you wanted your Collection Reader and CAS Consumer so share data
> that was *not* related to a particular CAS, you could use UIMA's
> external resource mechanism to accomplish that.

I was originally searching for that (because it was mentioned at the
UIMA workshop at the LREC) but I only found out how to use the session
variables. Do you have a pointer to finding out how to use that  
mechanism?

What I want is that wherever I put the Java object, it should be  
automatically
removed when a CAS is fully processed - no matter if it was  
successful or not.
The object is bound to a document but representable in the CAS.  
Before I had an
ID encoded in the CAS and passed the object out-of-band via a static  
hashmap
(CollectionReader puts object there under the ID and CASConsumer  
reads it again).
However I was not too happy with this solution. Would this scenario  
be solvable
with the external resources approach?

Richard Eckart

Technische Universität Darmstadt
Institute of Linguistics and Literary Studies
Department of English Linguistics

Hochschulstrasse 1
64289 Darmstadt
Germany




Re: Session variable potpurri in CPE

Posted by Adam Lally <al...@alum.rpi.edu>.
Hi Richard,

On Wed, Jun 4, 2008 at 10:35 AM, Richard Eckart
<ec...@linglit.tu-darmstadt.de> wrote:
> Hello there,
>
> I have recently switched from my own home-cooked version of session
> variables to using the UIMAContext
> session. Actually I am using the UIMAContextAdmin-rootContext for storing my
> session variables as I need them to set them in the CollectionReader and
> read them in the CASConsumer.
>

The Session support is actually not intended for sharing information
between components, and it's not really fully implemented anyway.  See
this email:
http://www.mail-archive.com/uima-dev@incubator.apache.org/msg04364.html.

> However, unless I set the casPoolSize to 1 I am having the problem that the
> CollectionReader already overwrites the session variable which the
> CASConsumer has not yet read.
>
> Before I had encoded my variables as an annotation within the CAS which
> worked fine.
>

It sounds like the values of these variables pertain to a particular
CAS (since you don't want the values to change until that CAS has been
fully processed).  If so, then storing them them in the CAS was a fine
solution.  The CollectionReader and CAS Consumers run in separate
threads, so the CollectionReader definitely may move on to a new CAS
before the CAS Consumer processes the first one.

If you wanted your Collection Reader and CAS Consumer so share data
that was *not* related to a particular CAS, you could use UIMA's
external resource mechanism to accomplish that.

 -Adam