You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Peter Klügl <pe...@averbis.com> on 2015/08/25 09:53:15 UTC

JCas cover class instance fields and classloaders (was: SharedViewData FS caching)

Hi,

nope, no PEARs used, just a simple junit test (with uimaFIT). I added
the junit test code below...

Yes, the classloaders are actually not the same...

CASImpl line 4133:
svd.jcasClassLoader: sun.misc.Launcher$AppClassLoader
newClassLoader:org.apache.uima.internal.util.UIMAClassLoader

I'll investigate where they come from...

Best,

Peter

https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core/src/test/java/org/apache/uima/ruta/engine/StackedScriptsTest.java

...

    String rules1 = "CW{->T1};";
    String rules2 = "T1 W{->T2} W{->T3};";
    String rules3 = "W{PARTOF({T1,T2,T3})->T4};";

    AnalysisEngine rutaAE1 = createEngine(RutaEngine.class,
RutaEngine.PARAM_RULES, rules1);
    AnalysisEngine rutaAE2 = createEngine(RutaEngine.class,
RutaEngine.PARAM_RULES, rules2);
    AnalysisEngine rutaAE3 = createEngine(RutaEngine.class,
RutaEngine.PARAM_RULES, rules3);

    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < LINES; i++) {
      sb.append(DOC_TEXT);
      sb.append("\n");
    }
    CAS cas = RutaTestUtils.getCAS(sb.toString());

    rutaAE1.process(cas);
    rutaAE2.process(cas);
    rutaAE3.process(cas);

...


Am 24.08.2015 um 21:03 schrieb Marshall Schor:
> are you using the PEAR class path isolation mechanism?
>
> Or, to put it another way, does the argument to line 382 always return the same
> value?  If not, then that is why you're losing the JCas cached values...
>
> Since you say that is what's happening, how come there's a separate class loader
> being used? The purpose of this was to allow allow different definitions of JCas
> cover classes to co-exist.  When you crossed a boundary into a PEAR, it would
> switch the class loader, and switch the JCas Cache as well (since the cover
> class implementations could well be different).
>
> -Marshall
>
> On 8/24/2015 12:47 PM, Peter Klügl wrote:
>> My investigations so far:
>>
>> line 382 in PrimitiveAnalysisEngine_impl
>> ((CASImpl)view).switchClassLoaderLockCasCL(this.getResourceManager().getExtensionClassLoader());
>>
>> causes the creation of new JCasHashMaps and new JCasHashMapSubMaps and
>> thus the table field is empty again for each analysis engine -> the JCas
>> cover class instance is created anew with empty fields.
>>
>> Best,
>>
>> Peter
>>
>> Am 24.08.2015 um 17:11 schrieb Peter Klügl:
>>> The code is of course in the current trunk of ruta-core ...
>>> ... and I do not expect you to run it but any help is appreciated ;-)
>>>
>>> Best,
>>>
>>> Peter
>>>
>>> Am 24.08.2015 um 17:08 schrieb Peter Klügl:
>>>> Here's my test bed:
>>>>
>>>> run the unit test:
>>>> org.apache.uima.ruta.engine.StackedScriptsTest
>>>>
>>>> There should be some logging output like the following.
>>>> There is a log for the first RutaBasic (begin/end/addr) and for the
>>>> content of one of its fields (beginMap), for the begin of the process
>>>> method and after the basics are initialized (when the information is
>>>> recreated/the map (actually arrays) are filled again).
>>>>
>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(702)
>>>> INFO: begin of process : CW{->T1}; - no RutaBasic yet
>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: after initBasics : CW{->T1}; - size of beginMap: 8
>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: begin of process : T1 W{->T2} W{->T3}; - size of beginMap: 0
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: after initBasics : T1 W{->T2} W{->T3}; - size of beginMap: 10
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: begin of process : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 0
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: after initBasics : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 10
>>>>
>>>>
>>>> Best,
>>>>
>>>> Peter
>>>>
>>>>
>>>>
>>>>
>>>> Am 24.08.2015 um 16:37 schrieb Peter Klügl:
>>>>> That's what I did many years ago (maybe 2008/2009)...
>>>>>
>>>>> I thought that this has worked some time ago, but right now the maps are
>>>>> always empty for the next analysis engine.
>>>>>
>>>>> I will clean up my test bed and will point to a reproducible example.
>>>>>
>>>>> Where do I disable the JCas caching (just in case I did that by accident)?
>>>>>
>>>>> Rigth now, the information is always recreated in Ruta, but that is what
>>>>> I want to avoid in future, at least for some use cases. I have to
>>>>> remember to still support the remote scenario then.
>>>>>
>>>>> Best,
>>>>>
>>>>> Peter
>>>>>
>>>>>
>>>>> Am 24.08.2015 um 16:11 schrieb Marshall Schor:
>>>>>> I think you're on the right track.
>>>>>>
>>>>>> You can add additional fields to your generated JCas cover class, such as
>>>>>> something like a Java Hash Map.
>>>>>> Provided your users haven't disabled the JCas caching, this will work.
>>>>>>
>>>>>> Some caveats:
>>>>>>
>>>>>> In the general UIMA design, any particular part of a pipeline is supposed to be
>>>>>> "remotable" - that is, converted to a service call to an external service.  When
>>>>>> this is done, the CAS is "serialized" to the remote.  This serialization won't
>>>>>> serialize any of the additional custom fields you may have added to your
>>>>>> JCasGen'd cover class definition.  One way around this is to have a fall-back
>>>>>> which recreates the info if not present.
>>>>>>
>>>>>> The same "serialization" issue applies if you manually serialize the Cas to some
>>>>>> file.
>>>>>>
>>>>>> Would this approach fit your situation?  If not, please explain a bit more
>>>>>> detail (e.g., why it doesn't fit... :-) ).
>>>>>>
>>>>>> -Marshall
>>>>>>
>>>>>>
>>>>>> On 8/24/2015 9:53 AM, Peter Klügl wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> first of all, thanks Marshall :-)
>>>>>>>
>>>>>>> Am 24.08.2015 um 15:08 schrieb Marshall Schor:
>>>>>>>> I assume you talking about this.svd.useFSCache :-) This has been disabled for as
>>>>>>>> long as I can recall. 
>>>>>>>> [...]
>>>>>>>>
>>>>>>>> There is currently no option to cache FSs for just some types, other than to
>>>>>>>> create a JCas cover class for those types and run with JCas enabled.
>>>>>>> Let me rephrase it: Is it a realistic option for us to introduce
>>>>>>> something like that?
>>>>>>>
>>>>>>> What do you mean with the second part of the sentence? I am currently
>>>>>>> looking for ways to share information for the same CAS between analysis
>>>>>>> engines. Should it be possible to use normal java fields of JCas cover
>>>>>>> classes for this purpose? My annotations are recreated all the time and
>>>>>>> thus I am loosing the field values ...
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Peter
>>>>>>>
>>>>>>>
>>>>>>>> -Marshall
>>>>>>>>
>>>>>>>> On 8/24/2015 7:42 AM, Peter Klügl wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> what is the current status on FS caching in svd? The comment says that
>>>>>>>>> it is not maintained. If activated, an NPE is thrown because the fsArray
>>>>>>>>> was never initialized. This could be solved by initializing it with a
>>>>>>>>> non-empty array. (I could create an issue and fix it, if wanted).
>>>>>>>>>
>>>>>>>>> In my current (extremely restricted) test bed, the memory consumption
>>>>>>>>> and runtime drop both by about 30% with fs caching.
>>>>>>>>>
>>>>>>>>> I do not have a overview yet: Could there be problems with other parts
>>>>>>>>> of UIMA if we use the caching?
>>>>>>>>>
>>>>>>>>> with a big Ruta hat on:
>>>>>>>>> Is it an option for us to active the caching on the fly for a specific
>>>>>>>>> type only?
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Peter
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 25.08.2015, at 11:47, Richard Eckart de Castilho <re...@apache.org> wrote:

> Using createEngine in a row is imho a really bad habit. Users should
> use createEngineDescription as long as possible and either leave it
> to a pipeline (SimplePipeline or CpePipeline) to instantiate the engines,

To elaborate a bit on this: using createEngine() means that UIMA/uimaFIT
are not taking care of the component lifecycles, in particular 
collectionProcessComplete() and destroy() are not called unless done
explicitly. 

-- Richard

Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Peter Klügl <pe...@averbis.com>.
+1 from me :-)

Am 25.08.2015 um 14:42 schrieb Richard Eckart de Castilho:
> On 25.08.2015, at 12:00, Peter Klügl <pe...@averbis.com> wrote:
>
>> The problems are gone... I should learn how to use uimaFIT correctly. Oh
>> dear, so much trouble for nothing...
>>
>> On the other hand, this will not solve my problems when I enforce the
>> usage of Ruta as a java library. However, I think I can take care of the
>> upcoming problems on the Ruta side of the code, e.g., with the factory
>> you mentioned.
> @Marshall: would it really be (so) bad to change UIMA to use a Thread
> classloader if one is defined?
>
> It might also help if the UIMAClassloader defined an equals method
> that could be checked before dumping the JCas cache. The resource
> managers are very eager at creating new instances of UIMAClassloader
> but they these could well be effectively the same (with the same
> parent, same classpath, etc.).
>
> E.g. although uimaFIT is creating new resource managers, it actually
> gives them always the same classloader and the same classpath. But
> JCas cannot see this because of the UIMAClassloader that wraps them,
> thus it unnecessarily flushes its caches.
>
> -- Richard


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Marshall Schor <ms...@schor.com>.
Re: the phase 2 and the UIMAClassLoader.

When this is created, the code creating it gets to specify the parent class.

I'm wondering if that's the right time / place to stick in additional defaulting
along the lines you suggest. I think, for instance, that PEAR classloader
isolation would work better (it relies on the parent being the outer pipeline's
classloader, whatever that may be, and that is set up when the PEAR wrapper is
initialized, if I recall correctly).

One aspect of using the Context Class Loader by default is that any code can
change this, per Thread (assuming it has security permissions).   (Even
retrieving the ContextClassLoader requires security permissions.)

Also, I'm not sure it's a good idea to use the Application class loader if the
context class loader is not set, in preference to the UIMAClassLoader's parent;
this could break some class path isolation a user might have intended to set up
(assuming they were not using the ContextClassLoader conventions).

-Marshall


On 9/1/2015 9:25 AM, Richard Eckart de Castilho wrote:
> On a first throw, I'd approach this with a two phase refactoring
>
> Phase 1: factor out classloading
>
> I searched for "classloader" in the code and found quite a few occurrences.
> Not all of them appear to be relevant through (e.g. getters, setters, CL usage
> in CAS). The relevant places appear to be where an appropriate classloader is
> chosen. So we have several places like this
>
>       // get UIMA extension ClassLoader if available
>       final ClassLoader cl = getUimaContextAdmin().getResourceManager().getExtensionClassLoader();
>
>       if (cl != null) {
>         // use UIMA extension ClassLoader to load the class
>         annotatorClass = cl.loadClass(annotatorClassName);
>       } else {
>         // use application ClassLoader to load the class
>         annotatorClass = Class.forName(annotatorClassName);
>       } 
>
> So first, I'd propose creating a utility method that is called that basically
> covers the lines above. Optionally, such a method could be added directly into the
> resource manager (e.g. RM.getClassloader());
>
> Of course I only did a very superficial scan right now just to get the idea rolling.
>
>
> Phase 2: add support for context classloader
>
> Hopefully after phase 1 we'd have a reduced the places that handle the choosing of a classloader.
> At that point, I'd extend the "else" case above to use the context classloader if available,
> otherwise the application classloader. 
>
> Likewise, the code of the UimaClassloader would at that point be changed to take the context classloader
> into account. I guess the appropriate place to inject it would be if a class cannot be found in the
> extension classpath and before forwarding to super.loadClass().
>
>
> Is there anything obvious that I missed (very likely)? ;)
>
> Cheers,
>
> -- Richard
>
> On 01.09.2015, at 15:11, Marshall Schor <ms...@schor.com> wrote:
>
>> I'm not against using the context class loader appropriately; can someone draft
>> a proposed change to support this?  I suspect the devil is in the details (if
>> you know that turn of phrase :-) ).
>>
>> -Marshalol
>>
>> On 8/28/2015 2:56 PM, Richard Eckart de Castilho wrote:
>>> On 28.08.2015, at 20:09, Marshall Schor <ms...@schor.com> wrote:
>>>
>>>> On 8/25/2015 8:42 AM, Richard Eckart de Castilho wrote:
>>>>> On 25.08.2015, at 12:00, Peter Klügl <pe...@averbis.com> wrote:
>>>>>
>>>>>> The problems are gone... I should learn how to use uimaFIT correctly. Oh
>>>>>> dear, so much trouble for nothing...
>>>>>>
>>>>>> On the other hand, this will not solve my problems when I enforce the
>>>>>> usage of Ruta as a java library. However, I think I can take care of the
>>>>>> upcoming problems on the Ruta side of the code, e.g., with the factory
>>>>>> you mentioned.
>>>>> @Marshall: would it really be (so) bad to change UIMA to use a Thread
>>>>> classloader if one is defined?
>>>> I vaguely recall some previous discussion about it; see uima.markmail.org and
>>>> search on thread classloader.
>>>> A concern I have is that the Thread classloader use is sort of by convention,
>>>> perhaps depending on the framework you might be embedding into (I'm not an
>>>> expert here, so please feel free to correct!); because of this, I think UIMA
>>>> intentionally takes the approach of having this be "outside" the UIMA
>>>> framework.  I think at some point we added an api to allow an embedding to set
>>>> the class loader to use, and it could of course use the thread local one.
>>> I'm not sure what your reservations against taking the context classloader
>>> into account are. I've been googling around a bit and that confirms what I
>>> was already thinking: it is a commonly used mechanism all over the place.
>>> It allows a calling function to provide the called function with access 
>>> to its classloading context even through caller and callee come from different
>>> classloaders. 
>>>
>>> Here is one article on this:
>>>
>>> http://www.z2-environment.net/blog/2012/07/for-techies-protecting-java-in-a-modular-world-context-classloaders/
>>>
>>> In fact, I do believe that using the context classloader could even be the
>>> better approach for UIMA instead of juggling around with PEAR classloaders etc.
>>>
>>> Since I don't know what reservations you have exactly, I also don't know how
>>> to refute them. But maybe a glance at the article linked above might give
>>> you some more context on the context classloaders ;)
>>>
>>> Cheers,
>>>
>>> -- Richard
>


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Richard Eckart de Castilho <re...@apache.org>.
On a first throw, I'd approach this with a two phase refactoring

Phase 1: factor out classloading

I searched for "classloader" in the code and found quite a few occurrences.
Not all of them appear to be relevant through (e.g. getters, setters, CL usage
in CAS). The relevant places appear to be where an appropriate classloader is
chosen. So we have several places like this

      // get UIMA extension ClassLoader if available
      final ClassLoader cl = getUimaContextAdmin().getResourceManager().getExtensionClassLoader();

      if (cl != null) {
        // use UIMA extension ClassLoader to load the class
        annotatorClass = cl.loadClass(annotatorClassName);
      } else {
        // use application ClassLoader to load the class
        annotatorClass = Class.forName(annotatorClassName);
      } 

So first, I'd propose creating a utility method that is called that basically
covers the lines above. Optionally, such a method could be added directly into the
resource manager (e.g. RM.getClassloader());

Of course I only did a very superficial scan right now just to get the idea rolling.


Phase 2: add support for context classloader

Hopefully after phase 1 we'd have a reduced the places that handle the choosing of a classloader.
At that point, I'd extend the "else" case above to use the context classloader if available,
otherwise the application classloader. 

Likewise, the code of the UimaClassloader would at that point be changed to take the context classloader
into account. I guess the appropriate place to inject it would be if a class cannot be found in the
extension classpath and before forwarding to super.loadClass().


Is there anything obvious that I missed (very likely)? ;)

Cheers,

-- Richard

On 01.09.2015, at 15:11, Marshall Schor <ms...@schor.com> wrote:

> I'm not against using the context class loader appropriately; can someone draft
> a proposed change to support this?  I suspect the devil is in the details (if
> you know that turn of phrase :-) ).
> 
> -Marshalol
> 
> On 8/28/2015 2:56 PM, Richard Eckart de Castilho wrote:
>> On 28.08.2015, at 20:09, Marshall Schor <ms...@schor.com> wrote:
>> 
>>> On 8/25/2015 8:42 AM, Richard Eckart de Castilho wrote:
>>>> On 25.08.2015, at 12:00, Peter Klügl <pe...@averbis.com> wrote:
>>>> 
>>>>> The problems are gone... I should learn how to use uimaFIT correctly. Oh
>>>>> dear, so much trouble for nothing...
>>>>> 
>>>>> On the other hand, this will not solve my problems when I enforce the
>>>>> usage of Ruta as a java library. However, I think I can take care of the
>>>>> upcoming problems on the Ruta side of the code, e.g., with the factory
>>>>> you mentioned.
>>>> @Marshall: would it really be (so) bad to change UIMA to use a Thread
>>>> classloader if one is defined?
>>> I vaguely recall some previous discussion about it; see uima.markmail.org and
>>> search on thread classloader.
>>> A concern I have is that the Thread classloader use is sort of by convention,
>>> perhaps depending on the framework you might be embedding into (I'm not an
>>> expert here, so please feel free to correct!); because of this, I think UIMA
>>> intentionally takes the approach of having this be "outside" the UIMA
>>> framework.  I think at some point we added an api to allow an embedding to set
>>> the class loader to use, and it could of course use the thread local one.
>> I'm not sure what your reservations against taking the context classloader
>> into account are. I've been googling around a bit and that confirms what I
>> was already thinking: it is a commonly used mechanism all over the place.
>> It allows a calling function to provide the called function with access 
>> to its classloading context even through caller and callee come from different
>> classloaders. 
>> 
>> Here is one article on this:
>> 
>> http://www.z2-environment.net/blog/2012/07/for-techies-protecting-java-in-a-modular-world-context-classloaders/
>> 
>> In fact, I do believe that using the context classloader could even be the
>> better approach for UIMA instead of juggling around with PEAR classloaders etc.
>> 
>> Since I don't know what reservations you have exactly, I also don't know how
>> to refute them. But maybe a glance at the article linked above might give
>> you some more context on the context classloaders ;)
>> 
>> Cheers,
>> 
>> -- Richard
> 


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Marshall Schor <ms...@schor.com>.
I'm not against using the context class loader appropriately; can someone draft
a proposed change to support this?  I suspect the devil is in the details (if
you know that turn of phrase :-) ).

-Marshalol

On 8/28/2015 2:56 PM, Richard Eckart de Castilho wrote:
> On 28.08.2015, at 20:09, Marshall Schor <ms...@schor.com> wrote:
>
>> On 8/25/2015 8:42 AM, Richard Eckart de Castilho wrote:
>>> On 25.08.2015, at 12:00, Peter Klügl <pe...@averbis.com> wrote:
>>>
>>>> The problems are gone... I should learn how to use uimaFIT correctly. Oh
>>>> dear, so much trouble for nothing...
>>>>
>>>> On the other hand, this will not solve my problems when I enforce the
>>>> usage of Ruta as a java library. However, I think I can take care of the
>>>> upcoming problems on the Ruta side of the code, e.g., with the factory
>>>> you mentioned.
>>> @Marshall: would it really be (so) bad to change UIMA to use a Thread
>>> classloader if one is defined?
>> I vaguely recall some previous discussion about it; see uima.markmail.org and
>> search on thread classloader.
>> A concern I have is that the Thread classloader use is sort of by convention,
>> perhaps depending on the framework you might be embedding into (I'm not an
>> expert here, so please feel free to correct!); because of this, I think UIMA
>> intentionally takes the approach of having this be "outside" the UIMA
>> framework.  I think at some point we added an api to allow an embedding to set
>> the class loader to use, and it could of course use the thread local one.
> I'm not sure what your reservations against taking the context classloader
> into account are. I've been googling around a bit and that confirms what I
> was already thinking: it is a commonly used mechanism all over the place.
> It allows a calling function to provide the called function with access 
> to its classloading context even through caller and callee come from different
> classloaders. 
>
> Here is one article on this:
>
> http://www.z2-environment.net/blog/2012/07/for-techies-protecting-java-in-a-modular-world-context-classloaders/
>
> In fact, I do believe that using the context classloader could even be the
> better approach for UIMA instead of juggling around with PEAR classloaders etc.
>
> Since I don't know what reservations you have exactly, I also don't know how
> to refute them. But maybe a glance at the article linked above might give
> you some more context on the context classloaders ;)
>
> Cheers,
>
> -- Richard


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 28.08.2015, at 20:09, Marshall Schor <ms...@schor.com> wrote:

> On 8/25/2015 8:42 AM, Richard Eckart de Castilho wrote:
>> On 25.08.2015, at 12:00, Peter Klügl <pe...@averbis.com> wrote:
>> 
>>> The problems are gone... I should learn how to use uimaFIT correctly. Oh
>>> dear, so much trouble for nothing...
>>> 
>>> On the other hand, this will not solve my problems when I enforce the
>>> usage of Ruta as a java library. However, I think I can take care of the
>>> upcoming problems on the Ruta side of the code, e.g., with the factory
>>> you mentioned.
>> @Marshall: would it really be (so) bad to change UIMA to use a Thread
>> classloader if one is defined?
> 
> I vaguely recall some previous discussion about it; see uima.markmail.org and
> search on thread classloader.
> A concern I have is that the Thread classloader use is sort of by convention,
> perhaps depending on the framework you might be embedding into (I'm not an
> expert here, so please feel free to correct!); because of this, I think UIMA
> intentionally takes the approach of having this be "outside" the UIMA
> framework.  I think at some point we added an api to allow an embedding to set
> the class loader to use, and it could of course use the thread local one.

I'm not sure what your reservations against taking the context classloader
into account are. I've been googling around a bit and that confirms what I
was already thinking: it is a commonly used mechanism all over the place.
It allows a calling function to provide the called function with access 
to its classloading context even through caller and callee come from different
classloaders. 

Here is one article on this:

http://www.z2-environment.net/blog/2012/07/for-techies-protecting-java-in-a-modular-world-context-classloaders/

In fact, I do believe that using the context classloader could even be the
better approach for UIMA instead of juggling around with PEAR classloaders etc.

Since I don't know what reservations you have exactly, I also don't know how
to refute them. But maybe a glance at the article linked above might give
you some more context on the context classloaders ;)

Cheers,

-- Richard

Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Marshall Schor <ms...@schor.com>.

On 8/25/2015 8:42 AM, Richard Eckart de Castilho wrote:
> On 25.08.2015, at 12:00, Peter Klügl <pe...@averbis.com> wrote:
>
>> The problems are gone... I should learn how to use uimaFIT correctly. Oh
>> dear, so much trouble for nothing...
>>
>> On the other hand, this will not solve my problems when I enforce the
>> usage of Ruta as a java library. However, I think I can take care of the
>> upcoming problems on the Ruta side of the code, e.g., with the factory
>> you mentioned.
> @Marshall: would it really be (so) bad to change UIMA to use a Thread
> classloader if one is defined?

I vaguely recall some previous discussion about it; see uima.markmail.org and
search on thread classloader.
A concern I have is that the Thread classloader use is sort of by convention,
perhaps depending on the framework you might be embedding into (I'm not an
expert here, so please feel free to correct!); because of this, I think UIMA
intentionally takes the approach of having this be "outside" the UIMA
framework.  I think at some point we added an api to allow an embedding to set
the class loader to use, and it could of course use the thread local one.

>
> It might also help if the UIMAClassloader defined an equals method
> that could be checked before dumping the JCas cache. 

+1, sounds like a good idea, if not too hard or slow.

-Marshall

> The resource
> managers are very eager at creating new instances of UIMAClassloader
> but they these could well be effectively the same (with the same
> parent, same classpath, etc.).
>
> E.g. although uimaFIT is creating new resource managers, it actually
> gives them always the same classloader and the same classpath. But
> JCas cannot see this because of the UIMAClassloader that wraps them,
> thus it unnecessarily flushes its caches.
>
> -- Richard


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 25.08.2015, at 12:00, Peter Klügl <pe...@averbis.com> wrote:

> The problems are gone... I should learn how to use uimaFIT correctly. Oh
> dear, so much trouble for nothing...
> 
> On the other hand, this will not solve my problems when I enforce the
> usage of Ruta as a java library. However, I think I can take care of the
> upcoming problems on the Ruta side of the code, e.g., with the factory
> you mentioned.

@Marshall: would it really be (so) bad to change UIMA to use a Thread
classloader if one is defined?

It might also help if the UIMAClassloader defined an equals method
that could be checked before dumping the JCas cache. The resource
managers are very eager at creating new instances of UIMAClassloader
but they these could well be effectively the same (with the same
parent, same classpath, etc.).

E.g. although uimaFIT is creating new resource managers, it actually
gives them always the same classloader and the same classpath. But
JCas cannot see this because of the UIMAClassloader that wraps them,
thus it unnecessarily flushes its caches.

-- Richard

Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Peter Klügl <pe...@averbis.com>.
The problems are gone... I should learn how to use uimaFIT correctly. Oh
dear, so much trouble for nothing...

On the other hand, this will not solve my problems when I enforce the
usage of Ruta as a java library. However, I think I can take care of the
upcoming problems on the Ruta side of the code, e.g., with the factory
you mentioned.

Thanks Richard :-)

Best,

Peter

Am 25.08.2015 um 11:47 schrieb Richard Eckart de Castilho:
> On 25.08.2015, at 11:32, Peter Klügl <pe...@averbis.com> wrote:
>
>>> How about you rewrite your test using createEngineDescription() and either SimplePipeline or you create an AAE from your individual engines, instantiate then and call process() on it once?
>> I added an alternative without uimaFIT where I use xml descriptors.
>> Here, the JCas cover class instances remain.
>>
>> It's not about that the tests fail. The uimaFIT test also is fulfilled
>> since Ruta regenerates the information anyway right now. This is rather
>> a requirement for future development of Ruta.
>>
>> I could restrict the usage of Ruta with a policy like "If you use Ruta
>> with uimaFIT, then you have to create the CAS also with uimaFIT or with
>> the resource manager of the uimaFIT analysis engine..., or it will get
>> really slow when you use it as a java library or you use several
>> separate Ruta analysis engines in one pipeline."
>>
>> If there are other options, I really want to avoid that. I rather prefer
>> to reduce restrictions like getting rid of the type priorities.
> Using createEngine in a row is imho a really bad habit. Users should
> use createEngineDescription as long as possible and either leave it
> to a pipeline (SimplePipeline or CpePipeline) to instantiate the engines,
> or do only a single createEngine that instantiates a whole aggregate and
> call process once.
>
> So if you use the "recommended" way with
>
> engine = createEngine(
>     createEngineDescription(
>       createEngineDescription(AE1.class,...),
>       createEngineDescription(AE2.class,...),
>       createEngineDescription(AE3.class,...));
>
> engine.process(cas);
>
> do you then still have the problem?
>
> Cheers,
>
> -- Richard


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 25.08.2015, at 11:32, Peter Klügl <pe...@averbis.com> wrote:

>> How about you rewrite your test using createEngineDescription() and either SimplePipeline or you create an AAE from your individual engines, instantiate then and call process() on it once?
> 
> I added an alternative without uimaFIT where I use xml descriptors.
> Here, the JCas cover class instances remain.
> 
> It's not about that the tests fail. The uimaFIT test also is fulfilled
> since Ruta regenerates the information anyway right now. This is rather
> a requirement for future development of Ruta.
> 
> I could restrict the usage of Ruta with a policy like "If you use Ruta
> with uimaFIT, then you have to create the CAS also with uimaFIT or with
> the resource manager of the uimaFIT analysis engine..., or it will get
> really slow when you use it as a java library or you use several
> separate Ruta analysis engines in one pipeline."
> 
> If there are other options, I really want to avoid that. I rather prefer
> to reduce restrictions like getting rid of the type priorities.

Using createEngine in a row is imho a really bad habit. Users should
use createEngineDescription as long as possible and either leave it
to a pipeline (SimplePipeline or CpePipeline) to instantiate the engines,
or do only a single createEngine that instantiates a whole aggregate and
call process once.

So if you use the "recommended" way with

engine = createEngine(
    createEngineDescription(
      createEngineDescription(AE1.class,...),
      createEngineDescription(AE2.class,...),
      createEngineDescription(AE3.class,...));

engine.process(cas);

do you then still have the problem?

Cheers,

-- Richard

Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Peter Klügl <pe...@averbis.com>.
Hi,

Am 25.08.2015 um 10:51 schrieb Richard Eckart de Castilho:
> Hm... well, if UIMA by default respected the thread classloader as all kinds of other frameworks, then I guess this would be a non-issue: uimaFIT wouldn't have to do this ResourceManagerCreator thing.
>
> So if the classloader is local to a single JCas instance, why does it get switched while a pipeline is being executed? 

Each resource manger in each analysis engine has its own classloader
instance. When the analyis engine is called, the classloader of the
resource manager is compared to the classloader of the
JCas/SharedDataView. Then the instance comparison fails even if the new
one is just a wrapper.

> How about you rewrite your test using createEngineDescription() and either SimplePipeline or you create an AAE from your individual engines, instantiate then and call process() on it once?

I added an alternative without uimaFIT where I use xml descriptors.
Here, the JCas cover class instances remain.

It's not about that the tests fail. The uimaFIT test also is fulfilled
since Ruta regenerates the information anyway right now. This is rather
a requirement for future development of Ruta.

I could restrict the usage of Ruta with a policy like "If you use Ruta
with uimaFIT, then you have to create the CAS also with uimaFIT or with
the resource manager of the uimaFIT analysis engine..., or it will get
really slow when you use it as a java library or you use several
separate Ruta analysis engines in one pipeline."

If there are other options, I really want to avoid that. I rather prefer
to reduce restrictions like getting rid of the type priorities.

Best,

Peter

> Cheers,
>
> -- Richard
>
> On 25.08.2015, at 10:31, Peter Klügl <pe...@averbis.com> wrote:
>
>> Hi,
>>
>> tested it without uimaFIT, works perfectly fine.
>>
>> Am 25.08.2015 um 10:18 schrieb Richard Eckart de Castilho:
>>> You can implement an alternative ResourceManagerCreator that does not set a new classloader and pass it to uimaFIT using ResourceManagerFactory.setResourceManagerCreator() before you run your application.
>> I try to avoid such things because all users of Ruta would need to
>> remember to do that in their applications. It would just causes problems.
>>
>>> But why is the classloader that uimaFIT sets not adequate?
>> It's adequate, but there is line 4133 in CASImpl:
>>
>> public void switchClassLoader(ClassLoader newClassLoader) {
>>    if (null == newClassLoader) { // is null if no cl set
>>      return;
>>    }
>>    if (newClassLoader != this.svd.jcasClassLoader) {                
>>          <-   Line 4133
>>      // System.out.println("Switching to new class loader");
>>      this.svd.jcasClassLoader = newClassLoader;
>>      if (null != this.jcas) {
>>        ((JCasImpl) this.jcas).switchClassLoader(newClassLoader);
>>      }
>>    }
>>  }
>>
>> which causes the drop of all cached JCas cover class instances if the
>> class loader instance changes.
>>
>> Best,
>>
>> Peter
>>
>>
>>
>>> For a bit of history on why this classloader is set, see:
>>>
>>>  https://issues.apache.org/jira/browse/UIMA-3692
>>>
>>> -- Richard
>>>
>>> On 25.08.2015, at 10:13, Peter Klügl <pe...@averbis.com> wrote:
>>>
>>>> Here's what happens:
>>>>
>>>> uimaFIT creates a new resource manager and sets the extension classpath,
>>>> which causes the creation of a new UIMAClassLoader.
>>>>
>>>> ResourceManager_impl.setExtensionClassPath(ClassLoader, String, boolean)
>>>> line: 229   
>>>> ResourceManagerFactory$DefaultResourceManagerCreator.newResourceManager() line:
>>>> 62   
>>>> ResourceManagerFactory.newResourceManager() line: 42   
>>>> AnalysisEngineFactory.createEngine(AnalysisEngineDescription, Object...)
>>>> line: 205   
>>>> AnalysisEngineFactory.createEngine(Class<AnalysisComponent>, Object...)
>>>> line: 281   
>>>> StackedScriptsTest.test() line: 43   
>>>>
>>>> I am not yet sure how we can/should solve this problem...
>>>>
>>>> Best,
>>>>
>>>> Peter


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Marshall Schor <ms...@schor.com>.
Re: why a classloader local to a single JCas instance might be switched, this is
only to support PEAR classpath isolation, where the PEAR might have different
JCas implementations. 

For UIMA v3 I'm thinking of not supporting this, because the Feature Structures
in v3 are the JCas classes; switching to other implementations would mean the
PEAR couldn't "see" the FeatureStructures.  I'm thinking of replacing this with
an approach to attempt to merge any JCas classes in the PEAR classpath, with
those outside of the PEAR, and have just one definition for the whole pipeline.
There's also a thought to allow "local-to-the PEAR" versions of things - this
means no data comes in or goes out of the PEAR, for those particular classes.

-Marshall

On 8/25/2015 4:51 AM, Richard Eckart de Castilho wrote:
> Hm... well, if UIMA by default respected the thread classloader as all kinds of other frameworks, then I guess this would be a non-issue: uimaFIT wouldn't have to do this ResourceManagerCreator thing.
>
> So if the classloader is local to a single JCas instance, why does it get switched while a pipeline is being executed? 
>
> How about you rewrite your test using createEngineDescription() and either SimplePipeline or you create an AAE from your individual engines, instantiate then and call process() on it once?
>
> Cheers,
>
> -- Richard
>
> On 25.08.2015, at 10:31, Peter Klügl <pe...@averbis.com> wrote:
>
>> Hi,
>>
>> tested it without uimaFIT, works perfectly fine.
>>
>> Am 25.08.2015 um 10:18 schrieb Richard Eckart de Castilho:
>>> You can implement an alternative ResourceManagerCreator that does not set a new classloader and pass it to uimaFIT using ResourceManagerFactory.setResourceManagerCreator() before you run your application.
>> I try to avoid such things because all users of Ruta would need to
>> remember to do that in their applications. It would just causes problems.
>>
>>> But why is the classloader that uimaFIT sets not adequate?
>> It's adequate, but there is line 4133 in CASImpl:
>>
>> public void switchClassLoader(ClassLoader newClassLoader) {
>>    if (null == newClassLoader) { // is null if no cl set
>>      return;
>>    }
>>    if (newClassLoader != this.svd.jcasClassLoader) {                
>>          <-   Line 4133
>>      // System.out.println("Switching to new class loader");
>>      this.svd.jcasClassLoader = newClassLoader;
>>      if (null != this.jcas) {
>>        ((JCasImpl) this.jcas).switchClassLoader(newClassLoader);
>>      }
>>    }
>>  }
>>
>> which causes the drop of all cached JCas cover class instances if the
>> class loader instance changes.
>>
>> Best,
>>
>> Peter
>>
>>
>>
>>> For a bit of history on why this classloader is set, see:
>>>
>>>  https://issues.apache.org/jira/browse/UIMA-3692
>>>
>>> -- Richard
>>>
>>> On 25.08.2015, at 10:13, Peter Klügl <pe...@averbis.com> wrote:
>>>
>>>> Here's what happens:
>>>>
>>>> uimaFIT creates a new resource manager and sets the extension classpath,
>>>> which causes the creation of a new UIMAClassLoader.
>>>>
>>>> ResourceManager_impl.setExtensionClassPath(ClassLoader, String, boolean)
>>>> line: 229   
>>>> ResourceManagerFactory$DefaultResourceManagerCreator.newResourceManager() line:
>>>> 62   
>>>> ResourceManagerFactory.newResourceManager() line: 42   
>>>> AnalysisEngineFactory.createEngine(AnalysisEngineDescription, Object...)
>>>> line: 205   
>>>> AnalysisEngineFactory.createEngine(Class<AnalysisComponent>, Object...)
>>>> line: 281   
>>>> StackedScriptsTest.test() line: 43   
>>>>
>>>> I am not yet sure how we can/should solve this problem...
>>>>
>>>> Best,
>>>>
>>>> Peter
>


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Richard Eckart de Castilho <re...@apache.org>.
Hm... well, if UIMA by default respected the thread classloader as all kinds of other frameworks, then I guess this would be a non-issue: uimaFIT wouldn't have to do this ResourceManagerCreator thing.

So if the classloader is local to a single JCas instance, why does it get switched while a pipeline is being executed? 

How about you rewrite your test using createEngineDescription() and either SimplePipeline or you create an AAE from your individual engines, instantiate then and call process() on it once?

Cheers,

-- Richard

On 25.08.2015, at 10:31, Peter Klügl <pe...@averbis.com> wrote:

> Hi,
> 
> tested it without uimaFIT, works perfectly fine.
> 
> Am 25.08.2015 um 10:18 schrieb Richard Eckart de Castilho:
>> You can implement an alternative ResourceManagerCreator that does not set a new classloader and pass it to uimaFIT using ResourceManagerFactory.setResourceManagerCreator() before you run your application.
> 
> I try to avoid such things because all users of Ruta would need to
> remember to do that in their applications. It would just causes problems.
> 
>> But why is the classloader that uimaFIT sets not adequate?
> 
> It's adequate, but there is line 4133 in CASImpl:
> 
> public void switchClassLoader(ClassLoader newClassLoader) {
>    if (null == newClassLoader) { // is null if no cl set
>      return;
>    }
>    if (newClassLoader != this.svd.jcasClassLoader) {                
>          <-   Line 4133
>      // System.out.println("Switching to new class loader");
>      this.svd.jcasClassLoader = newClassLoader;
>      if (null != this.jcas) {
>        ((JCasImpl) this.jcas).switchClassLoader(newClassLoader);
>      }
>    }
>  }
> 
> which causes the drop of all cached JCas cover class instances if the
> class loader instance changes.
> 
> Best,
> 
> Peter
> 
> 
> 
>> For a bit of history on why this classloader is set, see:
>> 
>>  https://issues.apache.org/jira/browse/UIMA-3692
>> 
>> -- Richard
>> 
>> On 25.08.2015, at 10:13, Peter Klügl <pe...@averbis.com> wrote:
>> 
>>> Here's what happens:
>>> 
>>> uimaFIT creates a new resource manager and sets the extension classpath,
>>> which causes the creation of a new UIMAClassLoader.
>>> 
>>> ResourceManager_impl.setExtensionClassPath(ClassLoader, String, boolean)
>>> line: 229   
>>> ResourceManagerFactory$DefaultResourceManagerCreator.newResourceManager() line:
>>> 62   
>>> ResourceManagerFactory.newResourceManager() line: 42   
>>> AnalysisEngineFactory.createEngine(AnalysisEngineDescription, Object...)
>>> line: 205   
>>> AnalysisEngineFactory.createEngine(Class<AnalysisComponent>, Object...)
>>> line: 281   
>>> StackedScriptsTest.test() line: 43   
>>> 
>>> I am not yet sure how we can/should solve this problem...
>>> 
>>> Best,
>>> 
>>> Peter
> 


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Peter Klügl <pe...@averbis.com>.
Hi,

tested it without uimaFIT, works perfectly fine.

Am 25.08.2015 um 10:18 schrieb Richard Eckart de Castilho:
> You can implement an alternative ResourceManagerCreator that does not set a new classloader and pass it to uimaFIT using ResourceManagerFactory.setResourceManagerCreator() before you run your application.

I try to avoid such things because all users of Ruta would need to
remember to do that in their applications. It would just causes problems.

> But why is the classloader that uimaFIT sets not adequate?

It's adequate, but there is line 4133 in CASImpl:

public void switchClassLoader(ClassLoader newClassLoader) {
    if (null == newClassLoader) { // is null if no cl set
      return;
    }
    if (newClassLoader != this.svd.jcasClassLoader) {                
          <-   Line 4133
      // System.out.println("Switching to new class loader");
      this.svd.jcasClassLoader = newClassLoader;
      if (null != this.jcas) {
        ((JCasImpl) this.jcas).switchClassLoader(newClassLoader);
      }
    }
  }

which causes the drop of all cached JCas cover class instances if the
class loader instance changes.

Best,

Peter



> For a bit of history on why this classloader is set, see:
>
>   https://issues.apache.org/jira/browse/UIMA-3692
>
> -- Richard
>
> On 25.08.2015, at 10:13, Peter Klügl <pe...@averbis.com> wrote:
>
>> Here's what happens:
>>
>> uimaFIT creates a new resource manager and sets the extension classpath,
>> which causes the creation of a new UIMAClassLoader.
>>
>> ResourceManager_impl.setExtensionClassPath(ClassLoader, String, boolean)
>> line: 229   
>> ResourceManagerFactory$DefaultResourceManagerCreator.newResourceManager() line:
>> 62   
>> ResourceManagerFactory.newResourceManager() line: 42   
>> AnalysisEngineFactory.createEngine(AnalysisEngineDescription, Object...)
>> line: 205   
>> AnalysisEngineFactory.createEngine(Class<AnalysisComponent>, Object...)
>> line: 281   
>> StackedScriptsTest.test() line: 43   
>>
>> I am not yet sure how we can/should solve this problem...
>>
>> Best,
>>
>> Peter


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Richard Eckart de Castilho <re...@apache.org>.
You can implement an alternative ResourceManagerCreator that does not set a new classloader and pass it to uimaFIT using ResourceManagerFactory.setResourceManagerCreator() before you run your application.

But why is the classloader that uimaFIT sets not adequate?

For a bit of history on why this classloader is set, see:

  https://issues.apache.org/jira/browse/UIMA-3692

-- Richard

On 25.08.2015, at 10:13, Peter Klügl <pe...@averbis.com> wrote:

> Here's what happens:
> 
> uimaFIT creates a new resource manager and sets the extension classpath,
> which causes the creation of a new UIMAClassLoader.
> 
> ResourceManager_impl.setExtensionClassPath(ClassLoader, String, boolean)
> line: 229   
> ResourceManagerFactory$DefaultResourceManagerCreator.newResourceManager() line:
> 62   
> ResourceManagerFactory.newResourceManager() line: 42   
> AnalysisEngineFactory.createEngine(AnalysisEngineDescription, Object...)
> line: 205   
> AnalysisEngineFactory.createEngine(Class<AnalysisComponent>, Object...)
> line: 281   
> StackedScriptsTest.test() line: 43   
> 
> I am not yet sure how we can/should solve this problem...
> 
> Best,
> 
> Peter


Re: JCas cover class instance fields, classloaders and uimaFIT

Posted by Peter Klügl <pe...@averbis.com>.
Here's what happens:

uimaFIT creates a new resource manager and sets the extension classpath,
which causes the creation of a new UIMAClassLoader.

ResourceManager_impl.setExtensionClassPath(ClassLoader, String, boolean)
line: 229   
ResourceManagerFactory$DefaultResourceManagerCreator.newResourceManager() line:
62   
ResourceManagerFactory.newResourceManager() line: 42   
AnalysisEngineFactory.createEngine(AnalysisEngineDescription, Object...)
line: 205   
AnalysisEngineFactory.createEngine(Class<AnalysisComponent>, Object...)
line: 281   
StackedScriptsTest.test() line: 43   

I am not yet sure how we can/should solve this problem...

Best,

Peter



Am 25.08.2015 um 09:53 schrieb Peter Klügl:
> Hi,
>
> nope, no PEARs used, just a simple junit test (with uimaFIT). I added
> the junit test code below...
>
> Yes, the classloaders are actually not the same...
>
> CASImpl line 4133:
> svd.jcasClassLoader: sun.misc.Launcher$AppClassLoader
> newClassLoader:org.apache.uima.internal.util.UIMAClassLoader
>
> I'll investigate where they come from...
>
> Best,
>
> Peter
>
> https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core/src/test/java/org/apache/uima/ruta/engine/StackedScriptsTest.java
>
> ...
>
>     String rules1 = "CW{->T1};";
>     String rules2 = "T1 W{->T2} W{->T3};";
>     String rules3 = "W{PARTOF({T1,T2,T3})->T4};";
>
>     AnalysisEngine rutaAE1 = createEngine(RutaEngine.class,
> RutaEngine.PARAM_RULES, rules1);
>     AnalysisEngine rutaAE2 = createEngine(RutaEngine.class,
> RutaEngine.PARAM_RULES, rules2);
>     AnalysisEngine rutaAE3 = createEngine(RutaEngine.class,
> RutaEngine.PARAM_RULES, rules3);
>
>     StringBuilder sb = new StringBuilder();
>     for (int i = 0; i < LINES; i++) {
>       sb.append(DOC_TEXT);
>       sb.append("\n");
>     }
>     CAS cas = RutaTestUtils.getCAS(sb.toString());
>
>     rutaAE1.process(cas);
>     rutaAE2.process(cas);
>     rutaAE3.process(cas);
>
> ...
>
>
> Am 24.08.2015 um 21:03 schrieb Marshall Schor:
>> are you using the PEAR class path isolation mechanism?
>>
>> Or, to put it another way, does the argument to line 382 always return the same
>> value?  If not, then that is why you're losing the JCas cached values...
>>
>> Since you say that is what's happening, how come there's a separate class loader
>> being used? The purpose of this was to allow allow different definitions of JCas
>> cover classes to co-exist.  When you crossed a boundary into a PEAR, it would
>> switch the class loader, and switch the JCas Cache as well (since the cover
>> class implementations could well be different).
>>
>> -Marshall
>>
>> On 8/24/2015 12:47 PM, Peter Klügl wrote:
>>> My investigations so far:
>>>
>>> line 382 in PrimitiveAnalysisEngine_impl
>>> ((CASImpl)view).switchClassLoaderLockCasCL(this.getResourceManager().getExtensionClassLoader());
>>>
>>> causes the creation of new JCasHashMaps and new JCasHashMapSubMaps and
>>> thus the table field is empty again for each analysis engine -> the JCas
>>> cover class instance is created anew with empty fields.
>>>
>>> Best,
>>>
>>> Peter
>>>
>>> Am 24.08.2015 um 17:11 schrieb Peter Klügl:
>>>> The code is of course in the current trunk of ruta-core ...
>>>> ... and I do not expect you to run it but any help is appreciated ;-)
>>>>
>>>> Best,
>>>>
>>>> Peter
>>>>
>>>> Am 24.08.2015 um 17:08 schrieb Peter Klügl:
>>>>> Here's my test bed:
>>>>>
>>>>> run the unit test:
>>>>> org.apache.uima.ruta.engine.StackedScriptsTest
>>>>>
>>>>> There should be some logging output like the following.
>>>>> There is a log for the first RutaBasic (begin/end/addr) and for the
>>>>> content of one of its fields (beginMap), for the begin of the process
>>>>> method and after the basics are initialized (when the information is
>>>>> recreated/the map (actually arrays) are filled again).
>>>>>
>>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(702)
>>>>> INFO: begin of process : CW{->T1}; - no RutaBasic yet
>>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: after initBasics : CW{->T1}; - size of beginMap: 8
>>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: begin of process : T1 W{->T2} W{->T3}; - size of beginMap: 0
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: after initBasics : T1 W{->T2} W{->T3}; - size of beginMap: 10
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: begin of process : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 0
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: after initBasics : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 10
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>> Peter
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Am 24.08.2015 um 16:37 schrieb Peter Klügl:
>>>>>> That's what I did many years ago (maybe 2008/2009)...
>>>>>>
>>>>>> I thought that this has worked some time ago, but right now the maps are
>>>>>> always empty for the next analysis engine.
>>>>>>
>>>>>> I will clean up my test bed and will point to a reproducible example.
>>>>>>
>>>>>> Where do I disable the JCas caching (just in case I did that by accident)?
>>>>>>
>>>>>> Rigth now, the information is always recreated in Ruta, but that is what
>>>>>> I want to avoid in future, at least for some use cases. I have to
>>>>>> remember to still support the remote scenario then.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Peter
>>>>>>
>>>>>>
>>>>>> Am 24.08.2015 um 16:11 schrieb Marshall Schor:
>>>>>>> I think you're on the right track.
>>>>>>>
>>>>>>> You can add additional fields to your generated JCas cover class, such as
>>>>>>> something like a Java Hash Map.
>>>>>>> Provided your users haven't disabled the JCas caching, this will work.
>>>>>>>
>>>>>>> Some caveats:
>>>>>>>
>>>>>>> In the general UIMA design, any particular part of a pipeline is supposed to be
>>>>>>> "remotable" - that is, converted to a service call to an external service.  When
>>>>>>> this is done, the CAS is "serialized" to the remote.  This serialization won't
>>>>>>> serialize any of the additional custom fields you may have added to your
>>>>>>> JCasGen'd cover class definition.  One way around this is to have a fall-back
>>>>>>> which recreates the info if not present.
>>>>>>>
>>>>>>> The same "serialization" issue applies if you manually serialize the Cas to some
>>>>>>> file.
>>>>>>>
>>>>>>> Would this approach fit your situation?  If not, please explain a bit more
>>>>>>> detail (e.g., why it doesn't fit... :-) ).
>>>>>>>
>>>>>>> -Marshall
>>>>>>>
>>>>>>>
>>>>>>> On 8/24/2015 9:53 AM, Peter Klügl wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> first of all, thanks Marshall :-)
>>>>>>>>
>>>>>>>> Am 24.08.2015 um 15:08 schrieb Marshall Schor:
>>>>>>>>> I assume you talking about this.svd.useFSCache :-) This has been disabled for as
>>>>>>>>> long as I can recall. 
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>> There is currently no option to cache FSs for just some types, other than to
>>>>>>>>> create a JCas cover class for those types and run with JCas enabled.
>>>>>>>> Let me rephrase it: Is it a realistic option for us to introduce
>>>>>>>> something like that?
>>>>>>>>
>>>>>>>> What do you mean with the second part of the sentence? I am currently
>>>>>>>> looking for ways to share information for the same CAS between analysis
>>>>>>>> engines. Should it be possible to use normal java fields of JCas cover
>>>>>>>> classes for this purpose? My annotations are recreated all the time and
>>>>>>>> thus I am loosing the field values ...
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Peter
>>>>>>>>
>>>>>>>>
>>>>>>>>> -Marshall
>>>>>>>>>
>>>>>>>>> On 8/24/2015 7:42 AM, Peter Klügl wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> what is the current status on FS caching in svd? The comment says that
>>>>>>>>>> it is not maintained. If activated, an NPE is thrown because the fsArray
>>>>>>>>>> was never initialized. This could be solved by initializing it with a
>>>>>>>>>> non-empty array. (I could create an issue and fix it, if wanted).
>>>>>>>>>>
>>>>>>>>>> In my current (extremely restricted) test bed, the memory consumption
>>>>>>>>>> and runtime drop both by about 30% with fs caching.
>>>>>>>>>>
>>>>>>>>>> I do not have a overview yet: Could there be problems with other parts
>>>>>>>>>> of UIMA if we use the caching?
>>>>>>>>>>
>>>>>>>>>> with a big Ruta hat on:
>>>>>>>>>> Is it an option for us to active the caching on the fly for a specific
>>>>>>>>>> type only?
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>> Peter
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>