You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@groovy.apache.org by Alain Stalder <as...@span.ch> on 2016/05/13 00:22:02 UTC

Improve Groovy class loading performance and memory management

Hi everybody

I had considered to submit a major feature request for Groovy with the 
text further below, because it appears to me that there might possibly 
be considerable room for improvement, but since I am not developing 
Groovy myself I thought that maybe I would just present it here and if 
people think such a feature request (or a somewhat similar one) would 
still make sense, it could still be created (by me or someone else)...

---
Improve Groovy class loading performance and memory management

Consider the maybe most typical use case of Groovy as a dynamic 
language: Within a Java VM, Groovy scripts are dynamically compiled, be 
it explicitly by using a GroovyShell or a GroovyClassLoader, or more 
implictly e.g. with a ConfigSlurper.

Qualitatively this often has the following result in the Java VM: 
Metaspace resp. PermGen, and Heap in parallel, just grow until a 
configured limit is reached (and note that there is none by default for 
Metaspace in Java 8 and later), often only then is it garbage collected. 
With Java classes, at least with simple ones, this looks often 
different, those appear to be garbage collected much more quickly.

Another qualitative difference is that loading a Groovy class and 
instantiating it seems typically to be considerably slower than 
instantiating a Java class with similar functionality, even quite 
drastically so, more than one would expect even considering the need to 
create metadata for dynamic function calls etc.

At least that has been my experience over the past few years.

Seen from very far ("management", "marketing", "sales"), this not 
optimal for a dynamic language, getting these two things right - memory 
management and class loading performace -  is arguably "core business".

Obviously Groovy classes are more complex than most Java classes and 
there is existing code and limitations of Java VMs and probably more 
issues...

But in the medium term, I think and feel that it would be good to find 
out what really impedes better performance and a more regular garbage 
collection. Possibly the problems are really hard and complex, but some 
might also be easy, and maybe some would become quite easy after some 
simple changes in the Java VM, but before such changes can be requested, 
it has to be analyzed what these could or should be.

Attached is a simple command line Java test program, ClassGCTester.java, 
that can be used to make some very basic measurements of class loading 
memory behavior and performance. The source code and a readme with two 
examples (one for Java and one for Groovy) is also at 
https://github.com/jexler/classgc, and I think it might help to 
illustrate current limitations and maybe also a little bit initially 
with analysis.
---

Note the tool at https://github.com/jexler/classgc and the readme with 
two illustrative examples - one Java class, one Groovy class - there...

Alain



Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
Ah, no, not with ClassValue, sorry for the spam...

On 16.05.16 13:37, Alain Stalder wrote:
> Ah, I think now I am getting it... In order to cache ClassInfo you 
> need a key that identifies the class...
>
> On 16.05.16 12:46, Alain Stalder wrote:
>> My argument is still the same: ClassInfo (or other assiociated 
>> metadata) only makes sense if you have your hands on a class (or an 
>> instance of it) to apply it to. The one who wants to do something 
>> with the class/instance has it and in principle can pass it down to 
>> ClassInfo in order to extract whatever is needed. If there is no 
>> "client" with a class/instance, there is no need to create ClassInfo 
>> (or similar). And if the class is garbage collected, automatically 
>> ClassInfo cannot be accessed with such queries any more, and then 
>> also the JVM bug with ClassValue would no longer affect Groovy, 
>> ClassValue could be used again by default. 
>
>


Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
Ah, I think now I am getting it... In order to cache ClassInfo you need 
a key that identifies the class...

On 16.05.16 12:46, Alain Stalder wrote:
> My argument is still the same: ClassInfo (or other assiociated 
> metadata) only makes sense if you have your hands on a class (or an 
> instance of it) to apply it to. The one who wants to do something with 
> the class/instance has it and in principle can pass it down to 
> ClassInfo in order to extract whatever is needed. If there is no 
> "client" with a class/instance, there is no need to create ClassInfo 
> (or similar). And if the class is garbage collected, automatically 
> ClassInfo cannot be accessed with such queries any more, and then also 
> the JVM bug with ClassValue would no longer affect Groovy, ClassValue 
> could be used again by default. 


Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
Maybe that explains the general idea better although a bit abstractly:

classValue.get(clazz).getOrCreateClassInfo(clazz).getOrCreateMetaClass(clazz).doWhatever(clazz-if-needed).getSomeMoreIfNeeded(clazz-if-needed)

Don't store the class in any objects of that chain, get the associated 
object only from ClassValue. Maybe that is too limiting in practice, 
maybe it just requires to think a bit differently, I don't know.

Alain

On 16.05.16 14:01, Jochen Theodorou wrote:
>> Which brings my mind back to my question regarding whether it is "good
>> architecture" to have a reference to the class in ClassInfo (or any
>> other metadata associated with a class) - again, I mean fundamentally,
>> independently of whether this is an option for a Groovy 2.4.7 or even
>> anything before a Groovy 3, because I fear it would likely require to
>> change several Groovy APIs and internals.
>
> ok, let�s assume the ClassInfo does not reference the class, then as 
> soon as you have a MetaClass, you have a reference to the class again. 
> If not there, then in the method accessors...
>


Re: Improve Groovy class loading performance and memory management

Posted by Jochen Theodorou <bl...@gmx.org>.
On 16.05.2016 12:46, Alain Stalder wrote:
> In order to get a better understanding, I made two configurable changes
> in ClassInfo, in a branch from the GROOVY_2_4_6 tag (ClassInfo is still
> practically the same in the GROOVY_2_4_X branch):
>
> - -Dgctest.classreftype=(hard|soft|weak|phantom), where hard=as today,
> soft=SoftReference
> - -Dgctest.cacheclassvalue=(true|false), if true and using ClassValue,
> then do not cache it
>
> See here:
> https://github.com/jexler/groovy/compare/GROOVY_2_4_6...jexler:f92c2866653208ad68db5580b5bf9febc347fe1d
>
>
> Compiled Groovy JAR:
> https://www.jexler.net/groovy-2.4.6-gctest.jar
[...]
> Then I ran a full matrix of tests:
>
> ------------------------------------------------------------------------
>   same loader | use class value | cache class value | hard | soft | weak
> ------------------------------------------------------------------------
>   YES         | YES             | YES               | FAIL | FAIL | FAIL
>   YES         | YES             | NO                | FAIL | FAIL | FAIL
>   YES         | NO              | --                |  OK  |  OK* | OK*
> ------------------------------------------------------------------------
>   NO          | YES             | YES               |  OK  |  OK* | OK*
>   NO          | YES             | NO                |  OK  |  OK* | OK*
>   NO          | NO              | --                | FAIL |  OK* | OK*
> ------------------------------------------------------------------------
>
> - "same loader" <=> java [opts] -XX:MaxMetaspaceSize=64m -Xmx512m -cp .
> ClassGCTester -cp groovy-2.4.6-gctest.jar:filling/ -parent null -classes
> GroovyFilling
> - not "same loader" <=> java [opts] -XX:MaxMetaspaceSize=64m -Xmx512m
> -cp .:groovy-2.4.6-gctest.jar ClassGCTester -cp filling/ -parent tester
> -classes GroovyFilling
> - "use class value" <=> -Dgroovy.use.classvalue=<true|false>
> - "cache class value" <=> -Dgctest.cacheclassvalue=<true|false>
> - "hard"|"soft"|"weak" <=> -Dgctest.classreftype=<hard|soft|weak>
>
> * Garbage collection in all cases still only when the limit on Metaspace
> or Heap is reached.
>
> So:
> - Caching ClassValue or not made no difference.
> - Using weak oder soft references did not help when using ClassValue.
> - When not using ClassValue, using weak or soft references helped. :)

Even if hard references had been used everywhere, it is only for a 
single iteration. It means unless data leaks into a more global 
structure, it must be collectable. So the non-ClassValue version working 
in this scenario is no sign of correctness for the memory awareness of 
the used structures at all.


> Actually the latter is also reflected (as I noticed in retrospect) by
> the pull request by John Wagenleitner for "GROOVY-7683 - Memory leak
> when using Groovy as JSR-223 scripting language":
> https://github.com/apache/groovy/pull/219/files
>
> There a WeakReference is used.

yes, I think that is something we should do.

> Which brings my mind back to my question regarding whether it is "good
> architecture" to have a reference to the class in ClassInfo (or any
> other metadata associated with a class) - again, I mean fundamentally,
> independently of whether this is an option for a Groovy 2.4.7 or even
> anything before a Groovy 3, because I fear it would likely require to
> change several Groovy APIs and internals.

ok, let�s assume the ClassInfo does not reference the class, then as 
soon as you have a MetaClass, you have a reference to the class again. 
If not there, then in the method accessors...

bye Jochen


Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
In order to get a better understanding, I made two configurable changes 
in ClassInfo, in a branch from the GROOVY_2_4_6 tag (ClassInfo is still 
practically the same in the GROOVY_2_4_X branch):

- -Dgctest.classreftype=(hard|soft|weak|phantom), where hard=as today, 
soft=SoftReference
- -Dgctest.cacheclassvalue=(true|false), if true and using ClassValue, 
then do not cache it

See here:
https://github.com/jexler/groovy/compare/GROOVY_2_4_6...jexler:f92c2866653208ad68db5580b5bf9febc347fe1d

Compiled Groovy JAR:
https://www.jexler.net/groovy-2.4.6-gctest.jar

First thing I learned was that you cannot get the value of a 
PhantomReference, it always returns null, by design. From its Javadoc: 
"In order to ensure that a reclaimable object remains so, the referent 
of a phantom reference may not be retrieved: The get method of a phantom 
reference always returns null."

(By the way, this very probably means that the already existing 
PhantomReference myThread in ClassInfo makes no sense.)

Then I ran a full matrix of tests:

------------------------------------------------------------------------
  same loader | use class value | cache class value | hard | soft | weak
------------------------------------------------------------------------
  YES         | YES             | YES               | FAIL | FAIL | FAIL
  YES         | YES             | NO                | FAIL | FAIL | FAIL
  YES         | NO              | --                |  OK  |  OK* | OK*
------------------------------------------------------------------------
  NO          | YES             | YES               |  OK  |  OK* | OK*
  NO          | YES             | NO                |  OK  |  OK* | OK*
  NO          | NO              | --                | FAIL |  OK* | OK*
------------------------------------------------------------------------

- "same loader" <=> java [opts] -XX:MaxMetaspaceSize=64m -Xmx512m -cp . 
ClassGCTester -cp groovy-2.4.6-gctest.jar:filling/ -parent null -classes 
GroovyFilling
- not "same loader" <=> java [opts] -XX:MaxMetaspaceSize=64m -Xmx512m 
-cp .:groovy-2.4.6-gctest.jar ClassGCTester -cp filling/ -parent tester 
-classes GroovyFilling
- "use class value" <=> -Dgroovy.use.classvalue=<true|false>
- "cache class value" <=> -Dgctest.cacheclassvalue=<true|false>
- "hard"|"soft"|"weak" <=> -Dgctest.classreftype=<hard|soft|weak>

* Garbage collection in all cases still only when the limit on Metaspace 
or Heap is reached.

So:
- Caching ClassValue or not made no difference.
- Using weak oder soft references did not help when using ClassValue.
- When not using ClassValue, using weak or soft references helped. :)

Actually the latter is also reflected (as I noticed in retrospect) by 
the pull request by John Wagenleitner for "GROOVY-7683 - Memory leak 
when using Groovy as JSR-223 scripting language": 
https://github.com/apache/groovy/pull/219/files

There a WeakReference is used.

Which brings my mind back to my question regarding whether it is "good 
architecture" to have a reference to the class in ClassInfo (or any 
other metadata associated with a class) - again, I mean fundamentally, 
independently of whether this is an option for a Groovy 2.4.7 or even 
anything before a Groovy 3, because I fear it would likely require to 
change several Groovy APIs and internals.

If using now a WeakReference or SoftReferencefor the class reference in 
ClassInfo instead of a hard reference, you now have to handle the case 
where the class is already null because it has been garbage collected. 
(Actually this is in principle more likely with a WeakReference than 
with a SoftReference, so I would rather tend to favor SoftReference 
because class GC so far only kicks in when a memory limit is reached 
anyway, but likely it makes no difference in practice exactly for the 
same reason. Actually, this may even save the situation, maybe in 
practice you never get the Reference to return null because classes and 
ClassInfo are only garbage collected together when the memory limit is 
reached in a Java VM that does nothing else then, but I am not sure...)

My argument is still the same: ClassInfo (or other assiociated metadata) 
only makes sense if you have your hands on a class (or an instance of 
it) to apply it to. The one who wants to do something with the 
class/instance has it and in principle can pass it down to ClassInfo in 
order to extract whatever is needed. If there is no "client" with a 
class/instance, there is no need to create ClassInfo (or similar). And 
if the class is garbage collected, automatically ClassInfo cannot be 
accessed with such queries any more, and then also the JVM bug with 
ClassValue would no longer affect Groovy, ClassValue could be used again 
by default.

But I don't want to make too much of this.

Using a WeakReference or SoftReference for the class reference in 
ClassInfo would already be step forward, at least no better realistic 
ideas from my side at the moment...

Alain

On 15.05.16 12:37, Jochen Theodorou wrote:
> On 15.05.2016 10:39, Alain Stalder wrote:
>> Thanks, that clarifies a lot to me, especially SoftReference.
>>
>> So with Groovy it is only realistic to have GC of classes (and attached
>> ClassInfo) kick in once a limit on Metaspace/PermGen (or Heap) is
>> reached - fine with me, no point to try to "outrun the bear"... :)
>
> well... I do think the ClassValue version should not have this 
> behaviour. But for this I think we would have to ensure not to keep 
> any references to the ClassValue anywhere in a global strucutre. Not 
> even as a WeakReference... PhantomReference would probably be ok... 
> but I find the usages for PhantomReferences quite rare...and not 
> fitting here I guess
>
>> A general question (current implementation and most likely APIs to keep
>> aside): Why does ClassInfo need a reference to the class? To me the use
>> case would be that you have an Groovy object or a Groovy class and want
>> to do something with it (call a static or instance method, for example),
>> so you only need to find ClassInfo from the class and then maybe pass
>> the class temporarily just for doing things, but don't need it a
>> reference back from ClassInfo.
>
> ClassInfo represents a cached reflective information of a Class, plus 
> some more internal stuff. To create that structure you need the Class. 
> And if you do not want to do it eager, you need to keep a reference... 
> at least till after init. Of course that does not have to be a 
> SoftReference.
>
> [...]
>> This allows, for example, two produce two of the known
>> "OutOfMemoryError: Metaspace|PermGen" issues with Groovy 2.4.6, as 
>> follows.
> [...]
>
> good job
>
> bye Jochen
>
>


Re: Improve Groovy class loading performance and memory management

Posted by Jochen Theodorou <bl...@gmx.org>.
On 15.05.2016 10:39, Alain Stalder wrote:
> Thanks, that clarifies a lot to me, especially SoftReference.
>
> So with Groovy it is only realistic to have GC of classes (and attached
> ClassInfo) kick in once a limit on Metaspace/PermGen (or Heap) is
> reached - fine with me, no point to try to "outrun the bear"... :)

well... I do think the ClassValue version should not have this 
behaviour. But for this I think we would have to ensure not to keep any 
references to the ClassValue anywhere in a global strucutre. Not even as 
a WeakReference... PhantomReference would probably be ok... but I find 
the usages for PhantomReferences quite rare...and not fitting here I guess

> A general question (current implementation and most likely APIs to keep
> aside): Why does ClassInfo need a reference to the class? To me the use
> case would be that you have an Groovy object or a Groovy class and want
> to do something with it (call a static or instance method, for example),
> so you only need to find ClassInfo from the class and then maybe pass
> the class temporarily just for doing things, but don't need it a
> reference back from ClassInfo.

ClassInfo represents a cached reflective information of a Class, plus 
some more internal stuff. To create that structure you need the Class. 
And if you do not want to do it eager, you need to keep a reference... 
at least till after init. Of course that does not have to be a 
SoftReference.

[...]
> This allows, for example, two produce two of the known
> "OutOfMemoryError: Metaspace|PermGen" issues with Groovy 2.4.6, as follows.
[...]

good job

bye Jochen


Re: Improve Groovy class loading performance and memory management

Posted by John Wagenleitner <jo...@gmail.com>.
On Tue, May 17, 2016 at 12:48 AM, Alain Stalder <as...@span.ch> wrote:

>
> On 17.05.16 09:04, Alain Stalder wrote:
>
> PS: Note that Introspector.flushFromCaches(clazz) was experimentally
> really not necessary in this case, but maybe has to do with the simple
> nature of the test script ("42") and only calling a (no-args)
> constructor... In any case very promising...
>
>
> Ah, that's simply because it is already called in
> InvokerHelper.removeClass():
>
>     public static void removeClass(Class clazz) {
>         metaRegistry.removeMetaClass(clazz);
>         ClassInfo.remove(clazz);
>         Introspector.flushFromCaches(clazz);
>     }
>
> Experimentally, for the test with ClassGCTester, the first call
> (metaRegistry.removeMetaClass(clazz)) was not necessary to have garbage
> collection before Metaspace reaches the maximum, the other two were.
>
>

I believe the removeMetaClass call is only there in case the metaclass
changed.  Any added methods cause the weak metaclass to be replaced by a
strong metaclass (ExpandoMetaClass) and that has a strong ref to the class
requiring removing the metaclass in order to allow GC to work.

Re: Improve Groovy class loading performance and memory management

Posted by John Wagenleitner <jo...@gmail.com>.
On Tue, May 17, 2016 at 12:23 PM, Alain Stalder <as...@span.ch> wrote:

> [...]
>
> As I said, so far rather a hack, probably better to reimplement the
> GroovyClassValuePreJava7 class instead? Performance under concurrent use?
> Are other caches that apparently exist in ClassInfo also no issue under
> different circumstances? (And at some point: does it work across VMs and
> OSes etc.?)
>


I think performance in general and not just under concurrent use is
extremely important for ClassInfo.  My understanding is that the static
cache it holds of ClassInfo's is queried on every method call (at least in
dynamic groovy).  That is probably why the current hash-based caches are
used to save from the O(n) retrieval from a globalClassSet which is
implemented as a linked list.

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
jwagenleitner wrote:
 > I think performance in general and not just under concurrent use is 
extremely important for ClassInfo.  My understanding is that the static 
cache it holds of ClassInfo's is queried on every method call (at least 
in dynamic groovy).  That is probably why the current hash-based caches 
are used to save from the O(n) retrieval from a globalClassSet which is 
implemented as a linked list.

Too bad, I see no way at the moment how to get "on-the-fly" garbage 
collection of ClassInfo and fast concurrent access to it.

What I quickly tried  was to change the type of object stored with 
globalClassValue from ClassInfo to other types, but all failed in some 
ways (I did not test the full matrix of (Java 7 ClassValue or not) and 
(same class loader as Groovy for compiled Groovy script or not), but 
gave up once something failed):
- WeakReference<ClassInfo>: Garbage collected even if the class is still 
referenced.
- SoftReference<ClassInfo>: OutOfMemoryError but took longer than usual.
- WeakHashMap<Class,ClassInfo>: i.e. a WeakHashMap with just a single 
entry (so no synchronization needed) - that was actually my biggest 
hope, but: OutOfMemoryError.

jwagenleitner wrote:
 >>> So why not have the GroovyClassLoader keep a set of all classes it
 >>> compiled itself and were loaded and offer a new ~
 >>> GroovyClassLoader#finalCleanup() method that removes meta information
 >>> for all these classes so that they would become immediately 
eligible for
 >>> garbage collection? (I guess InvokerHelper.removeClass(clazz) and
 >>> Introspector.flushFromCaches(clazz), but whatever is needed...)
 >>
 >> GroovyClassLoader (GCL) actually represents a tree of class loaders. 
for each compilation GCL will spawn an instance of InnerLoader. Since 
two different compilations are supposed to know each others
classes a list of classes is kept in GCL itself (see classCache). The 
inner loader itself is not referenced by GCL. Because of that list GCL 
has the clearCache method to remove classes from previous compilations.
 >>
 >> Why did we use this structure? GCL is supposed to offer you the 
possibility to compile the same class multiple times. That means you 
will get the same class multiple times. At the same time a class must be 
defined under the same name only once in a given defining class loader. 
As a result trying to define a class, that already exists under that 
name results in an error. A classloading constraint is actually to 
return the same class instance each time you request a class with a 
certain name. Is implies the error before.... it also means GCL is 
breaking those constraints knowingly.
 >>
 >> Anyway... I think such a cleanup method is misplaced in GCL, since 
it spans beyond the classloader... how about GroovySystem?
 >>
 > I agree that if a method were added I don't think GCL is the right 
place and that something like GroovySystem#removeClass(Class) or 
GroovySystem#flushFromCaches(Class) would be good.

I guess this would help a little, but - again - as soon as you use e.g. 
a closure in a script, you have more than one class from a compilation, 
and I guess this happens often.

In cases where a script was compiled at runtime, that would have to be 
by a GroovyClassLoader$InnerLoader (which extends GroovyClassLoader). 
Now, that $InnerLoader could be obtained with 
script.getClass().getClassLoader() - you could tell so by whether it is 
a GroovyClassLoader#InnerLoader - and tell it to clean up all classes it 
loaded which would all be related to the only script it compiled. If you 
want to offer that - which I think would probably make sense - you would 
have to add a new method to GroovyClassLoader or at least to 
$InnerLoader - like #cleanupCompiledScripts() or whatever - and then 
offer its functionality also from GroovySystem for convenience, probably 
in two variants, one that really only removes just the indicated class 
and one extends to all classes compiled from the same script in case it 
was compiled at runtime from a Groovy script.

This will certainly also not cover all use cases (Groovy classes loaded 
from the file system by an URLClassLoader, for example, there I see now 
way how to track which classes would belong to which main script etc.), 
but I think the use case of scripts compiled at runtime would still 
justify it. It would offer a relatively clean way to make all classes 
from a script compilation available for GC more quickly.

I would maybe also offer a similar method for ConfigSlurper, for 
convenience, because that also implicitly always compiles a Groovy 
script (the config), thus filling Metaspace/PermGen - with almost 
certainly nobody expecting this offhand - so that users would not have 
to explicitly get the class from the parsed object and then call the 
removal function of GroovySystem. (There a different type of loader 
seems to be used, RootLoader, but also usually compilation only gives a 
single class, I would estimate.)

Finally, Groovy class loading is so dynamic/flexible that many things 
become impossible to untangle (I am just waiting now for Jochen 
Theodorou to say that some things I wrote above are not always like 
that), so if my hopes really fade regarding a prospect for "on-the-fly" 
GC in the relatively near future (before a Groovy 3), I might consider 
to add similar cleanup functionality to Grengine, where I estimate it 
could be done in a much more structured and controlled way.

Script:

def script = new GroovyShell().parse("99")
println script.getClass().getClassLoader()

def a = new ConfigSlurper().parse("b{c=5}")
assert a.b.c == 5
println a.getClass().getClassLoader()

Output:

groovy.lang.GroovyClassLoader$InnerLoader@7f010382
org.codehaus.groovy.tools.RootLoader@5451c3a8

Alain





Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
On 17.05.16 21:23, Alain Stalder wrote:
> I managed to make a Groovy version where garbage collection of 
> ClassInfo happens before the limit in Metaspace or Heap is reached (!) 
> - so far it is just a hack, but maybe it can contribute to a solution...
Branch/diff with the hack: 
https://github.com/apache/groovy/compare/master...jexler:classinfo-gc-hack
Compiled JAR: 
https://www.jexler.net/groovy-2.5.0-SNAPSHOT-classinf-gc-hack.jar



Re: Improve Groovy class loading performance and memory management

Posted by Jochen Theodorou <bl...@gmx.org>.
hi all,

I thought I try to give an update on the ClassValue issue. Well.. my 
first suggestion is to really not activate it by default. That is 
because Classvalue  has quite special semantics.

I am trying to explain them a bit here....

so let us define AV as the value computed by a ClassValue aClassValue 
and aClass the class value we want an AV for.

so there is relation (aClass,aClassValue)-> AV

In this relation there will be a strong reference of aClass to AV. 
aClass and aClassValue can be (in theory) collected independent of AV, 
AV can only be collected after aClass or aClassValue have been 
collected. Even if AV references aClass it can still be collected - 
under conditions

No imagine AV is actually called ClassInfo and from our Groovy runtime 
and aClass is Integer. Since it is a system class, aClass will be never 
collected, since aClassValue is an instance of our runtime, its class 
and ClassInfo will have the same class loader. As there is a strong 
reference to ClassInfo, that ClassInfo instance will not be collected. 
And as ClassInfo will stay loaded, so will any class of the groovy 
runtime. And since the class value is usually in a static field, that 
mean class value too. In conclusion that means the runtime will not be 
unloaded at all, class space used up and the final is a OOME.

There is no chance our approach in the current implementation can work. 
To avoid the problem we would have to do a lot of things. first of all, 
we have to avoid having an AV, which is from our runtime. So at the very 
least we would need something like a WeakReference<ClassInfo> computed 
from the classvalue, instead of ClassInfo directly. Next we would have 
to keep a list of ClassInfo to avoid their garbage collection right 
away. And then we would have to find a way to remove entries from there 
upon class removal.

And that�s not all of it :(

bye Jochen



Re: Improve Groovy class loading performance and memory management

Posted by Jochen Theodorou <bl...@gmx.org>.
On 18.05.2016 11:59, Alain Stalder wrote:
> Looking at that code for GlobalClassset below now, the itemsMap is only
> used for two things:
> - put(), where performance is not crucial because it happens only once
> per loaded class (which is relatively expensive anyway)
> - get(), where performance is crucial
> but no iterations or removals etc. necessary.

yes

> Could it work to try get() first without synchronize and if it returns
> null or throws an exception, just try again in a synchronize(itemsMap)
> block? I have no experience with doing such a thing with a
> (Weak)HashMap... Could a synchronized put() fail if there is an
> unsycnchronized get() at the same time?
>
> I think I will try that unless someone tells me it can't work...?

not safe, no. The trouble is that it can work... for you, in your test. 
But there is no guarantee it will still work on a different machine. And 
then things like this can happen: 
https://bz.apache.org/bugzilla/show_bug.cgi?id=50078

What we used to use is ManagedConcurrentMap instead. Or you could try 
using Guava, you can make in there a concurrent weak keyed hashmap. Only 
we cannot use guava for Groovy. The library has just a too high payload

> If it was worth a try:
>
> Any tips on tests I could run to compare performance of Groovy master
> with this branch (and verify that it is thread-safe)?
> I know there are tests in the benchmark directory of the groovy sources
> - which one(s) could I maybe run for this?

those benchmarks are largely number oriented, they won�t help you. You 
could try for example run many scripts concurrently, using the same 
groovy runtime. That usually gives a rough idea about the performance of 
this part of the code. A verification, that it is thread safe you won�t 
get with a benchmark, only a stress test.

bye Jochen

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
Looking at that code for GlobalClassset below now, the itemsMap is only 
used for two things:
- put(), where performance is not crucial because it happens only once 
per loaded class (which is relatively expensive anyway)
- get(), where performance is crucial
but no iterations or removals etc. necessary.

Could it work to try get() first without synchronize and if it returns 
null or throws an exception, just try again in a synchronize(itemsMap) 
block? I have no experience with doing such a thing with a 
(Weak)HashMap... Could a synchronized put() fail if there is an 
unsycnchronized get() at the same time?

I think I will try that unless someone tells me it can't work...?

If it was worth a try:

Any tips on tests I could run to compare performance of Groovy master 
with this branch (and verify that it is thread-safe)?
I know there are tests in the benchmark directory of the groovy sources 
- which one(s) could I maybe run for this?

Actually, the fact that classes would be collected on-the-fly would 
reduce the size of itemsMapm which would in principle shorten access 
times, but maybe not significantly - would have to be seen then...

Alain


On 18.05.16 10:02, Alain Stalder wrote:
>
> On 18.05.16 09:10, Jochen Theodorou wrote:
>>>      private static class GlobalClassSet {
>>>
>>>          //private final ManagedLinkedList<ClassInfo> items = new
>>> ManagedLinkedList<ClassInfo>(weakBundle);
>>>          private final WeakHashMap<Class,WeakReference<ClassInfo>> items 
>>>
>>> = new WeakHashMap<Class,WeakReference<ClassInfo>>();
>>
>> would be actually interesting to keep the list and see if it can 
>> still garbage collect
>>
> Looks like it can. (As I would have expected because 
> ClassInfo.remove(clazz) did not touch that list before and that was 
> sufficient to get GC on-the-fly provided you also do 
> Introspector.flushFromCaches(clazz) ):
>
> --
>     private static class GlobalClassSet {
>
>         private final ManagedLinkedList<ClassInfo> itemsList = new 
> ManagedLinkedList<ClassInfo>(weakBundle);
>         private final WeakHashMap<Class,WeakReference<ClassInfo>> 
> itemsMap = new WeakHashMap<Class,WeakReference<ClassInfo>>();
>
>         public int size(){
>             return values().size();
>         }
>
>         public int fullSize(){
>             return values().size();
>         }
>
>         public Collection<ClassInfo> values(){
>             synchronized(itemsList){
>                 return Arrays.asList(itemsList.toArray(new ClassInfo[0]));
>             }
>         }
>
>         public void add(ClassInfo value){
>             synchronized(itemsList) {
>                 itemsList.add(value);
>             }
>             synchronized(itemsMap) {
>                 itemsMap.put(value.klazz, new 
> WeakReference<ClassInfo>(value));
>             }
>         }
>
>         public ClassInfo get(Class cls) {
>             WeakReference<ClassInfo> ref;
>             synchronized(itemsMap) {
>                 ref = itemsMap.get(cls);
>             }
>             ClassInfo info;
>             if (ref == null) {
>                 //System.out.println("ClassInfo Ref is null: " + 
> cls.getName());
>                 info = new ClassInfo(cls);
>                 synchronized (itemsMap) {
>                     itemsMap.put(cls, new WeakReference<ClassInfo>(info));
>                 }
>                 return info;
>             }
>             info = ref.get();
>             if (info == null) {
>                 //System.out.println("ClassInfo is null: " + 
> cls.getName());
>                 info = new ClassInfo(cls);
>                 itemsMap.put(cls, new WeakReference<ClassInfo>(info));
>                 return info;
>             }
>             return info;
>         }
>
>     }
> --
>
> $ java -XX:MaxMetaspaceSize=64m -Xmx512m -cp 
> .:groovy-2.5.0-SNAPSHOT.jar ClassGCTester -cp filling/ -parent tester 
> -classes GroovyFilling
> (does a Introspector.flushFromCaches(clazz) for each loaded class)
>
> Secs Test classes              Metaspace/PermGen Heap   Load time 
> Create time    Run time Cleanup time
>        #loaded  #remaining        used committed       used 
> committed     average     average     average      average
>    0         1           1       6.3m       6.5m      13.4m 245.5m     
> 0.890ms    14.308ms  0.026168ms   0.019285ms
>    1       435         435       8.9m      10.1m      22.1m 245.5m     
> 0.365ms     1.825ms  0.000064ms   0.000009ms
>    2      1202        1202      11.9m      14.6m      66.1m 245.5m     
> 0.280ms     1.314ms  0.000024ms   0.000001ms
>    3      2197        2197      15.7m      20.4m      83.8m 309.5m     
> 0.240ms     1.070ms  0.000010ms   0.000001ms
>    4      3247         966      11.0m      16.8m      16.5m 242.0m     
> 0.226ms     0.959ms  0.000006ms   0.000000ms
>    5      4396        2115      15.4m      20.3m      44.5m 238.0m     
> 0.208ms     0.886ms  0.000005ms   0.000000ms
>    6      5415        3134      19.3m      26.0m      54.4m 235.5m     
> 0.202ms     0.863ms  0.000009ms   0.000000ms
>    7      6458         667       9.8m      18.0m      94.7m 266.5m     
> 0.203ms     0.839ms  0.000003ms   0.000000ms
>    8      7550        1759      14.0m      21.4m     122.0m 268.5m     
> 0.198ms     0.821ms  0.000003ms   0.000000ms
>    9      8748        2957      18.6m      25.9m      46.3m 268.5m     
> 0.191ms     0.799ms  0.000003ms   0.000000ms
> [...]
>
> Very interesting because the list contains references to the class and 
> yet it can be garbage collected on-the-fly... Maybe that could help to 
> find a solution?
>
> Alain
>


Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
On 18.05.16 09:10, Jochen Theodorou wrote:
>>      private static class GlobalClassSet {
>>
>>          //private final ManagedLinkedList<ClassInfo> items = new
>> ManagedLinkedList<ClassInfo>(weakBundle);
>>          private final WeakHashMap<Class,WeakReference<ClassInfo>> items
>> = new WeakHashMap<Class,WeakReference<ClassInfo>>();
>
> would be actually interesting to keep the list and see if it can still 
> garbage collect
>
Looks like it can. (As I would have expected because 
ClassInfo.remove(clazz) did not touch that list before and that was 
sufficient to get GC on-the-fly provided you also do 
Introspector.flushFromCaches(clazz) ):

--
     private static class GlobalClassSet {

         private final ManagedLinkedList<ClassInfo> itemsList = new 
ManagedLinkedList<ClassInfo>(weakBundle);
         private final WeakHashMap<Class,WeakReference<ClassInfo>> 
itemsMap = new WeakHashMap<Class,WeakReference<ClassInfo>>();

         public int size(){
             return values().size();
         }

         public int fullSize(){
             return values().size();
         }

         public Collection<ClassInfo> values(){
             synchronized(itemsList){
                 return Arrays.asList(itemsList.toArray(new ClassInfo[0]));
             }
         }

         public void add(ClassInfo value){
             synchronized(itemsList) {
                 itemsList.add(value);
             }
             synchronized(itemsMap) {
                 itemsMap.put(value.klazz, new 
WeakReference<ClassInfo>(value));
             }
         }

         public ClassInfo get(Class cls) {
             WeakReference<ClassInfo> ref;
             synchronized(itemsMap) {
                 ref = itemsMap.get(cls);
             }
             ClassInfo info;
             if (ref == null) {
                 //System.out.println("ClassInfo Ref is null: " + 
cls.getName());
                 info = new ClassInfo(cls);
                 synchronized (itemsMap) {
                     itemsMap.put(cls, new WeakReference<ClassInfo>(info));
                 }
                 return info;
             }
             info = ref.get();
             if (info == null) {
                 //System.out.println("ClassInfo is null: " + 
cls.getName());
                 info = new ClassInfo(cls);
                 itemsMap.put(cls, new WeakReference<ClassInfo>(info));
                 return info;
             }
             return info;
         }

     }
--

$ java -XX:MaxMetaspaceSize=64m -Xmx512m -cp .:groovy-2.5.0-SNAPSHOT.jar 
ClassGCTester -cp filling/ -parent tester -classes GroovyFilling
(does a Introspector.flushFromCaches(clazz) for each loaded class)

Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time    Run time Cleanup time
        #loaded  #remaining        used committed       used 
committed     average     average     average      average
    0         1           1       6.3m       6.5m      13.4m 245.5m     
0.890ms    14.308ms  0.026168ms   0.019285ms
    1       435         435       8.9m      10.1m      22.1m 245.5m     
0.365ms     1.825ms  0.000064ms   0.000009ms
    2      1202        1202      11.9m      14.6m      66.1m 245.5m     
0.280ms     1.314ms  0.000024ms   0.000001ms
    3      2197        2197      15.7m      20.4m      83.8m 309.5m     
0.240ms     1.070ms  0.000010ms   0.000001ms
    4      3247         966      11.0m      16.8m      16.5m 242.0m     
0.226ms     0.959ms  0.000006ms   0.000000ms
    5      4396        2115      15.4m      20.3m      44.5m 238.0m     
0.208ms     0.886ms  0.000005ms   0.000000ms
    6      5415        3134      19.3m      26.0m      54.4m 235.5m     
0.202ms     0.863ms  0.000009ms   0.000000ms
    7      6458         667       9.8m      18.0m      94.7m 266.5m     
0.203ms     0.839ms  0.000003ms   0.000000ms
    8      7550        1759      14.0m      21.4m     122.0m 268.5m     
0.198ms     0.821ms  0.000003ms   0.000000ms
    9      8748        2957      18.6m      25.9m      46.3m 268.5m     
0.191ms     0.799ms  0.000003ms   0.000000ms
[...]

Very interesting because the list contains references to the class and 
yet it can be garbage collected on-the-fly... Maybe that could help to 
find a solution?

Alain


Re: Improve Groovy class loading performance and memory management

Posted by Jochen Theodorou <bl...@gmx.org>.
On 17.05.2016 21:23, Alain Stalder wrote:
[...]
> 3) Value in the WeakHashMap is a Wrapper with a WeakReference to the class:
>
> private static WeakHashMap<Class<?>, Wrapper> weakFillingClassesMap =
> new WeakHashMap<Class<?>, Wrapper>();
> private static class Wrapper { public WeakReference<Class<?>> clazz; }
> ...
> Wrapper wrapper = new Wrapper();
> wrapper.clazz = new WeakReference<Class<?>>(clazz);
> weakFillingClassesMap.put(clazz, wrapper);
>
> => Can immediately be garbage collected (i.e. before limit on Metaspace
> or Heap is reached)
>
> --
>
> 4) Value in the WeakHashMap is a WeakReference<Wrapper> with a hard
> reference to the class in the Wrapper:
>
> private static WeakHashMap<Class<?>, WeakReference<Wrapper>>
> weakFillingClassesMap = new WeakHashMap<Class<?>,
> WeakReference<Wrapper>>();\u2028private static class Wrapper { public
> Class<?> clazz; }
> ...
> Wrapper wrapper = new Wrapper();
> wrapper.clazz = clazz;
> weakFillingClassesMap.put(clazz, new WeakReference<Wrapper>(wrapper));
>
> => Can immediately be garbage collected (i.e. before limit on Metaspace
> or Heap is reached)
>
> --
>
> So, the basic idea would to refactor ClassInfo caches to use 3) or 4)
> and maybe to override Introspector...

since I am looking for something that works with ClassInfo in the end, I 
guess 3) is the way to go.

[...]
>      private static class GlobalClassSet {
>
>          //private final ManagedLinkedList<ClassInfo> items = new
> ManagedLinkedList<ClassInfo>(weakBundle);
>          private final WeakHashMap<Class,WeakReference<ClassInfo>> items
> = new WeakHashMap<Class,WeakReference<ClassInfo>>();

would be actually interesting to keep the list and see if it can still 
garbage collect

[....]
> What looks less than ideal is the first synchronize on items in get(),
> but I don't know to what degree that would matter in practice, I don't
> know how often that is called. In my tests this version appeared even to
> be slightly faster than the one that is using Java 7 ClassValue, but
> there was just a single thread...

in worst case, this is called for about every dynamic method 
invocation... so this should be better not blocking so much.

[...]
> As I said, so far rather a hack, probably better to reimplement the
> GroovyClassValuePreJava7 class instead?

reimplement to what?

> Performance under concurrent
> use? Are other caches that apparently exist in ClassInfo also no issue
> under different circumstances? (And at some point: does it work across
> VMs and OSes etc.?)

that`s to be tested

> Would it make sense to implement a "GroovyIntrospector" which caches
> things in a WeakHashMap<Class,<WeakReference<Method[]>> instead of in a
> WeakHashMap<Class,Method[]> as does Introspector, or something like
> that? Not sure there, because it is all static and not sure how much
> this has or will change from Java release to Java release, but maybe
> that is not so important, just need an implementation that works? Or is
> it sort of a public API for Groovy classes that is widely used?

I think we already have code for the bean stuff in Groovy itself. What 
it does not do is the bean info part. Removing the Introspector code 
(ideally making it optional) is quite high on my wishlist for a major 
version

bye Jochen


Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
I managed to make a Groovy version where garbage collection of ClassInfo 
happens before the limit in Metaspace or Heap is reached (!) - so far it 
is just a hack, but maybe it can contribute to a solution...

First some "basic research" on when the Java VM can garbage collect a 
class, performed with a slightly modified ClassGCTester and the simple 
"JavaFilling" class from previous tests. Again Oracle JDK 8 (Mac).

--

1) Original test setup (class reference as key, constant String "" as 
value):

private static WeakHashMap<Class<?>, String> weakFillingClassesMap = new 
WeakHashMap<Class<?>, String>();
...
weakFillingClassesMap.put(clazz, "");

=> Can immediately be garbage collected (i.e. before limit on Metaspace 
or Heap is reached), as expected, of course

--

2) Value in the WeakHashMap is a Wrapper with a hard reference to the class:

private static WeakHashMap<Class<?>, Wrapper> weakFillingClassesMap = 
new WeakHashMap<Class<?>, Wrapper>();
private static class Wrapper { public Class<?> clazz; }
...
Wrapper wrapper = new Wrapper();
wrapper.clazz = clazz;
weakFillingClassesMap.put(clazz, wrapper);

=> Cannot be garbage collected, OutOfMemoryError once limit on Metaspace 
or Heap is reached

--

3) Value in the WeakHashMap is a Wrapper with a WeakReference to the class:

private static WeakHashMap<Class<?>, Wrapper> weakFillingClassesMap = 
new WeakHashMap<Class<?>, Wrapper>();
private static class Wrapper { public WeakReference<Class<?>> clazz; }
...
Wrapper wrapper = new Wrapper();
wrapper.clazz = new WeakReference<Class<?>>(clazz);
weakFillingClassesMap.put(clazz, wrapper);

=> Can immediately be garbage collected (i.e. before limit on Metaspace 
or Heap is reached)

--

4) Value in the WeakHashMap is a WeakReference<Wrapper> with a hard 
reference to the class in the Wrapper:

private static WeakHashMap<Class<?>, WeakReference<Wrapper>> 
weakFillingClassesMap = new WeakHashMap<Class<?>, 
WeakReference<Wrapper>>();\u2028private static class Wrapper { public 
Class<?> clazz; }
...
Wrapper wrapper = new Wrapper();
wrapper.clazz = clazz;
weakFillingClassesMap.put(clazz, new WeakReference<Wrapper>(wrapper));

=> Can immediately be garbage collected (i.e. before limit on Metaspace 
or Heap is reached)

--

So, the basic idea would to refactor ClassInfo caches to use 3) or 4) 
and maybe to override Introspector...

Here's the hack I made for ClassInfo, based on the master branch (note 
that in that branch there is even still a hard reference to the Class in 
ClassInfo):

Not using ClassValue stuff at all:

     /*private static final GroovyClassValue<ClassInfo> globalClassValue 
= GroovyClassValueFactory.createGroovyClassValue(new 
ComputeValue<ClassInfo>(){
         @Override
         public ClassInfo computeValue(Class<?> type) {
             ClassInfo ret = new ClassInfo(type);
             globalClassSet.add(ret);
             return ret;
         }
     });*/

Instead getting ClassInfo from class from a refactored GlobalClassSet:

     public static ClassInfo getClassInfo (Class cls) {
         return globalClassSet.get(cls);
         //return globalClassValue.get(cls);
     }

and here is the refactored GlobalClassSet, now based on a WeakHashMap:

     private static class GlobalClassSet {

         //private final ManagedLinkedList<ClassInfo> items = new 
ManagedLinkedList<ClassInfo>(weakBundle);
         private final WeakHashMap<Class,WeakReference<ClassInfo>> items 
= new WeakHashMap<Class,WeakReference<ClassInfo>>();

         public int size(){
             return values().size();
         }

         public int fullSize(){
             return values().size();
         }

         public Collection<ClassInfo> values(){
             synchronized(items){
                 Collection<WeakReference<ClassInfo>> values = 
items.values();
                 List<ClassInfo> list = new ArrayList<ClassInfo>();
                 for (WeakReference<ClassInfo> value : values) {
                     ClassInfo info = value.get();
                     if (info != null) {
                         //System.out.println("ClassInfo is null");
                         list.add(info);
                     }
                 }
                 return list;
                 //return Arrays.asList(items.toArray(new ClassInfo[0]));
             }
         }

         public void add(ClassInfo value){
             synchronized(items){
                 //items.add(value);
                 items.put(value.klazz, new 
WeakReference<ClassInfo>(value));
             }
         }

         public ClassInfo get(Class cls) {
             WeakReference<ClassInfo> ref;
             synchronized(items) {
                 ref = items.get(cls);
             }
             ClassInfo info;
             if (ref == null) {
                 //System.out.println("ClassInfo Ref is null: " + 
cls.getName());
                 info = new ClassInfo(cls);
                 synchronized (items) {
                     items.put(cls, new WeakReference<ClassInfo>(info));
                 }
                 return info;
             }
             info = ref.get();
             if (info == null) {
                 //System.out.println("ClassInfo is null: " + 
cls.getName());
                 info = new ClassInfo(cls);
                 items.put(cls, new WeakReference<ClassInfo>(info));
                 return info;
             }
             return info;
         }

     }

What looks less than ideal is the first synchronize on items in get(), 
but I don't know to what degree that would matter in practice, I don't 
know how often that is called. In my tests this version appeared even to 
be slightly faster than the one that is using Java 7 ClassValue, but 
there was just a single thread...

For testing with ClassGCTester I made a manual cleanup of the 
Introspector cache after loading each class in the loop, if only loading 
the GroovyFilling class from the URLClassLoader like this:

    Introspector.flushFromCaches(clazz);

If loading also all of Groovy from the URLClassLoader I had to do it 
like this to clean up for all Groovy classes (which of course makes 
things slower):

    Introspector.flushCaches();

Sample output if only loading the GroovyFilling class from the 
URLClassLoader:

--
Java Version:    1.8.0_92
Groovy Version:  2.5.0-SNAPSHOT
Java Class Path: .:groovy-2.5.0-SNAPSHOT.jar
Arguments:       -cp filling/ -parent tester -classes GroovyFilling
VM Arguments:    -XX:MaxMetaspaceSize=64m -Xmx512m
PID:             57399

Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time
        #loaded  #remaining        used committed       used 
committed     average     average
    0         1           1       6.3m       6.5m      13.3m 245.5m     
1.150ms    15.411ms
    1       498         498       9.1m      10.5m      29.9m 245.5m     
0.339ms     1.598ms
    2      1361        1361      12.4m      15.5m      26.1m 245.5m     
0.267ms     1.164ms
    3      2303          10       7.3m      16.8m       4.6m 229.0m     
0.252ms     1.022ms
    4      3496        1203      11.8m      16.8m      46.3m 244.5m     
0.218ms     0.904ms
    5      4640        2347      16.2m      21.4m      71.6m 240.0m     
0.207ms     0.852ms
    6      5637        3344      20.0m      27.1m      74.8m 237.5m     
0.203ms     0.843ms
    7      6827        1024      11.1m      18.2m      19.1m 269.0m     
0.200ms     0.808ms
    8      8120        2317      16.1m      23.4m      73.2m 254.0m     
0.191ms     0.779ms
    9      9382        3579      20.9m      28.6m     122.7m 269.0m     
0.184ms     0.761ms
   10     10669        1036      11.2m      22.0m      21.3m 274.5m     
0.183ms     0.741ms
   11     12010        2377      16.3m      24.3m      81.9m 289.5m     
0.177ms     0.726ms
   12     13275        3642      21.2m      29.3m     129.5m 288.5m     
0.174ms     0.718ms
   13     14482        4849      25.8m      36.0m      42.9m 289.5m     
0.172ms     0.714ms
   14     15792        1177      11.7m      22.7m      36.1m 319.5m     
0.172ms     0.704ms
   15     17112        2497      16.8m      27.0m      86.5m 319.5m     
0.169ms     0.697ms
--

Sample output if loading Groovy and the GroovyFilling class from the 
URLClassLoader:

--
Java Version:    1.8.0_92
Groovy Version:  2.5.0-SNAPSHOT
Java Class Path: .
Arguments:       -cp groovy-2.5.0-SNAPSHOT.jar:filling/ -parent null 
-classes GroovyFilling
VM Arguments:    -XX:MaxMetaspaceSize=64m -Xmx512m
PID:             59454

Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time
        #loaded  #remaining        used committed       used 
committed     average     average
    0         1           1       7.9m       8.5m      18.0m 245.5m     
2.345ms   122.303ms
    1        10           3      10.3m      16.8m      24.2m 225.0m     
1.798ms   103.932ms
    2        20           1       7.0m      20.0m      20.4m 267.5m     
1.512ms   100.385ms
    3        29          10      22.4m      23.9m      83.2m 267.5m     
1.436ms   101.481ms
    4        41           7      17.3m      21.3m      65.7m 319.5m     
1.332ms    97.067ms
    5        52           2       8.8m      21.5m      31.2m 352.0m     
1.284ms    95.194ms
    6        64          14      29.3m      31.2m     116.6m 352.0m     
1.235ms    92.452ms
    7        76           9      20.8m      23.4m      85.5m 315.5m     
1.193ms    90.991ms
    8        88           4      12.3m      20.5m      35.2m 362.5m     
1.166ms    90.134ms
    9       100          16      32.8m      34.5m      54.7m 366.5m     
1.135ms    88.930ms
   10       112          11      24.2m      26.8m      84.6m 397.0m     
1.109ms    88.191ms
[...]
--

As I said, so far rather a hack, probably better to reimplement the 
GroovyClassValuePreJava7 class instead? Performance under concurrent 
use? Are other caches that apparently exist in ClassInfo also no issue 
under different circumstances? (And at some point: does it work across 
VMs and OSes etc.?)

Would it make sense to implement a "GroovyIntrospector" which caches 
things in a WeakHashMap<Class,<WeakReference<Method[]>> instead of in a 
WeakHashMap<Class,Method[]> as does Introspector, or something like 
that? Not sure there, because it is all static and not sure how much 
this has or will change from Java release to Java release, but maybe 
that is not so important, just need an implementation that works? Or is 
it sort of a public API for Groovy classes that is widely used?

Alain

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
On 17.05.16 09:48, Alain Stalder wrote:
> Experimentally, for the test with ClassGCTester, the first call 
> (metaRegistry.removeMetaClass(clazz)) was not necessary to have 
> garbage collection before Metaspace reaches the maximum, the other two 
> were.
Makes sense:
metaRegistry keeps no reference to the class, instead it gets ClassInfo 
and stores the MetaClass there.
Introspector has a reference to the class: It contains a WeakHashMap 
(resp. a class derived of it) with the class as the key and an array of 
java.lang.reflect.Method as the value, which, in turn, has the class 
reference in a private field.

Alain

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
On 17.05.16 09:04, Alain Stalder wrote:
> PS: Note that Introspector.flushFromCaches(clazz) was experimentally 
> really not necessary in this case, but maybe has to do with the simple 
> nature of the test script ("42") and only calling a (no-args) 
> constructor... In any case very promising...

Ah, that's simply because it is already called in 
InvokerHelper.removeClass():

     public static void removeClass(Class clazz) {
         metaRegistry.removeMetaClass(clazz);
         ClassInfo.remove(clazz);
         Introspector.flushFromCaches(clazz);
     }

Experimentally, for the test with ClassGCTester, the first call 
(metaRegistry.removeMetaClass(clazz)) was not necessary to have garbage 
collection before Metaspace reaches the maximum, the other two were.

Alain

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
On 17.05.16 07:53, Alain Stalder wrote:
> I will definitely try out the InvokerHelper.removeClass(clazz) with 
> added ClassInfo removal plus Introspector.flushFromCaches(clazz) and 
> see if I can get garbage collection before reaching the limit on 
> Metaspace or Heap.
Fantastic, I really got it to work :)

I took groovy master plus changes for GROOVY-7646 and GROOVY-7683 and 
modified the test in ClassGCTester as follows:

--
                 long nanoT0 = System.nanoTime();
                 Class<?> clazz = classLoader.loadClass(testClassName);
                 long nanoT1 = System.nanoTime();
                 clazz.newInstance();
                 long nanoT2 = System.nanoTime();
                 loadTimeTotal += (nanoT1 - nanoT0);
                 createTimeTotal += (nanoT2 - nanoT1);
                 weakFillingClassesMap.put(clazz, "");

                 // added:
                 Class<?> invokerHelperClass = 
ClassGCTester.class.getClassLoader().loadClass("org.codehaus.groovy.runtime.InvokerHelper");
                 Method removeClassMethod = 
invokerHelperClass.getDeclaredMethod("removeClass", Class.class);
                 removeClassMethod.invoke(null, clazz);
                 //Introspector.flushFromCaches(clazz);
--

An then ran the test as follows (not using ClassValue, loading the test 
class not from the same ClassLoader as Groovy itself):

$ java -XX:MaxMetaspaceSize=64m -Xmx512m -cp .:groovy-2.5.0-SNAPSHOT.jar 
ClassGCTester -cp filling/ -parent tester -classes GroovyFilling

--
Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time
        #loaded  #remaining        used committed       used 
committed     average     average
    0         1           1       6.3m       6.5m      12.8m 245.5m     
0.838ms    11.112ms
    1       486         486       9.2m      10.5m      27.6m 245.5m     
0.326ms     1.624ms
    2      1404        1404      12.7m      16.1m      30.3m 245.5m     
0.244ms     1.124ms
    3      2308          61       7.6m      16.8m      12.2m 228.5m     
0.241ms     1.014ms
    4      3461        1214      12.0m      16.8m      46.4m 244.0m     
0.212ms     0.908ms
    5      4577        2330      16.3m      21.4m      69.1m 240.0m     
0.200ms     0.860ms
    6      5581        3334      20.1m      27.4m      75.1m 237.5m     
0.197ms     0.847ms
    7      6703         974      11.1m      18.2m      11.3m 268.0m     
0.197ms     0.818ms
    8      7996        2267      16.0m      23.4m      65.7m 253.5m     
0.188ms     0.786ms
    9      9261        3532      20.9m      28.5m     115.1m 267.5m     
0.182ms     0.765ms
   10     10518         960      11.0m      20.0m      10.9m 285.0m     
0.181ms     0.747ms
   11     11841        2283      16.1m      24.0m      68.9m 285.0m     
0.177ms     0.730ms
   12     13097        3539      20.9m      29.0m     113.9m 285.0m     
0.173ms     0.722ms
   13     14288         331       8.7m      21.3m      49.6m 314.0m     
0.174ms     0.715ms
   14     15640        1683      13.8m      22.9m     105.5m 316.0m     
0.170ms     0.705ms
   15     16923        2966      18.7m      27.9m     150.4m 315.5m     
0.168ms     0.699ms
   16     18128        4171      23.3m      32.6m      54.6m 316.0m     
0.166ms     0.697ms
   17     19360         628       9.8m      21.5m      88.6m 347.0m     
0.167ms     0.692ms
   18     20707        1975      15.0m      24.6m     137.6m 346.5m     
0.165ms     0.685ms
   19     21958        3226      19.7m      29.6m      39.1m 347.5m     
0.164ms     0.683ms
   20     23172        4440      24.4m      34.4m      66.4m 349.5m     
0.163ms     0.682ms
   21     24430         861      10.7m      21.9m     120.7m 383.5m     
0.163ms     0.678ms
   22     25725        2156      15.7m      25.5m      19.0m 381.0m     
0.162ms     0.675ms
   23     26986        3417      20.5m      30.5m      49.3m 385.0m     
0.161ms     0.674ms
   24     28192        4623      25.1m      35.3m      70.8m 384.5m     
0.161ms     0.673ms
   25     29454         956      11.1m      21.0m     134.6m 416.0m     
0.161ms     0.671ms
   26     30737        2239      16.0m      26.3m      25.2m 416.5m     
0.160ms     0.669ms
[...]
--

Nicely garbage collects repeatedly, Metaspace stay well below the 
configured maximum of 64m...

PS: Note that Introspector.flushFromCaches(clazz) was experimentally 
really not necessary in this case, but maybe has to do with the simple 
nature of the test script ("42") and only calling a (no-args) 
constructor... In any case very promising...

Alain




Re: Improve Groovy class loading performance and memory management

Posted by John Wagenleitner <jo...@gmail.com>.
On Mon, May 16, 2016 at 11:56 PM, Jochen Theodorou <bl...@gmx.org>
wrote:

> On 17.05.2016 07:53, Alain Stalder wrote:
>
>> That looks very good to me :)
>>
>> I will definitely try out the InvokerHelper.removeClass(clazz) with
>> added ClassInfo removal plus Introspector.flushFromCaches(clazz) and see
>> if I can get garbage collection before reaching the limit on Metaspace
>> or Heap.
>>
>> And, maybe something like the following could be added to the
>> GroovyClassLoader? Thinking aloud:
>>
>> Assuming the following is true: Any class can only be garbage collected
>> once its ClassLoader can be garbage collected, because each class keeps
>> a reference to its ClassLoader (so that it can use it to load further
>> classes when needed when running methods).
>>
>
> not only the class, the classloader internals also keep such a reference.
> And I mean java.lang.ClassLoader here.
>
> So why not have the GroovyClassLoader keep a set of all classes it
>> compiled itself and were loaded and offer a new ~
>> GroovyClassLoader#finalCleanup() method that removes meta information
>> for all these classes so that they would become immediately eligible for
>> garbage collection? (I guess InvokerHelper.removeClass(clazz) and
>> Introspector.flushFromCaches(clazz), but whatever is needed...)
>>
>
> GroovyClassLoader (GCL) actually represents a tree of class loaders. for
> each compilation GCL will spawn an instance of InnerLoader. Since two
> different compilations are supposed to know each others classes a list of
> classes is kept in GCL itself (see classCache). The inner loader itself is
> not referenced by GCL. Because of that list GCL has the clearCache method
> to remove classes from previous compilations.
>
> Why did we use this structure? GCL is supposed to offer you the
> possibility to compile the same class multiple times. That means you will
> get the same class multiple times. At the same time a class must be defined
> under the same name only once in a given defining class loader. As a result
> trying to define a class, that already exists under that name results in an
> error. A classloading constraint is actually to return the same class
> instance each time you request a class with a certain name. Is implies the
> error before.... it also means GCL is breaking those constraints knowingly.
>
> Anyway... I think such a cleanup method is misplaced in GCL, since it
> spans beyond the classloader... how about GroovySystem?
>
>

I agree that if a method were added I don't think GCL is the right place
and that something like GroovySystem#removeClass(Class) or
GroovySystem#flushFromCaches(Class) would be good.

Re: Improve Groovy class loading performance and memory management

Posted by Jochen Theodorou <bl...@gmx.org>.
On 17.05.2016 07:53, Alain Stalder wrote:
> That looks very good to me :)
>
> I will definitely try out the InvokerHelper.removeClass(clazz) with
> added ClassInfo removal plus Introspector.flushFromCaches(clazz) and see
> if I can get garbage collection before reaching the limit on Metaspace
> or Heap.
>
> And, maybe something like the following could be added to the
> GroovyClassLoader? Thinking aloud:
>
> Assuming the following is true: Any class can only be garbage collected
> once its ClassLoader can be garbage collected, because each class keeps
> a reference to its ClassLoader (so that it can use it to load further
> classes when needed when running methods).

not only the class, the classloader internals also keep such a 
reference. And I mean java.lang.ClassLoader here.

> So why not have the GroovyClassLoader keep a set of all classes it
> compiled itself and were loaded and offer a new ~
> GroovyClassLoader#finalCleanup() method that removes meta information
> for all these classes so that they would become immediately eligible for
> garbage collection? (I guess InvokerHelper.removeClass(clazz) and
> Introspector.flushFromCaches(clazz), but whatever is needed...)

GroovyClassLoader (GCL) actually represents a tree of class loaders. for 
each compilation GCL will spawn an instance of InnerLoader. Since two 
different compilations are supposed to know each others classes a list 
of classes is kept in GCL itself (see classCache). The inner loader 
itself is not referenced by GCL. Because of that list GCL has the 
clearCache method to remove classes from previous compilations.

Why did we use this structure? GCL is supposed to offer you the 
possibility to compile the same class multiple times. That means you 
will get the same class multiple times. At the same time a class must be 
defined under the same name only once in a given defining class loader. 
As a result trying to define a class, that already exists under that 
name results in an error. A classloading constraint is actually to 
return the same class instance each time you request a class with a 
certain name. Is implies the error before.... it also means GCL is 
breaking those constraints knowingly.

Anyway... I think such a cleanup method is misplaced in GCL, since it 
spans beyond the classloader... how about GroovySystem?

bye Jochen

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
That looks very good to me :)

I will definitely try out the InvokerHelper.removeClass(clazz) with 
added ClassInfo removal plus Introspector.flushFromCaches(clazz) and see 
if I can get garbage collection before reaching the limit on Metaspace 
or Heap.

And, maybe something like the following could be added to the 
GroovyClassLoader? Thinking aloud:

Assuming the following is true: Any class can only be garbage collected 
once its ClassLoader can be garbage collected, because each class keeps 
a reference to its ClassLoader (so that it can use it to load further 
classes when needed when running methods).

So why not have the GroovyClassLoader keep a set of all classes it 
compiled itself and were loaded and offer a new ~ 
GroovyClassLoader#finalCleanup() method that removes meta information 
for all these classes so that they would become immediately eligible for 
garbage collection? (I guess InvokerHelper.removeClass(clazz) and 
Introspector.flushFromCaches(clazz), but whatever is needed...)

This would not help with Groovy classes that were precompiled and 
loaded, say, with an URLClassLoader, but would help with ones that were 
dynamically compiled at runtime.

Alain


On 17.05.16 00:35, John Wagenleitner wrote:
>
>
> On Mon, May 16, 2016 at 1:34 PM, Alain Stalder <astalder@span.ch 
> <ma...@span.ch>> wrote:
>
>     Thanks, I had not looked at the master branch, ClassInfo source
>     looks quite a bit cleaner there already :)
>
>     Regarding programmatic cleanup (GROOVY-7646), I think that is a
>     good idea, but in the details there might be some obstacles.
>
>
> I definitely agree, many obstacles usually present themselves once you 
> scratch the surface. :)
>
>
>
>     For example this sequence of calls to GroovyShell:
>
>     def shell = new GroovyShell()
>     def script1 = shell.parse("42")
>     assert script1.class.name <http://script1.class.name> == "Script1"
>     def script2 = shell.parse("new Script1().run()")
>     assert script2.run() == 42
>
>     def script3 = shell.parse("99", "Nintetynine")
>     assert script3.class.name <http://script3.class.name> == "Nintetynine"
>
>     def file = new File("Fiftyfive.groovy")
>     file.setText("55")
>     def script4 = shell.parse(file)
>     assert script4.class.name <http://script4.class.name> == "Fiftyfive"
>
>     So, classes accumulate (in the GroovyClassLoader) and can be
>     addressed by their names in subsequent scripts. (And for more
>     complex script expressions, more than one class might be the
>     result of compilation, e.g. with closures or inner classes, enums,
>     etc.)
>
>
>
> This passes with the changes in place for GROOVY-7646, though calls to 
> parse don't call the added clean-up code. It still passes if I change 
> parse to run.
>
>
>
>     I would estimate that in the case where a script is run with a
>     name given automatically by the GroovyShell ("Script1", "Script2",
>     ...) it would be OK to do the cleanup (and I guess using
>     GroovyShell that way might be a very common case?), but when it
>     comes to explicitly named scripts, doing so might change behaviour
>     of existing code.
>
>
>
> For quite some time GroovyShell#evaluate(Reader,String filename) was 
> doing this kind of cleanup [1].  Cleanup meaning removing the 
> metaClass, the ClassInfo from the cache and the Introspector 
> beanInfoCache.  As long as the ClassLoader and any of it's classes are 
> still referenced the Classes that result from parse/run calls would 
> still be available.  But you are right, there are so many ways the 
> shell can be used it is difficult to tell what it might break.
>
>
>     I just took a look at GroovyScriptEngine which also has run()
>     methods and, if I remember correctly, it recompiles all scripts if
>     one of them changes (to get dependencies right), so in principle
>     lots of classes to cleanup for each time this happens. But I am
>     not sure if that is possible there, because there is also a
>     createScript() method, so possibly still objects/classes that are
>     in use around.
>
>     (And I have also just started to think about Grengine in that
>     context, my open source library for using Groovy in a Java VM (and
>     which almost nobody uses ;), there it might be easier to build in
>     such automatic removal because the approach is more structured,
>     although a bit less dynamic.)
>
>     Hmn, would really be great if there was a way to achieve constant
>     garbage collection of Groovy classes.
>
>
>
> I take constant to mean not waiting until heap/metaspace is filled 
> before collection.  If so, from that I've seen that would still 
> require some user intervention 
> (java.beans.Introspector.flushFromCaches(clazz)) in order to clear the 
> Introspector cache which keeps a Soft Reference to main method of a 
> Script class which in turns references the Class.
>
>
>
>     Alain
>

Re: Improve Groovy class loading performance and memory management

Posted by John Wagenleitner <jo...@gmail.com>.
On Mon, May 16, 2016 at 1:34 PM, Alain Stalder <as...@span.ch> wrote:

> Thanks, I had not looked at the master branch, ClassInfo source looks
> quite a bit cleaner there already :)
>
> Regarding programmatic cleanup (GROOVY-7646), I think that is a good idea,
> but in the details there might be some obstacles.
>

I definitely agree, many obstacles usually present themselves once you
scratch the surface. :)



>
> For example this sequence of calls to GroovyShell:
>
> def shell = new GroovyShell()
> def script1 = shell.parse("42")
> assert script1.class.name == "Script1"
> def script2 = shell.parse("new Script1().run()")
> assert script2.run() == 42
>
> def script3 = shell.parse("99", "Nintetynine")
> assert script3.class.name == "Nintetynine"
>
> def file = new File("Fiftyfive.groovy")
> file.setText("55")
> def script4 = shell.parse(file)
> assert script4.class.name == "Fiftyfive"
>
> So, classes accumulate (in the GroovyClassLoader) and can be addressed by
> their names in subsequent scripts. (And for more complex script
> expressions, more than one class might be the result of compilation, e.g.
> with closures or inner classes, enums, etc.)
>


This passes with the changes in place for GROOVY-7646, though calls to
parse don't call the added clean-up code.  It still passes if I change
parse to run.



> I would estimate that in the case where a script is run with a name given
> automatically by the GroovyShell ("Script1", "Script2", ...) it would be OK
> to do the cleanup (and I guess using GroovyShell that way might be a very
> common case?), but when it comes to explicitly named scripts, doing so
> might change behaviour of existing code.
>


For quite some time GroovyShell#evaluate(Reader,String filename) was doing
this kind of cleanup [1].  Cleanup meaning removing the metaClass, the
ClassInfo from the cache and the Introspector beanInfoCache.  As long as
the ClassLoader and any of it's classes are still referenced the Classes
that result from parse/run calls would still be available.  But you are
right, there are so many ways the shell can be used it is difficult to tell
what it might break.



>
> I just took a look at GroovyScriptEngine which also has run() methods and,
> if I remember correctly, it recompiles all scripts if one of them changes
> (to get dependencies right), so in principle lots of classes to cleanup for
> each time this happens. But I am not sure if that is possible there,
> because there is also a createScript() method, so possibly still
> objects/classes that are in use around.
>
> (And I have also just started to think about Grengine in that context, my
> open source library for using Groovy in a Java VM (and which almost nobody
> uses ;), there it might be easier to build in such automatic removal
> because the approach is more structured, although a bit less dynamic.)
>
> Hmn, would really be great if there was a way to achieve constant garbage
> collection of Groovy classes.
>


I take constant to mean not waiting until heap/metaspace is filled before
collection.  If so, from that I've seen that would still require some user
intervention (java.beans.Introspector.flushFromCaches(clazz)) in order to
clear the Introspector cache which keeps a Soft Reference to main method of
a Script class which in turns references the Class.



>
>
> Alain
>
>
>

[1]
https://github.com/apache/groovy/commit/5724870025c25622015ba13c0310def5742d0b2f#diff-62f4f9c1bd5efea3ddcfe563c25f953eR459



> On 16.05.16 18:18, John Wagenleitner wrote:
>
> Just catching up on this thread, very interesting discussion and will have
> to give the posted test code a try.
>
> You are right about the PhantomReference and it has been removed in
> master [1] along with the local cache that used it.  Due to some
> refactorings that were not in 2_4_X at the time it wasn't removed from that
> branch.  But probably should be cleaned up if any fixes for the memory
> issues to ClassInfo are merged into that branch in the near future.
>
> I think the suggestion of referencing the Class via the ClassInfo from
> metaclasses/cachedclasses would be a good one, the less places the Class is
> kept the better.  Unfortunately since it's a protected field on
> MetaClassImpl that would be a breaking change would be something for a 3.0
> as you pointed out.
>
> So far, I have found that keeping a WeakReference to the Class in
> ClassInfo allows it to be collected (mostly tested with non-ClassValue
> version of ClassInfo).  At least one exception is if methods are added to
> the metaclass then it's required to setMetaClass(null) since the
> ExpandoMetaClass is a strong reference on ClassInfo and EMC has a strong
> reference to the Class.  What is difficult to determine is if keeping a
> WeakReference can cause any potential issues.  Only possible problem I can
> see is if the methods of the Class A were referenced in the MetaMethodIndex
> for Class B, but I think in that case as long as the Class B was strongly
> referenced the index itself would keep Class A referenced.
>
> In environments where lots of scripts are being parsed and run and
> references to the Class are not retained, it might be worth having a way to
> programmatically initiate the cleanup so as not to have to wait for the
> Soft References to be collected.  The extra performance costs of clearing a
> few references might not be as high as consistently hitting the upper heap
> limit constantly.  It is something I have looked at for GROOVY-7646 [2].
> Parsed groovy classes should be collectible by default without any
> intervention, but there may be cases where an API to help speed the removal
> might be useful too.
>
>
> [1]
> https://github.com/apache/groovy/commit/e967039222dc01a59824f95d9313a3b2e7aa9f50
>
> [2] https://github.com/apache/groovy/pull/325
>
>
> On Mon, May 16, 2016 at 8:01 AM, Alain Stalder <as...@span.ch> wrote:
>
>> My time here is running up, other things to attend to, so here is what
>> I wrote about the current state of class loading and garbage collection
>> in Groovy in the just updated user manual of Grengine:
>>
>> https://www.grengine.ch/manual.html#the-cost-of-session-separation
>> --
>> ==== The Cost of Session Separation
>>
>> Although loading classes from bytecode obtained from compiling Groovy
>> scripts
>> is a lot less expensive than compiling them (plus afterwards also loading
>> the
>> resulting bytecode), it is still somewhat more expensive than one might
>> naively
>> expect and there are a few things to be aware of when operating that way.
>>
>> In the following, I will simply call classes compiled by the Groovy
>> compiler
>> from Groovy scripts/sources _Groovy classes_ and classes compiled by the
>> Java
>> compiler from Java sources _Java classes_.
>>
>> * *Class Loading* +
>>   Experimentally, loading of a typical Groovy class is often about 10
>> times
>>   slower than loading a Java class with similarly complex source code, but
>>   both are relatively expensive operations (of the order of a millisecond
>>   for a small Groovy class, to give a rough indication). For Java classes,
>>   this is apparently mainly expensive because some security checks have to
>>   be made on the bytecode. For Groovy classes, it is mainly expensive
>>   because some meta information is needed to later efficiently call
>> methods
>>   dynamically, and the like.
>> * *Garbage Collection* +
>>   Classes are stored in _PermGen_ (up to Java 7) resp. _Metaspace_ (Java 8
>>   and later) plus some associated data on the Heap, at least for Groovy
>>   classes the latter is normally the case (meta information). Whereas for
>>   Java classes, unused classes appear to be usually garbage collected from
>>   PermGen/Metaspace continuously, with Groovy classes this experimentally
>>   does not happen before PermGen/Metaspace or the Heap reach a configured
>>   limit. Why exactly this is so and whether it is easy to change and
>> whether
>>   it will change in the future, is difficult to answer for me, I find the
>>   code around it is rather convoluted, hard to untangle. Note that by
>> default
>>   on Java VMs there is typically no limit set for Metaspace (but there is
>>   for PermGen), so setting a limit is crucial in practice when using
>> Groovy.
>> * *Garbage Collection Bugs* +
>>   In the past, several Groovy versions had failed at garbage collecting
>>   Groovy classes and their class loaders, resulting finally in an
>>   `OutOfMemoryError` due to exhaustion of PermGen/Metaspace or the Heap,
>>   whichever limit was reached first. If when you are reading this, Groovy
>>   2.4.6 is (still) the newest version, make sure you set the system
>> property
>>   `groovy.use.classvalue=true` in the context of Grengine. Note that under
>>   different circumstances, like the one described in
>>   https://issues.apache.org/jira/browse/GROOVY-7591[GROOVY-7591:
>>   Use of ClassValue causes major memory leak] you would instead have had
>> to
>>   set it to false! That Groovy bug is actually in turn due to a bug in
>>   Oracle/OpenJDK Java VMs regarding garbage collection under some
>>   circumstances, more precisely a bug in a new feature (`ClassValue`)
>>   introduced in order to make thing easier(!) for dynamic languages in the
>>   Java VM, see
>> https://bugs.openjdk.java.net/browse/JDK-8136353[JDK-8136353].
>>
>> So, if you want to use session separation with Greninge (or otherwise want
>> to load many Groovy classes repeately), first set a limit on
>> PermGen/Metaspace,
>> then verify that classes can be garbage collected in an environment close
>> to
>> production and that throughput under load would be sufficient (despite the
>> relatively slow class loading performance of Groovy (and Java) classes in
>> the
>> Java VM) and then use it. And don't forget to repeat this at least when
>> you
>> upgrade Groovy to a new version, but possibly also when you upgrade Java.
>>
>> Or see the next section for an alternative...
>> --
>>
>> PS: By the way, very funny how Jochen Theodorou "garbage collected" what I
>> wrote about PhantomReference to a "[...]"...
>>
>> Good luck with Groovy garbage collection.
>>
>> Alain
>>
>
>
>

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
Thanks, I had not looked at the master branch, ClassInfo source looks 
quite a bit cleaner there already :)

Regarding programmatic cleanup (GROOVY-7646), I think that is a good 
idea, but in the details there might be some obstacles.

For example this sequence of calls to GroovyShell:

def shell = new GroovyShell()
def script1 = shell.parse("42")
assert script1.class.name == "Script1"
def script2 = shell.parse("new Script1().run()")
assert script2.run() == 42

def script3 = shell.parse("99", "Nintetynine")
assert script3.class.name == "Nintetynine"

def file = new File("Fiftyfive.groovy")
file.setText("55")
def script4 = shell.parse(file)
assert script4.class.name == "Fiftyfive"

So, classes accumulate (in the GroovyClassLoader) and can be addressed 
by their names in subsequent scripts. (And for more complex script 
expressions, more than one class might be the result of compilation, 
e.g. with closures or inner classes, enums, etc.)

I would estimate that in the case where a script is run with a name 
given automatically by the GroovyShell ("Script1", "Script2", ...) it 
would be OK to do the cleanup (and I guess using GroovyShell that way 
might be a very common case?), but when it comes to explicitly named 
scripts, doing so might change behaviour of existing code.

I just took a look at GroovyScriptEngine which also has run() methods 
and, if I remember correctly, it recompiles all scripts if one of them 
changes (to get dependencies right), so in principle lots of classes to 
cleanup for each time this happens. But I am not sure if that is 
possible there, because there is also a createScript() method, so 
possibly still objects/classes that are in use around.

(And I have also just started to think about Grengine in that context, 
my open source library for using Groovy in a Java VM (and which almost 
nobody uses ;), there it might be easier to build in such automatic 
removal because the approach is more structured, although a bit less 
dynamic.)

Hmn, would really be great if there was a way to achieve constant 
garbage collection of Groovy classes.

Alain

On 16.05.16 18:18, John Wagenleitner wrote:
> Just catching up on this thread, very interesting discussion and will 
> have to give the posted test code a try.
>
> You are right about the PhantomReference and it has been removed in 
> master [1] along with the local cache that used it.  Due to some 
> refactorings that were not in 2_4_X at the time it wasn't removed from 
> that branch. But probably should be cleaned up if any fixes for the 
> memory issues to ClassInfo are merged into that branch in the near future.
>
> I think the suggestion of referencing the Class via the ClassInfo from 
> metaclasses/cachedclasses would be a good one, the less places the 
> Class is kept the better.  Unfortunately since it's a protected field 
> on MetaClassImpl that would be a breaking change would be something 
> for a 3.0 as you pointed out.
>
> So far, I have found that keeping a WeakReference to the Class in 
> ClassInfo allows it to be collected (mostly tested with non-ClassValue 
> version of ClassInfo).  At least one exception is if methods are added 
> to the metaclass then it's required to setMetaClass(null) since the 
> ExpandoMetaClass is a strong reference on ClassInfo and EMC has a 
> strong reference to the Class.  What is difficult to determine is if 
> keeping a WeakReference can cause any potential issues.  Only possible 
> problem I can see is if the methods of the Class A were referenced in 
> the MetaMethodIndex for Class B, but I think in that case as long as 
> the Class B was strongly referenced the index itself would keep Class 
> A referenced.
>
> In environments where lots of scripts are being parsed and run and 
> references to the Class are not retained, it might be worth having a 
> way to programmatically initiate the cleanup so as not to have to wait 
> for the Soft References to be collected.  The extra performance costs 
> of clearing a few references might not be as high as consistently 
> hitting the upper heap limit constantly.  It is something I have 
> looked at for GROOVY-7646 [2].  Parsed groovy classes should be 
> collectible by default without any intervention, but there may be 
> cases where an API to help speed the removal might be useful too.
>
>
> [1] 
> https://github.com/apache/groovy/commit/e967039222dc01a59824f95d9313a3b2e7aa9f50 
>
> [2] https://github.com/apache/groovy/pull/325
>
>
> On Mon, May 16, 2016 at 8:01 AM, Alain Stalder <astalder@span.ch 
> <ma...@span.ch>> wrote:
>
>     My time here is running up, other things to attend to, so here is what
>     I wrote about the current state of class loading and garbage
>     collection
>     in Groovy in the just updated user manual of Grengine:
>
>     https://www.grengine.ch/manual.html#the-cost-of-session-separation
>     --
>     ==== The Cost of Session Separation
>
>     Although loading classes from bytecode obtained from compiling
>     Groovy scripts
>     is a lot less expensive than compiling them (plus afterwards also
>     loading the
>     resulting bytecode), it is still somewhat more expensive than one
>     might naively
>     expect and there are a few things to be aware of when operating
>     that way.
>
>     In the following, I will simply call classes compiled by the
>     Groovy compiler
>     from Groovy scripts/sources _Groovy classes_ and classes compiled
>     by the Java
>     compiler from Java sources _Java classes_.
>
>     * *Class Loading* +
>       Experimentally, loading of a typical Groovy class is often about
>     10 times
>       slower than loading a Java class with similarly complex source
>     code, but
>       both are relatively expensive operations (of the order of a
>     millisecond
>       for a small Groovy class, to give a rough indication). For Java
>     classes,
>       this is apparently mainly expensive because some security checks
>     have to
>       be made on the bytecode. For Groovy classes, it is mainly expensive
>       because some meta information is needed to later efficiently
>     call methods
>       dynamically, and the like.
>     * *Garbage Collection* +
>       Classes are stored in _PermGen_ (up to Java 7) resp. _Metaspace_
>     (Java 8
>       and later) plus some associated data on the Heap, at least for
>     Groovy
>       classes the latter is normally the case (meta information).
>     Whereas for
>       Java classes, unused classes appear to be usually garbage
>     collected from
>       PermGen/Metaspace continuously, with Groovy classes this
>     experimentally
>       does not happen before PermGen/Metaspace or the Heap reach a
>     configured
>       limit. Why exactly this is so and whether it is easy to change
>     and whether
>       it will change in the future, is difficult to answer for me, I
>     find the
>       code around it is rather convoluted, hard to untangle. Note that
>     by default
>       on Java VMs there is typically no limit set for Metaspace (but
>     there is
>       for PermGen), so setting a limit is crucial in practice when
>     using Groovy.
>     * *Garbage Collection Bugs* +
>       In the past, several Groovy versions had failed at garbage
>     collecting
>       Groovy classes and their class loaders, resulting finally in an
>       `OutOfMemoryError` due to exhaustion of PermGen/Metaspace or the
>     Heap,
>       whichever limit was reached first. If when you are reading this,
>     Groovy
>       2.4.6 is (still) the newest version, make sure you set the
>     system property
>       `groovy.use.classvalue=true` in the context of Grengine. Note
>     that under
>       different circumstances, like the one described in
>     https://issues.apache.org/jira/browse/GROOVY-7591[GROOVY-7591
>     <https://issues.apache.org/jira/browse/GROOVY-7591%5BGROOVY-7591>:
>       Use of ClassValue causes major memory leak] you would instead
>     have had to
>       set it to false! That Groovy bug is actually in turn due to a bug in
>       Oracle/OpenJDK Java VMs regarding garbage collection under some
>       circumstances, more precisely a bug in a new feature (`ClassValue`)
>       introduced in order to make thing easier(!) for dynamic
>     languages in the
>       Java VM, see
>     https://bugs.openjdk.java.net/browse/JDK-8136353[JDK-8136353]
>     <https://bugs.openjdk.java.net/browse/JDK-8136353%5BJDK-8136353%5D>.
>
>     So, if you want to use session separation with Greninge (or
>     otherwise want
>     to load many Groovy classes repeately), first set a limit on
>     PermGen/Metaspace,
>     then verify that classes can be garbage collected in an
>     environment close to
>     production and that throughput under load would be sufficient
>     (despite the
>     relatively slow class loading performance of Groovy (and Java)
>     classes in the
>     Java VM) and then use it. And don't forget to repeat this at least
>     when you
>     upgrade Groovy to a new version, but possibly also when you
>     upgrade Java.
>
>     Or see the next section for an alternative...
>     --
>
>     PS: By the way, very funny how Jochen Theodorou "garbage
>     collected" what I
>     wrote about PhantomReference to a "[...]"...
>
>     Good luck with Groovy garbage collection.
>
>     Alain
>
>


Re: Improve Groovy class loading performance and memory management

Posted by John Wagenleitner <jo...@gmail.com>.
Just catching up on this thread, very interesting discussion and will have
to give the posted test code a try.

You are right about the PhantomReference and it has been removed in master
[1] along with the local cache that used it.  Due to some refactorings that
were not in 2_4_X at the time it wasn't removed from that branch.  But
probably should be cleaned up if any fixes for the memory issues to
ClassInfo are merged into that branch in the near future.

I think the suggestion of referencing the Class via the ClassInfo from
metaclasses/cachedclasses would be a good one, the less places the Class is
kept the better.  Unfortunately since it's a protected field on
MetaClassImpl that would be a breaking change would be something for a 3.0
as you pointed out.

So far, I have found that keeping a WeakReference to the Class in ClassInfo
allows it to be collected (mostly tested with non-ClassValue version of
ClassInfo).  At least one exception is if methods are added to the
metaclass then it's required to setMetaClass(null) since the
ExpandoMetaClass is a strong reference on ClassInfo and EMC has a strong
reference to the Class.  What is difficult to determine is if keeping a
WeakReference can cause any potential issues.  Only possible problem I can
see is if the methods of the Class A were referenced in the MetaMethodIndex
for Class B, but I think in that case as long as the Class B was strongly
referenced the index itself would keep Class A referenced.

In environments where lots of scripts are being parsed and run and
references to the Class are not retained, it might be worth having a way to
programmatically initiate the cleanup so as not to have to wait for the
Soft References to be collected.  The extra performance costs of clearing a
few references might not be as high as consistently hitting the upper heap
limit constantly.  It is something I have looked at for GROOVY-7646 [2].
Parsed groovy classes should be collectible by default without any
intervention, but there may be cases where an API to help speed the removal
might be useful too.


[1]
https://github.com/apache/groovy/commit/e967039222dc01a59824f95d9313a3b2e7aa9f50

[2] https://github.com/apache/groovy/pull/325


On Mon, May 16, 2016 at 8:01 AM, Alain Stalder <as...@span.ch> wrote:

> My time here is running up, other things to attend to, so here is what
> I wrote about the current state of class loading and garbage collection
> in Groovy in the just updated user manual of Grengine:
>
> https://www.grengine.ch/manual.html#the-cost-of-session-separation
> --
> ==== The Cost of Session Separation
>
> Although loading classes from bytecode obtained from compiling Groovy
> scripts
> is a lot less expensive than compiling them (plus afterwards also loading
> the
> resulting bytecode), it is still somewhat more expensive than one might
> naively
> expect and there are a few things to be aware of when operating that way.
>
> In the following, I will simply call classes compiled by the Groovy
> compiler
> from Groovy scripts/sources _Groovy classes_ and classes compiled by the
> Java
> compiler from Java sources _Java classes_.
>
> * *Class Loading* +
>   Experimentally, loading of a typical Groovy class is often about 10 times
>   slower than loading a Java class with similarly complex source code, but
>   both are relatively expensive operations (of the order of a millisecond
>   for a small Groovy class, to give a rough indication). For Java classes,
>   this is apparently mainly expensive because some security checks have to
>   be made on the bytecode. For Groovy classes, it is mainly expensive
>   because some meta information is needed to later efficiently call methods
>   dynamically, and the like.
> * *Garbage Collection* +
>   Classes are stored in _PermGen_ (up to Java 7) resp. _Metaspace_ (Java 8
>   and later) plus some associated data on the Heap, at least for Groovy
>   classes the latter is normally the case (meta information). Whereas for
>   Java classes, unused classes appear to be usually garbage collected from
>   PermGen/Metaspace continuously, with Groovy classes this experimentally
>   does not happen before PermGen/Metaspace or the Heap reach a configured
>   limit. Why exactly this is so and whether it is easy to change and
> whether
>   it will change in the future, is difficult to answer for me, I find the
>   code around it is rather convoluted, hard to untangle. Note that by
> default
>   on Java VMs there is typically no limit set for Metaspace (but there is
>   for PermGen), so setting a limit is crucial in practice when using
> Groovy.
> * *Garbage Collection Bugs* +
>   In the past, several Groovy versions had failed at garbage collecting
>   Groovy classes and their class loaders, resulting finally in an
>   `OutOfMemoryError` due to exhaustion of PermGen/Metaspace or the Heap,
>   whichever limit was reached first. If when you are reading this, Groovy
>   2.4.6 is (still) the newest version, make sure you set the system
> property
>   `groovy.use.classvalue=true` in the context of Grengine. Note that under
>   different circumstances, like the one described in
>   https://issues.apache.org/jira/browse/GROOVY-7591[GROOVY-7591:
>   Use of ClassValue causes major memory leak] you would instead have had to
>   set it to false! That Groovy bug is actually in turn due to a bug in
>   Oracle/OpenJDK Java VMs regarding garbage collection under some
>   circumstances, more precisely a bug in a new feature (`ClassValue`)
>   introduced in order to make thing easier(!) for dynamic languages in the
>   Java VM, see
> https://bugs.openjdk.java.net/browse/JDK-8136353[JDK-8136353].
>
> So, if you want to use session separation with Greninge (or otherwise want
> to load many Groovy classes repeately), first set a limit on
> PermGen/Metaspace,
> then verify that classes can be garbage collected in an environment close
> to
> production and that throughput under load would be sufficient (despite the
> relatively slow class loading performance of Groovy (and Java) classes in
> the
> Java VM) and then use it. And don't forget to repeat this at least when you
> upgrade Groovy to a new version, but possibly also when you upgrade Java.
>
> Or see the next section for an alternative...
> --
>
> PS: By the way, very funny how Jochen Theodorou "garbage collected" what I
> wrote about PhantomReference to a "[...]"...
>
> Good luck with Groovy garbage collection.
>
> Alain
>

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
My time here is running up, other things to attend to, so here is what
I wrote about the current state of class loading and garbage collection
in Groovy in the just updated user manual of Grengine:

https://www.grengine.ch/manual.html#the-cost-of-session-separation
--
==== The Cost of Session Separation

Although loading classes from bytecode obtained from compiling Groovy 
scripts
is a lot less expensive than compiling them (plus afterwards also 
loading the
resulting bytecode), it is still somewhat more expensive than one might 
naively
expect and there are a few things to be aware of when operating that way.

In the following, I will simply call classes compiled by the Groovy compiler
from Groovy scripts/sources _Groovy classes_ and classes compiled by the 
Java
compiler from Java sources _Java classes_.

* *Class Loading* +
   Experimentally, loading of a typical Groovy class is often about 10 times
   slower than loading a Java class with similarly complex source code, but
   both are relatively expensive operations (of the order of a millisecond
   for a small Groovy class, to give a rough indication). For Java classes,
   this is apparently mainly expensive because some security checks have to
   be made on the bytecode. For Groovy classes, it is mainly expensive
   because some meta information is needed to later efficiently call methods
   dynamically, and the like.
* *Garbage Collection* +
   Classes are stored in _PermGen_ (up to Java 7) resp. _Metaspace_ (Java 8
   and later) plus some associated data on the Heap, at least for Groovy
   classes the latter is normally the case (meta information). Whereas for
   Java classes, unused classes appear to be usually garbage collected from
   PermGen/Metaspace continuously, with Groovy classes this experimentally
   does not happen before PermGen/Metaspace or the Heap reach a configured
   limit. Why exactly this is so and whether it is easy to change and 
whether
   it will change in the future, is difficult to answer for me, I find the
   code around it is rather convoluted, hard to untangle. Note that by 
default
   on Java VMs there is typically no limit set for Metaspace (but there is
   for PermGen), so setting a limit is crucial in practice when using 
Groovy.
* *Garbage Collection Bugs* +
   In the past, several Groovy versions had failed at garbage collecting
   Groovy classes and their class loaders, resulting finally in an
   `OutOfMemoryError` due to exhaustion of PermGen/Metaspace or the Heap,
   whichever limit was reached first. If when you are reading this, Groovy
   2.4.6 is (still) the newest version, make sure you set the system 
property
   `groovy.use.classvalue=true` in the context of Grengine. Note that under
   different circumstances, like the one described in
   https://issues.apache.org/jira/browse/GROOVY-7591[GROOVY-7591:
   Use of ClassValue causes major memory leak] you would instead have had to
   set it to false! That Groovy bug is actually in turn due to a bug in
   Oracle/OpenJDK Java VMs regarding garbage collection under some
   circumstances, more precisely a bug in a new feature (`ClassValue`)
   introduced in order to make thing easier(!) for dynamic languages in the
   Java VM, see 
https://bugs.openjdk.java.net/browse/JDK-8136353[JDK-8136353].

So, if you want to use session separation with Greninge (or otherwise want
to load many Groovy classes repeately), first set a limit on 
PermGen/Metaspace,
then verify that classes can be garbage collected in an environment close to
production and that throughput under load would be sufficient (despite the
relatively slow class loading performance of Groovy (and Java) classes 
in the
Java VM) and then use it. And don't forget to repeat this at least when you
upgrade Groovy to a new version, but possibly also when you upgrade Java.

Or see the next section for an alternative...
--

PS: By the way, very funny how Jochen Theodorou "garbage collected" what I
wrote about PhantomReference to a "[...]"...

Good luck with Groovy garbage collection.

Alain

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
Thanks, that clarifies a lot to me, especially SoftReference.

So with Groovy it is only realistic to have GC of classes (and attached 
ClassInfo) kick in once a limit on Metaspace/PermGen (or Heap) is 
reached - fine with me, no point to try to "outrun the bear"... :)

A general question (current implementation and most likely APIs to keep 
aside): Why does ClassInfo need a reference to the class? To me the use 
case would be that you have an Groovy object or a Groovy class and want 
to do something with it (call a static or instance method, for example), 
so you only need to find ClassInfo from the class and then maybe pass 
the class temporarily just for doing things, but don't need it a 
reference back from ClassInfo.

I presume I am simply overlooking something rather obvious, but what 
would be the use case(s)? Or maybe historical reasons?

I have updated ClassGCTester ("version" 2.0.0) such that the class path 
for the URLClassLoader can be specified in more detail.

https://github.com/jexler/classgc
--
Usage: java [java-args] ClassGCTester -cp <class-path> -parent 
[tester|null] -classes <classes> [-wait]

   -cp:      Class path for URLClassLoader that will load the test classes.
             Directories and JARs separated by ':' or ';'.
             Example: .:classes/:libs/groovy-2.4.6.jar
   -parent:  Parent class loader for URLClassLoader that will load the 
test classes:
             'tester': ClassGCTester.class.getClassLoader() - i.e. same 
as tester
             'null':   null - i.e. no parent
   -classes: Test classes to load from the URLClassLoader in a loop.
             Fully qualified class names separated by ':' or ';'.
             Example: SampleInDefaultPackage:net.sample.Sample
   -wait:    Optional, whether to wait for key pressed before starting 
the test.
             Allows to attach external tools from the start.
             (Like 'jvisualvm' or 'jstat -gc <interval-ms> 
<number-of-iterations>' etc.)
             Note that this tool usually prints out its PID for convenience.
--

This allows, for example, two produce two of the known 
"OutOfMemoryError: Metaspace|PermGen" issues with Groovy 2.4.6, as follows.

Directory Structure:
- GroovyGCTester.class
- groovy-2.4.6.jar
- filling/GroovyFilling.class ("42" compiled, for example, but could be 
practically any Groovy class)

If you load Groovy only once with the class loader of ClassGCTester, and 
set groovy.use.classvalue=false (the default) you get the error:

$ java -XX:MaxMetaspaceSize=64m -Xmx512m -Dgroovy.use.classvalue=false 
-cp .:groovy-2.4.6.jar ClassGCTester -cp filling/ -parent tester 
-classes GroovyFilling
--
Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time
        #loaded  #remaining        used committed       used 
committed     average     average
[...]
   11      8107        8107      38.3m      55.1m     293.8m 455.5m     
0.433ms     0.909ms
   12      8579        8579      40.1m      57.9m     298.0m 455.5m     
0.581ms     0.909ms
   13      8665        8665      40.4m      58.5m     309.5m 455.5m     
0.577ms     0.908ms
   14      9423        9423      43.3m      63.0m     327.0m 455.5m     
0.545ms     0.940ms
Exception in thread "main" java.lang.OutOfMemoryError: Metaspace
[...]
--

With true GC kicks in at the limit of 64m:
--
Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time
        #loaded  #remaining        used committed       used 
committed     average     average
[...]
   12      8579        8579      40.0m      57.9m     300.7m 455.5m     
0.485ms     0.915ms
   13      9424        9424      43.3m      62.8m     330.0m 455.5m     
0.494ms     0.907ms
   14      9670          19       7.3m      30.0m      21.0m 367.0m     
0.527ms     0.908ms
   15     10899        1248      12.0m      31.3m     103.2m 367.0m     
0.484ms     0.880ms
[...]
--

If you instead load Groovy with the same URLClassLoader that loads 
GroovyFilling each time, it passes with groovy.use.classvalue=false (the 
default):

$ java -XX:MaxMetaspaceSize=64m -Xmx512m -Dgroovy.use.classvalue=false 
-cp . ClassGCTester -cp groovy-2.4.6.jar:filling/ -parent null -classes 
GroovyFilling
--
Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time
        #loaded  #remaining        used committed       used 
committed     average     average
[...]
    1         9           9      22.4m      23.1m      26.9m 242.0m     
1.622ms   112.702ms
    2        18          18      38.1m      39.4m      47.9m 308.0m     
1.419ms   112.346ms
    3        27          27      53.7m      55.4m     112.4m 308.0m     
1.367ms   111.185ms
    4        36           5      14.0m      28.0m      41.5m 457.0m     
1.315ms   111.633ms
    5        47          16      33.0m      37.4m      49.0m 474.5m     
1.242ms   106.555ms
[...]
--

Whereas you get the error with true:
--
[...]
Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time
        #loaded  #remaining        used committed       used 
committed     average     average
    0         1           1       8.0m       8.5m      17.2m 245.5m     
2.158ms   119.715ms
    1         9           9      22.3m      23.0m      24.2m 239.5m     
1.596ms   109.466ms
    2        18          18      37.8m      39.3m      46.2m 302.0m     
1.371ms   109.369ms
    3        27          27      53.3m      55.3m     109.8m 302.0m     
1.388ms   109.633ms
Exception in thread "main"
Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "main"
--

For completeness, ClassGCTester was compiled with Java 6 for Java 6, 
GroovyFiller with Groovy 2.4.6 (on Java 8, to Java 5) and the tests were 
run with Java 8 (Oracle JDK, Mac).

Alain













Re: Improve Groovy class loading performance and memory management

Posted by Jochen Theodorou <bl...@gmx.org>.
On 14.05.2016 09:54, Alain Stalder wrote:
[...]
> Lazy initialization of MetaClass:
>
> I could well imagine that this makes a noticeable difference, but you
> are probably much better able to estimate this offhand.
>
> If it made a noticeable difference, I guess it would also impact heap
> consumption due to loaded Groovy classes (less MetaClass instances) to a
> similar degree.

less meta classes is always good.

> Maybe measure in some way for how many and which classes MetaClass is
> actually ever needed/used? Maybe a modified Groovy test version that
> somehow records this and then use this Groovy test version in some
> "realistic" setups where lots of Groovy classes are loaded and
> instantiated? (Just thinking aloud...)

well... as soon as you access a property or call a method, you will most 
likely need a meta class of that class. What could be profit are small 
scripts. In case of Closure you will loose that already, because even if 
it does only return something like 42 and thus does not need to make a 
dynamic call, the call to the method containing that code is called 
dynamically, thus you need the meta class. On the other hand we have 
here ClosureMetaClass, with a smaller footprint exactly for that.

> Garbage collection without setting a limit on Metaspace:
>
> For the simple Java test class JavaFilling, the VM collected unused
> classes without having to set MaxMetaspaceSize, in the case of
> GroovyFilling this was not the case.

because of the use of SoftReference. The problem with using 
WeakReference for example is, that then you will get lots of meta class 
initializations, which can be a big impact on performance. ClassValue is 
supposed to make this better though

> I lack experience with garbage collection of "pure" Java classes, so I
> am not sure if the behavior observed with Java is only like this for
> very simple classes with little dependencies on other loaded classes. If
> that was so, I guess there would be very little that could be gained
> from trying to change Groovy behavior.

Well, in Java the classes are not referenced but structures as complex 
as we have to use like the table/classinfo/metaclass system. ClassValue 
is suppose to help here.

> One approach could be to look closer into that, maybe run ClassGCTester
> with some "pure Java" library JARs and load classes from there and
> observe Metaspace?

frankly, most Java applications do not even use more than one 
classloader, thus class unloading is of no  question there. That goes 
even as far as not being able to use the standard garbage collector if 
you want classes being unloaded properly. Well, this changed with JDK8. 
Even a JSP web app, does use a very limited amount of classes (but of 
course more class loaders than a standard Java app).

> Or maybe approach it the other way: Use a modified Groovy test version
> that *does* use a WeakHashMap, just to see if that would make a
> difference here and if investing more effort into that direction could
> amount to anything?

WeakHashMap cannot handle concurrency. Each and every access would have 
to be synchronized. In a simple application with GUI you can already 
have more than one thread operating potentially causing the creation of 
a meta class.

bye Jochen

Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
After reading a few times, some more feedback...

CompileStatic:

Cool that this thread (and maybe ClassGCTester?) has already yielded an 
issue that impacts performance (and I presume might likely be relatively 
straightforward to fix)  :)

Lazy initialization of MetaClass:

I could well imagine that this makes a noticeable difference, but you 
are probably much better able to estimate this offhand.

If it made a noticeable difference, I guess it would also impact heap 
consumption due to loaded Groovy classes (less MetaClass instances) to a 
similar degree.

Maybe measure in some way for how many and which classes MetaClass is 
actually ever needed/used? Maybe a modified Groovy test version that 
somehow records this and then use this Groovy test version in some 
"realistic" setups where lots of Groovy classes are loaded and 
instantiated? (Just thinking aloud...)

Garbage collection without setting a limit on Metaspace:

For the simple Java test class JavaFilling, the VM collected unused 
classes without having to set MaxMetaspaceSize, in the case of 
GroovyFilling this was not the case.

I lack experience with garbage collection of "pure" Java classes, so I 
am not sure if the behavior observed with Java is only like this for 
very simple classes with little dependencies on other loaded classes. If 
that was so, I guess there would be very little that could be gained 
from trying to change Groovy behavior.

One approach could be to look closer into that, maybe run ClassGCTester 
with some "pure Java" library JARs and load classes from there and 
observe Metaspace?

Or maybe approach it the other way: Use a modified Groovy test version 
that *does* use a WeakHashMap, just to see if that would make a 
difference here and if investing more effort into that direction could 
amount to anything? I am not familiar enough with the implementation to 
know if such a test change would be trivial or not, but interested 
enough to take at least a closer look, and I guess also to try to build 
Groovy myself for the first time, because this could be handy sometime 
anyway...

Alain

On 13.05.16 13:03, Jochen Theodorou wrote:
> On 13.05.2016 02:22, Alain Stalder wrote:
> [...]
>> Qualitatively this often has the following result in the Java VM:
>> Metaspace resp. PermGen, and Heap in parallel, just grow until a
>> configured limit is reached (and note that there is none by default for
>> Metaspace in Java 8 and later), often only then is it garbage collected.
>> With Java classes, at least with simple ones, this looks often
>> different, those appear to be garbage collected much more quickly.
>>
>> Another qualitative difference is that loading a Groovy class and
>> instantiating it seems typically to be considerably slower than
>> instantiating a Java class with similar functionality, even quite
>> drastically so, more than one would expect even considering the need to
>> create metadata for dynamic function calls etc.
>>
>> At least that has been my experience over the past few years.
>
> this is going to be a long mail ;)
>
> so let us make three things to discuss here:
>
> 1) Object initialization performance
> 2) class verification/complexity
> 3) garbage collection of unused classes
>
> And we have to distinguish here between usages of ClassValue, 
> invokedynamic and the traditional version of those... that makes 4 
> aspects.
>
> So I will write several things you probably already know, but other 
> reading here might not. And even though I simplify a bit, please bear 
> with me ;) And first of all, let us talk about the meta class 
> system... that mostly targets ClassValue then.
>
> The old version of the meta class system uses a big global table for 
> all meta classes, with a class key and ClassInfo as value. In 
> ClassInfo we have the meta class, which might be either recreate-able 
> (in which case the meta class is soft referenced) or not (in which 
> case it is a strong reference). The idea being, that as soon as the 
> class can be garbage collected, the ClassInfo can as well, and with it 
> the meta class.
>
> Problem 1 here is, the meta class holds a strong reference to the 
> class, so if the ClassInfo holds a strong reference to the meta class, 
> this entry in the table will never be collected. I mention this only 
> for completeness, since you did not set a permanent meta class in your 
> test
>
> Problem 2 here is, the code is concurrent, which rules out WeakHashMap 
> and forced us to implement our own map.
>
> In the ClassValue version we do not have our own table anymore and let 
> the JVM manage this.
>
> To avoid the lookup cost of the meta class, every Groovy class has a 
> reference to its ClassInfo. The meta class and ClassInfo are lazy 
> initialized.... well, "populated with actual data" in case of ClassInfo.
>
> Some classes extend GroovyObjectSupport, which does the initialization 
> in the constructor already, groovy.lang.Script is one of them. That 
> means every time you create an instance of a new script class, you get 
> the meta class already, even if the meta class is not used. let us 
> have a small look at the bytecode of the constructors (x.groovy which 
> only returns 42) of such a script:
>
>>   // access flags 0x1
>>   public <init>()V
>>     ALOAD 0
>>     INVOKESPECIAL groovy/lang/Script.<init> ()V
>>    L0
>>     INVOKESTATIC x.$getCallSiteArray 
>> ()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
>>     ASTORE 1
>>    L1
>>     RETURN
>>
>>   // access flags 0x1
>>   public <init>(Lgroovy/lang/Binding;)V
>>    L0
>>     INVOKESTATIC x.$getCallSiteArray 
>> ()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
>>     ASTORE 2
>>     ALOAD 0
>>     ALOAD 1
>>     INVOKESPECIAL groovy/lang/Script.<init> (Lgroovy/lang/Binding;)V
>>    L1
>>     RETURN
>
> as you can see here, there is a getCallSiteArray call in here and 
> direct methods calls for initialization. The getCallSiteArray call in 
> here is actually surplus, but it is difficult to decide in the 
> compiler early on if we will need it or not, because the callsite 
> array here is basically an array wrapper, which supplies method names 
> for method calls as well as specialized code for doing dynamic method 
> calls. Would the constructor for example contain a method call to a 
> method foo(), you would see some thing like getting the CallSite and 
> executing "call" on it.
>
> Why are we doing static method calls in the constructor here? Because 
> in several cases the compiler optimizes the dynamic call away here. 
> Basically you cannot provide a super constructor in Groovy, which 
> means only the one statically defined does count. And as long as the 
> given types are matching enough for the compiler to decide, we can 
> create direct method calls.
>
> So what does this mean for the object initialization performance in 
> cases of Scripts so far? We eagerly initialize ClassInfo and MetaClass 
> for each script. That means a lookup by reflection of the complete 
> structure of the class and its super classes... which is cached, so 
> the super class lookup will be much lower the next time. initializing 
> the first meta class will also initialize the extension method lookup, 
> which can also take a bit time... again, a one-time cost here.
>
> The invokedynamic version will again use the jdk internals for the 
> callsites, thus $getCallSiteArray is never called nor are the methods 
> around this created. This saves bytecode size (faster verification), 
> and in fact the two mentioned constructors above would look just like 
> they would in Java. This of course still implies the meta class system 
> init.
>
> Now... the actual code of the script is to be found in the run method, 
> which looks like this in old groovy:
>
>>   // access flags 0x1
>>   public run()Ljava/lang/Object;
>>    L0
>>     INVOKESTATIC x.$getCallSiteArray 
>> ()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
>>     ASTORE 1
>>    L1
>>     LINENUMBER 1 L1
>>     BIPUSH 42
>>     INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
>>     ARETURN
>>    L2
>>     ACONST_NULL
>>     ARETURN
>
> again we see the call to get the callsite array. But otherwise the 
> method should like this in Java too. Since we are not calling any 
> actual methods and just return 42, we have again needless overhead in 
> this. Accordingly the invokedynamic version looks like this:
>
>>   // access flags 0x1
>>   public run()Ljava/lang/Object;
>>    L0
>>     LINENUMBER 1 L0
>>     BIPUSH 42
>>     INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
>>     ARETURN
>>    L1
>>    FRAME FULL [] [java/lang/Throwable]
>>     NOP
>>     ATHROW
>
> Frames appear in this one, because the old groovy bytecode is java5 
> compatible, while invokedynamic requires java7 and thus the new faster 
> verifier with frames. The frames basically support the verifier to 
> make it faster.... I never tested if that is actually the case.
>
> To sum it up a little so far.... the invokedynamic version uses less 
> bytecode and the new verifier 1685 vs 2140, both could execute that 
> script without ever using the meta class system, if run is called 
> directly and if not Script would initialize the meta class eagerly. Of 
> course the later is for nil as soon as you have some method calls in 
> there, which are not static.
>
> Since we now have some basics, we should start talking about your test 
> case.
>
> First thing... I don�t think you load the class new every time. (Taken 
> from my machine, with java8u25 and Groovy 2.4.6)
>>    0         1           1      10.7m 11.4m        14.4m    
>> 240.0m     0.105ms     0.061ms
>>    1    760081           1      11.4m      11.8m        28.3m 
>> 246.5m     0.000ms     0.000ms
>
> I see here the first time class loading with 105 and a creation time 
> with 61. This class loading time has to appear every time and must 
> never be 0. But that is the case. So even if you create a new 
> URLClassLoader without defining a root, it still has the system class 
> loader as parent. Providing null as parent class loader fixes that 
> problem. And then I got a more interesting:
>
>>    0         1           1      10.6m 11.4m        14.4m    
>> 240.0m     0.213ms     0.063ms
>>    1     11519        2227      16.2m      35.9m        24.2m 
>> 246.0m     0.060ms     0.022ms
>>    2     27103        4055      19.4m      46.8m        40.7m 
>> 229.0m     0.051ms     0.019ms
>>    3     42465           1      12.2m      34.4m         4.5m 
>> 235.0m     0.049ms     0.019ms
>>    4     57606         580      13.3m      34.4m         9.0m 
>> 236.0m     0.048ms     0.018ms
>>    5     73506        1918      15.8m      39.1m        21.3m 
>> 236.0m     0.047ms     0.018ms
>>    6     89212        2482      16.8m      43.0m        25.8m 
>> 212.5m     0.047ms     0.018ms
>>    7    104578           1      12.4m      41.7m         4.0m 
>> 232.0m     0.046ms     0.018ms
>
> actually this was with java5 compatible code... with java8 standard I 
> get something like this:
>
>>    0         1           1      10.6m 11.4m        14.4m    
>> 240.0m     0.214ms     0.029ms
>>    1     12117        2861      17.3m      38.4m        30.4m 
>> 246.0m     0.062ms     0.016ms
>>    2     29143        1303      14.6m      36.6m        15.8m 
>> 236.0m     0.052ms     0.014ms
>>    3     46633        4267      19.8m      48.3m        43.2m 
>> 235.0m     0.048ms     0.013ms
>>    4     62634         900      14.0m      35.2m        11.4m 
>> 236.0m     0.048ms     0.013ms
>>    5     80618        4358      20.1m      48.7m        43.5m 
>> 235.5m     0.046ms     0.013ms
>>    6     97964        2336      16.6m      41.0m        25.3m 
>> 236.5m     0.046ms     0.012ms
>
> This makes me actually wondering a bit as of why the creation time 
> goes down, but not the load time. And after some thinking things 
> become clear to me...when you load a class using loadClass, the class 
> is not verified immediately. This happens lazy on actual static 
> initialization or once you ask the class for its structure... like 
> when using getMethods. So what load time tests here is purely the 
> class loader performance, nothing else. A difference in load time 
> would thus indicate solely that more or less classes are used. This 
> also means the creation time here includes the verification time of 
> the class, which I guess amounts to the .006 in difference between the 
> two.
>
> Now lets get further and change the simple java class with a run 
> method to extend Groovy�s script class.. well this requires further 
> changes, since now of course giving the groovy jar on the classpath 
> will no longer enable the spawned URLClassloader to load groovy 
> classes. Anyway... after fixing that:
>
>>    0         1           1      13.4m 13.8m        24.0m    
>> 240.0m     1.290ms    90.612ms
>>    1        12          12      39.3m      40.1m        64.3m 
>> 190.0m     0.985ms    82.673ms
>>    2        23          23      65.0m      66.5m        56.0m 
>> 246.0m     1.024ms    88.072ms
>>    3        34           5      29.3m      46.8m        27.6m 
>> 230.5m     0.976ms    88.245ms
>>    4        48          19      54.8m      64.0m        39.2m 
>> 228.0m     0.929ms    83.359ms
>>    5        60           1      19.6m      51.3m        13.2m 
>> 228.0m     0.892ms    84.574ms
>>    6        73          14      42.6m      55.8m        36.8m 
>> 228.0m     0.859ms    82.042ms
> ...
>>   46       593           1      21.8m 62.2m         7.9m    
>> 220.5m     0.783ms    76.878ms
>>   47       607          15      47.0m      62.3m        41.8m 
>> 219.0m     0.787ms    76.695ms
>>   48       621          29      78.3m      82.5m        44.5m 
>> 220.5m     0.782ms    76.563ms
>
> nothing much more will happen. I guess having a fast SSD is paying off 
> here, creation times would have been 1s+ otherwise. Anyway... as you 
> can see by the load time, we load a ton more classes now and creation 
> is much more expensive, since we now load part of the meta class 
> system. Without any further changes, this would be the best Groovy can 
> achieve without actually changing the way scripts are done. But as you 
> can see as well, the memory behaviour is actually quite well. I would 
> have liked to test the ClassValue version, but it crashes almost 
> immediately after my changes.
>
> Next step is then to use the simple "42" script:
> ...
>>   46       596          17      52.0m 63.7m        36.9m    
>> 219.5m     0.840ms    76.407ms
>>   47       609           1      21.7m      60.7m        11.5m 
>> 223.0m     0.835ms    76.536ms
>>   48       621          13      43.0m      60.7m        27.0m 
>> 223.0m     0.830ms    76.434ms
>
> we now load even more classes like CallSite and such, but creation 
> times are about the same meaning the overhead here is almost zero in 
> the end. Unsurprisingly the invokedynamic version behaves about the 
> same. Maybe the load times are a bit less good... which indicates more 
> classes being loaded internally in the JDK code to support invokedynamic.
>
> Now a further step... I can change the script from being "42" to:
>
> public class GroovyFilling {
>   public Object run() {
>     return 42;
>   }
> }
>
>
> so basically the same as the java version... This means Script will no 
> longer be used here!
>
>>   46       638          28      74.9m 79.6m        50.7m    
>> 227.5m     0.626ms    71.473ms
>>   47       650          10      35.4m      60.3m        41.1m 
>> 227.5m     0.631ms    71.630ms
>>   48       665          25      68.4m      74.6m        47.4m 
>> 227.5m     0.626ms    71.512ms
>
> well... we seem to load less classes, but we still get quite high 
> creation times. looking at the bytecode I find this (for the indy 
> version, but that does not really matter):
>
>>   // access flags 0x1
>>   public <init>()V
>>     ALOAD 0
>>     INVOKESPECIAL java/lang/Object.<init> ()V
>>    L0
>>     ALOAD 0
>>     INVOKEVIRTUAL GroovyFilling.$getStaticMetaClass 
>> ()Lgroovy/lang/MetaClass;
>>     ASTORE 1
>>     ALOAD 1
>>     ALOAD 0
>>     SWAP
>>     PUTFIELD GroovyFilling.metaClass : Lgroovy/lang/MetaClass;
>>     ALOAD 1
>>     POP
>>    L1
>>     RETURN
>
> So we actually do an eager meta class creation here as well. The worst 
> part here is... we get this even with static compilation... which 
> absolutely is a bug.
>
> So to improve this test here I guess trying to get rid of 
> $getStaticMetaClass and making Script (maybe even GroovyObjectSupport) 
> init meta classes lazy would be a gain. In the times of pre Groovy 1.8 
> this would not have amassed to much gain, since we did dynamic calls 
> all other the place. But since then we managed to reduce those calls 
> quite a lot, so I see an actual potential to reduce creation times for 
> simple cases. Of course in the more complex ones or in case in which 
> the super class constructor cannot be chosen statically by the 
> compiler, all bets are off again.
>
> bye Jochen
>
> .
>


Re: Improve Groovy class loading performance and memory management

Posted by Alain Stalder <as...@span.ch>.
First about the bug that classes were loaded only once: This did not 
happen in my case for the GroovyFilling because I used different 
directories for ClassGCTester and GroovyFilling, so the parent of the 
URLClassLoader could not load the test class, but it happened 
(unintentionally) in my tests for JavaFilling which is why I though that 
loading Java classes was so incredibly fast... :(

With that fixed, I got much more reasonable output, about 
0.1ms+0.02ms=0.12ms for loading JavaFilling and about 0.4ms + 0.9ms = 
1.3ms for the original GroovyFilling. So, there is not all that much of 
room for improvement, at least a lot less than I feared, considering 
that Groovy classes do have more capabilities.

$ java -cp . ClassGCTester filling/ JavaFilling

Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time
        #loaded  #remaining        used committed       used 
committed     average     average
    0         1           1       2.9m       4.8m       2.6m 245.5m     
0.812ms     0.098ms
    1      5577        2803       7.4m      21.3m      37.9m 177.5m     
0.136ms     0.028ms
    2     13610        3437       8.2m      25.9m      45.6m 361.0m     
0.114ms     0.024ms
    3     21579        2836       7.4m      24.0m      38.3m 513.5m     
0.109ms     0.023ms
    4     29582        2269       6.7m      21.8m      31.6m 733.5m     
0.106ms     0.022ms
    5     37792        1909       6.3m      20.5m      28.7m 990.5m     
0.104ms     0.022ms
    6     45894        1441       5.7m      18.6m      22.3m 1257.0m     
0.103ms     0.022ms
    7     54049        1026       5.2m      17.6m      19.1m 1575.5m     
0.102ms     0.021ms

$ java -XX:MaxMetaspaceSize=64m -Xmx512m -Dgroovy.use.classvalue=true 
-cp .:groovy-2.4.6.jar ClassGCTester filling/ GroovyFilling

Secs Test classes              Metaspace/PermGen Heap   Load time Create 
time
        #loaded  #remaining        used committed       used 
committed     average     average
    0         1           1       6.3m       6.5m      13.4m 245.5m     
0.875ms    11.525ms
    1       474         474       9.0m      10.3m      36.5m 245.5m     
0.350ms     1.694ms
    2      1312        1312      12.2m      15.1m     103.7m 309.5m     
0.265ms     1.224ms
    3      2291        2291      16.0m      21.0m      87.4m 389.0m     
0.417ms     1.034ms
    4      2977        2977      18.6m      25.1m     178.4m 389.0m     
0.360ms     0.962ms
    5      4065        4065      22.8m      31.5m     216.0m 389.0m     
0.307ms     0.905ms
    6      4641        4641      25.0m      34.9m     164.0m 455.5m     
0.444ms     0.892ms
    7      5314        5314      27.6m      38.8m     213.2m 455.5m     
0.412ms     0.888ms

What I still observed was that for loading JavaFilling, Metaspace does 
not grow indefinitely even without a limit (see above), but it does for 
GroovyFilling, which I can also understand. Would be nice if it was 
possible that Groovy classes were also collected so quickly, but in 
practice I guess once you know that, you just have to set a reasonable 
maximum for Metaspace, and when you operate a "server environment" you 
have to take care of these kinds of things anyway, I would say.

So, essentially I am quite happy with the results and with how Groovy 
fares :)

I have made a "release" 1.1.0 of ClassGCTester with an added check that 
the test class cannot be loaded from the classpath of ClassGCTester 
alone and with a fix for the display of Metaspace/PermGen (this matches 
now roughly the output of "jstat -gc ..." for MC and MU, resp. the 
equivalents for PermGen), plus updated the readme and examples.

So far for the "broad picture", some feedback regarding your detailed 
analysis hopefully a bit later, I think I will first read it again a few 
more times - very interesting... :)

Alain


On 13.05.16 13:03, Jochen Theodorou wrote:
> On 13.05.2016 02:22, Alain Stalder wrote:
> [...]
>> Qualitatively this often has the following result in the Java VM:
>> Metaspace resp. PermGen, and Heap in parallel, just grow until a
>> configured limit is reached (and note that there is none by default for
>> Metaspace in Java 8 and later), often only then is it garbage collected.
>> With Java classes, at least with simple ones, this looks often
>> different, those appear to be garbage collected much more quickly.
>>
>> Another qualitative difference is that loading a Groovy class and
>> instantiating it seems typically to be considerably slower than
>> instantiating a Java class with similar functionality, even quite
>> drastically so, more than one would expect even considering the need to
>> create metadata for dynamic function calls etc.
>>
>> At least that has been my experience over the past few years.
>
> this is going to be a long mail ;)
>
> so let us make three things to discuss here:
>
> 1) Object initialization performance
> 2) class verification/complexity
> 3) garbage collection of unused classes
>
> And we have to distinguish here between usages of ClassValue, 
> invokedynamic and the traditional version of those... that makes 4 
> aspects.
>
> So I will write several things you probably already know, but other 
> reading here might not. And even though I simplify a bit, please bear 
> with me ;) And first of all, let us talk about the meta class 
> system... that mostly targets ClassValue then.
>
> The old version of the meta class system uses a big global table for 
> all meta classes, with a class key and ClassInfo as value. In 
> ClassInfo we have the meta class, which might be either recreate-able 
> (in which case the meta class is soft referenced) or not (in which 
> case it is a strong reference). The idea being, that as soon as the 
> class can be garbage collected, the ClassInfo can as well, and with it 
> the meta class.
>
> Problem 1 here is, the meta class holds a strong reference to the 
> class, so if the ClassInfo holds a strong reference to the meta class, 
> this entry in the table will never be collected. I mention this only 
> for completeness, since you did not set a permanent meta class in your 
> test
>
> Problem 2 here is, the code is concurrent, which rules out WeakHashMap 
> and forced us to implement our own map.
>
> In the ClassValue version we do not have our own table anymore and let 
> the JVM manage this.
>
> To avoid the lookup cost of the meta class, every Groovy class has a 
> reference to its ClassInfo. The meta class and ClassInfo are lazy 
> initialized.... well, "populated with actual data" in case of ClassInfo.
>
> Some classes extend GroovyObjectSupport, which does the initialization 
> in the constructor already, groovy.lang.Script is one of them. That 
> means every time you create an instance of a new script class, you get 
> the meta class already, even if the meta class is not used. let us 
> have a small look at the bytecode of the constructors (x.groovy which 
> only returns 42) of such a script:
>
>>   // access flags 0x1
>>   public <init>()V
>>     ALOAD 0
>>     INVOKESPECIAL groovy/lang/Script.<init> ()V
>>    L0
>>     INVOKESTATIC x.$getCallSiteArray 
>> ()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
>>     ASTORE 1
>>    L1
>>     RETURN
>>
>>   // access flags 0x1
>>   public <init>(Lgroovy/lang/Binding;)V
>>    L0
>>     INVOKESTATIC x.$getCallSiteArray 
>> ()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
>>     ASTORE 2
>>     ALOAD 0
>>     ALOAD 1
>>     INVOKESPECIAL groovy/lang/Script.<init> (Lgroovy/lang/Binding;)V
>>    L1
>>     RETURN
>
> as you can see here, there is a getCallSiteArray call in here and 
> direct methods calls for initialization. The getCallSiteArray call in 
> here is actually surplus, but it is difficult to decide in the 
> compiler early on if we will need it or not, because the callsite 
> array here is basically an array wrapper, which supplies method names 
> for method calls as well as specialized code for doing dynamic method 
> calls. Would the constructor for example contain a method call to a 
> method foo(), you would see some thing like getting the CallSite and 
> executing "call" on it.
>
> Why are we doing static method calls in the constructor here? Because 
> in several cases the compiler optimizes the dynamic call away here. 
> Basically you cannot provide a super constructor in Groovy, which 
> means only the one statically defined does count. And as long as the 
> given types are matching enough for the compiler to decide, we can 
> create direct method calls.
>
> So what does this mean for the object initialization performance in 
> cases of Scripts so far? We eagerly initialize ClassInfo and MetaClass 
> for each script. That means a lookup by reflection of the complete 
> structure of the class and its super classes... which is cached, so 
> the super class lookup will be much lower the next time. initializing 
> the first meta class will also initialize the extension method lookup, 
> which can also take a bit time... again, a one-time cost here.
>
> The invokedynamic version will again use the jdk internals for the 
> callsites, thus $getCallSiteArray is never called nor are the methods 
> around this created. This saves bytecode size (faster verification), 
> and in fact the two mentioned constructors above would look just like 
> they would in Java. This of course still implies the meta class system 
> init.
>
> Now... the actual code of the script is to be found in the run method, 
> which looks like this in old groovy:
>
>>   // access flags 0x1
>>   public run()Ljava/lang/Object;
>>    L0
>>     INVOKESTATIC x.$getCallSiteArray 
>> ()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
>>     ASTORE 1
>>    L1
>>     LINENUMBER 1 L1
>>     BIPUSH 42
>>     INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
>>     ARETURN
>>    L2
>>     ACONST_NULL
>>     ARETURN
>
> again we see the call to get the callsite array. But otherwise the 
> method should like this in Java too. Since we are not calling any 
> actual methods and just return 42, we have again needless overhead in 
> this. Accordingly the invokedynamic version looks like this:
>
>>   // access flags 0x1
>>   public run()Ljava/lang/Object;
>>    L0
>>     LINENUMBER 1 L0
>>     BIPUSH 42
>>     INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
>>     ARETURN
>>    L1
>>    FRAME FULL [] [java/lang/Throwable]
>>     NOP
>>     ATHROW
>
> Frames appear in this one, because the old groovy bytecode is java5 
> compatible, while invokedynamic requires java7 and thus the new faster 
> verifier with frames. The frames basically support the verifier to 
> make it faster.... I never tested if that is actually the case.
>
> To sum it up a little so far.... the invokedynamic version uses less 
> bytecode and the new verifier 1685 vs 2140, both could execute that 
> script without ever using the meta class system, if run is called 
> directly and if not Script would initialize the meta class eagerly. Of 
> course the later is for nil as soon as you have some method calls in 
> there, which are not static.
>
> Since we now have some basics, we should start talking about your test 
> case.
>
> First thing... I don�t think you load the class new every time. (Taken 
> from my machine, with java8u25 and Groovy 2.4.6)
>>    0         1           1      10.7m 11.4m        14.4m    
>> 240.0m     0.105ms     0.061ms
>>    1    760081           1      11.4m      11.8m        28.3m 
>> 246.5m     0.000ms     0.000ms
>
> I see here the first time class loading with 105 and a creation time 
> with 61. This class loading time has to appear every time and must 
> never be 0. But that is the case. So even if you create a new 
> URLClassLoader without defining a root, it still has the system class 
> loader as parent. Providing null as parent class loader fixes that 
> problem. And then I got a more interesting:
>
>>    0         1           1      10.6m 11.4m        14.4m    
>> 240.0m     0.213ms     0.063ms
>>    1     11519        2227      16.2m      35.9m        24.2m 
>> 246.0m     0.060ms     0.022ms
>>    2     27103        4055      19.4m      46.8m        40.7m 
>> 229.0m     0.051ms     0.019ms
>>    3     42465           1      12.2m      34.4m         4.5m 
>> 235.0m     0.049ms     0.019ms
>>    4     57606         580      13.3m      34.4m         9.0m 
>> 236.0m     0.048ms     0.018ms
>>    5     73506        1918      15.8m      39.1m        21.3m 
>> 236.0m     0.047ms     0.018ms
>>    6     89212        2482      16.8m      43.0m        25.8m 
>> 212.5m     0.047ms     0.018ms
>>    7    104578           1      12.4m      41.7m         4.0m 
>> 232.0m     0.046ms     0.018ms
>
> actually this was with java5 compatible code... with java8 standard I 
> get something like this:
>
>>    0         1           1      10.6m 11.4m        14.4m    
>> 240.0m     0.214ms     0.029ms
>>    1     12117        2861      17.3m      38.4m        30.4m 
>> 246.0m     0.062ms     0.016ms
>>    2     29143        1303      14.6m      36.6m        15.8m 
>> 236.0m     0.052ms     0.014ms
>>    3     46633        4267      19.8m      48.3m        43.2m 
>> 235.0m     0.048ms     0.013ms
>>    4     62634         900      14.0m      35.2m        11.4m 
>> 236.0m     0.048ms     0.013ms
>>    5     80618        4358      20.1m      48.7m        43.5m 
>> 235.5m     0.046ms     0.013ms
>>    6     97964        2336      16.6m      41.0m        25.3m 
>> 236.5m     0.046ms     0.012ms
>
> This makes me actually wondering a bit as of why the creation time 
> goes down, but not the load time. And after some thinking things 
> become clear to me...when you load a class using loadClass, the class 
> is not verified immediately. This happens lazy on actual static 
> initialization or once you ask the class for its structure... like 
> when using getMethods. So what load time tests here is purely the 
> class loader performance, nothing else. A difference in load time 
> would thus indicate solely that more or less classes are used. This 
> also means the creation time here includes the verification time of 
> the class, which I guess amounts to the .006 in difference between the 
> two.
>
> Now lets get further and change the simple java class with a run 
> method to extend Groovy�s script class.. well this requires further 
> changes, since now of course giving the groovy jar on the classpath 
> will no longer enable the spawned URLClassloader to load groovy 
> classes. Anyway... after fixing that:
>
>>    0         1           1      13.4m 13.8m        24.0m    
>> 240.0m     1.290ms    90.612ms
>>    1        12          12      39.3m      40.1m        64.3m 
>> 190.0m     0.985ms    82.673ms
>>    2        23          23      65.0m      66.5m        56.0m 
>> 246.0m     1.024ms    88.072ms
>>    3        34           5      29.3m      46.8m        27.6m 
>> 230.5m     0.976ms    88.245ms
>>    4        48          19      54.8m      64.0m        39.2m 
>> 228.0m     0.929ms    83.359ms
>>    5        60           1      19.6m      51.3m        13.2m 
>> 228.0m     0.892ms    84.574ms
>>    6        73          14      42.6m      55.8m        36.8m 
>> 228.0m     0.859ms    82.042ms
> ...
>>   46       593           1      21.8m 62.2m         7.9m    
>> 220.5m     0.783ms    76.878ms
>>   47       607          15      47.0m      62.3m        41.8m 
>> 219.0m     0.787ms    76.695ms
>>   48       621          29      78.3m      82.5m        44.5m 
>> 220.5m     0.782ms    76.563ms
>
> nothing much more will happen. I guess having a fast SSD is paying off 
> here, creation times would have been 1s+ otherwise. Anyway... as you 
> can see by the load time, we load a ton more classes now and creation 
> is much more expensive, since we now load part of the meta class 
> system. Without any further changes, this would be the best Groovy can 
> achieve without actually changing the way scripts are done. But as you 
> can see as well, the memory behaviour is actually quite well. I would 
> have liked to test the ClassValue version, but it crashes almost 
> immediately after my changes.
>
> Next step is then to use the simple "42" script:
> ...
>>   46       596          17      52.0m 63.7m        36.9m    
>> 219.5m     0.840ms    76.407ms
>>   47       609           1      21.7m      60.7m        11.5m 
>> 223.0m     0.835ms    76.536ms
>>   48       621          13      43.0m      60.7m        27.0m 
>> 223.0m     0.830ms    76.434ms
>
> we now load even more classes like CallSite and such, but creation 
> times are about the same meaning the overhead here is almost zero in 
> the end. Unsurprisingly the invokedynamic version behaves about the 
> same. Maybe the load times are a bit less good... which indicates more 
> classes being loaded internally in the JDK code to support invokedynamic.
>
> Now a further step... I can change the script from being "42" to:
>
> public class GroovyFilling {
>   public Object run() {
>     return 42;
>   }
> }
>
>
> so basically the same as the java version... This means Script will no 
> longer be used here!
>
>>   46       638          28      74.9m 79.6m        50.7m    
>> 227.5m     0.626ms    71.473ms
>>   47       650          10      35.4m      60.3m        41.1m 
>> 227.5m     0.631ms    71.630ms
>>   48       665          25      68.4m      74.6m        47.4m 
>> 227.5m     0.626ms    71.512ms
>
> well... we seem to load less classes, but we still get quite high 
> creation times. looking at the bytecode I find this (for the indy 
> version, but that does not really matter):
>
>>   // access flags 0x1
>>   public <init>()V
>>     ALOAD 0
>>     INVOKESPECIAL java/lang/Object.<init> ()V
>>    L0
>>     ALOAD 0
>>     INVOKEVIRTUAL GroovyFilling.$getStaticMetaClass 
>> ()Lgroovy/lang/MetaClass;
>>     ASTORE 1
>>     ALOAD 1
>>     ALOAD 0
>>     SWAP
>>     PUTFIELD GroovyFilling.metaClass : Lgroovy/lang/MetaClass;
>>     ALOAD 1
>>     POP
>>    L1
>>     RETURN
>
> So we actually do an eager meta class creation here as well. The worst 
> part here is... we get this even with static compilation... which 
> absolutely is a bug.
>
> So to improve this test here I guess trying to get rid of 
> $getStaticMetaClass and making Script (maybe even GroovyObjectSupport) 
> init meta classes lazy would be a gain. In the times of pre Groovy 1.8 
> this would not have amassed to much gain, since we did dynamic calls 
> all other the place. But since then we managed to reduce those calls 
> quite a lot, so I see an actual potential to reduce creation times for 
> simple cases. Of course in the more complex ones or in case in which 
> the super class constructor cannot be chosen statically by the 
> compiler, all bets are off again.
>
> bye Jochen
>
> .
>


Re: Improve Groovy class loading performance and memory management

Posted by Jochen Theodorou <bl...@gmx.org>.
On 13.05.2016 02:22, Alain Stalder wrote:
[...]
> Qualitatively this often has the following result in the Java VM:
> Metaspace resp. PermGen, and Heap in parallel, just grow until a
> configured limit is reached (and note that there is none by default for
> Metaspace in Java 8 and later), often only then is it garbage collected.
> With Java classes, at least with simple ones, this looks often
> different, those appear to be garbage collected much more quickly.
>
> Another qualitative difference is that loading a Groovy class and
> instantiating it seems typically to be considerably slower than
> instantiating a Java class with similar functionality, even quite
> drastically so, more than one would expect even considering the need to
> create metadata for dynamic function calls etc.
>
> At least that has been my experience over the past few years.

this is going to be a long mail ;)

so let us make three things to discuss here:

1) Object initialization performance
2) class verification/complexity
3) garbage collection of unused classes

And we have to distinguish here between usages of ClassValue, 
invokedynamic and the traditional version of those... that makes 4 aspects.

So I will write several things you probably already know, but other 
reading here might not. And even though I simplify a bit, please bear 
with me ;) And first of all, let us talk about the meta class system... 
that mostly targets ClassValue then.

The old version of the meta class system uses a big global table for all 
meta classes, with a class key and ClassInfo as value. In ClassInfo we 
have the meta class, which might be either recreate-able (in which case 
the meta class is soft referenced) or not (in which case it is a strong 
reference). The idea being, that as soon as the class can be garbage 
collected, the ClassInfo can as well, and with it the meta class.

Problem 1 here is, the meta class holds a strong reference to the class, 
so if the ClassInfo holds a strong reference to the meta class, this 
entry in the table will never be collected. I mention this only for 
completeness, since you did not set a permanent meta class in your test

Problem 2 here is, the code is concurrent, which rules out WeakHashMap 
and forced us to implement our own map.

In the ClassValue version we do not have our own table anymore and let 
the JVM manage this.

To avoid the lookup cost of the meta class, every Groovy class has a 
reference to its ClassInfo. The meta class and ClassInfo are lazy 
initialized.... well, "populated with actual data" in case of ClassInfo.

Some classes extend GroovyObjectSupport, which does the initialization 
in the constructor already, groovy.lang.Script is one of them. That 
means every time you create an instance of a new script class, you get 
the meta class already, even if the meta class is not used. let us have 
a small look at the bytecode of the constructors (x.groovy which only 
returns 42) of such a script:

>   // access flags 0x1
>   public <init>()V
>     ALOAD 0
>     INVOKESPECIAL groovy/lang/Script.<init> ()V
>    L0
>     INVOKESTATIC x.$getCallSiteArray ()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
>     ASTORE 1
>    L1
>     RETURN
>
>   // access flags 0x1
>   public <init>(Lgroovy/lang/Binding;)V
>    L0
>     INVOKESTATIC x.$getCallSiteArray ()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
>     ASTORE 2
>     ALOAD 0
>     ALOAD 1
>     INVOKESPECIAL groovy/lang/Script.<init> (Lgroovy/lang/Binding;)V
>    L1
>     RETURN

as you can see here, there is a getCallSiteArray call in here and direct 
methods calls for initialization. The getCallSiteArray call in here is 
actually surplus, but it is difficult to decide in the compiler early on 
if we will need it or not, because the callsite array here is basically 
an array wrapper, which supplies method names for method calls as well 
as specialized code for doing dynamic method calls. Would the 
constructor for example contain a method call to a method foo(), you 
would see some thing like getting the CallSite and executing "call" on it.

Why are we doing static method calls in the constructor here? Because in 
several cases the compiler optimizes the dynamic call away here. 
Basically you cannot provide a super constructor in Groovy, which means 
only the one statically defined does count. And as long as the given 
types are matching enough for the compiler to decide, we can create 
direct method calls.

So what does this mean for the object initialization performance in 
cases of Scripts so far? We eagerly initialize ClassInfo and MetaClass 
for each script. That means a lookup by reflection of the complete 
structure of the class and its super classes... which is cached, so the 
super class lookup will be much lower the next time. initializing the 
first meta class will also initialize the extension method lookup, which 
can also take a bit time... again, a one-time cost here.

The invokedynamic version will again use the jdk internals for the 
callsites, thus $getCallSiteArray is never called nor are the methods 
around this created. This saves bytecode size (faster verification), and 
in fact the two mentioned constructors above would look just like they 
would in Java. This of course still implies the meta class system init.

Now... the actual code of the script is to be found in the run method, 
which looks like this in old groovy:

>   // access flags 0x1
>   public run()Ljava/lang/Object;
>    L0
>     INVOKESTATIC x.$getCallSiteArray ()[Lorg/codehaus/groovy/runtime/callsite/CallSite;
>     ASTORE 1
>    L1
>     LINENUMBER 1 L1
>     BIPUSH 42
>     INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
>     ARETURN
>    L2
>     ACONST_NULL
>     ARETURN

again we see the call to get the callsite array. But otherwise the 
method should like this in Java too. Since we are not calling any actual 
methods and just return 42, we have again needless overhead in this. 
Accordingly the invokedynamic version looks like this:

>   // access flags 0x1
>   public run()Ljava/lang/Object;
>    L0
>     LINENUMBER 1 L0
>     BIPUSH 42
>     INVOKESTATIC java/lang/Integer.valueOf (I)Ljava/lang/Integer;
>     ARETURN
>    L1
>    FRAME FULL [] [java/lang/Throwable]
>     NOP
>     ATHROW

Frames appear in this one, because the old groovy bytecode is java5 
compatible, while invokedynamic requires java7 and thus the new faster 
verifier with frames. The frames basically support the verifier to make 
it faster.... I never tested if that is actually the case.

To sum it up a little so far.... the invokedynamic version uses less 
bytecode and the new verifier 1685 vs 2140, both could execute that 
script without ever using the meta class system, if run is called 
directly and if not Script would initialize the meta class eagerly. Of 
course the later is for nil as soon as you have some method calls in 
there, which are not static.

Since we now have some basics, we should start talking about your test case.

First thing... I don�t think you load the class new every time. (Taken 
from my machine, with java8u25 and Groovy 2.4.6)
>    0         1           1      10.7m      11.4m        14.4m    240.0m     0.105ms     0.061ms
>    1    760081           1      11.4m      11.8m        28.3m    246.5m     0.000ms     0.000ms

I see here the first time class loading with 105 and a creation time 
with 61. This class loading time has to appear every time and must never 
be 0. But that is the case. So even if you create a new URLClassLoader 
without defining a root, it still has the system class loader as parent. 
Providing null as parent class loader fixes that problem. And then I got 
a more interesting:

>    0         1           1      10.6m      11.4m        14.4m    240.0m     0.213ms     0.063ms
>    1     11519        2227      16.2m      35.9m        24.2m    246.0m     0.060ms     0.022ms
>    2     27103        4055      19.4m      46.8m        40.7m    229.0m     0.051ms     0.019ms
>    3     42465           1      12.2m      34.4m         4.5m    235.0m     0.049ms     0.019ms
>    4     57606         580      13.3m      34.4m         9.0m    236.0m     0.048ms     0.018ms
>    5     73506        1918      15.8m      39.1m        21.3m    236.0m     0.047ms     0.018ms
>    6     89212        2482      16.8m      43.0m        25.8m    212.5m     0.047ms     0.018ms
>    7    104578           1      12.4m      41.7m         4.0m    232.0m     0.046ms     0.018ms

actually this was with java5 compatible code... with java8 standard I 
get something like this:

>    0         1           1      10.6m      11.4m        14.4m    240.0m     0.214ms     0.029ms
>    1     12117        2861      17.3m      38.4m        30.4m    246.0m     0.062ms     0.016ms
>    2     29143        1303      14.6m      36.6m        15.8m    236.0m     0.052ms     0.014ms
>    3     46633        4267      19.8m      48.3m        43.2m    235.0m     0.048ms     0.013ms
>    4     62634         900      14.0m      35.2m        11.4m    236.0m     0.048ms     0.013ms
>    5     80618        4358      20.1m      48.7m        43.5m    235.5m     0.046ms     0.013ms
>    6     97964        2336      16.6m      41.0m        25.3m    236.5m     0.046ms     0.012ms

This makes me actually wondering a bit as of why the creation time goes 
down, but not the load time. And after some thinking things become clear 
to me...when you load a class using loadClass, the class is not verified 
immediately. This happens lazy on actual static initialization or once 
you ask the class for its structure... like when using getMethods. So 
what load time tests here is purely the class loader performance, 
nothing else. A difference in load time would thus indicate solely that 
more or less classes are used. This also means the creation time here 
includes the verification time of the class, which I guess amounts to 
the .006 in difference between the two.

Now lets get further and change the simple java class with a run method 
to extend Groovy�s script class.. well this requires further changes, 
since now of course giving the groovy jar on the classpath will no 
longer enable the spawned URLClassloader to load groovy classes. 
Anyway... after fixing that:

>    0         1           1      13.4m      13.8m        24.0m    240.0m     1.290ms    90.612ms
>    1        12          12      39.3m      40.1m        64.3m    190.0m     0.985ms    82.673ms
>    2        23          23      65.0m      66.5m        56.0m    246.0m     1.024ms    88.072ms
>    3        34           5      29.3m      46.8m        27.6m    230.5m     0.976ms    88.245ms
>    4        48          19      54.8m      64.0m        39.2m    228.0m     0.929ms    83.359ms
>    5        60           1      19.6m      51.3m        13.2m    228.0m     0.892ms    84.574ms
>    6        73          14      42.6m      55.8m        36.8m    228.0m     0.859ms    82.042ms
...
>   46       593           1      21.8m      62.2m         7.9m    220.5m     0.783ms    76.878ms
>   47       607          15      47.0m      62.3m        41.8m    219.0m     0.787ms    76.695ms
>   48       621          29      78.3m      82.5m        44.5m    220.5m     0.782ms    76.563ms

nothing much more will happen. I guess having a fast SSD is paying off 
here, creation times would have been 1s+ otherwise. Anyway... as you can 
see by the load time, we load a ton more classes now and creation is 
much more expensive, since we now load part of the meta class system. 
Without any further changes, this would be the best Groovy can achieve 
without actually changing the way scripts are done. But as you can see 
as well, the memory behaviour is actually quite well. I would have liked 
to test the ClassValue version, but it crashes almost immediately after 
my changes.

Next step is then to use the simple "42" script:
...
>   46       596          17      52.0m      63.7m        36.9m    219.5m     0.840ms    76.407ms
>   47       609           1      21.7m      60.7m        11.5m    223.0m     0.835ms    76.536ms
>   48       621          13      43.0m      60.7m        27.0m    223.0m     0.830ms    76.434ms

we now load even more classes like CallSite and such, but creation times 
are about the same meaning the overhead here is almost zero in the end. 
Unsurprisingly the invokedynamic version behaves about the same. Maybe 
the load times are a bit less good... which indicates more classes being 
loaded internally in the JDK code to support invokedynamic.

Now a further step... I can change the script from being "42" to:

public class GroovyFilling {
   public Object run() {
     return 42;
   }
}


so basically the same as the java version... This means Script will no 
longer be used here!

>   46       638          28      74.9m      79.6m        50.7m    227.5m     0.626ms    71.473ms
>   47       650          10      35.4m      60.3m        41.1m    227.5m     0.631ms    71.630ms
>   48       665          25      68.4m      74.6m        47.4m    227.5m     0.626ms    71.512ms

well... we seem to load less classes, but we still get quite high 
creation times. looking at the bytecode I find this (for the indy 
version, but that does not really matter):

>   // access flags 0x1
>   public <init>()V
>     ALOAD 0
>     INVOKESPECIAL java/lang/Object.<init> ()V
>    L0
>     ALOAD 0
>     INVOKEVIRTUAL GroovyFilling.$getStaticMetaClass ()Lgroovy/lang/MetaClass;
>     ASTORE 1
>     ALOAD 1
>     ALOAD 0
>     SWAP
>     PUTFIELD GroovyFilling.metaClass : Lgroovy/lang/MetaClass;
>     ALOAD 1
>     POP
>    L1
>     RETURN

So we actually do an eager meta class creation here as well. The worst 
part here is... we get this even with static compilation... which 
absolutely is a bug.

So to improve this test here I guess trying to get rid of 
$getStaticMetaClass and making Script (maybe even GroovyObjectSupport) 
init meta classes lazy would be a gain. In the times of pre Groovy 1.8 
this would not have amassed to much gain, since we did dynamic calls all 
other the place. But since then we managed to reduce those calls quite a 
lot, so I see an actual potential to reduce creation times for simple 
cases. Of course in the more complex ones or in case in which the super 
class constructor cannot be chosen statically by the compiler, all bets 
are off again.

bye Jochen