You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@harmony.apache.org by Aleksey Ignatenko <al...@gmail.com> on 2006/10/24 11:51:25 UTC

[drlvm] Class unloading support

Hello all!



As you probably know current version of harmony DRLVM has no class unloading
support. This leads to the fact that some Java applications accumulate
memory leaks leading to memory overflow and crashes.

In this message I would like to describe two approaches for class unloading
in DRLVM and propose to implement one of them as basic. Pros and cons for
both approaches are presented below. Lets name these approaches:

   1. Mark and scan based approach.
   2. Automatic class unloading approach.



*Current DRLVM implementation specifics.*



All Java.lang.Class (j.l.Class) and java.lang.Classloader (j.l.Classloader)
instances are enumerated as strong roots inside VM, which leads to the state
when all j.l.Class and j.l.Classloader instances are always reachable.



To unload class loader CL three conditions are to be fulfilled (*):

   1. j.l.Classloader instance of CL is unreachable.
   2. Classes (j.l.Class instances) loaded by CL are unreachable.
   3. No object of any class loaded by CL exists.



Here is brief description for the both approaches:



*Mark and scan based approach.*

Java heap trace is performed by VM Core at the beginning of stop-the-world.
If some class loader and its classes are unreachable and there is no object
of these classes, then exclude this class loader from enumeration to make GC
collect it. After GC happens and appropriate j.l.Classloader instance is
collected – remove native resources from C heap: class loader and all
classes loaded by it, jitted code and so on. Corresponding Java objects
should already be collected by GC at this moment.

Pros:

- Simplicity – requires only additional mark&scan functionality on VM side
to detect classes for unloading + few changes in enumeration algorithm.

Cons:

- Requires additional GC/VM functionality to trace j.l.Class and
j.l.Classloader instances from each object.

- Duplicates mark&scan functionality on VM side.

- Affects every plugged GC.

- "Stop-the-world" state of VM is required, i.e. all threads except the one
performing unloading should be suspended.

- Possibly some additional limitations on new GCs.



*Automatic class unloading approach.*

"Automatic class unloading" means that j.l.Classloader instance is unloaded
automatically (w/o additional enumeration tricks or GC dependency) and after
we detect that some class loader was unloaded we destroy its native
resources. To do that we need to provide two conditions:

   1. Introduce reference from object to its j.l.Class instance.
   2. Class registry - introduce references from j.l.Classes to its
   defining j.l.Classloader and references from j.l.Classloader to
   j.l.Classes loaded by it (unloading is to be done for
j.l.Classloaderand corresponding
   j.l.Classes at once).



*Introduce reference from object to its j.l.Class instance.*

DRLVM has definite implementation specifics. Object is described with native
VTable structure, which has pointers to class and other related data.
VTables can have different sizes according to object class specifics. The
main idea of referencing j.l.Class from object is to make VTable a special
Java object with reference to appropriate j.l.Class instance, but give it a
regular object view from GC point of view. VTable pointer is located in
object by zero offset and therefore can be simply considered as reference
field. Thus we can implement j.l.Class instance tracing from object via
VTable object. VTable object is considered to be pinned for simplification.



In summary, having class registry and reference from object to its
j.l.Classinstance we guarantee that some class loader CL can be
unloaded only if
three conditions are fulfilled described above (*). To find out when Java
part of class loader was unloaded j.l.Classloader instance should be
enumerated as weak root. When this root becomes equal to null – destroy
native memory of appropriate class loader.



Pros:

- Unification of unloading approach – no additional requirements from GC.

- Stop-the-world is not required.

- GC handles VTables automatically as regular objects.

Cons

- Number of objects to be increased.

- Memory footprint to be increased both for native and Java heaps (as VTable
objects appear).



*Conclusion. *

I prefer automatic class unloading approach due to the described set of
properties (see above). It is more flexible and perspective solution. Also
JVM specification is mostly related to automatic class unloading approach
while mark and scan based approach looks more like class unloading
workaround.





Please, do not hesitate to ask questions.

Best regards,

Aleksey Ignatenko,

Intel Enterprise Solutions Software Division.

Re: [drlvm] Class unloading support

Posted by Etienne Gagnon <eg...@sablevm.org>.

Aleksey Ignatenko wrote:
>> Am I wrong, or does this proposition imply collecting classes
>> independently from their class loader?  If this is the case, I have to
>>...
> 
> Yes, you are wrong. This proposition implies collection of classloader and
> clasess loaded by it at once. You can see what is "class registry" in
> the first letter of the discussion -

Excellent.  That was one of my main worries.

>> And what about gagnon-phd.pdf:
>... 
> 
> Drlvm already has similar functionality: look at classloader.h, function
> void* Alloc(size_t size); You'll see that most of classloader's data (not
> 100% yet) is already allocated from pool of that classloader.

Heh!  Super.

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

 Hello, Etienne.

>Am I wrong, or does this proposition imply collecting classes
>independently from their class loader?  If this is the case, I have to
>say that I disagree with the proposed approach.

Yes, you are wrong. This proposition implies collection of classloader and
clasess loaded by it at once. You can see what is "class registry" in
the first letter of the discussion -

"Class registry - introduce references from j.l.Classes to its defining
j.l.Classloader and references from j.l.Classloader to j.l.Classes loaded by
it (unloading is to be done for j.l.Classloader and corresponding
j.l.Classes at once)."

And what about gagnon-phd.pdf:
> very effective approach for managing class-loader related memory
Drlvm already has similar functionality: look at classloader.h, function
void* Alloc(size_t size); You'll see that most of classloader's data (not
100% yet) is already allocated from pool of that classloader.

Aleksey.



On 10/30/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>
> Hi Weldon,
>
> Weldon Washburn wrote:
> > I read section 3.2.3 (Class-Loader-Specific Memory) of gagnon-phd.pdf .
> > Please tell me if the following is a correct interpretation.  You create
> > a new memory manager that is uniquely associated with each new class
> > loader.
>
> Right.
>
> >  All the C data structures associated with a class loader (classes,
> > vtables, etc) are "malloc()ed" out of the associated memory manager.
>
> [For those who have not read it...]
>
> "malloc()ed" is a big word...  It is rather "simpleAlloc()ed", i.e.,
> once allocated, you cannot free it (...or if you do, the "free-list"
> manager is very minimal, performs no checks [you have to tell it how
> much you are freeing] and no aggregation).  I do discuss this in the
> Chapter, of course, and you can look at the implementation in SableVM.
> [The SableVM trunk is under AL2.0 (unlike released versions)].
>
> >  When
> > the class loader becomes unreachable, then its associated memory manager
> is
> > deallocated which automatically frees all the associated C structs
> > (classes, vtables, etc.)
>
> Yep.
>
> Etienne
>
> --
> Etienne M. Gagnon, Ph.D.             http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>

Re: [drlvm] Class unloading support

Posted by Etienne Gagnon <eg...@sablevm.org>.

Hi Weldon,

Weldon Washburn wrote:
> I read section 3.2.3 (Class-Loader-Specific Memory) of gagnon-phd.pdf.
> Please tell me if the following is a correct interpretation.  You create
> a new memory manager that is uniquely associated with each new class
> loader.

Right.

>  All the C data structures associated with a class loader (classes,
> vtables, etc) are "malloc()ed" out of the associated memory manager.

[For those who have not read it...]

"malloc()ed" is a big word...  It is rather "simpleAlloc()ed", i.e.,
once allocated, you cannot free it (...or if you do, the "free-list"
manager is very minimal, performs no checks [you have to tell it how
much you are freeing] and no aggregation).  I do discuss this in the
Chapter, of course, and you can look at the implementation in SableVM.
[The SableVM trunk is under AL2.0 (unlike released versions)].

>  When
> the class loader becomes unreachable, then its associated memory manager is
> deallocated which automatically frees all the associated C structs
> (classes, vtables, etc.)

Yep.

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

Etienne,

I read section 3.2.3 (Class-Loader-Specific Memory) of gagnon-phd.pdf.
Please tell me if the following is a correct interpretation.  You create
a new memory manager that is uniquely associated with each new class
loader.  All the C data structures associated with a class loader (classes,
vtables, etc) are "malloc()ed" out of the associated memory manager.  When
the class loader becomes unreachable, then its associated memory manager is
deallocated which automatically frees all the associated C structs (classes,
vtables, etc.)

Everyone,
Does it make sense to try to implement Etienne's scheme?

On 10/30/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>
> >>> If I get it right, in case of automagic unloading, GC does all the job
> >>> without a knowledge whether it collects a class, a classloader or
> >>> whatever else.
> >>> Perhaps I'm missing something, but to provide a callback on class
> >>> unloading, the GC must know the semantic of the object being
> collected.
>
> Am I wrong, or does this proposition imply collecting classes
> independently from their class loader?  If this is the case, I have to
> say that I disagree with the proposed approach.
>
> The JVM spec says quite clearly:
> 2.17.8 Unloading of Classes and Interfaces
>
> A class or interface may be unloaded if and only if its class loader is
> unreachable. The bootstrap class loader is always reachable; as a
> result, system classes may never be unloaded.
>
> Just think about it.  One could take an instance "o" of a class C loaded
> by L, call it (C,L), and call o.getClass().hashcode().  Store this
> integer some where.  Then, "o" could die, and maybe (C,L) unloaded while
> L is still reachable.  As L is still reachable, some code could do a
> L.findClass("C").hashcode().  This will likely result in a different
> hashcode, in full breach of the both the VM and API specifications.
>
>
> In a related note, for memory management I highly encourage Drlvm to
> look at Chapter 3 of http://sablevm.org/people/egagnon/gagnon-phd.pdf
> that exposes a simple, yet very effective approach for managing
> class-loader related memory (i.e. memory used to store internal class
> data, vtables, jitted code) so that it can all be freed efficiently at
> class-loader unloading time.
>
> Etienne
>
>
> --
> Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Etienne Gagnon <eg...@sablevm.org>.

>>> If I get it right, in case of automagic unloading, GC does all the job
>>> without a knowledge whether it collects a class, a classloader or
>>> whatever else.
>>> Perhaps I'm missing something, but to provide a callback on class
>>> unloading, the GC must know the semantic of the object being collected.

Am I wrong, or does this proposition imply collecting classes
independently from their class loader?  If this is the case, I have to
say that I disagree with the proposed approach.

The JVM spec says quite clearly:

 2.17.8 Unloading of Classes and Interfaces

 A class or interface may be unloaded if and only if its class loader is
 unreachable. The bootstrap class loader is always reachable; as a
 result, system classes may never be unloaded.

Just think about it.  One could take an instance "o" of a class C loaded
by L, call it (C,L), and call o.getClass().hashcode().  Store this
integer some where.  Then, "o" could die, and maybe (C,L) unloaded while
L is still reachable.  As L is still reachable, some code could do a
L.findClass("C").hashcode().  This will likely result in a different
hashcode, in full breach of the both the VM and API specifications.


In a related note, for memory management I highly encourage Drlvm to
look at Chapter 3 of http://sablevm.org/people/egagnon/gagnon-phd.pdf
that exposes a simple, yet very effective approach for managing
class-loader related memory (i.e. memory used to store internal class
data, vtables, jitted code) so that it can all be freed efficiently at
class-loader unloading time.

Etienne


-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Alex Astapchuk <al...@gmail.com>.

Ivan,

Ivan Volosyuk:
> On 10/29/06, Alex Astapchuk <al...@gmail.com> wrote:
>> Mikhail Fursov:
>> > On 10/28/06, Alex Astapchuk <al...@gmail.com> wrote:
>> >>
>> >> Aleksey,
>> >>
>> >> >   1. Mark and scan based approach.
>> >> >   2. Automatic class unloading approach.
>> >>
>> >> In the #2, is there any chance for other components to be notified 
>> about
>> >>    unloaded classes?
>> >>
>> >
>> > Alex,
>> > I asked Aleksey about the same feature some time ago. I was 
>> interested if
>> > it's possible to deallocate profiler's data in EM for unloaded 
>> methods. The
>> > answer was: OK you will get a callback from VM. So, this feature is 
>> in the
>> > design. Let's wait Aleksey to give us more details about it.
>>
>> Hmmm...  Yes, some more details would be nice.
>> If I get it right, in case of automagic unloading, GC does all the job
>> without a knowledge whether it collects a class, a classloader or
>> whatever else.
>> Perhaps I'm missing something, but to provide a callback on class
>> unloading, the GC must know the semantic of the object being collected.
> 
> The callback will be called by class unloading implementation (for
> #1). It will definetly know everything about classloader being

Sure, I see no problem with #1.

But I'm really curious how the callback can be implemented in the 
*second* approach - with automatic class unloading.


-- 
Thanks,
   Alex


> deallocated. EM just needs to make relation between its data
> structures with corresponding classloader and free them by request.
>

Re: [drlvm] Class unloading support

Posted by Xiao-Feng Li <xi...@gmail.com>.

uh... reference counting for class loader, interesting.

One thing could you help to clarify: how can the classloader know
there is class (loaded by it) that has surviving objects? We need
trace the object header to find the class then to find the classloader
and then to mark the classloader? Is this virtually the same as the
solution #1?

Thanks,
xiaofeng

On 10/29/06, Etienne Gagnon <eg...@sablevm.org> wrote:
> I have missed some messages of this thread, yet I do not remember seeing
> a discussion of what seems to me the obvious solution to the problem.
> So, here it is.
>
> Why don't you simply add a reference count on classes which is
> incremented on object allocation and decremented on object reclamation?
>  [In case you use a copying collector, you could keep a separate count
> (in the class) for the collected area, so that you only have to count
> copied objects].  You would also use reference counting for the class
> loader (therefore eliminating any cyclic problem that you could have
> with normal garbage collection).  This would work very well as unloading
> only happens when the class loader can be unloaded along all of its classes.
>
> No need for any supportive information in object header, or anything
> complex...  Am I really missing something?
>
> Just an idea...
>
> Etienne
>
>
> --
> Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.

On 10/31/06, Xiao-Feng Li <xi...@gmail.com> wrote:
> On 10/31/06, Pavel Pervov <pm...@gmail.com> wrote:
> > > 7- Each class loader structure maintains a set of boolean flags, one
> > > flag per "non-nursery" garbage collected area (even when thread-local
> > > heaps are used).  The flag is set when an instance of a class loaded by
> > > this class leader is moved into the related GC-area.  The flag is unset
> > > when the GC-area is emptied, or (optionally) when it can be determined
> > > that no instance of a class loaded by this class loader remains in the
> > > GC-area.  This is best implemented as follows: a) use an unconditional
> > > write of "true" in the flag every time an object is moved into the
> > > GC-area by the garbage collector, b) unset the related flag in "all"
> > > class loader structures just before collecting a GC-area, then setting
> > > the flag back when an object survives in the area.
> >
> >
> > Requires identification of object' class type during GC. Will most
> > probably degrade GC performance.
>
> Yes, this is also my concern.

Yes, tracing and marking of Vtable objects can be cheaper then tracing
object->vtable->class->classloader for each object.

Even #2 proposal will degrade performance, but this approach will do
this even more.

--
Ivan

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.

On 11/9/06, Robin Garner <ro...@anu.edu.au> wrote:
>
> Geir Magnusson Jr. wrote:
> >
> >
> > Weldon Washburn wrote:
> >>
> >>
> >> On 11/8/06, *Geir Magnusson Jr.* <geir@pobox.com
> >> <ma...@pobox.com>> wrote:
> >>
> >>
> >>
> >>     Weldon Washburn wrote:
> >>      > On 11/7/06, Ivan Volosyuk < ivan.volosyuk@gmail.com
> >>     <ma...@gmail.com>> wrote:
> >>      >>
> >>      >> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <egor.pasko@gmail.com
> >>     <ma...@gmail.com>> wrote:
> >>      >> > > I already have one idea how to benefit from movable
> vtables.
> >>      >
> >>      >
> >>      > There would have to be a very compelling argument for making
> >> vtables
> >>      > movable.  Like a business workload that Harmony needs to run
> >>     within the
> >>      > next
> >>      > 12 months.
> >>
> >>     How would a business workload need this directly?
> >>
> >> That's the point.  I can't figure out any compelling story for moving
> >> vtables.  As far as I can tell, its over-engineering.   I would love
> >> to be proven wrong.
> >
> > But isn't this simply an implementation detail of something that is
> > important, namely the class unloading?
> >
> > geir


I have no problem calling it an implementation detail.  Its an important
implementation detail that somehow got mixed into the design conversation.
Worth noting is that ultimately the committer is on the hook for committing
an implementation.  It would be good to have the discussion on moving vtable
implementation before someone spends a bunch of time on it.

While it did come up as an issue in the class-unloading talks I think
> most of us believe it to be orthogonal.
>
> cheers
>
> --
> Robin Garner
> Dept. of Computer Science
> Australian National University
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Robin Garner <ro...@anu.edu.au>.

Geir Magnusson Jr. wrote:
> 
> 
> Weldon Washburn wrote:
>>
>>
>> On 11/8/06, *Geir Magnusson Jr.* <geir@pobox.com 
>> <ma...@pobox.com>> wrote:
>>
>>
>>
>>     Weldon Washburn wrote:
>>      > On 11/7/06, Ivan Volosyuk < ivan.volosyuk@gmail.com
>>     <ma...@gmail.com>> wrote:
>>      >>
>>      >> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <egor.pasko@gmail.com
>>     <ma...@gmail.com>> wrote:
>>      >> > > I already have one idea how to benefit from movable vtables.
>>      >
>>      >
>>      > There would have to be a very compelling argument for making 
>> vtables
>>      > movable.  Like a business workload that Harmony needs to run
>>     within the
>>      > next
>>      > 12 months.
>>
>>     How would a business workload need this directly?
>>  
>> That's the point.  I can't figure out any compelling story for moving 
>> vtables.  As far as I can tell, its over-engineering.   I would love 
>> to be proven wrong.
> 
> But isn't this simply an implementation detail of something that is 
> important, namely the class unloading?
> 
> geir

While it did come up as an issue in the class-unloading talks I think 
most of us believe it to be orthogonal.

cheers

-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.


Weldon Washburn wrote:
> 
> 
> On 11/8/06, *Geir Magnusson Jr.* <geir@pobox.com 
> <ma...@pobox.com>> wrote:
> 
> 
> 
>     Weldon Washburn wrote:
>      > On 11/7/06, Ivan Volosyuk < ivan.volosyuk@gmail.com
>     <ma...@gmail.com>> wrote:
>      >>
>      >> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <egor.pasko@gmail.com
>     <ma...@gmail.com>> wrote:
>      >> > > I already have one idea how to benefit from movable vtables.
>      >
>      >
>      > There would have to be a very compelling argument for making vtables
>      > movable.  Like a business workload that Harmony needs to run
>     within the
>      > next
>      > 12 months.
> 
>     How would a business workload need this directly? 
> 
>  
> That's the point.  I can't figure out any compelling story for moving 
> vtables.  As far as I can tell, its over-engineering.   I would love to 
> be proven wrong.

But isn't this simply an implementation detail of something that is 
important, namely the class unloading?

geir

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.

On 11/8/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
>
>
> Weldon Washburn wrote:
> > On 11/7/06, Ivan Volosyuk <iv...@gmail.com> wrote:
> >>
> >> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >> > > I already have one idea how to benefit from movable vtables.
> >
> >
> > There would have to be a very compelling argument for making vtables
> > movable.  Like a business workload that Harmony needs to run within the
> > next
> > 12 months.
>
> How would a business workload need this directly?


That's the point.  I can't figure out any compelling story for moving
vtables.  As far as I can tell, its over-engineering.   I would love to be
proven wrong.

geir
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.


Weldon Washburn wrote:
> On 11/7/06, Ivan Volosyuk <iv...@gmail.com> wrote:
>>
>> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
>> > > I already have one idea how to benefit from movable vtables.
> 
> 
> There would have to be a very compelling argument for making vtables
> movable.  Like a business workload that Harmony needs to run within the 
> next
> 12 months.

How would a business workload need this directly?

geir

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Robin Garner <ro...@anu.edu.au>.

Weldon Washburn wrote:
> On 11/7/06, Ivan Volosyuk <iv...@gmail.com> wrote:
>>
>> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
>> > > I already have one idea how to benefit from movable vtables.
> 
> 
> There would have to be a very compelling argument for making vtables
> movable.  Like a business workload that Harmony needs to run within the 
> next
> 12 months.

The cost of moving vtables would be huge.  It would have to be a very 
hefty optimization :)

-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.

On 11/7/06, Ivan Volosyuk <iv...@gmail.com> wrote:
>
> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > > I already have one idea how to benefit from movable vtables.

There would have to be a very compelling argument for making vtables
movable.  Like a business workload that Harmony needs to run within the next
12 months.

>
> > in GCV4.1? :)
>
> Yes
>
> --
> Ivan
> Intel Enterprise Solutions Software Division
>

-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.

On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > I already have one idea how to benefit from movable vtables.
>
> in GCV4.1? :)

Yes

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x217 day of Apache Harmony Ivan Volosyuk wrote:
> In current GCv4.1 implementation there is an assumption that vtables
> will not move. It is used in compaction algorithm. Strictly speaking,
> the only thing I need is to distinguish objects and vtables during
> allocation. If so, one of GC algorithms may treat vtables as pinned
> objects, while another could make use of the ability to move the
> vtables. 

Ivan, thank you for making it clear!

> I already have one idea how to benefit from movable vtables.

in GCV4.1? :)

> --
> Ivan
> 
> On 03 Nov 2006 14:34:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > On the 0x214 day of Apache Harmony Aleksey Ignatenko wrote:
> > > Egor,
> > >
> > > Vtable objects pinning is required not only by JIT, this is also required by
> > > GC, which relies on that VTables are non movable. So this not a way to
> > > disable guarded devirtualization. Pinning is required anyway.
> >
> > Sorry, but I am not aware of places, where pinning is required other
> > than for JIT. If you menttion one or two, that would be great for
> > understanding and the next step to beat my ignorance in this subject :)
> >
> > > On 01 Nov 2006 10:37:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > > >
> > > > On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> > > > > On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> > > > >
> > > > > > >Yet:
> > > > > >
> > > > > > >1- You do need pinning, so you rule out some of the simplest GCs (e.g
> > > > .
> > > > > > >simple, non-generational copying without pinning.)  [Apparently, for
> > > > > > >some very large heaps, simple copying a can be quite difficult to
> > > > beat,
> > > > > > >efficiency wise, if you believe some relatively recent JikesRVM
> > > > related
> > > > > > >paper...]
> > > > >
> > > > >
> > > > > Yes, this was one of my  concerns about the vtable object approach. This
> > > > is
> > > > > limiting, but this is one specific GC requirement. (Maybe for GC's that
> > > > > don't support pinning, the JIT can compare object->vtable->class for
> > > > guarded
> > > > > devirtiualization, or even not do guarded devirtualization, sort of
> > > > support
> > > > > the GC in downlevel mode). For the refcounting method we need to hand
> > > > off
> > > > > between  GC and VM before and after processing weak references, update
> > > > the
> > > > > generational or semispace related CL flags, and also use the GC to undo
> > > > or
> > > > > rescue CL instances that may come alive due to the generational flag
> > > > > processing.
> > > > >
> > > > >
> > > > >
> > > > > > >2- You do have overhead even on minor collections.  With my approach,
> > > > > > >you could limit the (quite similar to yours, if you put a
> > > > > > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > > > > > >cycles.
> > > > >
> > > > >
> > > > > I think the main advantage of the vtable object approach is that it is
> > > > > somewhat elegant and natural, if one can get past the idea of non C
> > > > vtables
> > > > > :-). Special casing to avoid object->vtable scans during minor
> > > > collections
> > > > > etc. just breaks that. Relying on GC all the way forces a class
> > > > unloading
> > > > > overhead to every GC cycle( even for the young generation collections ).
> > > > > There is also a space overhead that I can't really estimate(
> > > > proportional to
> > > > > class ....etc. etc.). As I understood it, there is no impact on MMTk
> > > > based
> > > > > GC's, but I may be wrong.
> > > > > If class unloading is done at specific moments only, the refcounting
> > > > > approach does not add a perf overhead to each GC cycle, there is no heap
> > > > > overhead of the method either. But the former implies yet another
> > > > > secondary heuristic to optimally choose the class unloading triggers,
> > > > this
> > > > > depends on the application profile and is not really once an hour/day
> > > > etc.
> > > > > My guess( humbly ) would be that the refcounting method "may" be
> > > > somewhat
> > > > > more time/space efficient, but that's probably not the only issue. There
> > > > is
> > > > > the issue of implementation correctness, existing code, etc. And I don't
> > > > > know what's the best way to go to the next step.
> > > > > A suggestion could be to take Harmony-2000, review it, put it in a
> > > > > branch,
> > > >
> > > > an alternative: JIT can disable guarded devirtualization via an
> > > > option. Commit the unloading, use/tune GCV5 with that opion until it
> > > > supports pinning. No branch required.
> > > >
> > > > > tune and test it , wait for GCV5 to start supporting pinning, try with
> > > > MMTk,
> > > > > and then integrate. If we do this, the refcounting approach would be a
> > > > > fallback for DRLVM.
> > > > > We need to decide on next steps, we cannot debate the algorithm forever
> > > > :-)
> 

-- 
Egor Pasko

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.

In current GCv4.1 implementation there is an assumption that vtables
will not move. It is used in compaction algorithm. Strictly speaking,
the only thing I need is to distinguish objects and vtables during
allocation. If so, one of GC algorithms may treat vtables as pinned
objects, while another could make use of the ability to move the
vtables. I already have one idea how to benefit from movable vtables.
--
Ivan

On 03 Nov 2006 14:34:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> On the 0x214 day of Apache Harmony Aleksey Ignatenko wrote:
> > Egor,
> >
> > Vtable objects pinning is required not only by JIT, this is also required by
> > GC, which relies on that VTables are non movable. So this not a way to
> > disable guarded devirtualization. Pinning is required anyway.
>
> Sorry, but I am not aware of places, where pinning is required other
> than for JIT. If you menttion one or two, that would be great for
> understanding and the next step to beat my ignorance in this subject :)
>
> > On 01 Nov 2006 10:37:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > >
> > > On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> > > > On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> > > >
> > > > > >Yet:
> > > > >
> > > > > >1- You do need pinning, so you rule out some of the simplest GCs (e.g
> > > .
> > > > > >simple, non-generational copying without pinning.)  [Apparently, for
> > > > > >some very large heaps, simple copying a can be quite difficult to
> > > beat,
> > > > > >efficiency wise, if you believe some relatively recent JikesRVM
> > > related
> > > > > >paper...]
> > > >
> > > >
> > > > Yes, this was one of my  concerns about the vtable object approach. This
> > > is
> > > > limiting, but this is one specific GC requirement. (Maybe for GC's that
> > > > don't support pinning, the JIT can compare object->vtable->class for
> > > guarded
> > > > devirtiualization, or even not do guarded devirtualization, sort of
> > > support
> > > > the GC in downlevel mode). For the refcounting method we need to hand
> > > off
> > > > between  GC and VM before and after processing weak references, update
> > > the
> > > > generational or semispace related CL flags, and also use the GC to undo
> > > or
> > > > rescue CL instances that may come alive due to the generational flag
> > > > processing.
> > > >
> > > >
> > > >
> > > > > >2- You do have overhead even on minor collections.  With my approach,
> > > > > >you could limit the (quite similar to yours, if you put a
> > > > > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > > > > >cycles.
> > > >
> > > >
> > > > I think the main advantage of the vtable object approach is that it is
> > > > somewhat elegant and natural, if one can get past the idea of non C
> > > vtables
> > > > :-). Special casing to avoid object->vtable scans during minor
> > > collections
> > > > etc. just breaks that. Relying on GC all the way forces a class
> > > unloading
> > > > overhead to every GC cycle( even for the young generation collections ).
> > > > There is also a space overhead that I can't really estimate(
> > > proportional to
> > > > class ....etc. etc.). As I understood it, there is no impact on MMTk
> > > based
> > > > GC's, but I may be wrong.
> > > > If class unloading is done at specific moments only, the refcounting
> > > > approach does not add a perf overhead to each GC cycle, there is no heap
> > > > overhead of the method either. But the former implies yet another
> > > > secondary heuristic to optimally choose the class unloading triggers,
> > > this
> > > > depends on the application profile and is not really once an hour/day
> > > etc.
> > > > My guess( humbly ) would be that the refcounting method "may" be
> > > somewhat
> > > > more time/space efficient, but that's probably not the only issue. There
> > > is
> > > > the issue of implementation correctness, existing code, etc. And I don't
> > > > know what's the best way to go to the next step.
> > > > A suggestion could be to take Harmony-2000, review it, put it in a
> > > > branch,
> > >
> > > an alternative: JIT can disable guarded devirtualization via an
> > > option. Commit the unloading, use/tune GCV5 with that opion until it
> > > supports pinning. No branch required.
> > >
> > > > tune and test it , wait for GCV5 to start supporting pinning, try with
> > > MMTk,
> > > > and then integrate. If we do this, the refcounting approach would be a
> > > > fallback for DRLVM.
> > > > We need to decide on next steps, we cannot debate the algorithm forever
> > > :-)

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x214 day of Apache Harmony Aleksey Ignatenko wrote:
> Egor,
> 
> Vtable objects pinning is required not only by JIT, this is also required by
> GC, which relies on that VTables are non movable. So this not a way to
> disable guarded devirtualization. Pinning is required anyway.

Sorry, but I am not aware of places, where pinning is required other
than for JIT. If you menttion one or two, that would be great for
understanding and the next step to beat my ignorance in this subject :)

> On 01 Nov 2006 10:37:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >
> > On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> > > On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> > >
> > > > >Yet:
> > > >
> > > > >1- You do need pinning, so you rule out some of the simplest GCs (e.g
> > .
> > > > >simple, non-generational copying without pinning.)  [Apparently, for
> > > > >some very large heaps, simple copying a can be quite difficult to
> > beat,
> > > > >efficiency wise, if you believe some relatively recent JikesRVM
> > related
> > > > >paper...]
> > >
> > >
> > > Yes, this was one of my  concerns about the vtable object approach. This
> > is
> > > limiting, but this is one specific GC requirement. (Maybe for GC's that
> > > don't support pinning, the JIT can compare object->vtable->class for
> > guarded
> > > devirtiualization, or even not do guarded devirtualization, sort of
> > support
> > > the GC in downlevel mode). For the refcounting method we need to hand
> > off
> > > between  GC and VM before and after processing weak references, update
> > the
> > > generational or semispace related CL flags, and also use the GC to undo
> > or
> > > rescue CL instances that may come alive due to the generational flag
> > > processing.
> > >
> > >
> > >
> > > > >2- You do have overhead even on minor collections.  With my approach,
> > > > >you could limit the (quite similar to yours, if you put a
> > > > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > > > >cycles.
> > >
> > >
> > > I think the main advantage of the vtable object approach is that it is
> > > somewhat elegant and natural, if one can get past the idea of non C
> > vtables
> > > :-). Special casing to avoid object->vtable scans during minor
> > collections
> > > etc. just breaks that. Relying on GC all the way forces a class
> > unloading
> > > overhead to every GC cycle( even for the young generation collections ).
> > > There is also a space overhead that I can't really estimate(
> > proportional to
> > > class ....etc. etc.). As I understood it, there is no impact on MMTk
> > based
> > > GC's, but I may be wrong.
> > > If class unloading is done at specific moments only, the refcounting
> > > approach does not add a perf overhead to each GC cycle, there is no heap
> > > overhead of the method either. But the former implies yet another
> > > secondary heuristic to optimally choose the class unloading triggers,
> > this
> > > depends on the application profile and is not really once an hour/day
> > etc.
> > > My guess( humbly ) would be that the refcounting method "may" be
> > somewhat
> > > more time/space efficient, but that's probably not the only issue. There
> > is
> > > the issue of implementation correctness, existing code, etc. And I don't
> > > know what's the best way to go to the next step.
> > > A suggestion could be to take Harmony-2000, review it, put it in a
> > > branch,
> >
> > an alternative: JIT can disable guarded devirtualization via an
> > option. Commit the unloading, use/tune GCV5 with that opion until it
> > supports pinning. No branch required.
> >
> > > tune and test it , wait for GCV5 to start supporting pinning, try with
> > MMTk,
> > > and then integrate. If we do this, the refcounting approach would be a
> > > fallback for DRLVM.
> > > We need to decide on next steps, we cannot debate the algorithm forever
> > :-)
> >
> > --
> > Egor Pasko, Intel Managed Runtime Division
> >
> >

-- 
Egor Pasko

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Egor,

Vtable objects pinning is required not only by JIT, this is also required by
GC, which relies on that VTables are non movable. So this not a way to
disable guarded devirtualization. Pinning is required anyway.

Aleksey.


On 01 Nov 2006 10:37:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
>
> On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> > On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> >
> > > >Yet:
> > >
> > > >1- You do need pinning, so you rule out some of the simplest GCs (e.g
> .
> > > >simple, non-generational copying without pinning.)  [Apparently, for
> > > >some very large heaps, simple copying a can be quite difficult to
> beat,
> > > >efficiency wise, if you believe some relatively recent JikesRVM
> related
> > > >paper...]
> >
> >
> > Yes, this was one of my  concerns about the vtable object approach. This
> is
> > limiting, but this is one specific GC requirement. (Maybe for GC's that
> > don't support pinning, the JIT can compare object->vtable->class for
> guarded
> > devirtiualization, or even not do guarded devirtualization, sort of
> support
> > the GC in downlevel mode). For the refcounting method we need to hand
> off
> > between  GC and VM before and after processing weak references, update
> the
> > generational or semispace related CL flags, and also use the GC to undo
> or
> > rescue CL instances that may come alive due to the generational flag
> > processing.
> >
> >
> >
> > > >2- You do have overhead even on minor collections.  With my approach,
> > > >you could limit the (quite similar to yours, if you put a
> > > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > > >cycles.
> >
> >
> > I think the main advantage of the vtable object approach is that it is
> > somewhat elegant and natural, if one can get past the idea of non C
> vtables
> > :-). Special casing to avoid object->vtable scans during minor
> collections
> > etc. just breaks that. Relying on GC all the way forces a class
> unloading
> > overhead to every GC cycle( even for the young generation collections ).
> > There is also a space overhead that I can't really estimate(
> proportional to
> > class ....etc. etc.). As I understood it, there is no impact on MMTk
> based
> > GC's, but I may be wrong.
> > If class unloading is done at specific moments only, the refcounting
> > approach does not add a perf overhead to each GC cycle, there is no heap
> > overhead of the method either. But the former implies yet another
> > secondary heuristic to optimally choose the class unloading triggers,
> this
> > depends on the application profile and is not really once an hour/day
> etc.
> > My guess( humbly ) would be that the refcounting method "may" be
> somewhat
> > more time/space efficient, but that's probably not the only issue. There
> is
> > the issue of implementation correctness, existing code, etc. And I don't
> > know what's the best way to go to the next step.
> > A suggestion could be to take Harmony-2000, review it, put it in a
> > branch,
>
> an alternative: JIT can disable guarded devirtualization via an
> option. Commit the unloading, use/tune GCV5 with that opion until it
> supports pinning. No branch required.
>
> > tune and test it , wait for GCV5 to start supporting pinning, try with
> MMTk,
> > and then integrate. If we do this, the refcounting approach would be a
> > fallback for DRLVM.
> > We need to decide on next steps, we cannot debate the algorithm forever
> :-)
>
> --
> Egor Pasko, Intel Managed Runtime Division
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.

On 11/2/06, Xiao-Feng Li <xi...@gmail.com> wrote:
> On 11/2/06, Robin Garner <ro...@anu.edu.au> wrote:
> > Xiao-Feng Li wrote:
> > > On 11/1/06, Mikhail Fursov <mi...@gmail.com> wrote:
> > >> On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > >> >
> > >> > On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> > >> > > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com>
> > >> wrote:
> > >> > > >
> > >> > > > agreed. not patching .. just reporting 'golden' VTable refs to
> > >> GC, am
> > >> > > > I right?
> > >> > > >
> > >> > > Yes, and everytime we report it to GC and GC moves an object - it
> > >> > patches
> > >> > > the address we report.
> > >> >
> > >> > so, by saying "patching" you insist to put immediate operands into
> > >> > instructions? That's probably worth it, but it extends the JIT<->GC
> > >> > interface. How about making a simple operand (reg/mem) as the first
> > >> step?
> > >>
> > >>
> > >> Egor, I thinks this is slightly more complicated problem. If vtable
> > >> object
> > >> is moved we must update all devirtualization points in every method
> > >> compiled
> > >> before. It can require an extension of JIT<->VM<->GC interface.
> > >> Another solution I see is to collect info about all devirtualization
> > >> points
> > >> in JIT (code addrs) and report these addresses as enumeration roots.
> > >> This is
> > >> JIT-only solution, and disadvantage is a significant (~hot methods count)
> > >> increase of number of objects reported.
> > >>
> > >> On the other hand I see no reasons to unpin vtables in the nearest future
> > >> (Let's GC guru correct me). If you use special (freelist-type ?)
> > >> allocator
> > >> in GC the memory fragmentation when unloading pinned vtable objects
> > >> could be
> > >> low.
> > >
> > > Yes, vtable should never be moved except for very weird reason. And
> > > yes, to pin certain amount of objects is not a big performance issue
> > > (in both temporal and spatial wise).
> > >
> > > -xiaofeng
> > >
> > >> --
> > >> Mikhail Fursov
> > >>
> > >>
> >
> > In MMTk, this kind of 'pinning' is an allocation-time policy decision of
> > the type I was advocating in the GC helpers thread.  Once a GC allows
> > for the idea of supporting multiple collection policies (which
> > generational GC requires in any case), then adding a non-moving space to
> > a memory manager is easy.
> >
> > Most memory managers will have a non-moving large object space no matter
> >   what the primary collection policy is.  The DRLVM collectors have this
> > too, don't they ?
> > Pinning an object after allocation is a harder problem, but not
> > something required in this case.
>
> Yes, I agree with all what you said. And DRLVM GCv4/v5 doesn't move
> large objects at the moment.

GCv4.1 does. There is no problems to support pinned allocation here anyway.

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Xiao-Feng Li <xi...@gmail.com>.

On 11/2/06, Robin Garner <ro...@anu.edu.au> wrote:
> Xiao-Feng Li wrote:
> > On 11/1/06, Mikhail Fursov <mi...@gmail.com> wrote:
> >> On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >> >
> >> > On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> >> > > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com>
> >> wrote:
> >> > > >
> >> > > > agreed. not patching .. just reporting 'golden' VTable refs to
> >> GC, am
> >> > > > I right?
> >> > > >
> >> > > Yes, and everytime we report it to GC and GC moves an object - it
> >> > patches
> >> > > the address we report.
> >> >
> >> > so, by saying "patching" you insist to put immediate operands into
> >> > instructions? That's probably worth it, but it extends the JIT<->GC
> >> > interface. How about making a simple operand (reg/mem) as the first
> >> step?
> >>
> >>
> >> Egor, I thinks this is slightly more complicated problem. If vtable
> >> object
> >> is moved we must update all devirtualization points in every method
> >> compiled
> >> before. It can require an extension of JIT<->VM<->GC interface.
> >> Another solution I see is to collect info about all devirtualization
> >> points
> >> in JIT (code addrs) and report these addresses as enumeration roots.
> >> This is
> >> JIT-only solution, and disadvantage is a significant (~hot methods count)
> >> increase of number of objects reported.
> >>
> >> On the other hand I see no reasons to unpin vtables in the nearest future
> >> (Let's GC guru correct me). If you use special (freelist-type ?)
> >> allocator
> >> in GC the memory fragmentation when unloading pinned vtable objects
> >> could be
> >> low.
> >
> > Yes, vtable should never be moved except for very weird reason. And
> > yes, to pin certain amount of objects is not a big performance issue
> > (in both temporal and spatial wise).
> >
> > -xiaofeng
> >
> >> --
> >> Mikhail Fursov
> >>
> >>
>
> In MMTk, this kind of 'pinning' is an allocation-time policy decision of
> the type I was advocating in the GC helpers thread.  Once a GC allows
> for the idea of supporting multiple collection policies (which
> generational GC requires in any case), then adding a non-moving space to
> a memory manager is easy.
>
> Most memory managers will have a non-moving large object space no matter
>   what the primary collection policy is.  The DRLVM collectors have this
> too, don't they ?
> Pinning an object after allocation is a harder problem, but not
> something required in this case.

Yes, I agree with all what you said. And DRLVM GCv4/v5 doesn't move
large objects at the moment.

Thanks,
xiaofeng

> cheers
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Robin Garner <ro...@anu.edu.au>.

Xiao-Feng Li wrote:
> On 11/1/06, Mikhail Fursov <mi...@gmail.com> wrote:
>> On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
>> >
>> > On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
>> > > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> 
>> wrote:
>> > > >
>> > > > agreed. not patching .. just reporting 'golden' VTable refs to 
>> GC, am
>> > > > I right?
>> > > >
>> > > Yes, and everytime we report it to GC and GC moves an object - it
>> > patches
>> > > the address we report.
>> >
>> > so, by saying "patching" you insist to put immediate operands into
>> > instructions? That's probably worth it, but it extends the JIT<->GC
>> > interface. How about making a simple operand (reg/mem) as the first 
>> step?
>>
>>
>> Egor, I thinks this is slightly more complicated problem. If vtable 
>> object
>> is moved we must update all devirtualization points in every method 
>> compiled
>> before. It can require an extension of JIT<->VM<->GC interface.
>> Another solution I see is to collect info about all devirtualization 
>> points
>> in JIT (code addrs) and report these addresses as enumeration roots. 
>> This is
>> JIT-only solution, and disadvantage is a significant (~hot methods count)
>> increase of number of objects reported.
>>
>> On the other hand I see no reasons to unpin vtables in the nearest future
>> (Let's GC guru correct me). If you use special (freelist-type ?) 
>> allocator
>> in GC the memory fragmentation when unloading pinned vtable objects 
>> could be
>> low.
> 
> Yes, vtable should never be moved except for very weird reason. And
> yes, to pin certain amount of objects is not a big performance issue
> (in both temporal and spatial wise).
> 
> -xiaofeng
> 
>> -- 
>> Mikhail Fursov
>>
>>

In MMTk, this kind of 'pinning' is an allocation-time policy decision of 
the type I was advocating in the GC helpers thread.  Once a GC allows 
for the idea of supporting multiple collection policies (which 
generational GC requires in any case), then adding a non-moving space to 
a memory manager is easy.

Most memory managers will have a non-moving large object space no matter 
  what the primary collection policy is.  The DRLVM collectors have this 
too, don't they ?

Pinning an object after allocation is a harder problem, but not 
something required in this case.

cheers

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Xiao-Feng Li <xi...@gmail.com>.

On 11/1/06, Mikhail Fursov <mi...@gmail.com> wrote:
> On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >
> > On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> > > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > > >
> > > > agreed. not patching .. just reporting 'golden' VTable refs to GC, am
> > > > I right?
> > > >
> > > Yes, and everytime we report it to GC and GC moves an object - it
> > patches
> > > the address we report.
> >
> > so, by saying "patching" you insist to put immediate operands into
> > instructions? That's probably worth it, but it extends the JIT<->GC
> > interface. How about making a simple operand (reg/mem) as the first step?
>
>
> Egor, I thinks this is slightly more complicated problem. If vtable object
> is moved we must update all devirtualization points in every method compiled
> before. It can require an extension of JIT<->VM<->GC interface.
> Another solution I see is to collect info about all devirtualization points
> in JIT (code addrs) and report these addresses as enumeration roots. This is
> JIT-only solution, and disadvantage is a significant (~hot methods count)
> increase of number of objects reported.
>
> On the other hand I see no reasons to unpin vtables in the nearest future
> (Let's GC guru correct me). If you use special (freelist-type ?) allocator
> in GC the memory fragmentation when unloading pinned vtable objects could be
> low.

Yes, vtable should never be moved except for very weird reason. And
yes, to pin certain amount of objects is not a big performance issue
(in both temporal and spatial wise).

-xiaofeng

> --
> Mikhail Fursov
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
>
> On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > >
> > > agreed. not patching .. just reporting 'golden' VTable refs to GC, am
> > > I right?
> > >
> > Yes, and everytime we report it to GC and GC moves an object - it
> patches
> > the address we report.
>
> so, by saying "patching" you insist to put immediate operands into
> instructions? That's probably worth it, but it extends the JIT<->GC
> interface. How about making a simple operand (reg/mem) as the first step?

Egor, I thinks this is slightly more complicated problem. If vtable object
is moved we must update all devirtualization points in every method compiled
before. It can require an extension of JIT<->VM<->GC interface.
Another solution I see is to collect info about all devirtualization points
in JIT (code addrs) and report these addresses as enumeration roots. This is
JIT-only solution, and disadvantage is a significant (~hot methods count)
increase of number of objects reported.

On the other hand I see no reasons to unpin vtables in the nearest future
(Let's GC guru correct me). If you use special (freelist-type ?) allocator
in GC the memory fragmentation when unloading pinned vtable objects could be
low.

-- 
Mikhail Fursov

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >
> > agreed. not patching .. just reporting 'golden' VTable refs to GC, am
> > I right?
> >
> Yes, and everytime we report it to GC and GC moves an object - it patches
> the address we report.

so, by saying "patching" you insist to put immediate operands into
instructions? That's probably worth it, but it extends the JIT<->GC
interface. How about making a simple operand (reg/mem) as the first step?

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> wrote:
>
> agreed. not patching .. just reporting 'golden' VTable refs to GC, am
> I right?
>
Yes, and everytime we report it to GC and GC moves an object - it patches
the address we report.


-- 
Mikhail Fursov

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> On 11/1/06, Rana Dasgupta <rd...@gmail.com> wrote:
> >
> > Maybe for GC's that don't support pinning, the JIT can compare
> > object->vtable->class for guarded
> > devirtiualization, or even not do guarded devirtualization, sort of
> > support
> > the GC in downlevel mode
> 
> 
> I think this is not a long term solution for a JIT. IMO the best solutions
> for a JIT with unpinned vtables would be
> 1) Short term: turn devirtualization off (As Egor has proposed)
> 2) Long term: patch devirtualization calls when GC moves object (usual
> enumeration routine)

agreed. not patching .. just reporting 'golden' VTable refs to GC, am
I right?

> Storing vtable in the object without additional indirection in memory is
> important from the performance POV.

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 11/1/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
> Maybe for GC's that don't support pinning, the JIT can compare
> object->vtable->class for guarded
> devirtiualization, or even not do guarded devirtualization, sort of
> support
> the GC in downlevel mode

I think this is not a long term solution for a JIT. IMO the best solutions
for a JIT with unpinned vtables would be
1) Short term: turn devirtualization off (As Egor has proposed)
2) Long term: patch devirtualization calls when GC moves object (usual
enumeration routine)

Storing vtable in the object without additional indirection in memory is
important from the performance POV.

-- 
Mikhail Fursov

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.

On 10/31/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
> On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
>
> > >Yet:
> >
> > >1- You do need pinning, so you rule out some of the simplest GCs (e.g.
> > >simple, non-generational copying without pinning.)  [Apparently, for
> > >some very large heaps, simple copying a can be quite difficult to beat,
> > >efficiency wise, if you believe some relatively recent JikesRVM related
> > >paper...]
>
>
> Yes, this was one of my  concerns about the vtable object approach. This
> is
> limiting, but this is one specific GC requirement. (Maybe for GC's that
> don't support pinning, the JIT can compare object->vtable->class for
> guarded
> devirtiualization, or even not do guarded devirtualization, sort of
> support
> the GC in downlevel mode). For the refcounting method we need to hand off
> between  GC and VM before and after processing weak references, update the
> generational or semispace related CL flags, and also use the GC to undo or
> rescue CL instances that may come alive due to the generational flag
> processing.


> >2- You do have overhead even on minor collections.  With my approach,
> > >you could limit the (quite similar to yours, if you put a
> > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > >cycles.
>
>
> I think the main advantage of the vtable object approach is that it is
> somewhat elegant and natural, if one can get past the idea of non C
> vtables
> :-). Special casing to avoid object->vtable scans during minor collections
> etc. just breaks that. Relying on GC all the way forces a class unloading
> overhead to every GC cycle( even for the young generation collections ).
> There is also a space overhead that I can't really estimate( proportional
> to
> class ....etc. etc.). As I understood it, there is no impact on MMTk based
> GC's, but I may be wrong.


Actually Robin Garner in the other class unloading thread ([drlvm] classs
unloading support) said minor mods to MMTk might be required.

If class unloading is done at specific moments only, the refcounting
> approach does not add a perf overhead to each GC cycle, there is no heap
> overhead of the method either. But the former implies yet another
> secondary heuristic to optimally choose the class unloading triggers, this
> depends on the application profile and is not really once an hour/day etc.
> My guess( humbly ) would be that the refcounting method "may" be somewhat
> more time/space efficient, but that's probably not the only issue. There
> is
> the issue of implementation correctness, existing code, etc. And I don't
> know what's the best way to go to the next step.
> A suggestion could be to take Harmony-2000, review it, put it in a branch,
> tune and test it , wait for GCV5 to start supporting pinning, try with
> MMTk,
> and then integrate.



+1
I can't really visualize the changes to 40 files by looking at a diff file.
It seems inefficient for all of us to battle applying the patch simply to be
able to look at the code and set break points with the debugger.


If we do this, the refcounting approach would be a
> fallback for DRLVM.
> We need to decide on next steps, we cannot debate the algorithm forever
> :-)







-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> 
> > >Yet:
> >
> > >1- You do need pinning, so you rule out some of the simplest GCs (e.g.
> > >simple, non-generational copying without pinning.)  [Apparently, for
> > >some very large heaps, simple copying a can be quite difficult to beat,
> > >efficiency wise, if you believe some relatively recent JikesRVM related
> > >paper...]
> 
> 
> Yes, this was one of my  concerns about the vtable object approach. This is
> limiting, but this is one specific GC requirement. (Maybe for GC's that
> don't support pinning, the JIT can compare object->vtable->class for guarded
> devirtiualization, or even not do guarded devirtualization, sort of support
> the GC in downlevel mode). For the refcounting method we need to hand off
> between  GC and VM before and after processing weak references, update the
> generational or semispace related CL flags, and also use the GC to undo or
> rescue CL instances that may come alive due to the generational flag
> processing.
> 
> 
> 
> > >2- You do have overhead even on minor collections.  With my approach,
> > >you could limit the (quite similar to yours, if you put a
> > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > >cycles.
> 
> 
> I think the main advantage of the vtable object approach is that it is
> somewhat elegant and natural, if one can get past the idea of non C vtables
> :-). Special casing to avoid object->vtable scans during minor collections
> etc. just breaks that. Relying on GC all the way forces a class unloading
> overhead to every GC cycle( even for the young generation collections ).
> There is also a space overhead that I can't really estimate( proportional to
> class ....etc. etc.). As I understood it, there is no impact on MMTk based
> GC's, but I may be wrong.
> If class unloading is done at specific moments only, the refcounting
> approach does not add a perf overhead to each GC cycle, there is no heap
> overhead of the method either. But the former implies yet another
> secondary heuristic to optimally choose the class unloading triggers, this
> depends on the application profile and is not really once an hour/day etc.
> My guess( humbly ) would be that the refcounting method "may" be somewhat
> more time/space efficient, but that's probably not the only issue. There is
> the issue of implementation correctness, existing code, etc. And I don't
> know what's the best way to go to the next step.
> A suggestion could be to take Harmony-2000, review it, put it in a
> branch,

an alternative: JIT can disable guarded devirtualization via an
option. Commit the unloading, use/tune GCV5 with that opion until it
supports pinning. No branch required.

> tune and test it , wait for GCV5 to start supporting pinning, try with MMTk,
> and then integrate. If we do this, the refcounting approach would be a
> fallback for DRLVM.
> We need to decide on next steps, we cannot debate the algorithm forever :-)

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Rana Dasgupta <rd...@gmail.com>.

On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:

> >Yet:
>
> >1- You do need pinning, so you rule out some of the simplest GCs (e.g.
> >simple, non-generational copying without pinning.)  [Apparently, for
> >some very large heaps, simple copying a can be quite difficult to beat,
> >efficiency wise, if you believe some relatively recent JikesRVM related
> >paper...]

Yes, this was one of my  concerns about the vtable object approach. This is
limiting, but this is one specific GC requirement. (Maybe for GC's that
don't support pinning, the JIT can compare object->vtable->class for guarded
devirtiualization, or even not do guarded devirtualization, sort of support
the GC in downlevel mode). For the refcounting method we need to hand off
between  GC and VM before and after processing weak references, update the
generational or semispace related CL flags, and also use the GC to undo or
rescue CL instances that may come alive due to the generational flag
processing.

> >2- You do have overhead even on minor collections.  With my approach,
> >you could limit the (quite similar to yours, if you put a
> >class-loader/NULL pointer in the vtable) overhead only to selected GC
> >cycles.

I think the main advantage of the vtable object approach is that it is
somewhat elegant and natural, if one can get past the idea of non C vtables
:-). Special casing to avoid object->vtable scans during minor collections
etc. just breaks that. Relying on GC all the way forces a class unloading
overhead to every GC cycle( even for the young generation collections ).
There is also a space overhead that I can't really estimate( proportional to
class ....etc. etc.). As I understood it, there is no impact on MMTk based
GC's, but I may be wrong.
If class unloading is done at specific moments only, the refcounting
approach does not add a perf overhead to each GC cycle, there is no heap
overhead of the method either. But the former implies yet another
secondary heuristic to optimally choose the class unloading triggers, this
depends on the application profile and is not really once an hour/day etc.
My guess( humbly ) would be that the refcounting method "may" be somewhat
more time/space efficient, but that's probably not the only issue. There is
the issue of implementation correctness, existing code, etc. And I don't
know what's the best way to go to the next step.
A suggestion could be to take Harmony-2000, review it, put it in a branch,
tune and test it , wait for GCV5 to start supporting pinning, try with MMTk,
and then integrate. If we do this, the refcounting approach would be a
fallback for DRLVM.
We need to decide on next steps, we cannot debate the algorithm forever :-)

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.

On 10/31/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>
> Yet:
>
> 1- You do need pinning, so you rule out some of the simplest GCs (e.g.
> simple, non-generational copying without pinning.)  [Apparently, for
> some very large heaps, simple copying a can be quite difficult to beat,
> efficiency wise, if you believe some relatively recent JikesRVM related
> paper...]
>
> 2- You do have overhead even on minor collections.  With my approach,
> you could limit the (quite similar to yours, if you put a
> class-loader/NULL pointer in the vtable) overhead only to selected GC
> cycles.
>
> Of course, I am sure that all of the proposed approaches have their
> benefits/drawbacks.  I was simply contributing to the ongoing
> discussion.  I have no special reason to try very hard to convince you
> that "my idea is better than yours"!  I'm only joining the debate for
> trying find the most suitable solution.  I've already gained knowledge,
> from the discussion so far, that I'll be able to apply eventually in
> SableVM. :-)
>
> Maybe the best solution lies in mixing some of the various ideas
> proposed so far...


I too learned a lot from this thread.  I also suspect a better solution will
emerge from these kinds of discussions.

Etienne
>
> Ivan Volosyuk wrote:
> > Actually, no need to add the overhead to _all_ cycles. We don't need
> > to trace the vtables everytime. On minor collections all the pinned
> > vtables can be linearly scanned, thus most expensive tracing from
> > object to vtable can be avoided in this case.
> >
>
> --
> Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Etienne Gagnon <eg...@sablevm.org>.

Yet:

1- You do need pinning, so you rule out some of the simplest GCs (e.g.
simple, non-generational copying without pinning.)  [Apparently, for
some very large heaps, simple copying a can be quite difficult to beat,
efficiency wise, if you believe some relatively recent JikesRVM related
paper...]

2- You do have overhead even on minor collections.  With my approach,
you could limit the (quite similar to yours, if you put a
class-loader/NULL pointer in the vtable) overhead only to selected GC
cycles.

Of course, I am sure that all of the proposed approaches have their
benefits/drawbacks.  I was simply contributing to the ongoing
discussion.  I have no special reason to try very hard to convince you
that "my idea is better than yours"!  I'm only joining the debate for
trying find the most suitable solution.  I've already gained knowledge,
from the discussion so far, that I'll be able to apply eventually in
SableVM. :-)

Maybe the best solution lies in mixing some of the various ideas
proposed so far...

Etienne

Ivan Volosyuk wrote:
> Actually, no need to add the overhead to _all_ cycles. We don't need
> to trace the vtables everytime. On minor collections all the pinned
> vtables can be linearly scanned, thus most expensive tracing from
> object to vtable can be avoided in this case.
> 

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.

Actually, no need to add the overhead to _all_ cycles. We don't need
to trace the vtables everytime. On minor collections all the pinned
vtables can be linearly scanned, thus most expensive tracing from
object to vtable can be avoided in this case.

-- 
Ivan
Intel Enterprise Solutions Software Division

On 10/31/06, Etienne Gagnon <eg...@sablevm.org> wrote:
> Actually, I think that Java vtables would be more expensive than my
> proposed approach (when you take my proposed improvements in my reply to
> Pavel Pervov), as you add overhead to all GC cycles!  [Unless you don't
> "trace" from every visited object to its vtable?]
>
> I really don't like much the idea of an "object" vtable.  It requires
> things such as "pinning", etc.  Looks more expensive than my solution.
>
> Etienne
>
> Rana Dasgupta wrote:
> > Etienne,
> >  This is a good design, thanks. Conceptually, reference counting in the VM
> > is somewhat similar to Aleksey's proposal 1, if I understand correctly.
> > This
> > design also requires quite a few hand-offs between the VM and GC. In DRLVM,
> > the problem is that we have quite a few GC's, not all within our control.
> >  However, it seems to me that we can either desire to make unloading
> > automatic, in which case, we will need things like java vtables etc and
> > leave most things to the GC. Or we can do refcounting or tracing in the VM,
> > and work lock step with the GC(s). I am not sure which is the better way.

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Etienne Gagnon <eg...@sablevm.org>.

Actually, I think that Java vtables would be more expensive than my
proposed approach (when you take my proposed improvements in my reply to
Pavel Pervov), as you add overhead to all GC cycles!  [Unless you don't
"trace" from every visited object to its vtable?]

I really don't like much the idea of an "object" vtable.  It requires
things such as "pinning", etc.  Looks more expensive than my solution.

Etienne

Rana Dasgupta wrote:
> Etienne,
>  This is a good design, thanks. Conceptually, reference counting in the VM
> is somewhat similar to Aleksey's proposal 1, if I understand correctly.
> This
> design also requires quite a few hand-offs between the VM and GC. In DRLVM,
> the problem is that we have quite a few GC's, not all within our control.
>  However, it seems to me that we can either desire to make unloading
> automatic, in which case, we will need things like java vtables etc and
> leave most things to the GC. Or we can do refcounting or tracing in the VM,
> and work lock step with the GC(s). I am not sure which is the better way.

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Rana Dasgupta <rd...@gmail.com>.

Etienne,
  This is a good design, thanks. Conceptually, reference counting in the VM
is somewhat similar to Aleksey's proposal 1, if I understand correctly. This
design also requires quite a few hand-offs between the VM and GC. In DRLVM,
the problem is that we have quite a few GC's, not all within our control.
  However, it seems to me that we can either desire to make unloading
automatic, in which case, we will need things like java vtables etc and
leave most things to the GC. Or we can do refcounting or tracing in the VM,
 and work lock step with the GC(s). I am not sure which is the better way.

Thanks,
Rana


On 10/30/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>
> Hi all,
>
> Here's a more structured proposal for a simple and effective
> implementation of class unloading support.
>
> In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> its related native resource cleanup) can only happen when the class
> loader instance becomes unreachable.  For this to happen, we put in
> place the following things:
>
> 1- Each class loader is represented by some VM internal structure.
> [We'll call it the "class loader structure"].
>
> 2- Each class loader internal structure, except (optionally) the
> bootstrap class loader, maintains a weak reference to an object
> instance of class ClassLoader (or some subclass).  The Java instance
> has some opaque pointer back to the internal VM structure.   The Java
> instance is usually created before the internal VM structure.  The
> instance constructor is usually in charge of creating the internal VM
> structure.  [We'll call it the "class loader instance"]
>
> 3- Each class loader instance maintains a collection of loaded classes.
> A class/interface is never removed from this collection.  This
> collection maintains "hard" (i.e. "not weak") references to
> classes/interfaces.
>
> 4- [Informative] A class loader instance is also most likely to maintain
> a collection of classes for which it has "initiated" class loading.
> This collection should use hard references (as weak references won't
> lead to earlier class loading).
>
> 5- Each class loader instance maintains a hard reference to its parent
> class loader.  This reference is (optionally) null if the parent is the
> bootstrap class loader.
>
> 6- Each j.l.Class instance maintains a hard reference to the class
> loader instance of the class loader that has loaded it.  [This is not
> the "initiating" loaders, but really the "loading" loader].
>
> 7- Each class loader structure maintains a set of boolean flags, one
> flag per "non-nursery" garbage collected area (even when thread-local
> heaps are used).  The flag is set when an instance of a class loaded by
> this class leader is moved into the related GC-area.  The flag is unset
> when the GC-area is emptied, or (optionally) when it can be determined
> that no instance of a class loaded by this class loader remains in the
> GC-area.  This is best implemented as follows: a) use an unconditional
> write of "true" in the flag every time an object is moved into the
> GC-area by the garbage collector, b) unset the related flag in "all"
> class loader structures just before collecting a GC-area, then setting
> the flag back when an object survives in the area.
>
> 8- Each method invocation frame maintains a hard reference to either its
> surrounding instance (in case of instance methods, i.e. (invokevirtual,
> invokeinterface, and invokespecial) or its surrounding class
> (invokestatic).  This is already required for synchronized methods
> (it's not a good idea to allow the instance to be collected before the
> end of a synchronized instance method call; yep, learned the hard way
> in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> this is in the correctness of not letting a class loader to die while a
> static/instance method of a class loaded by it is still active, leading
> to premature release of native resources (such as jitted code, etc.).
>
> 9- A little magic is required to prevent premature collection of a class
> loader instance and its loaded j.l.Class instances (see [3-] above), as
> object instances do not maintain a hard reference to their j.l.Class
> instance, yet we want to preserve the correctness of Object.getClass().
>
> So, the simplest approach is to maintain a hard reference in a class
> loader structure to its class loader instance (in addition to the weak
> reference in [2-] above).  This reference is kept always set (thus
> preventing collection of the class loader instance), except when *all*
> the following conditions are met:
> a) All nurseries are empty.
> b) All GC-area flags are unset.
>
> Actually, for making this practical and preserving correctness, it's a
> little trickier.  It requires a 2-step process, much like the
> object-finalization dance.  Here's a typical example:
>
> On a major collection, where all nurseries are collected, and some (but
> not necessary all) other GC-areas are collected, we do the following
> sequence of actions:
> a) All class loader structures are visited.  All flags related to
>   non-nursery GC-areas that we intend to collect are unset.  If this
>   leads to *all* flags to be unset, the hard reference to the class
>   loader instance is set to NULL (thus enabling, possibly, the
>   collection of the class loader instance).
>
> b) The garbage collection cycle is started and proceeds as usual.
>   Note that the work mandated in [7-] above is also done, which might
>   lead to setting back some flags in class loader structures that had
>   all their flags unset in [a)].
>
> c) After the initial garbage collection is applied, and just before
>   the usual treatment of weak references (where they are set to NULL
>   when pointing to a collected object), all class loader structures
>   are visited again.  The hard pointer of every class loader structure
>   that has any flag set is set back to point to the class loader
>   instance if it was NULL (same as how object instances are preserved
>   for finalization).
>
> d) If [c)] has triggered any change (i.e. it mandates the survival of
>   additional class loader instances that were due to die), the garbage
>   collection cycle is "extended" to rescue the additional class loader
>   instances and all objects they can reach.
>
> e) Any additional work of the garbage collection cycle is done (e.g.
>   soft, weak, and phantom references, finalization handling).
>
> f) All class loader structures are visited again.  Every structure for
>   which the weak reference has NOT been set to NULL has its hard
>   reference set to the weak reference target.  Every structure for
>   which the weak reference has been set to NULL is now ready to be
>   unloaded ( i.e. release all of its native resources, including jitted
>   code, class information, method information, vtables, and so on).
>
>
> In addition,I highly recommend using the approach proposed in Chapter 3
> of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> class-loader related memory.  It has many advantages:
>
> 1- No "header space" overhead for very small allocations.  [This is a
> typical "hidden" space overhead of malloc() implementations to allow
> for later free() calls].
> 2- Minimal memory fragmentation.  [Allocation only happens in large
>   blocks].
> 3- Simple and very efficient allocation.  [No overhead for complex
>   management of freeing small areas later].
> 4- Efficient freeing of large memory blocks on class unloading.
> 5- Possibility of clever usage of this memory; see Chapter 4 of the same
>   document for the implementation of sparse interface virtual tables
>   enabling invokeinterface at the simple cost of invokevirtual.  :-)
>
>
> I hope this is useful to both projects [drlvm][sablevm]  :-)
>
> Etienne
>
> (C) 2006 by Etienne M. Gagnon <eg...@sablebm.org>
> This text is licensed under the Apache License, Version 2.0.
>
> [You may add this document in svn;  I am willing to sign the required
> Apache agreement to make it so, if you intend to use it in drlvm's
> implementation].
>
> --
> Etienne M. Gagnon, Ph.D.             http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Xiao-Feng Li <xi...@gmail.com>.

On 10/31/06, Pavel Pervov <pm...@gmail.com> wrote:
> Ignatenko vs Gagnon proposal checklist follows. :)
>
>
> > In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> > its related native resource cleanup) can only happen when the class
> > loader instance becomes unreachable.  For this to happen, we put in
> > place the following things:
>
> 1- Each class loader is represented by some VM internal structure.
> > [We'll call it the "class loader structure"].
>
>
> This is true.
>
>
>
> > 2- Each class loader internal structure, except (optionally) the
> > bootstrap class loader, maintains a weak reference to an object
> > instance of class ClassLoader (or some subclass).  The Java instance
> > has some opaque pointer back to the internal VM structure.   The Java
> > instance is usually created before the internal VM structure.  The
> > instance constructor is usually in charge of creating the internal VM
> > structure.  [We'll call it the "class loader instance"]
>
>
> This is true.
>
>
>
> > 3- Each class loader instance maintains a collection of loaded classes.
> > A class/interface is never removed from this collection.  This
> > collection maintains "hard" (i.e. "not weak") references to
> > classes/interfaces.
>
>
> This is true.
>
>
>
> > 4- [Informative] A class loader instance is also most likely to maintain
> > a collection of classes for which it has "initiated" class loading.
> > This collection should use hard references (as weak references won't
> > lead to earlier class loading).
>
>
> This is not true. Look for the thread "[drlvm] Non-bug difference
> HARMONY-1688?", where Eugene Ostrovsky desribed initiating loaders in
> details with links to specification.
>
>
> > 5- Each class loader instance maintains a hard reference to its parent
> > class loader.  This reference is (optionally) null if the parent is the
> > bootstrap class loader.
>
>
> This is true. This is actually a part of delegation framework.
>
>
>
> > 6- Each j.l.Class instance maintains a hard reference to the class
> > loader instance of the class loader that has loaded it.  [This is not
> > the "initiating" loaders, but really the "loading" loader].
>
>
> This is true. AFAIU, this class loader is called "defining" loader for a
> class.
>
>
>
> > 7- Each class loader structure maintains a set of boolean flags, one
> > flag per "non-nursery" garbage collected area (even when thread-local
> > heaps are used).  The flag is set when an instance of a class loaded by
> > this class leader is moved into the related GC-area.  The flag is unset
> > when the GC-area is emptied, or (optionally) when it can be determined
> > that no instance of a class loaded by this class loader remains in the
> > GC-area.  This is best implemented as follows: a) use an unconditional
> > write of "true" in the flag every time an object is moved into the
> > GC-area by the garbage collector, b) unset the related flag in "all"
> > class loader structures just before collecting a GC-area, then setting
> > the flag back when an object survives in the area.
>
>
> Requires identification of object' class type during GC. Will most
> probably degrade GC performance.

Yes, this is also my concern.

Thanks,
xiaofeng

> > 8- Each method invocation frame maintains a hard reference to either its
> > surrounding instance (in case of instance methods, i.e. (invokevirtual,
> > invokeinterface, and invokespecial) or its surrounding class
> > (invokestatic).  This is already required for synchronized methods
> > (it's not a good idea to allow the instance to be collected before the
> > end of a synchronized instance method call; yep, learned the hard way
> > in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> > this is in the correctness of not letting a class loader to die while a
> > static/instance method of a class loaded by it is still active, leading
> > to premature release of native resources (such as jitted code, etc.).
>
>
> Not generally true for optimizing JITs. "This" (or "class") can be omitted
> from enumeration if it is not used anywhere in the code. Generally, this
> technique reduces number of registers used in the code ("register pressure"
> they call it :)).
>
>
>
> > 9- A little magic is required to prevent premature collection of a class
> > loader instance and its loaded j.l.Class instances (see [3-] above), as
> > object instances do not maintain a hard reference to their j.l.Class
> > instance, yet we want to preserve the correctness of Object.getClass().
> >
> > So, the simplest approach is to maintain a hard reference in a class
> > loader structure to its class loader instance (in addition to the weak
> > reference in [2-] above).  This reference is kept always set (thus
> > preventing collection of the class loader instance), except when *all*
> > the following conditions are met:
> > a) All nurseries are empty.
> > b) All GC-area flags are unset.
>
>
> This requires more involvment of a GC in unloading process and affects GC
> code more. In DRLVM, GC is designed to be a replaceable component. Moreover,
> we already have 3 different working GCs and MMTk on the way. So, including
> GC into the design is not a good idea for DRLVM.
>
>
> <SNIP>
>
> In addition,I highly recommend using the approach proposed in Chapter 3
> > of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> > class-loader related memory.  It has many advantages:
>
>
> It is also true. Per class loader memory allocation is already used for part
> of data allocated for this class loader. Look in HARMONY-2000 which brings
> per-class loader pools to the extent.
>
> <SNIP>
>
> --
> Pavel Pervov,
> Intel Enterprise Solutions Software Division
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.

On 10/30/06, Pavel Pervov <pm...@gmail.com> wrote:
>
> Ignatenko vs Gagnon proposal checklist follows. :)
>
>
> > In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> > its related native resource cleanup) can only happen when the class
> > loader instance becomes unreachable.  For this to happen, we put in
> > place the following things:
>
> 1- Each class loader is represented by some VM internal structure.
> > [We'll call it the "class loader structure"].
>
>
> This is true.
>
>
>
> > 2- Each class loader internal structure, except (optionally) the
> > bootstrap class loader, maintains a weak reference to an object
> > instance of class ClassLoader (or some subclass).  The Java instance
> > has some opaque pointer back to the internal VM structure.   The Java
> > instance is usually created before the internal VM structure.  The
> > instance constructor is usually in charge of creating the internal VM
> > structure.  [We'll call it the "class loader instance"]
>
>
> This is true.
>
>
>
> > 3- Each class loader instance maintains a collection of loaded classes.
> > A class/interface is never removed from this collection.  This
> > collection maintains "hard" (i.e. "not weak") references to
> > classes/interfaces.
>
>
> This is true.
>
>
>
> > 4- [Informative] A class loader instance is also most likely to maintain
> > a collection of classes for which it has "initiated" class loading.
> > This collection should use hard references (as weak references won't
> > lead to earlier class loading).
>
>
> This is not true. Look for the thread "[drlvm] Non-bug difference
> HARMONY-1688?", where Eugene Ostrovsky desribed initiating loaders in
> details with links to specification.
>
>
> > 5- Each class loader instance maintains a hard reference to its parent
> > class loader.  This reference is (optionally) null if the parent is the
> > bootstrap class loader.
>
>
> This is true. This is actually a part of delegation framework.
>
>
>
> > 6- Each j.l.Class instance maintains a hard reference to the class
> > loader instance of the class loader that has loaded it.  [This is not
> > the "initiating" loaders, but really the "loading" loader].
>
>
> This is true. AFAIU, this class loader is called "defining" loader for a
> class.
>
>
>
> > 7- Each class loader structure maintains a set of boolean flags, one
> > flag per "non-nursery" garbage collected area (even when thread-local
> > heaps are used).  The flag is set when an instance of a class loaded by
> > this class leader is moved into the related GC-area.  The flag is unset
> > when the GC-area is emptied, or (optionally) when it can be determined
> > that no instance of a class loaded by this class loader remains in the
> > GC-area.  This is best implemented as follows: a) use an unconditional
> > write of "true" in the flag every time an object is moved into the
> > GC-area by the garbage collector, b) unset the related flag in "all"
> > class loader structures just before collecting a GC-area, then setting
> > the flag back when an object survives in the area.
>
>
> Requires identification of object' class type during GC. Will most
> probably degrade GC performance.


Good point.  To get an idea of how much impact on performance, it would have
to be measured.

> 8- Each method invocation frame maintains a hard reference to either its
> > surrounding instance (in case of instance methods, i.e. (invokevirtual,
> > invokeinterface, and invokespecial) or its surrounding class
> > (invokestatic).  This is already required for synchronized methods
> > (it's not a good idea to allow the instance to be collected before the
> > end of a synchronized instance method call; yep, learned the hard way
> > in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> > this is in the correctness of not letting a class loader to die while a
> > static/instance method of a class loaded by it is still active, leading
> > to premature release of native resources (such as jitted code, etc.).
>
>
> Not generally true for optimizing JITs. "This" (or "class") can be omitted
> from enumeration if it is not used anywhere in the code. Generally, this
> technique reduces number of registers used in the code ("register
> pressure"
> they call it :)).


Good point.  If a JIT inlines a method that makes zero reference to "this",
there may not be a way of identifying the class involved.

> 9- A little magic is required to prevent premature collection of a class
> > loader instance and its loaded j.l.Class instances (see [3-] above), as
> > object instances do not maintain a hard reference to their j.l.Class
> > instance, yet we want to preserve the correctness of Object.getClass().
> >
> > So, the simplest approach is to maintain a hard reference in a class
> > loader structure to its class loader instance (in addition to the weak
> > reference in [2-] above).  This reference is kept always set (thus
> > preventing collection of the class loader instance), except when *all*
> > the following conditions are met:
> > a) All nurseries are empty.
> > b) All GC-area flags are unset.
>
>
> This requires more involvment of a GC in unloading process and affects GC
> code more. In DRLVM, GC is designed to be a replaceable component.
> Moreover,
> we already have 3 different working GCs and MMTk on the way. So, including
> GC into the design is not a good idea for DRLVM.


Good point.

<SNIP>
>
> In addition,I highly recommend using the approach proposed in Chapter 3
> > of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> > class-loader related memory.  It has many advantages:
>
>
> It is also true. Per class loader memory allocation is already used for
> part
> of data allocated for this class loader. Look in HARMONY-2000 which brings
> per-class loader pools to the extent.
>
> <SNIP>
>
> --
> Pavel Pervov,
> Intel Enterprise Solutions Software Division
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Etienne Gagnon <eg...@sablevm.org>.

>> 4- [Informative] A class loader instance is also most likely to maintain
>>...
> This is not true. Look for the thread "[drlvm] Non-bug difference
> HARMONY-1688?", where Eugene Ostrovsky desribed initiating loaders in
> details with links to specification.

OK.

>> 7- Each class loader structure maintains a set of boolean flags, one
>>...
> Requires identification of object' class type during GC. Will most
> probably degrade GC performance.

Not necessarily.  It really depends whether you want to "always" care
about class unloading, or if you only care about it when doing "major"
collections.  Maybe you only want to unload classes on "full
collections", when all generations are collected.  In such a case, you
would not do anything special (e.g. not maintain these bits) during any
other collection than full ones.

As for type identification, this is not necessarily required.  You only
need to add a pointer in the vtable header (That's a 1-liner in SableVM)
that points to:

1- NULL for any class of an "unloadable class loader" (e.g. bootstrap,
system?)

2- ClassLoader structure, for ones that we wish to unload (user class
loader).

Maybe that's the "big" change to the vtable that was argued about in
this thread?  If yes, the "bigness" of it was quite misleading to me;
such a change is a trivial one, to me.  In SableVM, it's really just the
following change:

1- Add a field in the vtable "struct" in file type.h  (1 line)
2- Initialize the field to non-zero for classes of non-bootstrap loader
(1 line).

No big deal...

As an additional optimization(???), one could set a bit in the object
header when the pointer (in the vtable) is not NULL, yet parsing the
bits might cost more that dereferencing the vtable pointer and checking
the field against NULL.  [I know, this is most probably a very bad idea!]


You could even go further and only do class unloading when a special
request is made for it.  This way, you don't do anything special during
normal collection.  When the special request is done, you do a full GC
and unload any class (and loader) you can...

I guess that some of these ideas had already been somehow discussed on
this thread; I likely did misunderstand some of the few messages I read.


>> 8- Each method invocation frame maintains a hard reference to either its
>>...
> Not generally true for optimizing JITs. "This" (or "class") can be omitted
> from enumeration if it is not used anywhere in the code. Generally, this
> technique reduces number of registers used in the code ("register pressure"
> they call it :)).

OK.  Yet, for correctness, you want to make sure that at any time you
want to unload classes, you do make sure that you take into account
classes of active methods.  This can be achieved in various ways; I was
proposing one that was natural to SableVM. :-)

>> 9- A little magic is required to prevent premature collection of a class
>>...
> This requires more involvment of a GC in unloading process and affects GC
> code more. In DRLVM, GC is designed to be a replaceable component.
> Moreover,
> we already have 3 different working GCs and MMTk on the way. So, including
> GC into the design is not a good idea for DRLVM.

There is a dependency between GC and class unloading.  Somehow, you must
be aware if there are still instances, around, of needed classes.  You
don't need to "always" care for class unloading, while doing GC; as I
said above, you could reduce the overhead to well defined moments.  [You
could have rules such that: at full collections, only, and no more than
once per 1hour | 10 minutes | ...


-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Pavel Pervov <pm...@gmail.com>.

Ignatenko vs Gagnon proposal checklist follows. :)


> In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> its related native resource cleanup) can only happen when the class
> loader instance becomes unreachable.  For this to happen, we put in
> place the following things:

1- Each class loader is represented by some VM internal structure.
> [We'll call it the "class loader structure"].


This is true.



> 2- Each class loader internal structure, except (optionally) the
> bootstrap class loader, maintains a weak reference to an object
> instance of class ClassLoader (or some subclass).  The Java instance
> has some opaque pointer back to the internal VM structure.   The Java
> instance is usually created before the internal VM structure.  The
> instance constructor is usually in charge of creating the internal VM
> structure.  [We'll call it the "class loader instance"]


This is true.



> 3- Each class loader instance maintains a collection of loaded classes.
> A class/interface is never removed from this collection.  This
> collection maintains "hard" (i.e. "not weak") references to
> classes/interfaces.


This is true.



> 4- [Informative] A class loader instance is also most likely to maintain
> a collection of classes for which it has "initiated" class loading.
> This collection should use hard references (as weak references won't
> lead to earlier class loading).


This is not true. Look for the thread "[drlvm] Non-bug difference
HARMONY-1688?", where Eugene Ostrovsky desribed initiating loaders in
details with links to specification.


> 5- Each class loader instance maintains a hard reference to its parent
> class loader.  This reference is (optionally) null if the parent is the
> bootstrap class loader.


This is true. This is actually a part of delegation framework.



> 6- Each j.l.Class instance maintains a hard reference to the class
> loader instance of the class loader that has loaded it.  [This is not
> the "initiating" loaders, but really the "loading" loader].


This is true. AFAIU, this class loader is called "defining" loader for a
class.



> 7- Each class loader structure maintains a set of boolean flags, one
> flag per "non-nursery" garbage collected area (even when thread-local
> heaps are used).  The flag is set when an instance of a class loaded by
> this class leader is moved into the related GC-area.  The flag is unset
> when the GC-area is emptied, or (optionally) when it can be determined
> that no instance of a class loaded by this class loader remains in the
> GC-area.  This is best implemented as follows: a) use an unconditional
> write of "true" in the flag every time an object is moved into the
> GC-area by the garbage collector, b) unset the related flag in "all"
> class loader structures just before collecting a GC-area, then setting
> the flag back when an object survives in the area.


Requires identification of object' class type during GC. Will most
probably degrade GC performance.



> 8- Each method invocation frame maintains a hard reference to either its
> surrounding instance (in case of instance methods, i.e. (invokevirtual,
> invokeinterface, and invokespecial) or its surrounding class
> (invokestatic).  This is already required for synchronized methods
> (it's not a good idea to allow the instance to be collected before the
> end of a synchronized instance method call; yep, learned the hard way
> in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> this is in the correctness of not letting a class loader to die while a
> static/instance method of a class loaded by it is still active, leading
> to premature release of native resources (such as jitted code, etc.).


Not generally true for optimizing JITs. "This" (or "class") can be omitted
from enumeration if it is not used anywhere in the code. Generally, this
technique reduces number of registers used in the code ("register pressure"
they call it :)).



> 9- A little magic is required to prevent premature collection of a class
> loader instance and its loaded j.l.Class instances (see [3-] above), as
> object instances do not maintain a hard reference to their j.l.Class
> instance, yet we want to preserve the correctness of Object.getClass().
>
> So, the simplest approach is to maintain a hard reference in a class
> loader structure to its class loader instance (in addition to the weak
> reference in [2-] above).  This reference is kept always set (thus
> preventing collection of the class loader instance), except when *all*
> the following conditions are met:
> a) All nurseries are empty.
> b) All GC-area flags are unset.


This requires more involvment of a GC in unloading process and affects GC
code more. In DRLVM, GC is designed to be a replaceable component. Moreover,
we already have 3 different working GCs and MMTk on the way. So, including
GC into the design is not a good idea for DRLVM.


<SNIP>

In addition,I highly recommend using the approach proposed in Chapter 3
> of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> class-loader related memory.  It has many advantages:


It is also true. Per class loader memory allocation is already used for part
of data allocated for this class loader. Look in HARMONY-2000 which brings
per-class loader pools to the extent.

<SNIP>

-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.

I like it.  I don't fully understand the fine details yet.  But overall it
seems to be a clean design.  Maybe it makes sense for someone to prototype
this in drlvm.

On 10/30/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>
> Hi all,
>
> Here's a more structured proposal for a simple and effective
> implementation of class unloading support.
>
> In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> its related native resource cleanup) can only happen when the class
> loader instance becomes unreachable.  For this to happen, we put in
> place the following things:
>
> 1- Each class loader is represented by some VM internal structure.
> [We'll call it the "class loader structure"].
>
> 2- Each class loader internal structure, except (optionally) the
> bootstrap class loader, maintains a weak reference to an object
> instance of class ClassLoader (or some subclass).  The Java instance
> has some opaque pointer back to the internal VM structure.   The Java
> instance is usually created before the internal VM structure.  The
> instance constructor is usually in charge of creating the internal VM
> structure.  [We'll call it the "class loader instance"]
>
> 3- Each class loader instance maintains a collection of loaded classes.
> A class/interface is never removed from this collection.  This
> collection maintains "hard" (i.e. "not weak") references to
> classes/interfaces.
>
> 4- [Informative] A class loader instance is also most likely to maintain
> a collection of classes for which it has "initiated" class loading.
> This collection should use hard references (as weak references won't
> lead to earlier class loading).
>
> 5- Each class loader instance maintains a hard reference to its parent
> class loader.  This reference is (optionally) null if the parent is the
> bootstrap class loader.
>
> 6- Each j.l.Class instance maintains a hard reference to the class
> loader instance of the class loader that has loaded it.  [This is not
> the "initiating" loaders, but really the "loading" loader].
>
> 7- Each class loader structure maintains a set of boolean flags, one
> flag per "non-nursery" garbage collected area (even when thread-local
> heaps are used).  The flag is set when an instance of a class loaded by
> this class leader is moved into the related GC-area.  The flag is unset
> when the GC-area is emptied, or (optionally) when it can be determined
> that no instance of a class loaded by this class loader remains in the
> GC-area.  This is best implemented as follows: a) use an unconditional
> write of "true" in the flag every time an object is moved into the
> GC-area by the garbage collector, b) unset the related flag in "all"
> class loader structures just before collecting a GC-area, then setting
> the flag back when an object survives in the area.
>
> 8- Each method invocation frame maintains a hard reference to either its
> surrounding instance (in case of instance methods, i.e. (invokevirtual,
> invokeinterface, and invokespecial) or its surrounding class
> (invokestatic).  This is already required for synchronized methods
> (it's not a good idea to allow the instance to be collected before the
> end of a synchronized instance method call; yep, learned the hard way
> in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> this is in the correctness of not letting a class loader to die while a
> static/instance method of a class loaded by it is still active, leading
> to premature release of native resources (such as jitted code, etc.).
>
> 9- A little magic is required to prevent premature collection of a class
> loader instance and its loaded j.l.Class instances (see [3-] above), as
> object instances do not maintain a hard reference to their j.l.Class
> instance, yet we want to preserve the correctness of Object.getClass().
>
> So, the simplest approach is to maintain a hard reference in a class
> loader structure to its class loader instance (in addition to the weak
> reference in [2-] above).  This reference is kept always set (thus
> preventing collection of the class loader instance), except when *all*
> the following conditions are met:
> a) All nurseries are empty.
> b) All GC-area flags are unset.
>
> Actually, for making this practical and preserving correctness, it's a
> little trickier.  It requires a 2-step process, much like the
> object-finalization dance.  Here's a typical example:
>
> On a major collection, where all nurseries are collected, and some (but
> not necessary all) other GC-areas are collected, we do the following
> sequence of actions:
> a) All class loader structures are visited.  All flags related to
>   non-nursery GC-areas that we intend to collect are unset.  If this
>   leads to *all* flags to be unset, the hard reference to the class
>   loader instance is set to NULL (thus enabling, possibly, the
>   collection of the class loader instance).
>
> b) The garbage collection cycle is started and proceeds as usual.
>   Note that the work mandated in [7-] above is also done, which might
>   lead to setting back some flags in class loader structures that had
>   all their flags unset in [a)].
>
> c) After the initial garbage collection is applied, and just before
>   the usual treatment of weak references (where they are set to NULL
>   when pointing to a collected object), all class loader structures
>   are visited again.  The hard pointer of every class loader structure
>   that has any flag set is set back to point to the class loader
>   instance if it was NULL (same as how object instances are preserved
>   for finalization).
>
> d) If [c)] has triggered any change (i.e. it mandates the survival of
>   additional class loader instances that were due to die), the garbage
>   collection cycle is "extended" to rescue the additional class loader
>   instances and all objects they can reach.
>
> e) Any additional work of the garbage collection cycle is done (e.g.
>   soft, weak, and phantom references, finalization handling).
>
> f) All class loader structures are visited again.  Every structure for
>   which the weak reference has NOT been set to NULL has its hard
>   reference set to the weak reference target.  Every structure for
>   which the weak reference has been set to NULL is now ready to be
>   unloaded (i.e. release all of its native resources, including jitted
>   code, class information, method information, vtables, and so on).
>
>
> In addition,I highly recommend using the approach proposed in Chapter 3
> of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> class-loader related memory.  It has many advantages:
>
> 1- No "header space" overhead for very small allocations.  [This is a
> typical "hidden" space overhead of malloc() implementations to allow
> for later free() calls].
> 2- Minimal memory fragmentation.  [Allocation only happens in large
>   blocks].
> 3- Simple and very efficient allocation.  [No overhead for complex
>   management of freeing small areas later].
> 4- Efficient freeing of large memory blocks on class unloading.
> 5- Possibility of clever usage of this memory; see Chapter 4 of the same
>   document for the implementation of sparse interface virtual tables
>   enabling invokeinterface at the simple cost of invokevirtual.  :-)
>
>
> I hope this is useful to both projects [drlvm][sablevm]  :-)
>
> Etienne
>
> (C) 2006 by Etienne M. Gagnon <eg...@sablebm.org>
> This text is licensed under the Apache License, Version 2.0.
>
> [You may add this document in svn;  I am willing to sign the required
> Apache agreement to make it so, if you intend to use it in drlvm's
> implementation].
>
> --
> Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [admin] ICLA / ACQ (Was: [drlvm][sablevm] Desing of Class Unloading Support)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

ICLA should be faxed to the number on the document  ASQ can be sent scanned

Etienne Gagnon wrote:
> Geir Magnusson Jr. wrote:
>> However, it would be great if you had an ICLA and ACQ on file to save
>> you the trouble of typing this in the future :)  "Better living through
>> paperwork!"
> 
> OK; I should have made this a while ago...  Can they be submitted by
> email (where?) as "scanned" documents in PDF format?  This is usually
> accepted here as much as Fax documents.  [Much better quality, actually!]
> 
> Etienne
>

[admin] ICLA / ACQ (Was: [drlvm][sablevm] Desing of Class Unloading Support)

Posted by Etienne Gagnon <eg...@sablevm.org>.

Geir Magnusson Jr. wrote:
> However, it would be great if you had an ICLA and ACQ on file to save
> you the trouble of typing this in the future :)  "Better living through
> paperwork!"

OK; I should have made this a while ago...  Can they be submitted by
email (where?) as "scanned" documents in PDF format?  This is usually
accepted here as much as Fax documents.  [Much better quality, actually!]

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

Etienne Gagnon wrote:

[SNIP]

> 
> I hope this is useful to both projects [drlvm][sablevm]  :-)

This was really great - I need to go back and read it carefully.  Thanks
so much!

> 
> Etienne
> 
> (C) 2006 by Etienne M. Gagnon <eg...@sablebm.org>
> This text is licensed under the Apache License, Version 2.0.
> 
> [You may add this document in svn;  I am willing to sign the required
> Apache agreement to make it so, if you intend to use it in drlvm's
> implementation].

This isn't really necessary - by the terms of this list, anything
submitted is considered a contribution under the terms of the Apache
license and ICLA, unless noted otherwise as "NOT A CONTRIBUTION".

However, it would be great if you had an ICLA and ACQ on file to save
you the trouble of typing this in the future :)  "Better living through 
paperwork!"

geir

[drlvm][sablevm] Desing of Class Unloading Support

Posted by Etienne Gagnon <eg...@sablevm.org>.

Hi all,

Here's a more structured proposal for a simple and effective
implementation of class unloading support.

In accordance with Section 2.17.8 of the JVM spec, class unloading (and
its related native resource cleanup) can only happen when the class
loader instance becomes unreachable.  For this to happen, we put in
place the following things:

1- Each class loader is represented by some VM internal structure.
 [We'll call it the "class loader structure"].

2- Each class loader internal structure, except (optionally) the
 bootstrap class loader, maintains a weak reference to an object
 instance of class ClassLoader (or some subclass).  The Java instance
 has some opaque pointer back to the internal VM structure.   The Java
 instance is usually created before the internal VM structure.  The
 instance constructor is usually in charge of creating the internal VM
 structure.  [We'll call it the "class loader instance"]

3- Each class loader instance maintains a collection of loaded classes.
 A class/interface is never removed from this collection.  This
 collection maintains "hard" (i.e. "not weak") references to
 classes/interfaces.

4- [Informative] A class loader instance is also most likely to maintain
 a collection of classes for which it has "initiated" class loading.
 This collection should use hard references (as weak references won't
 lead to earlier class loading).

5- Each class loader instance maintains a hard reference to its parent
 class loader.  This reference is (optionally) null if the parent is the
 bootstrap class loader.

6- Each j.l.Class instance maintains a hard reference to the class
 loader instance of the class loader that has loaded it.  [This is not
 the "initiating" loaders, but really the "loading" loader].

7- Each class loader structure maintains a set of boolean flags, one
 flag per "non-nursery" garbage collected area (even when thread-local
 heaps are used).  The flag is set when an instance of a class loaded by
 this class leader is moved into the related GC-area.  The flag is unset
 when the GC-area is emptied, or (optionally) when it can be determined
 that no instance of a class loaded by this class loader remains in the
 GC-area.  This is best implemented as follows: a) use an unconditional
 write of "true" in the flag every time an object is moved into the
 GC-area by the garbage collector, b) unset the related flag in "all"
 class loader structures just before collecting a GC-area, then setting
 the flag back when an object survives in the area.

8- Each method invocation frame maintains a hard reference to either its
 surrounding instance (in case of instance methods, i.e. (invokevirtual,
 invokeinterface, and invokespecial) or its surrounding class
 (invokestatic).  This is already required for synchronized methods
 (it's not a good idea to allow the instance to be collected before the
 end of a synchronized instance method call; yep, learned the hard way
 in SableVM...)  So, the "overhead" is quite minimal.  The importance of
 this is in the correctness of not letting a class loader to die while a
 static/instance method of a class loaded by it is still active, leading
 to premature release of native resources (such as jitted code, etc.).

9- A little magic is required to prevent premature collection of a class
 loader instance and its loaded j.l.Class instances (see [3-] above), as
  object instances do not maintain a hard reference to their j.l.Class
 instance, yet we want to preserve the correctness of Object.getClass().

 So, the simplest approach is to maintain a hard reference in a class
 loader structure to its class loader instance (in addition to the weak
 reference in [2-] above).  This reference is kept always set (thus
 preventing collection of the class loader instance), except when *all*
 the following conditions are met:
  a) All nurseries are empty.
  b) All GC-area flags are unset.

 Actually, for making this practical and preserving correctness, it's a
 little trickier.  It requires a 2-step process, much like the
 object-finalization dance.  Here's a typical example:

 On a major collection, where all nurseries are collected, and some (but
 not necessary all) other GC-areas are collected, we do the following
 sequence of actions:
  a) All class loader structures are visited.  All flags related to
   non-nursery GC-areas that we intend to collect are unset.  If this
   leads to *all* flags to be unset, the hard reference to the class
   loader instance is set to NULL (thus enabling, possibly, the
   collection of the class loader instance).

  b) The garbage collection cycle is started and proceeds as usual.
   Note that the work mandated in [7-] above is also done, which might
   lead to setting back some flags in class loader structures that had
   all their flags unset in [a)].

  c) After the initial garbage collection is applied, and just before
   the usual treatment of weak references (where they are set to NULL
   when pointing to a collected object), all class loader structures
   are visited again.  The hard pointer of every class loader structure
   that has any flag set is set back to point to the class loader
   instance if it was NULL (same as how object instances are preserved
   for finalization).

  d) If [c)] has triggered any change (i.e. it mandates the survival of
   additional class loader instances that were due to die), the garbage
   collection cycle is "extended" to rescue the additional class loader
   instances and all objects they can reach.

  e) Any additional work of the garbage collection cycle is done (e.g.
   soft, weak, and phantom references, finalization handling).

  f) All class loader structures are visited again.  Every structure for
   which the weak reference has NOT been set to NULL has its hard
   reference set to the weak reference target.  Every structure for
   which the weak reference has been set to NULL is now ready to be
   unloaded (i.e. release all of its native resources, including jitted
   code, class information, method information, vtables, and so on).


In addition,I highly recommend using the approach proposed in Chapter 3
of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
class-loader related memory.  It has many advantages:

1- No "header space" overhead for very small allocations.  [This is a
 typical "hidden" space overhead of malloc() implementations to allow
 for later free() calls].
2- Minimal memory fragmentation.  [Allocation only happens in large
   blocks].
3- Simple and very efficient allocation.  [No overhead for complex
   management of freeing small areas later].
4- Efficient freeing of large memory blocks on class unloading.
5- Possibility of clever usage of this memory; see Chapter 4 of the same
   document for the implementation of sparse interface virtual tables
   enabling invokeinterface at the simple cost of invokevirtual.  :-)


I hope this is useful to both projects [drlvm][sablevm]  :-)

Etienne

(C) 2006 by Etienne M. Gagnon <eg...@sablebm.org>
This text is licensed under the Apache License, Version 2.0.

[You may add this document in svn;  I am willing to sign the required
Apache agreement to make it so, if you intend to use it in drlvm's
implementation].

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

Etienne,

I like your ideas.  It looks like it should work.  We need to carefully look
at all the corner cases to make sure we don't restrict the development of GC
algorithms.  For example, make sure concurrent GC algorithms can work with
your proposed scheme.

On 10/29/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>
> I don't know about drlvm, but SableVM does keep a reference (i.e. a
> local native reference) in each method activation frame to either the
> instance (in case of an instance method) or to the class object (in case
> of a static method).  This is required for correctly (and efficiently)
> implementing the "synchronized" method modifier.  Of course, one could
> have distinct method activation records for synchronized vs
> unsynchronized methods, yet, if you think about it, keeping identical
> activation records give you, for almost free, the class/instance
> survival you seek to prevent class unloading.
>
> The class loader should be not be unloaded before 2 conditions are met:
> 1- No instances of any loaded class.
> 2- The ClassLoader "instance" has been garbage collected.
>
> This means that, internally, there is a class loader structure which
> maintains a "weak" global native reference to its ClassLoader instance.
>
> Hoping this helps...  Any volunteer to try it in SableVM?  Much easier
> than coding it in drlvm for doing some initial experimentation. ;-)
>
> Etienne
>
> Ivan Volosyuk wrote:
> > I like your idea. We can skip counting on young generation.
> >
> > Good, this approach doesn't force us to convert VTables to java objects.
> >
> > There is one more thing to clarify. Having no objects in heap we can
> > have running method in stack which holds classloader from unloading.
> > How can we deal with that? Should we examine root-set when going to
> > trigger deallocation?
>
> --
> Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Etienne Gagnon <eg...@sablevm.org>.

I don't know about drlvm, but SableVM does keep a reference (i.e. a
local native reference) in each method activation frame to either the
instance (in case of an instance method) or to the class object (in case
of a static method).  This is required for correctly (and efficiently)
implementing the "synchronized" method modifier.  Of course, one could
have distinct method activation records for synchronized vs
unsynchronized methods, yet, if you think about it, keeping identical
activation records give you, for almost free, the class/instance
survival you seek to prevent class unloading.

The class loader should be not be unloaded before 2 conditions are met:
1- No instances of any loaded class.
2- The ClassLoader "instance" has been garbage collected.

This means that, internally, there is a class loader structure which
maintains a "weak" global native reference to its ClassLoader instance.

Hoping this helps...  Any volunteer to try it in SableVM?  Much easier
than coding it in drlvm for doing some initial experimentation. ;-)

Etienne

Ivan Volosyuk wrote:
> I like your idea. We can skip counting on young generation.
> 
> Good, this approach doesn't force us to convert VTables to java objects.
> 
> There is one more thing to clarify. Having no objects in heap we can
> have running method in stack which holds classloader from unloading.
> How can we deal with that? Should we examine root-set when going to
> trigger deallocation?

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Ivan Volosyuk <iv...@gmail.com>.

I like your idea. We can skip counting on young generation.

Good, this approach doesn't force us to convert VTables to java objects.

There is one more thing to clarify. Having no objects in heap we can
have running method in stack which holds classloader from unloading.
How can we deal with that? Should we examine root-set when going to
trigger deallocation?
--
Ivan

On 10/29/06, Etienne Gagnon <eg...@sablevm.org> wrote:
> Wait a minute!  I missed something.  Actually, there is no need to track
> allocations in the young generation!  Only survivals.  So, you apply the
> boolean trick only for objects that survive the nursery collection.
>
> So, there would be no bit overhead in objects, nor work overhead on
> allocations.  Just a little overhead on moving objects across generations.
>
> As for the class loader, there are many solutions; one is to maintain a
> list of loaded classes with surviving objects.  Every time a class sets
> one of its gc-area booleans to false (at the end of a gc cycle), it
> checks whether all its other gc-area booleans are also false.  If yes,
> it removes itself from the class-loader "loaded class with surviving
> objects" list.  When this list becomes empty, the class loader can be
> unloaded (as soon as it is not referenced elsewhere).
>
> Making any sense?
>
> Etienne
>
> Etienne Gagnon wrote:
> > Ivan Volosyuk wrote:
> >
> >>If I understand you correctly, you suggest to increment
> >>per-classloader object counter on allocation... It can be much
> >>overhead with the solution, as most of the objects die young.
> >>Do I miss something?
> >
> >
> > No, I was thinking about a "per-class" counter.  Actually, a counter is
> > not needed.  A simple boolean is suffucient (one boolean per gc
> > generation in each class), so the cost would be a single inconditional
> > memory write per object allocation.  I would think that this would be
> > lost in the noize of object field "zero" initialization. No?
> >
> > Etienne
> >
>
> --
> Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Etienne Gagnon <eg...@sablevm.org>.

OK, my brain seems to be just waking up...  I was thinking of per-class
to reduce indirections at allocation time.  There's no need for
per-class data, only "one boolean per GC generation" in every class
loader.  The boolean is set when an instance (of a class loaded by this
class loader) is moved into the generation, and is unset when the
generation is collected.

As I said, I missed some of this thread's messages.  Maybe you already
came to a similar solution, maybe not.  I just remember seeing messages
about modifying the object header to help class unloading, which seemed
overly onerous to me.

Regards,

Etienne

Etienne Gagnon wrote:
> Wait a minute!  I missed something.  Actually, there is no need to track
> ...

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Etienne Gagnon <eg...@sablevm.org>.

Wait a minute!  I missed something.  Actually, there is no need to track
allocations in the young generation!  Only survivals.  So, you apply the
boolean trick only for objects that survive the nursery collection.

So, there would be no bit overhead in objects, nor work overhead on
allocations.  Just a little overhead on moving objects across generations.

As for the class loader, there are many solutions; one is to maintain a
list of loaded classes with surviving objects.  Every time a class sets
one of its gc-area booleans to false (at the end of a gc cycle), it
checks whether all its other gc-area booleans are also false.  If yes,
it removes itself from the class-loader "loaded class with surviving
objects" list.  When this list becomes empty, the class loader can be
unloaded (as soon as it is not referenced elsewhere).

Making any sense?

Etienne

Etienne Gagnon wrote:
> Ivan Volosyuk wrote:
> 
>>If I understand you correctly, you suggest to increment
>>per-classloader object counter on allocation... It can be much
>>overhead with the solution, as most of the objects die young.
>>Do I miss something?
> 
> 
> No, I was thinking about a "per-class" counter.  Actually, a counter is
> not needed.  A simple boolean is suffucient (one boolean per gc
> generation in each class), so the cost would be a single inconditional
> memory write per object allocation.  I would think that this would be
> lost in the noize of object field "zero" initialization. No?
> 
> Etienne
> 

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Etienne Gagnon <eg...@sablevm.org>.

Ivan Volosyuk wrote:
> If I understand you correctly, you suggest to increment
> per-classloader object counter on allocation... It can be much
> overhead with the solution, as most of the objects die young.
> Do I miss something?

No, I was thinking about a "per-class" counter.  Actually, a counter is
not needed.  A simple boolean is suffucient (one boolean per gc
generation in each class), so the cost would be a single inconditional
memory write per object allocation.  I would think that this would be
lost in the noize of object field "zero" initialization. No?

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Ivan Volosyuk <iv...@gmail.com>.

If I understand you correctly, you suggest to increment
per-classloader object counter on allocation... It can be much
overhead with the solution, as most of the objects die young.
Do I miss something?
--
Ivan

On 10/29/06, Etienne Gagnon <eg...@sablevm.org> wrote:
> I have missed some messages of this thread, yet I do not remember seeing
> a discussion of what seems to me the obvious solution to the problem.
> So, here it is.
>
> Why don't you simply add a reference count on classes which is
> incremented on object allocation and decremented on object reclamation?
>  [In case you use a copying collector, you could keep a separate count
> (in the class) for the collected area, so that you only have to count
> copied objects].  You would also use reference counting for the class
> loader (therefore eliminating any cyclic problem that you could have
> with normal garbage collection).  This would work very well as unloading
> only happens when the class loader can be unloaded along all of its classes.
>
> No need for any supportive information in object header, or anything
> complex...  Am I really missing something?
>
> Just an idea...
>
> Etienne
>
>
> --
> Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Etienne Gagnon <eg...@sablevm.org>.

I have missed some messages of this thread, yet I do not remember seeing
a discussion of what seems to me the obvious solution to the problem.
So, here it is.

Why don't you simply add a reference count on classes which is
incremented on object allocation and decremented on object reclamation?
 [In case you use a copying collector, you could keep a separate count
(in the class) for the collected area, so that you only have to count
copied objects].  You would also use reference counting for the class
loader (therefore eliminating any cyclic problem that you could have
with normal garbage collection).  This would work very well as unloading
only happens when the class loader can be unloaded along all of its classes.

No need for any supportive information in object header, or anything
complex...  Am I really missing something?

Just an idea...

Etienne


-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support

Posted by Ivan Volosyuk <iv...@gmail.com>.

On 10/29/06, Alex Astapchuk <al...@gmail.com> wrote:
> Mikhail Fursov:
> > On 10/28/06, Alex Astapchuk <al...@gmail.com> wrote:
> >>
> >> Aleksey,
> >>
> >> >   1. Mark and scan based approach.
> >> >   2. Automatic class unloading approach.
> >>
> >> In the #2, is there any chance for other components to be notified about
> >>    unloaded classes?
> >>
> >
> > Alex,
> > I asked Aleksey about the same feature some time ago. I was interested if
> > it's possible to deallocate profiler's data in EM for unloaded methods. The
> > answer was: OK you will get a callback from VM. So, this feature is in the
> > design. Let's wait Aleksey to give us more details about it.
>
> Hmmm...  Yes, some more details would be nice.
> If I get it right, in case of automagic unloading, GC does all the job
> without a knowledge whether it collects a class, a classloader or
> whatever else.
> Perhaps I'm missing something, but to provide a callback on class
> unloading, the GC must know the semantic of the object being collected.

The callback will be called by class unloading implementation (for
#1). It will definetly know everything about classloader being
deallocated. EM just needs to make relation between its data
structures with corresponding classloader and free them by request.

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

>Hmmm...  Yes, some more details would be nice.
>If I get it right, in case of automagic unloading, GC does all the job
>without a knowledge whether it collects a class, a classloader or
>whatever else.
>Perhaps I'm missing something, but to provide a callback on class
>unloading, the GC must know the semantic of the object being collected.?

Alex.
GC does not need any special knowledge about the semantic of the object
being collected. We simply provide weak root to j.l.Classloader inside VM.
When VM detects that this weak reference is zeroed (e.g. at the end of GC
cycle) it destroys all native sources of appropriate classloader or schedule
this destroy on some other time.

I hope, I answered your question.
Aleksey.


On 10/29/06, Alex Astapchuk <al...@gmail.com> wrote:
>
> Mikhail Fursov:
> > On 10/28/06, Alex Astapchuk <al...@gmail.com> wrote:
> >>
> >> Aleksey,
> >>
> >> >   1. Mark and scan based approach.
> >> >   2. Automatic class unloading approach.
> >>
> >> In the #2, is there any chance for other components to be notified
> about
> >>    unloaded classes?
> >>
> >
> > Alex,
> > I asked Aleksey about the same feature some time ago. I was interested
> if
> > it's possible to deallocate profiler's data in EM for unloaded methods.
> The
> > answer was: OK you will get a callback from VM. So, this feature is in
> the
> > design. Let's wait Aleksey to give us more details about it.
>
> Hmmm...  Yes, some more details would be nice.
> If I get it right, in case of automagic unloading, GC does all the job
> without a knowledge whether it collects a class, a classloader or
> whatever else.
> Perhaps I'm missing something, but to provide a callback on class
> unloading, the GC must know the semantic of the object being collected.
>
> ?
>
> --
> Thanks,
>   Alex
>
>

Re: [drlvm] Class unloading support

Posted by Alex Astapchuk <al...@gmail.com>.

Mikhail Fursov:
> On 10/28/06, Alex Astapchuk <al...@gmail.com> wrote:
>>
>> Aleksey,
>>
>> >   1. Mark and scan based approach.
>> >   2. Automatic class unloading approach.
>>
>> In the #2, is there any chance for other components to be notified about
>>    unloaded classes?
>>
> 
> Alex,
> I asked Aleksey about the same feature some time ago. I was interested if
> it's possible to deallocate profiler's data in EM for unloaded methods. The
> answer was: OK you will get a callback from VM. So, this feature is in the
> design. Let's wait Aleksey to give us more details about it.

Hmmm...  Yes, some more details would be nice.
If I get it right, in case of automagic unloading, GC does all the job 
without a knowledge whether it collects a class, a classloader or 
whatever else.
Perhaps I'm missing something, but to provide a callback on class 
unloading, the GC must know the semantic of the object being collected.

?

-- 
Thanks,
   Alex

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 10/28/06, Alex Astapchuk <al...@gmail.com> wrote:
>
> Aleksey,
>
> >   1. Mark and scan based approach.
> >   2. Automatic class unloading approach.
>
> In the #2, is there any chance for other components to be notified about
>    unloaded classes?
>

Alex,
I asked Aleksey about the same feature some time ago. I was interested if
it's possible to deallocate profiler's data in EM for unloaded methods. The
answer was: OK you will get a callback from VM. So, this feature is in the
design. Let's wait Aleksey to give us more details about it.

-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Alex Astapchuk <al...@gmail.com>.

Aleksey,

 >   1. Mark and scan based approach.
 >   2. Automatic class unloading approach.

In the #2, is there any chance for other components to be notified about 
   unloaded classes?

I can imagine many scenarios when a component needs to keep some data 
associated with the classes. If a class gets unloaded, then other 
components may need to clean up internal data as well.

The use cases I can think of are:
- a JIT that does whole program optimizations and associate some data 
with the classes - a profile data for example.

- a GC that again uses some class-related data. For example to improve 
locality of allocated objects basing on their types.

Perhaps there are other scenarios exist.

Or may be you have a proposal how to associate a resources with the 
classes in such cases?

-- 
Thanks,
   Alex


Aleksey Ignatenko wrote:
> Hello all!
> 
> 
> 
> As you probably know current version of harmony DRLVM has no class 
> unloading
> support. This leads to the fact that some Java applications accumulate
> memory leaks leading to memory overflow and crashes.
> 
> In this message I would like to describe two approaches for class unloading
> in DRLVM and propose to implement one of them as basic. Pros and cons for
> both approaches are presented below. Lets name these approaches:
> 
>   1. Mark and scan based approach.
>   2. Automatic class unloading approach.
> 
> 
> 
> *Current DRLVM implementation specifics.*
> 
> 
> 
> All Java.lang.Class (j.l.Class) and java.lang.Classloader (j.l.Classloader)
> instances are enumerated as strong roots inside VM, which leads to the 
> state
> when all j.l.Class and j.l.Classloader instances are always reachable.
> 
> 
> 
> To unload class loader CL three conditions are to be fulfilled (*):
> 
>   1. j.l.Classloader instance of CL is unreachable.
>   2. Classes (j.l.Class instances) loaded by CL are unreachable.
>   3. No object of any class loaded by CL exists.
> 
> 
> 
> Here is brief description for the both approaches:
> 
> 
> 
> *Mark and scan based approach.*
> 
> Java heap trace is performed by VM Core at the beginning of stop-the-world.
> If some class loader and its classes are unreachable and there is no object
> of these classes, then exclude this class loader from enumeration to 
> make GC
> collect it. After GC happens and appropriate j.l.Classloader instance is
> collected – remove native resources from C heap: class loader and all
> classes loaded by it, jitted code and so on. Corresponding Java objects
> should already be collected by GC at this moment.
> 
> Pros:
> 
> - Simplicity – requires only additional mark&scan functionality on VM side
> to detect classes for unloading + few changes in enumeration algorithm.
> 
> Cons:
> 
> - Requires additional GC/VM functionality to trace j.l.Class and
> j.l.Classloader instances from each object.
> 
> - Duplicates mark&scan functionality on VM side.
> 
> - Affects every plugged GC.
> 
> - "Stop-the-world" state of VM is required, i.e. all threads except the one
> performing unloading should be suspended.
> 
> - Possibly some additional limitations on new GCs.
> 
> 
> 
> *Automatic class unloading approach.*
> 
> "Automatic class unloading" means that j.l.Classloader instance is unloaded
> automatically (w/o additional enumeration tricks or GC dependency) and 
> after
> we detect that some class loader was unloaded we destroy its native
> resources. To do that we need to provide two conditions:
> 
>   1. Introduce reference from object to its j.l.Class instance.
>   2. Class registry - introduce references from j.l.Classes to its
>   defining j.l.Classloader and references from j.l.Classloader to
>   j.l.Classes loaded by it (unloading is to be done for
> j.l.Classloaderand corresponding
>   j.l.Classes at once).
> 
> 
> 
> *Introduce reference from object to its j.l.Class instance.*
> 
> DRLVM has definite implementation specifics. Object is described with 
> native
> VTable structure, which has pointers to class and other related data.
> VTables can have different sizes according to object class specifics. The
> main idea of referencing j.l.Class from object is to make VTable a special
> Java object with reference to appropriate j.l.Class instance, but give it a
> regular object view from GC point of view. VTable pointer is located in
> object by zero offset and therefore can be simply considered as reference
> field. Thus we can implement j.l.Class instance tracing from object via
> VTable object. VTable object is considered to be pinned for simplification.
> 
> 
> 
> In summary, having class registry and reference from object to its
> j.l.Classinstance we guarantee that some class loader CL can be
> unloaded only if
> three conditions are fulfilled described above (*). To find out when Java
> part of class loader was unloaded j.l.Classloader instance should be
> enumerated as weak root. When this root becomes equal to null – destroy
> native memory of appropriate class loader.
> 
> 
> 
> Pros:
> 
> - Unification of unloading approach – no additional requirements from GC.
> 
> - Stop-the-world is not required.
> 
> - GC handles VTables automatically as regular objects.
> 
> Cons
> 
> - Number of objects to be increased.
> 
> - Memory footprint to be increased both for native and Java heaps (as 
> VTable
> objects appear).
> 
> 
> 
> *Conclusion. *
> 
> I prefer automatic class unloading approach due to the described set of
> properties (see above). It is more flexible and perspective solution. Also
> JVM specification is mostly related to automatic class unloading approach
> while mark and scan based approach looks more like class unloading
> workaround.
> 
> 
> 
> 
> 
> Please, do not hesitate to ask questions.
> 
> Best regards,
> 
> Aleksey Ignatenko,
> 
> Intel Enterprise Solutions Software Division.
>

Re: [drlvm] Class unloading support

Posted by Pavel Pervov <pm...@gmail.com>.

And yes, there will be reference to corresponding java/lang/Class in VTable
object. Which means that j/l/Class will be reachable from any object of this
class.

So, still no object size increase. :)

Pavel.

On 10/25/06, Weldon Washburn <we...@gmail.com> wrote:
>
> On 10/24/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >
> >
> > *Automatic class unloading approach.*
> >
> > "Automatic class unloading" means that j.l.Classloader instance is
> > unloaded
> > automatically (w/o additional enumeration tricks or GC dependency) and
> > after
> > we detect that some class loader was unloaded we destroy its native
> > resources. To do that we need to provide two conditions:
> >
> >   1. Introduce reference from object to its j.l.Class instance.
>
>
> hmm... I think this means the object header size will increase by
> sizeof(reference_ptr).  In addition to the cons listed below, the added
> ref
> ptr can cause cache pollution problems.  From old studies adding a ref ptr
> to object header degraded overall performance about 3%.  Maybe it makes
> sense to add a dummy slot in the existing object header layout to see what
> the footprint and speed impact will be on modern hardware.
>
>
>
>
>
> 2. Class registry - introduce references from j.l.Classes to its
> >   defining j.l.Classloader and references from j.l.Classloader to
> >   j.l.Classes loaded by it (unloading is to be done for
> > j.l.Classloaderand corresponding
> >   j.l.Classes at once).
> >
> >
> >
> > *Introduce reference from object to its j.l.Class instance.*
> >
> > DRLVM has definite implementation specifics. Object is described with
> > native
> > VTable structure, which has pointers to class and other related data.
> > VTables can have different sizes according to object class specifics.
> The
> > main idea of referencing j.l.Class from object is to make VTable a
> special
> > Java object with reference to appropriate j.l.Class instance, but give
> it
> > a
> > regular object view from GC point of view. VTable pointer is located in
> > object by zero offset and therefore can be simply considered as
> reference
> > field. Thus we can implement j.l.Class instance tracing from object via
> > VTable object. VTable object is considered to be pinned for
> > simplification.
> >
> >
> >
> > In summary, having class registry and reference from object to its
> > j.l.Classinstance we guarantee that some class loader CL can be
> > unloaded only if
> > three conditions are fulfilled described above (*). To find out when
> Java
> > part of class loader was unloaded j.l.Classloader instance should be
> > enumerated as weak root. When this root becomes equal to null – destroy
> > native memory of appropriate class loader.
> >
> >
> >
> > Pros:
> >
> > - Unification of unloading approach – no additional requirements from
> GC.
> >
> > - Stop-the-world is not required.
> >
> > - GC handles VTables automatically as regular objects.
> >
> > Cons
> >
> > - Number of objects to be increased.
> >
> > - Memory footprint to be increased both for native and Java heaps (as
> > VTable
> > objects appear).
> >
> >
> >
> > *Conclusion. *
> >
> > I prefer automatic class unloading approach due to the described set of
> > properties (see above). It is more flexible and perspective solution.
> Also
> > JVM specification is mostly related to automatic class unloading
> approach
> > while mark and scan based approach looks more like class unloading
> > workaround.
> >
> >
> >
> >
> >
> > Please, do not hesitate to ask questions.
> >
> > Best regards,
> >
> > Aleksey Ignatenko,
> >
> > Intel Enterprise Solutions Software Division.
> >
> >
>
>
> --
> Weldon Washburn
> Intel Middleware Products Division
>



-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Pavel Pervov <pm...@gmail.com>.

Weldon,

You sort of missed that in the proposal. Current (possibly compressed)
vtable pointer at the offset 0 in the object layout will be replaced with
(possibly compressed) ManagedObject*. VTable itself is going to become an
object allocated in Java heap.

So, no encrease in object size.

Pavel.
On 10/25/06, Weldon Washburn <we...@gmail.com> wrote:
>
> On 10/24/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >
> >
> > *Automatic class unloading approach.*
> >
> > "Automatic class unloading" means that j.l.Classloader instance is
> > unloaded
> > automatically (w/o additional enumeration tricks or GC dependency) and
> > after
> > we detect that some class loader was unloaded we destroy its native
> > resources. To do that we need to provide two conditions:
> >
> >   1. Introduce reference from object to its j.l.Class instance.
>
>
> hmm... I think this means the object header size will increase by
> sizeof(reference_ptr).  In addition to the cons listed below, the added
> ref
> ptr can cause cache pollution problems.  From old studies adding a ref ptr
> to object header degraded overall performance about 3%.  Maybe it makes
> sense to add a dummy slot in the existing object header layout to see what
> the footprint and speed impact will be on modern hardware.
>
>
>
>
>
> 2. Class registry - introduce references from j.l.Classes to its
> >   defining j.l.Classloader and references from j.l.Classloader to
> >   j.l.Classes loaded by it (unloading is to be done for
> > j.l.Classloaderand corresponding
> >   j.l.Classes at once).
> >
> >
> >
> > *Introduce reference from object to its j.l.Class instance.*
> >
> > DRLVM has definite implementation specifics. Object is described with
> > native
> > VTable structure, which has pointers to class and other related data.
> > VTables can have different sizes according to object class specifics.
> The
> > main idea of referencing j.l.Class from object is to make VTable a
> special
> > Java object with reference to appropriate j.l.Class instance, but give
> it
> > a
> > regular object view from GC point of view. VTable pointer is located in
> > object by zero offset and therefore can be simply considered as
> reference
> > field. Thus we can implement j.l.Class instance tracing from object via
> > VTable object. VTable object is considered to be pinned for
> > simplification.
> >
> >
> >
> > In summary, having class registry and reference from object to its
> > j.l.Classinstance we guarantee that some class loader CL can be
> > unloaded only if
> > three conditions are fulfilled described above (*). To find out when
> Java
> > part of class loader was unloaded j.l.Classloader instance should be
> > enumerated as weak root. When this root becomes equal to null – destroy
> > native memory of appropriate class loader.
> >
> >
> >
> > Pros:
> >
> > - Unification of unloading approach – no additional requirements from
> GC.
> >
> > - Stop-the-world is not required.
> >
> > - GC handles VTables automatically as regular objects.
> >
> > Cons
> >
> > - Number of objects to be increased.
> >
> > - Memory footprint to be increased both for native and Java heaps (as
> > VTable
> > objects appear).
> >
> >
> >
> > *Conclusion. *
> >
> > I prefer automatic class unloading approach due to the described set of
> > properties (see above). It is more flexible and perspective solution.
> Also
> > JVM specification is mostly related to automatic class unloading
> approach
> > while mark and scan based approach looks more like class unloading
> > workaround.
> >
> >
> >
> >
> >
> > Please, do not hesitate to ask questions.
> >
> > Best regards,
> >
> > Aleksey Ignatenko,
> >
> > Intel Enterprise Solutions Software Division.
> >
> >
>
>
> --
> Weldon Washburn
> Intel Middleware Products Division
>



-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

On 10/24/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>
>
> *Automatic class unloading approach.*
>
> "Automatic class unloading" means that j.l.Classloader instance is
> unloaded
> automatically (w/o additional enumeration tricks or GC dependency) and
> after
> we detect that some class loader was unloaded we destroy its native
> resources. To do that we need to provide two conditions:
>
>   1. Introduce reference from object to its j.l.Class instance.


hmm... I think this means the object header size will increase by
sizeof(reference_ptr).  In addition to the cons listed below, the added ref
ptr can cause cache pollution problems.  From old studies adding a ref ptr
to object header degraded overall performance about 3%.  Maybe it makes
sense to add a dummy slot in the existing object header layout to see what
the footprint and speed impact will be on modern hardware.





  2. Class registry - introduce references from j.l.Classes to its
>   defining j.l.Classloader and references from j.l.Classloader to
>   j.l.Classes loaded by it (unloading is to be done for
> j.l.Classloaderand corresponding
>   j.l.Classes at once).
>
>
>
> *Introduce reference from object to its j.l.Class instance.*
>
> DRLVM has definite implementation specifics. Object is described with
> native
> VTable structure, which has pointers to class and other related data.
> VTables can have different sizes according to object class specifics. The
> main idea of referencing j.l.Class from object is to make VTable a special
> Java object with reference to appropriate j.l.Class instance, but give it
> a
> regular object view from GC point of view. VTable pointer is located in
> object by zero offset and therefore can be simply considered as reference
> field. Thus we can implement j.l.Class instance tracing from object via
> VTable object. VTable object is considered to be pinned for
> simplification.
>
>
>
> In summary, having class registry and reference from object to its
> j.l.Classinstance we guarantee that some class loader CL can be
> unloaded only if
> three conditions are fulfilled described above (*). To find out when Java
> part of class loader was unloaded j.l.Classloader instance should be
> enumerated as weak root. When this root becomes equal to null – destroy
> native memory of appropriate class loader.
>
>
>
> Pros:
>
> - Unification of unloading approach – no additional requirements from GC.
>
> - Stop-the-world is not required.
>
> - GC handles VTables automatically as regular objects.
>
> Cons
>
> - Number of objects to be increased.
>
> - Memory footprint to be increased both for native and Java heaps (as
> VTable
> objects appear).
>
>
>
> *Conclusion. *
>
> I prefer automatic class unloading approach due to the described set of
> properties (see above). It is more flexible and perspective solution. Also
> JVM specification is mostly related to automatic class unloading approach
> while mark and scan based approach looks more like class unloading
> workaround.
>
>
>
>
>
> Please, do not hesitate to ask questions.
>
> Best regards,
>
> Aleksey Ignatenko,
>
> Intel Enterprise Solutions Software Division.
>
>


-- 
Weldon Washburn
Intel Middleware Products Division

Re: [drlvm] Class unloading support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x20D day of Apache Harmony Geir Magnusson, Jr. wrote:
> Egor Pasko wrote:
> > On the 0x20C day of Apache Harmony Aleksey Ignatenko wrote:
> >> Hello all!
> >>
> >>
> >>
> >> As you probably know current version of harmony DRLVM has no class unloading
> >> support. This leads to the fact that some Java applications accumulate
> >> memory leaks leading to memory overflow and crashes.
> >>
> >> In this message I would like to describe two approaches for class unloading
> >> in DRLVM and propose to implement one of them as basic. Pros and cons for
> >> both approaches are presented below. Lets name these approaches:
> >>
> >>    1. Mark and scan based approach.
> >>    2. Automatic class unloading approach.
> > I am +1 to (2)=(Automatic class unloading approach). Do not like
> > stop-the-world. But it has 1 more "cons" -- JIT should change it's
> > devirtualizer
> > accordingly to the VTable change. Doable, of course.
> > BTW, is it reasonable to "compress" or "enumerate" references to
> > j.l.Class in each object to reduce the footprint? How many classes are
> > alive in heavy-duty applications? not very much probably.
> 
> What is your sense of "very much"?  IOW, how many would make you say,
> "yeah, that's a lot"

hm, 1/10 of the whole heap is probably very much :)

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm] Class unloading support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.


Egor Pasko wrote:
> On the 0x20C day of Apache Harmony Aleksey Ignatenko wrote:
>> Hello all!
>>
>>
>>
>> As you probably know current version of harmony DRLVM has no class unloading
>> support. This leads to the fact that some Java applications accumulate
>> memory leaks leading to memory overflow and crashes.
>>
>> In this message I would like to describe two approaches for class unloading
>> in DRLVM and propose to implement one of them as basic. Pros and cons for
>> both approaches are presented below. Lets name these approaches:
>>
>>    1. Mark and scan based approach.
>>    2. Automatic class unloading approach.
> 
> I am +1 to (2)=(Automatic class unloading approach). Do not like
> stop-the-world. 
> 
> But it has 1 more "cons" -- JIT should change it's devirtualizer
> accordingly to the VTable change. Doable, of course.
> 
> BTW, is it reasonable to "compress" or "enumerate" references to
> j.l.Class in each object to reduce the footprint? How many classes are
> alive in heavy-duty applications? not very much probably.

What is your sense of "very much"?  IOW, how many would make you say, 
"yeah, that's a lot"

geir

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

A Number of additional VTable objects will be proportional to a number of
loaded classes, this is considerably less than total number of
objects, so "1/10
of the whole heap " digit looks unreal. Any way precise meausrements could
be done only after implementation is ready, so  I suppose talking about
possible 10% performance and footprint decrease is too early :)

Alelsey.

On 10/26/06, Ivan Volosyuk <iv...@gmail.com> wrote:
>
> The comment was for this statement:
> > 1) "1/10 of the whole heap is probably very much :)"
>
> If we spare 10% of total space for additional objects which doesn't
> required for computations we may have mentioned peformance decrease on
> memory intensive usage.
> --
> Ivan
>
>
> On 10/26/06, Mikhail Fursov <mi...@gmail.com> wrote:
> > On 10/26/06, Ivan Volosyuk <iv...@gmail.com> wrote:
> > >
> > > It means 10% more pressure on GC subsystem. In memory intensive cases
> > > it can be up to 10% decrease in performance.
> > >
> >
> > Ivan,
> > If you have 10% more classes loaded it does not mean that you need 10%
> more
> > memory space for all of runtime objects you have.
> >
> > --
> > Mikhail Fursov
>

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 10/26/06, Ivan Volosyuk <iv...@gmail.com> wrote:
>
> The comment was for this statement:
> > 1) "1/10 of the whole heap is probably very much :)"

Ok. Now I understand the difference. Heap size is not a linear function of
"class size" or "number of classes loaded". So let's skip it :)

If we spare 10% of total space for additional objects which doesn't
> required for computations we may have mentioned peformance decrease on
> memory intensive usage.

Yes I agree with you here.

-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Ivan Volosyuk <iv...@gmail.com>.

The comment was for this statement:
> 1) "1/10 of the whole heap is probably very much :)"

If we spare 10% of total space for additional objects which doesn't
required for computations we may have mentioned peformance decrease on
memory intensive usage.
--
Ivan


On 10/26/06, Mikhail Fursov <mi...@gmail.com> wrote:
> On 10/26/06, Ivan Volosyuk <iv...@gmail.com> wrote:
> >
> > It means 10% more pressure on GC subsystem. In memory intensive cases
> > it can be up to 10% decrease in performance.
> >
>
> Ivan,
> If you have 10% more classes loaded it does not mean that you need 10% more
> memory space for all of runtime objects you have.
>
> --
> Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 10/26/06, Ivan Volosyuk <iv...@gmail.com> wrote:
>
> It means 10% more pressure on GC subsystem. In memory intensive cases
> it can be up to 10% decrease in performance.
>

Ivan,
If you have 10% more classes loaded it does not mean that you need 10% more
memory space for all of runtime objects you have.

-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Ivan Volosyuk <iv...@gmail.com>.

On 10/26/06, Mikhail Fursov <mi...@gmail.com> wrote:
> Egor,
> I would rather disagree in most of the details. :(
>
> 1) "1/10 of the whole heap is probably very much :)"
>
> No. You can't measure performance in method/class numbers and proportions at
> all. It's normal situation when only 1% of methods consume 90% of CPU ticks.
> Not all of them depends on devirtualization or classes unloaded. So you
> can't say that 1 class unloaded is OK but 100% is bad. It's a lottery.

It means 10% more pressure on GC subsystem. In memory intensive cases
it can be up to 10% decrease in performance.

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x20E day of Apache Harmony Mikhail Fursov wrote:
> On 26 Oct 2006 15:46:13 +0700, Egor Pasko <eg...@gmail.com> wrote:
> >
> > > No. You can't measure performance in method/class numbers and
> > proportions at
> > > all. It's normal situation when only 1% of methods consume 90% of CPU
> > ticks.
> > > Not all of them depends on devirtualization or classes unloaded. So you
> > > can't say that 1 class unloaded is OK but 100% is bad. It's a lottery.
> >
> > I do not quite understand with what I disagree here :)
> >
> > I meant, increasing footprint by 1/10 due to class unloading support
> > seems too much for a big application.
> 
> What numbers do you mean when you say "too much"? Memory footprint? 

yes, I did mean that

> Not: the memory footprint is the number of objects (not classes)
> created.  

I was afraid of more references kept in total. That could have
increased the footprint. Aleksey clarified that no extra pointers are
introduced (VTable ptr -> VTable ref in each object) ptrs and refs
have the same size (refs can be even less space-consuming on x86_64
due to common compression).

Aleksey also pointed out that some extra refs appear -- pointers to
VTable from VTable objects. So, we are loosing extra (4 *
<number_of_VTables>) bytes more with this approach (on IA32). But it
is "not much", IMHO :)

I cannot clarify more :)

> Performance: not again.

Mikhail, sorry if I lead it into confusion :(

> > Of course, it's a personal impression, application-dependant,
> > etc. Just want to know the real number. They say, it would be much
> > less than that.
> 
> 
> The real number depends on a real application and application type.  With AS
> that runs for months without reboot you can have almost every class reloaded
> thousands of times. A big and configured client application can use multiple
> classloaders only to load classes and never unload it.

Surely, I agree here.

> > 2)  "hm, keeping VTable pointers on operands (and reporting them in root
> > > sets) solves the problem. That slightly increases register pressure,
> > > but I think is a better solution than pinning VTables (and rather
> > > straightforward)."
> > >
> > > The  Alexey's solution does not affect devirtualizer at all.
> >
> > How can we do without devirtualizer changes here? We need to replace
> > Object->Vtable with Object->class->vtable. Am I missing something?
> 
> Before and after Alexey's solution you have a constant in object header. Why
> to change devirtualizer here?

OK. Rereading the thread helped me now. Sorry. object->vtable stays
the same. Now I see your point. Sure, VTables should be pinned in this
case! Anyway, improving devirtualizer to report VTables seems
useful. Not obvious, need to make specific performance measurements,
but..

> > Unpinning vtables you have to include them into enumeration. I'm not
> > > sure that moving vtables to opnds as object-type is a simple task
> > > and think that it l will affect GC-enumeration part in JIT too.
> >
> > You think, GC enumeration would be difficult here? why? It's an
> > ordinary object.
> 
> I have no exact estimation right now. It's not an easy task even to estimate
> all of the places where we have to change the design if vtable are unpinned.

I see no places except devirtualizer and CG (implementing a special
ldfield instruction to load VTable by offset). Oh, yes, Translator
inlining. Collecting the places started :)

> This is the reason I asked Aleksey to be careful if he decides to unpin
> vtables someday.

OK, let it be. (I just do not like these pinned objects due to such a
"minor" issue)

> > Another cons: type profiling we will have soon. You have to include it
> > into
> > > enumeration.
> >
> > why not? I see no extra complication here.
> 
> Yes. The complication is enumeration from inside of profile collector
> itself.  It's possible, but, again, let's be careful.

Oh, yes, value profile collector should be an interesting piece of code.

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 26 Oct 2006 15:46:13 +0700, Egor Pasko <eg...@gmail.com> wrote:
>
> > No. You can't measure performance in method/class numbers and
> proportions at
> > all. It's normal situation when only 1% of methods consume 90% of CPU
> ticks.
> > Not all of them depends on devirtualization or classes unloaded. So you
> > can't say that 1 class unloaded is OK but 100% is bad. It's a lottery.
>
> I do not quite understand with what I disagree here :)
>
> I meant, increasing footprint by 1/10 due to class unloading support
> seems too much for a big application.

What numbers do you mean when you say "too much"? Memory footprint? Not: the
memory footprint is the number of objects (not classes) created.
Performance: not again.

Of course, it's a personal
> impression, application-dependant, etc. Just want to know the real
> number. They say, it would be much less than that.

The real number depends on a real application and application type.  With AS
that runs for months without reboot you can have almost every class reloaded
thousands of times. A big and configured client application can use multiple
classloaders only to load classes and never unload it.

> 2)  "hm, keeping VTable pointers on operands (and reporting them in root
> > sets) solves the problem. That slightly increases register pressure,
> > but I think is a better solution than pinning VTables (and rather
> > straightforward)."
> >
> > The  Alexey's solution does not affect devirtualizer at all.
>
> How can we do without devirtualizer changes here? We need to replace
> Object->Vtable with Object->class->vtable. Am I missing something?

Before and after Alexey's solution you have a constant in object header. Why
to change devirtualizer here?

> Unpinning vtables you have to include them into enumeration. I'm not
> > sure that moving vtables to opnds as object-type is a simple task
> > and think that it l will affect GC-enumeration part in JIT too.
>
> You think, GC enumeration would be difficult here? why? It's an
> ordinary object.

I have no exact estimation right now. It's not an easy task even to estimate
all of the places where we have to change the design if vtable are unpinned.

This is the reason I asked Aleksey to be careful if he decides to unpin
vtables someday.

> Another cons: type profiling we will have soon. You have to include it
> into
> > enumeration.
>
> why not? I see no extra complication here.

Yes. The complication is enumeration from inside of profile collector
itself.  It's possible, but, again, let's be careful.

-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Pavel Ozhdikhin <pa...@gmail.com>.

Egor,

Object->VTable will still exist so no changes in devirtualizer are needed.

Thanks,
Pavel


On 26 Oct 2006 15:46:13 +0700, Egor Pasko <eg...@gmail.com> wrote:
>
> On the 0x20E day of Apache Harmony Mikhail Fursov wrote:
> > Egor,
> > I would rather disagree in most of the details. :(
>
> Thanks for that, but I do not see any significant disagreement :)
>
> > 1) "1/10 of the whole heap is probably very much :)"
> >
> > No. You can't measure performance in method/class numbers and
> proportions at
> > all. It's normal situation when only 1% of methods consume 90% of CPU
> ticks.
> > Not all of them depends on devirtualization or classes unloaded. So you
> > can't say that 1 class unloaded is OK but 100% is bad. It's a lottery.
>
> I do not quite understand with what I disagree here :)
>
> I meant, increasing footprint by 1/10 due to class unloading support
> seems too much for a big application. Of course, it's a personal
> impression, application-dependant, etc. Just want to know the real
> number. They say, it would be much less than that.
>
> > 2)  "hm, keeping VTable pointers on operands (and reporting them in root
> > sets) solves the problem. That slightly increases register pressure,
> > but I think is a better solution than pinning VTables (and rather
> > straightforward)."
> >
> > The  Alexey's solution does not affect devirtualizer at all.
>
> How can we do without devirtualizer changes here? We need to replace
> Object->Vtable with Object->class->vtable. Am I missing something?
>
> > Unpinning vtables you have to include them into enumeration. I'm not
> > sure that moving vtables to opnds as object-type is a simple task
> > and think that it l will affect GC-enumeration part in JIT too.
>
> You think, GC enumeration would be difficult here? why? It's an
> ordinary object.
>
> > Another cons: type profiling we will have soon. You have to include it
> into
> > enumeration.
>
> why not? I see no extra complication here.
>
> --
> Egor Pasko, Intel Managed Runtime Division
>
>

Re: [drlvm] Class unloading support

Posted by Ivan Volosyuk <iv...@gmail.com>.

AFAIU, we have pinned VTables for initial version of unloading
support. This means that we can still rely that VTable will not change
and we can compare it with already known constant. No performance
degradation, no new inter-component contracts.
--
Ivan

On 10/26/06, Mikhail Fursov <mi...@gmail.com> wrote:
> On 10/26/06, Pavel Pervov <pm...@gmail.com> wrote:
> >
> > Mikhail,
> >
> > Now if you introduce one more level of indirection and start comparing
> > Classes, not VTables, you do not need to enumerate.
>
>
> Yes. But I see 2 cons here
> 1) Performance: extra memory access for every devirtualization guard.
> 2) What component will be responsible for this extra level? This is a new
> inter-component contract.

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 10/26/06, Pavel Pervov <pm...@gmail.com> wrote:
>
> Mikhail,
>
> Now if you introduce one more level of indirection and start comparing
> Classes, not VTables, you do not need to enumerate.

Yes. But I see 2 cons here
1) Performance: extra memory access for every devirtualization guard.
2) What component will be responsible for this extra level? This is a new
inter-component contract.

-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Pavel Pervov <pm...@gmail.com>.

Mikhail,

Now if you introduce one more level of indirection and start comparing
Classes, not VTables, you do not need to enumerate.

On 10/26/06, Mikhail Fursov <mi...@gmail.com> wrote:
>
> On 10/26/06, Pavel Pervov <pm...@gmail.com> wrote:
> >
> > My statement is based on the following: JIT only needs vtable to call
> > (interface) methods. If it's true, then I don't see where it will be
> > necessary to enumerate vtables. The only possibility I can imagine is
> when
> > JIT reuse vtable invoking several methods of the same object.
>
>
> Pavel,
> here is an example of the devirtualization: eax is an object reference:
>
> cmp         eax,0  ;
> je          0131E778 ;
> mov         ecx,dword ptr [eax]
> cmp         ecx,13D5250h
>
> As you  see JIT hardcodes vtable as a constant. If you move it - you have
> to
> enumerate it.
>
>
>
> All,
> Let's allow to Alexey to commit his code as it is and discuss unpinning
> with
> details in future?
>
> --
> Mikhail Fursov
>
>


-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x20E day of Apache Harmony Mikhail Fursov wrote:
> On 10/26/06, Pavel Pervov <pm...@gmail.com> wrote:
> >
> > My statement is based on the following: JIT only needs vtable to call
> > (interface) methods. If it's true, then I don't see where it will be
> > necessary to enumerate vtables. The only possibility I can imagine is when
> > JIT reuse vtable invoking several methods of the same object.
> 
> 
> Pavel,
> here is an example of the devirtualization: eax is an object reference:
> 
> cmp         eax,0  ;
> je          0131E778 ;
> mov         ecx,dword ptr [eax]
> cmp         ecx,13D5250h
> 
> As you  see JIT hardcodes vtable as a constant. If you move it - you have to
> enumerate it.
> 
> 
> 
> All,
> Let's allow to Alexey to commit his code as it is and discuss unpinning with
> details in future?

+1
I think, discussion was worth it. Now I am aware of the set of features
Aleksey introduces.

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 10/26/06, Pavel Pervov <pm...@gmail.com> wrote:
>
> My statement is based on the following: JIT only needs vtable to call
> (interface) methods. If it's true, then I don't see where it will be
> necessary to enumerate vtables. The only possibility I can imagine is when
> JIT reuse vtable invoking several methods of the same object.

Pavel,
here is an example of the devirtualization: eax is an object reference:

cmp         eax,0  ;
je          0131E778 ;
mov         ecx,dword ptr [eax]
cmp         ecx,13D5250h

As you  see JIT hardcodes vtable as a constant. If you move it - you have to
enumerate it.

All,
Let's allow to Alexey to commit his code as it is and discuss unpinning with
details in future?

-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Pavel Pervov <pm...@gmail.com>.

My statement is based on the following: JIT only needs vtable to call
(interface) methods. If it's true, then I don't see where it will be
necessary to enumerate vtables. The only possibility I can imagine is when
JIT reuse vtable invoking several methods of the same object.

On 10/26/06, Mikhail Fursov <mi...@gmail.com> wrote:

> On 10/26/06, Pavel Pervov <pm...@gmail.com> wrote:
> >
> > Unpinning vtables does not cause the need to enumerate them from JIT, if
> > JIT
> > does not store pointers to them.
>
>
> AFAIU vtable is inlined into a special Java object.
> If you move the object -> vtable is also moved.
> Am I missing something here?
>
> --
> Mikhail Fursov
>
>
-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x20E day of Apache Harmony Mikhail Fursov wrote:
> On 10/26/06, Pavel Pervov <pm...@gmail.com> wrote:
> >
> > Unpinning vtables does not cause the need to enumerate them from JIT, if
> > JIT
> > does not store pointers to them.
> 
> 
> AFAIU vtable is inlined into a special Java object.
> If you move the object -> vtable is also moved.
> Am I missing something here?

IMHO, VTable *is* the object. It is pinned. does not move ;)

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 10/26/06, Pavel Pervov <pm...@gmail.com> wrote:
>
> Unpinning vtables does not cause the need to enumerate them from JIT, if
> JIT
> does not store pointers to them.

AFAIU vtable is inlined into a special Java object.
If you move the object -> vtable is also moved.
Am I missing something here?

-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Pavel Pervov <pm...@gmail.com>.

<SNIP>

On 26 Oct 2006 19:34:04 +0700, Egor Pasko <eg...@gmail.com> wrote:
>
> yeah, that is why unpinning them might appear a perf-boost as well as
> slowdown. I cannot find any solution not to slow down here. (thinking
> of obj->class->vtable which is bad for virtual calls, keeping both
> obj->vtable->class and obj->class->vtable is bad for footprint) Is it
> really the problem each JVM designer meets? :)


Some VM developers keep all info on the class in one block in Java heap:
java/lang/Class, VM part of class data, vtable of that class - everything.
:)

-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x20E day of Apache Harmony Pavel Pervov wrote:
> > The  Alexey's solution does not affect devirtualizer at all.
> >
> > How can we do without devirtualizer changes here? We need to replace
> > Object->Vtable with Object->class->vtable. Am I missing something?
> 
> 
> If vtables are pinned Aleksey's solution does not affect JITs at all.
> 
> If vtables move, it will be one more level of indirection for guarded
> devirtualization: was object->vtable, will be object->vtable->class.

yeah, that is why unpinning them might appear a perf-boost as well as
slowdown. I cannot find any solution not to slow down here. (thinking
of obj->class->vtable which is bad for virtual calls, keeping both
obj->vtable->class and obj->class->vtable is bad for footprint) Is it
really the problem each JVM designer meets? :)

> Unpinning vtables does not cause the need to enumerate them from JIT, if JIT
> does not store pointers to them.

yes, as well as all other objects

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm] Class unloading support

Posted by Pavel Pervov <pm...@gmail.com>.

>
> <SNIP>


Egor,

> The  Alexey's solution does not affect devirtualizer at all.
>
> How can we do without devirtualizer changes here? We need to replace
> Object->Vtable with Object->class->vtable. Am I missing something?


If vtables are pinned Aleksey's solution does not affect JITs at all.

If vtables move, it will be one more level of indirection for guarded
devirtualization: was object->vtable, will be object->vtable->class.

Unpinning vtables does not cause the need to enumerate them from JIT, if JIT
does not store pointers to them.

-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x20E day of Apache Harmony Mikhail Fursov wrote:
> Egor,
> I would rather disagree in most of the details. :(

Thanks for that, but I do not see any significant disagreement :)

> 1) "1/10 of the whole heap is probably very much :)"
> 
> No. You can't measure performance in method/class numbers and proportions at
> all. It's normal situation when only 1% of methods consume 90% of CPU ticks.
> Not all of them depends on devirtualization or classes unloaded. So you
> can't say that 1 class unloaded is OK but 100% is bad. It's a lottery.

I do not quite understand with what I disagree here :)

I meant, increasing footprint by 1/10 due to class unloading support
seems too much for a big application. Of course, it's a personal
impression, application-dependant, etc. Just want to know the real
number. They say, it would be much less than that.

> 2)  "hm, keeping VTable pointers on operands (and reporting them in root
> sets) solves the problem. That slightly increases register pressure,
> but I think is a better solution than pinning VTables (and rather
> straightforward)."
> 
> The  Alexey's solution does not affect devirtualizer at all. 

How can we do without devirtualizer changes here? We need to replace
Object->Vtable with Object->class->vtable. Am I missing something?

> Unpinning vtables you have to include them into enumeration. I'm not
> sure that moving vtables to opnds as object-type is a simple task
> and think that it l will affect GC-enumeration part in JIT too.

You think, GC enumeration would be difficult here? why? It's an
ordinary object. 

> Another cons: type profiling we will have soon. You have to include it into
> enumeration.

why not? I see no extra complication here. 

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

Egor,
I would rather disagree in most of the details. :(

1) "1/10 of the whole heap is probably very much :)"

No. You can't measure performance in method/class numbers and proportions at
all. It's normal situation when only 1% of methods consume 90% of CPU ticks.
Not all of them depends on devirtualization or classes unloaded. So you
can't say that 1 class unloaded is OK but 100% is bad. It's a lottery.

2)  "hm, keeping VTable pointers on operands (and reporting them in root
sets) solves the problem. That slightly increases register pressure,
but I think is a better solution than pinning VTables (and rather
straightforward)."

The  Alexey's solution does not affect devirtualizer at all. Unpinning
vtables you have to include them into enumeration. I'm not sure that moving
vtables to opnds as object-type is a simple task and think that it l will
affect GC-enumeration part in JIT too.
Another cons: type profiling we will have soon. You have to include it into
enumeration.




-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x20D day of Apache Harmony Mikhail Fursov wrote:
> On 10/25/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >
> > Egor,
> > >But it has 1 more "cons" -- JIT should change it's devirtualizer
> > >accordingly to the VTable change. Doable, of course.
> > There is no need to change struct VTable structure - it could be simply
> > inlined in pinned VTable object + 1 additional reference field to
> > j.l.Class.
> > So there won't be too much work to do on JIT side.
> >
> +1 for "Automatic class unloading approach".
> But, please, keep vtables pinned in the first version. If you make vtable
> objects unpinned JIT have to track if the object is moved and patch all
> devirtualized calls (not a simple task..).

hm, keeping VTable pointers on operands (and reporting them in root
sets) solves the problem. That slightly increases register pressure,
but I think is a better solution than pinning VTables (and rather
straightforward).

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 10/25/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>
> Egor,
> >But it has 1 more "cons" -- JIT should change it's devirtualizer
> >accordingly to the VTable change. Doable, of course.
> There is no need to change struct VTable structure - it could be simply
> inlined in pinned VTable object + 1 additional reference field to
> j.l.Class.
> So there won't be too much work to do on JIT side.
>
+1 for "Automatic class unloading approach".
But, please, keep vtables pinned in the first version. If you make vtable
objects unpinned JIT have to track if the object is moved and patch all
devirtualized calls (not a simple task..).
-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Rana Dasgupta <rd...@gmail.com>.

Aleksey,
  Thanks for the answers, please see inline.

On 10/26/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>
> >Rana,
>
> >- One example is Eclipse, when you build something with ant. E.g.
> building
> >kernel classes with ant in Eclipse will create separate class loader
> every
> >time developer performs build operation, so it means that sources
> rebuilding
> >will initiate VM native memory accumulation (memory leaks: class related
> >data, JIT code and so on) on Eclipse.

This sounds like an Eclipse bug or strange design to me that they don't
maintain state.

>Also I think that drlvm
> >would not be able to pass some cyclic many days scenarios which use
> separate
> >class loader for every step, e.g. some reliability testing.

That was my point. That we are trying to do an invasive optimization for a
problem that does not exist yet. We don't have these cyclic scenarios on
memory constrained systems.

>
> >Every object  is described with VTable object, but VTable object is also
> a
> >full Java object, it means it is to have its own VTable object. But there
> 1
> >little specifics that prevents us from infinite sequence of VTables.
> VTable
> >describing object has variable size according to the objects class
> specifics
> >(number of methods). Lets name VTable of object as VT. Now VT should have
> >own VTable, name it VTable_for_VT. This VTable_for_VT describes VT (e.g.
> >size of VT as it's different for different classes), but in its turn
> >VTable_for_VT has always the same size (as VT is specific and has no
> methods
> >in its class). Therefore all VTable_for_VT objects could be described
> with
> >only one last VTable object, name it VTable_for_VTable.

Yes, I understand that you break the recursion by creating a self
referential structure when the vtables stop being unique. Thanks. That was
why I asked if the Vtable class is a special class.

> >-Yes. To avoid heap fragmentation GC should allocate pinned objects via
> >special malloc like function like gc_pinned_alloc (see
> >issues.apache.org/jira/browse/HARMONY-1935).

Let's wait to hear from other GC developers as well, what do you think? Eg.,
one solution can be to allocate pinned objects at the edge, or outside the
main Java heap, but these have several implications on performance or on the
success of your proposed method.

> >- Heap pressure is to be measured when unloading is done, but I feel
> >optimistic on that because number of VTable object is small comparing to
> the
> >total number of objects on heap. Adding j.l.Class reference to every
> object
> >will lead to great memory footprint increase. This about object overhead.
> >Right now it is 8 bytes for every object (IA32, Compressed mode). Adding
> >additional reference to the object header will increase object overhead
> on 4
> >bytes (IA32). Lets consider some application having about 1 million of
> >objects, them memory footprint will increase on 4Mb. I'm not familiar
> what
> >is the number of objects on Eclipse launch. Does anyone have this data?

I agree with the overall statement that adding a reference to each object as
opposed to each class( vtable ) would possibly create more heap pressure. I
can't do more theoritical analysis than that. But  java vtables have other
issues, as discussed on this thread...an impact to devirtualization, need
for pinning, etc. The question is whether we need this optimization, and all
the associated changes now. And if this is the best way to implement it.

Thanks,
Rana

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Rana,



>You state that DRLVM does not implement the class unloading optimization,
>and this may create memory pressure on some applications that load many
>classes. Do we have a real case / example where an application is stuck for
>insufficient memory because it uses a lot of classes initially and then
>stops using them, but these are not unloaded? One can imagine a web browser
>doing something like this. Is a web browser a typical use case for the
>Harmony JVM?



- One example is Eclipse, when you build something with ant. E.g. building
kernel classes with ant in Eclipse will create separate class loader every
time developer performs build operation, so it means that sources rebuilding
will initiate VM native memory accumulation (memory leaks: class related
data, JIT code and so on) on Eclipse. This process is very similar to normal
development process, so developer as soon as he has native memory over on
his PC will have to stop drlvm and rerun Eclipse to eliminate memory
overflow problem caused by class unloading absence. Also I think that drlvm
would not be able to pass some cyclic many days scenarios which use separate
class loader for every step, e.g. some reliability testing.

>Regarding your engineering choices, choice 2 seems nicer, but again I have
>some questions.
>1. In the class registry, is the reference from the j.l.class instance to
>the j.l.CL <http://j.l.cl/> instance a weak refernce and the reverse not a
weak reference?



- class registry means *strong references* from j.l.Classes to their
defining j.l.Classloader and *strong references* from j.l.Classloader to
j.l.Classes loaded by it. We provide weak roots (weak references inside VM)
to every j.l.Classloader because we have to detect the moment when VM can
start unloading of appropriate class loader. It means that as soon as there
are fulfilled 3 class unloading conditions (see in the first letter *)
j.l.Classloader object will be collected by GC and weak root to it is
zeroed, so this is the moment when VM detects that current classloader was
unloaded and it destroys its native sources.

>2. I am missing something about the java vtable object. Is it  a first
class
>java object with its own java class? In this case the vtable object would
>have its own vtable which is a java object, but that also would have a
>vtable and so on...??? In other words if every java object has a vtable,
>which is a also a java object.......



- You are partially right, as I already said in some letter number of
VTables is *proportional* (no equal) to number of classes. Now details:

Every object  is described with VTable object, but VTable object is also a
full Java object, it means it is to have its own VTable object. But there 1
little specifics that prevents us from infinite sequence of VTables. VTable
describing object has variable size according to the objects class specifics
(number of methods). Lets name VTable of object as VT. Now VT should have
own VTable, name it VTable_for_VT. This VTable_for_VT describes VT (e.g.
size of VT as it's different for different classes), but in its turn
VTable_for_VT has always the same size (as VT is specific and has no methods
in its class). Therefore all VTable_for_VT objects could be described with
only one last VTable object, name it VTable_for_VTable. So, we have the
following structure:

VTable_for_VTable -> VTable_for_VT -> VT -> object. VTable_for_VTable object
is the only one object for the whole VM, VTable_for_VT and  VT objects are
created for every class loaded.

So as you can see this is not an infinite sequence of VTable "tail" for
every object. Probably, this could a bit unclear first, then probably I
could draw some pictures to show the idea.



>3. If I am misunderstanding the above(  I hope ), the vtable objects would
>need to be pinned to avoid patching virtual calls after GC, efficient
>dispatching etc. Does this not put a requirement on compatible GC's to be
>able to deal with pinned objects?



-Yes. To avoid heap fragmentation GC should allocate pinned objects via
special malloc like function like gc_pinned_alloc (see
issues.apache.org/jira/browse/HARMONY-1935).

>4. Why cannot one have a j.l.class reference in the object header, as
Weldon
>mentions, instead of this new vtable java type? Is the peformance impact
>known and do we understand it as compared to heap pressure due to the new
>vtable object?

- Heap pressure is to be measured when unloading is done, but I feel
optimistic on that because number of VTable object is small comparing to the
total number of objects on heap. Adding j.l.Class reference to every object
will lead to great memory footprint increase. This about object overhead.
Right now it is 8 bytes for every object (IA32, Compressed mode). Adding
additional reference to the object header will increase object overhead on 4
bytes (IA32). Lets consider some application having about 1 million of
objects, them memory footprint will increase on 4Mb. I'm not familiar what
is the number of objects on Eclipse launch. Does anyone have this data?

Initially there was an idea to trace j.l.Class via native struct VTable
(simply doable of gcv4 versions of GC) that led to the problem: slot
referencing to j.l.Class was out of the Java heap and there would be
problems with next version of GC. I think, Ivan, could explain it in
details.



The current automatic unloading design is heavily connected to the current
drlvm architecture and implementation specifics, therefore we have to choose
between pinning and memory overhead or maybe something else to get uniform
support of class unloading.

Aleksey.


On 10/27/06, Weldon Washburn <we...@gmail.com> wrote:
>
> Steve Blackburn was in Portland Oregon today.  I mentioned the idea of
> adding a  reference pointer from object to its j.l.Class instance.  MMTk
> was
> not designed with this idea in mind.  It looks like you will need to fix
> this part of MMTk and maintain it yourself.  Steve did not seem thrilled
> at
> adding this support to MMTk code base.
>
> Have we looked at other class unloading designs?  From what I have read in
> open literature on object layout, I don't recall any special fields to
> support class unloading.
>
>
> On 10/26/06, Rana Dasgupta <rd...@gmail.com> wrote:
> >
> > Aleksey,
> >   I had a couple of questions.
> >   You state that DRLVM does not implement the class unloading
> > optimization,
> > and this may create memory pressure on some applications that load many
> > classes. Do we have a real case / example where an application is stuck
> > for
> > insufficient memory because it uses a lot of classes initially and then
> > stops using them, but these are not unloaded? One can imagine a web
> > browser
> > doing something like this. Is a web browser a typical use case for the
> > Harmony JVM?
> >
> > Regarding your engineering choices, choice 2 seems nicer, but again I
> have
> > some questions.
> >
> > 1. In the class registry, is the reference from the j.l.class instance
> to
> > the j.l.CL instance a weak refernce and the reverse not a weak
> reference?
> > 2. I am missing something about the java vtable object. Is it  a first
> > class
> > java object with its own java class? In this case the vtable object
> would
> > have its own vtable which is a java object, but that also would have a
> > vtable and so on...??? In other words if every java object has a vtable,
> > which is a also a java object.......
> > 3. If I am misunderstanding the above(  I hope ), the vtable objects
> would
> > need to be pinned to avoid patching virtual calls after GC, efficient
> > dispatching etc. Does this not put a requirement on compatible GC's to
> be
> > able to deal with pinned objects?
> > 4. Why cannot one have a j.l.class reference in the object header, as
> > Weldon
> > mentions, instead of this new vtable java type? Is the peformance impact
> > known and do we understand it as compared to heap pressure due to the
> new
> > vtable object?
> >
> > Thanks,
> > Rana
> >
> >
> >
> >
> > > On 10/24/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> > > >
> > > > Egor,
> > > > >But it has 1 more "cons" -- JIT should change it's devirtualizer
> > > > >accordingly to the VTable change. Doable, of course.
> > > > There is no need to change struct VTable structure - it could be
> > simply
> > > > inlined in pinned VTable object + 1 additional reference field to
> > > > j.l.Class.
> > > > So there won't be too much work to do on JIT side.
> > > >
> > > > >BTW, is it reasonable to "compress" or "enumerate" references to
> > > > >j.l.Class in each object to reduce the footprint? How many classes
> > are
> > > > >alive in heavy-duty applications? not very much probably.
> > > > We are to trace j.l.Class from every object via VTable to detect if
> > > > there is
> > > > any live object of that j.l.Class. This one of requirements of class
> > > > unloading.
> > > > As for footprint - there is already pointer to struct VTable in
> every
> > > > object, so changing this pointer to reference to VTable Object will
> > have
> > > > no
> > > > effect on footprint. Compressed VTable pointers will be changed to
> > > > compressed references. The only effect is that VTable object is a
> full
> > > > Java
> > > > object and in its turn it is to have own VTable, so number of VTable
> > > > objects
> > > > will encrease for every class. As Vtable is a small object footprint
> > > > will
> > > > encrease only for tens of bytes for every loaded class, and as I
> know,
> > > > there
> > > > are loaded several thousands classes on Eclipse startup, therefore
> > > > footprint
> > > > increase is negligible.
> > > >
> > > > Aleksey Ignatenko,
> > > > Intel Enterprise Solutions Software Division
> > >
> > > .
> >
> >
>
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
>
>

Re: [drlvm] Class unloading support

Posted by Rana Dasgupta <rd...@gmail.com>.

Yes, I thought about this breaking MMTk too. BTW, I don't know if Jikes does
class unloading, but they may get automatic unloading almost for free since
it is all java?

I am not very knowledgeable regarding other methods, but I think you can
always count manually to figure out when the loader instance becomes
unreachable. That would be like Aleksey's method 1, which is quite invasive.

Thanks,
Rana

> On 10/26/06, Weldon Washburn <we...@gmail.com> wrote:
> >
> > Steve Blackburn was in Portland Oregon today.  I mentioned the idea of
> > adding a  reference pointer from object to its j.l.Class instance.  MMTk
> > was
> > not designed with this idea in mind.  It looks like you will need to fix
> > this part of MMTk and maintain it yourself.  Steve did not seem thrilled
> > at
> > adding this support to MMTk code base.
> >
> > Have we looked at other class unloading designs?  From what I have read
> > in
> > open literature on object layout, I don't recall any special fields to
> > support class unloading.
> >
> >

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

Sorry for the confusion.  We are getting ourselves all tangled up with
subconversations in this thread.  There have been 90+ replies to the
original posting.

No patch containing class unloading will be committed until Harmony has a
design and the design has been implemented.

What is being discussed is a patch that cleans up the malloc/free of C
structs that are currently used for class loading.  I have looked at the
proposed patch.  It looks to have low impact on stability.  It contains no
class unloading code.  Its not urgent to apply this patch.  I will hold off
doing anything until the confusion clears.  It might even be better to open
a new JIRA titled something like, "classloader malloc/free cleanup".  Note
there are currently 12 files attached to HARMONY-2000.

The patch at issue was split out of the original class unloading patch to
isolate independent problems.


On 11/10/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
> Hang on - we aren't going to consider this patch quite yet, are we?  We
> have a very active and fruitful discussion going on regarding alternate
> approaches?
>
> geir
>
>
> Aleksey Ignatenko wrote:
> > Weldon, I have attached updated patch to H-2000:
> > cleanup_sources_1558_merged.patch.
> > Please, see comments.
> >
> > Aleksey.
> >
> >
> > On 11/10/06, Weldon Washburn <we...@gmail.com> wrote:
> >>
> >> Aleksey,
> >> I tried to apply native_sources_cleanup_upd.patch.  svn HEAD has
> changed
> >> and
> >> the patch no longer works.  Part of the problem is that JIRA 1558 has
> >> been
> >> committed.  In addition to the below issues, I posted comments to
> >> JIRA HARMONY-2000.
> >>
> >>
> >> On 11/2/06, Weldon Washburn <we...@gmail.com> wrote:
> >> >
> >> > Aleksey,
> >> >
> >> > Excellent step forward -- breaking the patch into two pieces.   This
> >> made
> >> > the patch(es) much more readable.
> >> >
> >> > I glanced at native_sources_cleanup.patch.  It looks like code for
> >> > alloc/dealloc vtables and jitted code blocks.  The original patch
> made
> >> > vtables into objects.  Will native_sources_cleanup need to change if
> >> vtables
> >> > are normal C structs instead?  Also, I see reference to path
> >> .../gcv4/...  I
> >> > guess this will need to change to support gc_gen and gc_cc.
> >> >
> >> > Once you get native_sources_cleanup.patch in good shape, I have no
> >> problem
> >> > committing it.
> >> >
> >> > If there is no other debate on class unloading design, I will call
> >> for a
> >> > vote in a seperate email.
> >> >
> >> >
> >> >
> >> > On 11/2/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >> > >
> >> > > Hi, everyone.
> >> > >
> >> > > I've splitted Harmony-2000 (see details:
> >> > > http://issues.apache.org/jira/browse/HARMONY-2000) patch with
> >> automatic
> >> > > class unloading implementation into 2 independent parts:
> >> > > 1. cleaning native resources (native_sources_cleanup.patch).
> >> > > 2. automatic unloading design implementation
> (auto_unloading.patch).
> >> > >
> >> > > The first part is independent for all class unloading designs and
> >> could
> >> > > be
> >> > > commited. The second part is class unloading design implementation
> >> (the
> >> > > best
> >> > > class unloading approach is discussed now).
> >> > >
> >> > > I propose to commit native_sources_cleanup.patch and continue class
> >> > > unloading development with minimal requirements on drlvm.
> >> > >
> >> > > Aleksey.
> >> > >
> >> > >
> >> > > On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >> > > >
> >> > > > Oops, sorry, misprinted in my suggestion:
> >> > > >                 if (cl->IsBootstrap() *||
> >> > > *env->b_VTable_trace_is_not_supported_by_GC)
> >> > > >
> >> > > >                 {
> >> > > >                     vm_enumerate_jlc(c);
> >> > > >                     if (c->vtable)
> >> > > >
> >> vm_enumerate_root_reference((void**)&c->vtObj,
> >> > > > FALSE);
> >> > > >                 }
> >> > > >
> >> > > > Aleksey.
> >> > > >
> >> > > >  On 11/1/06, Aleksey Ignatenko < aleksey.ignatenko@gmail.com>
> >> wrote:
> >> > > > >
> >> > > > > Weldon,
> >> > > > >
> >> > > > > >A question for all involved.  Is it possible to somehow make
> it
> >> > > appear
> >> > > > > that
> >> > > > > >the new objects related to unloading  (VTable, ClassLoader,
> >> > > etc)  are
> >> > > > > always
> >> > > > > >reachable and thus never collected?  I am trying to figure out
> a
> >> > > way to
> >> > > > > make
> >> > > > > >integration of class unloading independent of correct support
> >> > > inside
> >> > > > > the GC
> >> > > > > >and JIT.  This option could be a command line switch or
> compile
> >> > > time
> >> > > > > option.
> >> > > > >
> >> > > > > I agree with Robin:
> >> > > > > >Simple.  Keep a list or table of these objects as part of the
> >> root
> >> > > set.
> >> > > > > >Enumerate it optionally depending on a command line option.
> >> > > > >
> >> > > > > Details: you can see from Harmony-2000 patch, that this is done
> >> for
> >> > > > > Bootstrap classes already. If you look at
> >> root_set_enum_common.cpp
> >> > > (with the
> >> > > > > patch applied) vm_enumerate_static_fields() function, there is
> >> line:
> >> > > > >                 if (cl->IsBootstrap())
> >> > > > >                 {
> >> > > > >                     vm_enumerate_jlc(c);
> >> > > > >                     if (c->vtable)
> >> > > > >
> >> > > vm_enumerate_root_reference((void**)&c->vtObj,
> >> > > > > FALSE);
> >> > > > >                 }
> >> > > > >                 else
> >> > > > >                 {
> >> > > > >                     vm_enumerate_jlc(c, true/*weak*/);
> >> > > > >                 }
> >> > > > > You can see, that there are strong roots to Bootstrap
> >> j.l.Classesand
> >> > > > > their VTable objects. So I suppose, that it would be very
> simple
> >> to
> >> > > > > propogate strong roots to all other classes (not only
> Bootstrap),
> >> > > something
> >> > > > > like:
> >> > > > >                 if (cl->IsBootstrap() *&&
> >> > > > > env->b_VTable_trace_is_not_supported_by_GC*)
> >> > > > >                 {
> >> > > > >                     vm_enumerate_jlc(c);
> >> > > > >                     if (c->vtable)
> >> > > > >
> >> > > vm_enumerate_root_reference((void**)&c->vtObj,
> >> > > > > FALSE);
> >> > > > >                 }
> >> > > > > where *b_VTable_trace_is_not_supported_by_GC *is flag which is
> >> set
> >> > > > > according to used GC. This will force switching off any class
> >> > > unloading
> >> > > > > support.
> >> > > > >
> >> > > > > Aleksey.
> >> > > > >
> >> > > > >  On 11/1/06, Robin Garner <robin.garner@anu.edu.au > wrote:
> >> > > > > >
> >> > > > > > Weldon Washburn wrote:
> >> > > > > > > On 10/30/06, Robin Garner < robin.garner@anu.edu.au >
> wrote:
> >> > > > > > >>
> >> > > > > > >> Weldon Washburn wrote:
> >> > > > > > >> > On 10/27/06, Geir Magnusson Jr. < geir@pobox.com> wrote:
> >> > > > > > >> >>
> >> > > > > > >> >>
> >> > > > > > >> >>
> >> > > > > > >> >> Weldon Washburn wrote:
> >> > > > > > >> >> > Steve Blackburn was in Portland Oregon today.  I
> >> mentioned
> >> > > the
> >> > > > > > idea
> >> > > > > > >> of
> >> > > > > > >> >> > adding a  reference pointer from object to its
> >> > > j.l.Classinstance.
> >> > > > > > >> >> MMTk
> >> > > > > > >> >> > was
> >> > > > > > >> >> > not designed with this idea in mind.  It looks like
> you
> >> > > will
> >> > > > > > need to
> >> > > > > > >> >> fix
> >> > > > > > >> >> > this part of MMTk and maintain it yourself.  Steve
> did
> >> not
> >> > > > > > seem
> >> > > > > > >> >> thrilled
> >> > > > > > >> >> at
> >> > > > > > >> >> > adding this support to MMTk code base.
> >> > > > > > >>
> >> > > > > > >> Actually I think the answer may have been a little garbled
> >> > > along
> >> > > > > > the way
> >> > > > > > >> here: MMTk is not a memory manager *for* Java, it is
> >> simply a
> >> > > > > > memory
> >> > > > > > >> manager.  We have been careful to eliminate
> >> language-specific
> >> > > > > > features
> >> > > > > > >> in the heap that it manages.  MMTk has been used to
> >> manage C#
> >> > > (in
> >> > > > > > the
> >> > > > > > >> Rotor VM) and was being incorporated into a Haskell
> runtime
> >> > > until I
> >> > > > > > ran
> >> > > > > > >> out of time.
> >> > > > > > >>
> >> > > > > > >> Therefore, MMTk knows nothing about the concept of class
> >> > > unloading,
> >> > > > > > or
> >> > > > > > >> java.lang.Class .
> >> > > > > > >>
> >> > > > > > >> >> How does MMTk support class unloading then?
> >> > > > > > >> >
> >> > > > > > >> >
> >> > > > > > >> > MMTk has no special support for class unloading.  This
> may
> >> > > have
> >> > > > > > >> > something to
> >> > > > > > >> > do with the entire system being written in Java thus
> class
> >> > > > > > unloading
> >> > > > > > >> come
> >> > > > > > >> > along for free.  If there needs to be a modification to
> >> > > support
> >> > > > > > special
> >> > > > > > >> > case
> >> > > > > > >> > objects in DRLVM, someone will need to fixup MMTk and
> >> provide
> >> > > > > > onging
> >> > > > > > >> > support of this patch in Harmony.  I have zero idea how
> >> big
> >> > > this
> >> > > > > > effort
> >> > > > > > >> > would be.   It would also be good to hear what the
> impact
> >> > > will be
> >> > > > > > on
> >> > > > > > >> GCV5.
> >> > > > > > >>
> >> > > > > > >> MMTk implements several algorithms for retaining the
> >> reachable
> >> > > > > > objects
> >> > > > > > >> in a graph and recycling space used by unreachable
> ones.  It
> >> > > relies
> >> > > > > > on
> >> > > > > > >> the host VM to provide a set of roots.  It supports
> several
> >> > > > > > different
> >> > > > > > >> semantics of 'weak' references, including but not
> >> confined to
> >> > > those
> >> > > > > > >> required by Java.
> >> > > > > > >>
> >> > > > > > >> If you can implement class unloading using those (which
> the
> >> > > current
> >> > > > > >
> >> > > > > > >> proposal does), then MMTk can help.
> >> > > > > > >>
> >> > > > > > >> If you want to put a pointer to the j.l.Class in the
> object
> >> > > header,
> >> > > > > > MMTk
> >> > > > > > >> will not care, as it has no way of knowing.  If you put an
> >> > > > > > additional
> >> > > > > > >> pointer into the body of every object, then MMTk will see
> it
> >> as
> >> > > > > > just
> >> > > > > > >> another object to scan.
> >> > > > > > >>
> >> > > > > > >> Remember MMTk is a memory manager, not a Java VM!
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> Conversely, supporting some exotic class unloading
> mechanism
> >> in
> >> > >
> >> > > > > > MMTk
> >> > > > > > >> shouldn't be hard and wouldn't deter me from trying it
> out.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > Robin, it would be great if you can get MMTk to support
> this
> >> > > class
> >> > > > > > > unloading
> >> > > > > > > effort.  My main concern is the ongoing maintenance of MMTk
> >> > > class
> >> > > > > > unloading
> >> > > > > > > support.
> >> > > > > >
> >> > > > > > I haven't seen any proposal that requires MMTk to be
> modified,
> >> so
> >> > > it's
> >> > > > > > a
> >> > > > > > moot point at the moment.
> >> > > > > >
> >> > > > > > > A question for all involved.  Is it possible to somehow
> make
> >> it
> >> > > > > > appear that
> >> > > > > > > the new objects related to unloading  (VTable, ClassLoader,
> >> > > > > > etc)  are
> >> > > > > > > always
> >> > > > > > > reachable and thus never collected?  I am trying to figure
> >> out
> >> a
> >> > > way
> >> > > > > > to
> >> > > > > > > make
> >> > > > > > > integration of class unloading independent of correct
> support
> >> > > inside
> >> > > > > > the GC
> >> > > > > > > and JIT.  This option could be a command line switch or
> >> compile
> >> > > time
> >> > > > > >
> >> > > > > > > option.
> >> > > > > >
> >> > > > > > Simple.  Keep a list or table of these objects as part of the
> >> root
> >> > >
> >> > > > > > set.
> >> > > > > > Enumerate it optionally depending on a command line option.
> >> > > > > >
> >> > > > > > cheers,
> >> > > > > > Robin
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> >
> >> >
> >> > --
> >> > Weldon Washburn
> >> > Intel Enterprise Solutions Software Division
> >> >
> >>
> >>
> >>
> >> --
> >> Weldon Washburn
> >> Intel Enterprise Solutions Software Division
> >>
> >>
> >
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

I've introduced class unloading test into harmony-2000 attachments:
Test_unloading_native_lib.zip. This test is drlvm class unloading
implementation specific.

Aleksey.


On 11/3/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>
> Weldon,
> >I glanced at native_sources_cleanup.patch.  It looks like code for
> >alloc/dealloc vtables and jitted code blocks.  The original patch made
> >vtables into objects.  Will native_sources_cleanup need to change if
> vtables
> >are normal C structs instead?  Also, I see reference to path
> .../gcv4/...  I
> >guess this will need to change to support gc_gen and gc_cc.
> Vtables are not affected in native resource cleanup patch (no change from
> c struct to object).
> GCV4: There is some code cleanup and native resource cleaning in gcv4. The
> same will be done for gc_gen and gc_cc by GC people with separate JIRA,
> becuse it could affect some performance problems.
>
> I have updated patches to the final versions:
> native_sources_cleanup_upd.patch, auto_unloading_upd.patch. So, I suppose
> native_sources_cleanup_upd.patch is ready for comit.
> Aleksey.
>
>
>
>  On 11/2/06, Weldon Washburn <we...@gmail.com> wrote:
> >
> > Aleksey,
> >
> > Excellent step forward -- breaking the patch into two pieces.   This
> > made
> > the patch(es) much more readable.
> >
> > I glanced at native_sources_cleanup.patch.  It looks like code for
> > alloc/dealloc vtables and jitted code blocks.  The original patch made
> > vtables into objects.  Will native_sources_cleanup need to change if
> > vtables
> > are normal C structs instead?  Also, I see reference to path
> > .../gcv4/...  I
> > guess this will need to change to support gc_gen and gc_cc.
> >
> > Once you get native_sources_cleanup.patch in good shape, I have no
> > problem
> > committing it.
> >
> > If there is no other debate on class unloading design, I will call for a
> > vote in a seperate email.
> >
> >
> >
> > On 11/2/06, Aleksey Ignatenko < aleksey.ignatenko@gmail.com> wrote:
> > >
> > > Hi, everyone.
> > >
> > > I've splitted Harmony-2000 (see details:
> > > http://issues.apache.org/jira/browse/HARMONY-2000 ) patch with
> > automatic
> > > class unloading implementation into 2 independent parts:
> > > 1. cleaning native resources (native_sources_cleanup.patch).
> > > 2. automatic unloading design implementation (auto_unloading.patch).
> > >
> > > The first part is independent for all class unloading designs and
> > could be
> > > commited. The second part is class unloading design implementation
> > (the
> > > best
> > > class unloading approach is discussed now).
> > >
> > > I propose to commit native_sources_cleanup.patch and continue class
> > > unloading development with minimal requirements on drlvm.
> > >
> > > Aleksey.
> > >
> > >
> > > On 11/1/06, Aleksey Ignatenko < aleksey.ignatenko@gmail.com> wrote:
> > > >
> > > > Oops, sorry, misprinted in my suggestion:
> > > >                 if (cl->IsBootstrap() *||
> > > *env->b_VTable_trace_is_not_supported_by_GC)
> > > >
> > > >                 {
> > > >                     vm_enumerate_jlc(c);
> > > >                     if (c->vtable)
> > > >
> > vm_enumerate_root_reference((void**)&c->vtObj,
> > > > FALSE);
> > > >                 }
> > > >
> > > > Aleksey.
> > > >
> > > >  On 11/1/06, Aleksey Ignatenko <aleksey.ignatenko@gmail.com > wrote:
> > > > >
> > > > > Weldon,
> > > > >
> > > > > >A question for all involved.  Is it possible to somehow make it
> > > appear
> > > > > that
> > > > > >the new objects related to unloading  (VTable, ClassLoader,
> > etc)  are
> > > > > always
> > > > > >reachable and thus never collected?  I am trying to figure out a
> > way
> > > to
> > > > > make
> > > > > >integration of class unloading independent of correct support
> > inside
> > > > > the GC
> > > > > >and JIT.  This option could be a command line switch or compile
> > time
> > > > > option.
> > > > >
> > > > > I agree with Robin:
> > > > > >Simple.  Keep a list or table of these objects as part of the
> > root
> > > set.
> > > > > >Enumerate it optionally depending on a command line option.
> > > > >
> > > > > Details: you can see from Harmony-2000 patch, that this is done
> > for
> > > > > Bootstrap classes already. If you look at root_set_enum_common.cpp
> >
> > > (with the
> > > > > patch applied) vm_enumerate_static_fields() function, there is
> > line:
> > > > >                 if (cl->IsBootstrap())
> > > > >                 {
> > > > >                     vm_enumerate_jlc(c);
> > > > >                     if (c->vtable)
> > > > >
> > vm_enumerate_root_reference((void**)&c->vtObj,
> > > > > FALSE);
> > > > >                 }
> > > > >                 else
> > > > >                 {
> > > > >                     vm_enumerate_jlc(c, true/*weak*/);
> > > > >                 }
> > > > > You can see, that there are strong roots to Bootstrap j.l.Classesand
> > > > > their VTable objects. So I suppose, that it would be very simple
> > to
> > > > > propogate strong roots to all other classes (not only Bootstrap),
> > > something
> > > > > like:
> > > > >                 if (cl->IsBootstrap() *&&
> > > > > env->b_VTable_trace_is_not_supported_by_GC*)
> > > > >                 {
> > > > >                     vm_enumerate_jlc(c);
> > > > >                     if (c->vtable)
> > > > >
> > vm_enumerate_root_reference((void**)&c->vtObj,
> > > > > FALSE);
> > > > >                 }
> > > > > where *b_VTable_trace_is_not_supported_by_GC *is flag which is set
> >
> > > > > according to used GC. This will force switching off any class
> > > unloading
> > > > > support.
> > > > >
> > > > > Aleksey.
> > > > >
> > > > >  On 11/1/06, Robin Garner < robin.garner@anu.edu.au > wrote:
> > > > > >
> > > > > > Weldon Washburn wrote:
> > > > > > > On 10/30/06, Robin Garner < robin.garner@anu.edu.au > wrote:
> > > > > > >>
> > > > > > >> Weldon Washburn wrote:
> > > > > > >> > On 10/27/06, Geir Magnusson Jr. < geir@pobox.com> wrote:
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >> Weldon Washburn wrote:
> > > > > > >> >> > Steve Blackburn was in Portland Oregon today.  I
> > mentioned
> > > the
> > > > > > idea
> > > > > > >> of
> > > > > > >> >> > adding a  reference pointer from object to its
> > > j.l.Classinstance.
> > > > > > >> >> MMTk
> > > > > > >> >> > was
> > > > > > >> >> > not designed with this idea in mind.  It looks like you
> > will
> > > > > > need to
> > > > > > >> >> fix
> > > > > > >> >> > this part of MMTk and maintain it yourself.  Steve did
> > not
> > > > > > seem
> > > > > > >> >> thrilled
> > > > > > >> >> at
> > > > > > >> >> > adding this support to MMTk code base.
> > > > > > >>
> > > > > > >> Actually I think the answer may have been a little garbled
> > along
> > > > > > the way
> > > > > > >> here: MMTk is not a memory manager *for* Java, it is simply a
> > > > > > memory
> > > > > > >> manager.  We have been careful to eliminate language-specific
> > > > > > features
> > > > > > >> in the heap that it manages.  MMTk has been used to manage C#
> > (in
> > > > > > the
> > > > > > >> Rotor VM) and was being incorporated into a Haskell runtime
> > until
> > > I
> > > > > > ran
> > > > > > >> out of time.
> > > > > > >>
> > > > > > >> Therefore, MMTk knows nothing about the concept of class
> > > unloading,
> > > > > > or
> > > > > > >> java.lang.Class.
> > > > > > >>
> > > > > > >> >> How does MMTk support class unloading then?
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > MMTk has no special support for class unloading.  This may
> > have
> > > > > > >> > something to
> > > > > > >> > do with the entire system being written in Java thus class
> > > > > > unloading
> > > > > > >> come
> > > > > > >> > along for free.  If there needs to be a modification to
> > support
> > > > > > special
> > > > > > >> > case
> > > > > > >> > objects in DRLVM, someone will need to fixup MMTk and
> > provide
> > > > > > onging
> > > > > > >> > support of this patch in Harmony.  I have zero idea how big
> >
> > > this
> > > > > > effort
> > > > > > >> > would be.   It would also be good to hear what the impact
> > will
> > > be
> > > > > > on
> > > > > > >> GCV5.
> > > > > > >>
> > > > > > >> MMTk implements several algorithms for retaining the
> > reachable
> > > > > > objects
> > > > > > >> in a graph and recycling space used by unreachable ones.  It
> > > relies
> > > > > > on
> > > > > > >> the host VM to provide a set of roots.  It supports several
> > > > > > different
> > > > > > >> semantics of 'weak' references, including but not confined to
> >
> > > those
> > > > > > >> required by Java.
> > > > > > >>
> > > > > > >> If you can implement class unloading using those (which the
> > > current
> > > > > >
> > > > > > >> proposal does), then MMTk can help.
> > > > > > >>
> > > > > > >> If you want to put a pointer to the j.l.Class in the object
> > > header,
> > > > > > MMTk
> > > > > > >> will not care, as it has no way of knowing.  If you put an
> > > > > > additional
> > > > > > >> pointer into the body of every object, then MMTk will see it
> > as
> > > > > > just
> > > > > > >> another object to scan.
> > > > > > >>
> > > > > > >> Remember MMTk is a memory manager, not a Java VM!
> > > > > > >>
> > > > > > >>
> > > > > > >> Conversely, supporting some exotic class unloading mechanism
> > in
> > > > > > MMTk
> > > > > > >> shouldn't be hard and wouldn't deter me from trying it out.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Robin, it would be great if you can get MMTk to support this
> > class
> > > > > > > unloading
> > > > > > > effort.  My main concern is the ongoing maintenance of MMTk
> > class
> > > > > > unloading
> > > > > > > support.
> > > > > >
> > > > > > I haven't seen any proposal that requires MMTk to be modified,
> > so
> > > it's
> > > > > > a
> > > > > > moot point at the moment.
> > > > > >
> > > > > > > A question for all involved.  Is it possible to somehow make
> > it
> > > > > > appear that
> > > > > > > the new objects related to unloading  (VTable, ClassLoader,
> > > > > > etc)  are
> > > > > > > always
> > > > > > > reachable and thus never collected?  I am trying to figure out
> > a
> > > way
> > > > > > to
> > > > > > > make
> > > > > > > integration of class unloading independent of correct support
> > > inside
> > > > > > the GC
> > > > > > > and JIT.  This option could be a command line switch or
> > compile
> > > time
> > > > > >
> > > > > > > option.
> > > > > >
> > > > > > Simple.  Keep a list or table of these objects as part of the
> > root
> > > > > > set.
> > > > > > Enumerate it optionally depending on a command line option.
> > > > > >
> > > > > > cheers,
> > > > > > Robin
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Weldon Washburn
> > Intel Enterprise Solutions Software Division
> >
> >
>

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Weldon, I've created separate jira H-2158 called "native resource cleanup".
So you can proceed with closing H-2000.

Aleksey.


On 11/10/06, Weldon Washburn <we...@gmail.com> wrote:
>
> On 11/10/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> >
> > I wonder if you might want to create a new JIRA that's clear about what
> > the point is, and close the class unload JIRa for now.
>
>
>
> I was hoping someone would suggest closing HARMONY-2000.  Unless there are
> objections in the next 24 hours, consider it done.
>
>
> geir
> >
> >
> > Pavel Pervov wrote:
> > > On 11/10/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> > >>
> > >> Hang on - we aren't going to consider this patch quite yet, are
> we?  We
> > >> have a very active and fruitful discussion going on regarding
> alternate
> > >> approaches?
> > >>
> > >> geir
> > >
> > >
> > > This part of the patch does not contain class unloading implementation
> > > but instead contain native resources cleanup code, which is required
> by
> > any
> > > choosen class unloading design to be implemented in DRLVM.
> > > So, my +1 to commit this part, and hold on with the second, until
> > > harmony-dev derives the decision.
> > >
> > > Regards
> >
>
>
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
>
>

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

On 11/10/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
> I wonder if you might want to create a new JIRA that's clear about what
> the point is, and close the class unload JIRa for now.



I was hoping someone would suggest closing HARMONY-2000.  Unless there are
objections in the next 24 hours, consider it done.


geir
>
>
> Pavel Pervov wrote:
> > On 11/10/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> >>
> >> Hang on - we aren't going to consider this patch quite yet, are we?  We
> >> have a very active and fruitful discussion going on regarding alternate
> >> approaches?
> >>
> >> geir
> >
> >
> > This part of the patch does not contain class unloading implementation
> > but instead contain native resources cleanup code, which is required by
> any
> > choosen class unloading design to be implemented in DRLVM.
> > So, my +1 to commit this part, and hold on with the second, until
> > harmony-dev derives the decision.
> >
> > Regards
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

I wonder if you might want to create a new JIRA that's clear about what 
the point is, and close the class unload JIRa for now.

geir


Pavel Pervov wrote:
> On 11/10/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>>
>> Hang on - we aren't going to consider this patch quite yet, are we?  We
>> have a very active and fruitful discussion going on regarding alternate
>> approaches?
>>
>> geir
> 
> 
> This part of the patch does not contain class unloading implementation
> but instead contain native resources cleanup code, which is required by any
> choosen class unloading design to be implemented in DRLVM.
> So, my +1 to commit this part, and hold on with the second, until
> harmony-dev derives the decision.
> 
> Regards

Re: [drlvm] Class unloading support

Posted by Pavel Pervov <pm...@gmail.com>.

On 11/10/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
> Hang on - we aren't going to consider this patch quite yet, are we?  We
> have a very active and fruitful discussion going on regarding alternate
> approaches?
>
> geir

 This part of the patch does not contain class unloading implementation
but instead contain native resources cleanup code, which is required by any
choosen class unloading design to be implemented in DRLVM.
So, my +1 to commit this part, and hold on with the second, until
harmony-dev derives the decision.

Regards
-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

Hang on - we aren't going to consider this patch quite yet, are we?  We 
have a very active and fruitful discussion going on regarding alternate 
approaches?

geir


Aleksey Ignatenko wrote:
> Weldon, I have attached updated patch to H-2000:
> cleanup_sources_1558_merged.patch.
> Please, see comments.
> 
> Aleksey.
> 
> 
> On 11/10/06, Weldon Washburn <we...@gmail.com> wrote:
>>
>> Aleksey,
>> I tried to apply native_sources_cleanup_upd.patch.  svn HEAD has changed
>> and
>> the patch no longer works.  Part of the problem is that JIRA 1558 has 
>> been
>> committed.  In addition to the below issues, I posted comments to
>> JIRA HARMONY-2000.
>>
>>
>> On 11/2/06, Weldon Washburn <we...@gmail.com> wrote:
>> >
>> > Aleksey,
>> >
>> > Excellent step forward -- breaking the patch into two pieces.   This
>> made
>> > the patch(es) much more readable.
>> >
>> > I glanced at native_sources_cleanup.patch.  It looks like code for
>> > alloc/dealloc vtables and jitted code blocks.  The original patch made
>> > vtables into objects.  Will native_sources_cleanup need to change if
>> vtables
>> > are normal C structs instead?  Also, I see reference to path
>> .../gcv4/...  I
>> > guess this will need to change to support gc_gen and gc_cc.
>> >
>> > Once you get native_sources_cleanup.patch in good shape, I have no
>> problem
>> > committing it.
>> >
>> > If there is no other debate on class unloading design, I will call 
>> for a
>> > vote in a seperate email.
>> >
>> >
>> >
>> > On 11/2/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>> > >
>> > > Hi, everyone.
>> > >
>> > > I've splitted Harmony-2000 (see details:
>> > > http://issues.apache.org/jira/browse/HARMONY-2000) patch with
>> automatic
>> > > class unloading implementation into 2 independent parts:
>> > > 1. cleaning native resources (native_sources_cleanup.patch).
>> > > 2. automatic unloading design implementation (auto_unloading.patch).
>> > >
>> > > The first part is independent for all class unloading designs and
>> could
>> > > be
>> > > commited. The second part is class unloading design implementation
>> (the
>> > > best
>> > > class unloading approach is discussed now).
>> > >
>> > > I propose to commit native_sources_cleanup.patch and continue class
>> > > unloading development with minimal requirements on drlvm.
>> > >
>> > > Aleksey.
>> > >
>> > >
>> > > On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>> > > >
>> > > > Oops, sorry, misprinted in my suggestion:
>> > > >                 if (cl->IsBootstrap() *||
>> > > *env->b_VTable_trace_is_not_supported_by_GC)
>> > > >
>> > > >                 {
>> > > >                     vm_enumerate_jlc(c);
>> > > >                     if (c->vtable)
>> > > >
>> vm_enumerate_root_reference((void**)&c->vtObj,
>> > > > FALSE);
>> > > >                 }
>> > > >
>> > > > Aleksey.
>> > > >
>> > > >  On 11/1/06, Aleksey Ignatenko < aleksey.ignatenko@gmail.com> 
>> wrote:
>> > > > >
>> > > > > Weldon,
>> > > > >
>> > > > > >A question for all involved.  Is it possible to somehow make it
>> > > appear
>> > > > > that
>> > > > > >the new objects related to unloading  (VTable, ClassLoader,
>> > > etc)  are
>> > > > > always
>> > > > > >reachable and thus never collected?  I am trying to figure out a
>> > > way to
>> > > > > make
>> > > > > >integration of class unloading independent of correct support
>> > > inside
>> > > > > the GC
>> > > > > >and JIT.  This option could be a command line switch or compile
>> > > time
>> > > > > option.
>> > > > >
>> > > > > I agree with Robin:
>> > > > > >Simple.  Keep a list or table of these objects as part of the
>> root
>> > > set.
>> > > > > >Enumerate it optionally depending on a command line option.
>> > > > >
>> > > > > Details: you can see from Harmony-2000 patch, that this is done
>> for
>> > > > > Bootstrap classes already. If you look at 
>> root_set_enum_common.cpp
>> > > (with the
>> > > > > patch applied) vm_enumerate_static_fields() function, there is
>> line:
>> > > > >                 if (cl->IsBootstrap())
>> > > > >                 {
>> > > > >                     vm_enumerate_jlc(c);
>> > > > >                     if (c->vtable)
>> > > > >
>> > > vm_enumerate_root_reference((void**)&c->vtObj,
>> > > > > FALSE);
>> > > > >                 }
>> > > > >                 else
>> > > > >                 {
>> > > > >                     vm_enumerate_jlc(c, true/*weak*/);
>> > > > >                 }
>> > > > > You can see, that there are strong roots to Bootstrap
>> j.l.Classesand
>> > > > > their VTable objects. So I suppose, that it would be very simple
>> to
>> > > > > propogate strong roots to all other classes (not only Bootstrap),
>> > > something
>> > > > > like:
>> > > > >                 if (cl->IsBootstrap() *&&
>> > > > > env->b_VTable_trace_is_not_supported_by_GC*)
>> > > > >                 {
>> > > > >                     vm_enumerate_jlc(c);
>> > > > >                     if (c->vtable)
>> > > > >
>> > > vm_enumerate_root_reference((void**)&c->vtObj,
>> > > > > FALSE);
>> > > > >                 }
>> > > > > where *b_VTable_trace_is_not_supported_by_GC *is flag which is 
>> set
>> > > > > according to used GC. This will force switching off any class
>> > > unloading
>> > > > > support.
>> > > > >
>> > > > > Aleksey.
>> > > > >
>> > > > >  On 11/1/06, Robin Garner <robin.garner@anu.edu.au > wrote:
>> > > > > >
>> > > > > > Weldon Washburn wrote:
>> > > > > > > On 10/30/06, Robin Garner < robin.garner@anu.edu.au > wrote:
>> > > > > > >>
>> > > > > > >> Weldon Washburn wrote:
>> > > > > > >> > On 10/27/06, Geir Magnusson Jr. < geir@pobox.com> wrote:
>> > > > > > >> >>
>> > > > > > >> >>
>> > > > > > >> >>
>> > > > > > >> >> Weldon Washburn wrote:
>> > > > > > >> >> > Steve Blackburn was in Portland Oregon today.  I
>> mentioned
>> > > the
>> > > > > > idea
>> > > > > > >> of
>> > > > > > >> >> > adding a  reference pointer from object to its
>> > > j.l.Classinstance.
>> > > > > > >> >> MMTk
>> > > > > > >> >> > was
>> > > > > > >> >> > not designed with this idea in mind.  It looks like you
>> > > will
>> > > > > > need to
>> > > > > > >> >> fix
>> > > > > > >> >> > this part of MMTk and maintain it yourself.  Steve did
>> not
>> > > > > > seem
>> > > > > > >> >> thrilled
>> > > > > > >> >> at
>> > > > > > >> >> > adding this support to MMTk code base.
>> > > > > > >>
>> > > > > > >> Actually I think the answer may have been a little garbled
>> > > along
>> > > > > > the way
>> > > > > > >> here: MMTk is not a memory manager *for* Java, it is 
>> simply a
>> > > > > > memory
>> > > > > > >> manager.  We have been careful to eliminate 
>> language-specific
>> > > > > > features
>> > > > > > >> in the heap that it manages.  MMTk has been used to 
>> manage C#
>> > > (in
>> > > > > > the
>> > > > > > >> Rotor VM) and was being incorporated into a Haskell runtime
>> > > until I
>> > > > > > ran
>> > > > > > >> out of time.
>> > > > > > >>
>> > > > > > >> Therefore, MMTk knows nothing about the concept of class
>> > > unloading,
>> > > > > > or
>> > > > > > >> java.lang.Class .
>> > > > > > >>
>> > > > > > >> >> How does MMTk support class unloading then?
>> > > > > > >> >
>> > > > > > >> >
>> > > > > > >> > MMTk has no special support for class unloading.  This may
>> > > have
>> > > > > > >> > something to
>> > > > > > >> > do with the entire system being written in Java thus class
>> > > > > > unloading
>> > > > > > >> come
>> > > > > > >> > along for free.  If there needs to be a modification to
>> > > support
>> > > > > > special
>> > > > > > >> > case
>> > > > > > >> > objects in DRLVM, someone will need to fixup MMTk and
>> provide
>> > > > > > onging
>> > > > > > >> > support of this patch in Harmony.  I have zero idea how 
>> big
>> > > this
>> > > > > > effort
>> > > > > > >> > would be.   It would also be good to hear what the impact
>> > > will be
>> > > > > > on
>> > > > > > >> GCV5.
>> > > > > > >>
>> > > > > > >> MMTk implements several algorithms for retaining the
>> reachable
>> > > > > > objects
>> > > > > > >> in a graph and recycling space used by unreachable ones.  It
>> > > relies
>> > > > > > on
>> > > > > > >> the host VM to provide a set of roots.  It supports several
>> > > > > > different
>> > > > > > >> semantics of 'weak' references, including but not 
>> confined to
>> > > those
>> > > > > > >> required by Java.
>> > > > > > >>
>> > > > > > >> If you can implement class unloading using those (which the
>> > > current
>> > > > > >
>> > > > > > >> proposal does), then MMTk can help.
>> > > > > > >>
>> > > > > > >> If you want to put a pointer to the j.l.Class in the object
>> > > header,
>> > > > > > MMTk
>> > > > > > >> will not care, as it has no way of knowing.  If you put an
>> > > > > > additional
>> > > > > > >> pointer into the body of every object, then MMTk will see it
>> as
>> > > > > > just
>> > > > > > >> another object to scan.
>> > > > > > >>
>> > > > > > >> Remember MMTk is a memory manager, not a Java VM!
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> Conversely, supporting some exotic class unloading mechanism
>> in
>> > >
>> > > > > > MMTk
>> > > > > > >> shouldn't be hard and wouldn't deter me from trying it out.
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > Robin, it would be great if you can get MMTk to support this
>> > > class
>> > > > > > > unloading
>> > > > > > > effort.  My main concern is the ongoing maintenance of MMTk
>> > > class
>> > > > > > unloading
>> > > > > > > support.
>> > > > > >
>> > > > > > I haven't seen any proposal that requires MMTk to be modified,
>> so
>> > > it's
>> > > > > > a
>> > > > > > moot point at the moment.
>> > > > > >
>> > > > > > > A question for all involved.  Is it possible to somehow make
>> it
>> > > > > > appear that
>> > > > > > > the new objects related to unloading  (VTable, ClassLoader,
>> > > > > > etc)  are
>> > > > > > > always
>> > > > > > > reachable and thus never collected?  I am trying to figure 
>> out
>> a
>> > > way
>> > > > > > to
>> > > > > > > make
>> > > > > > > integration of class unloading independent of correct support
>> > > inside
>> > > > > > the GC
>> > > > > > > and JIT.  This option could be a command line switch or
>> compile
>> > > time
>> > > > > >
>> > > > > > > option.
>> > > > > >
>> > > > > > Simple.  Keep a list or table of these objects as part of the
>> root
>> > >
>> > > > > > set.
>> > > > > > Enumerate it optionally depending on a command line option.
>> > > > > >
>> > > > > > cheers,
>> > > > > > Robin
>> > > > > >
>> > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> >
>> >
>> > --
>> > Weldon Washburn
>> > Intel Enterprise Solutions Software Division
>> >
>>
>>
>>
>> -- 
>> Weldon Washburn
>> Intel Enterprise Solutions Software Division
>>
>>
>

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Weldon, I have attached updated patch to H-2000:
cleanup_sources_1558_merged.patch.
Please, see comments.

Aleksey.


On 11/10/06, Weldon Washburn <we...@gmail.com> wrote:
>
> Aleksey,
> I tried to apply native_sources_cleanup_upd.patch.  svn HEAD has changed
> and
> the patch no longer works.  Part of the problem is that JIRA 1558 has been
> committed.  In addition to the below issues, I posted comments to
> JIRA HARMONY-2000.
>
>
> On 11/2/06, Weldon Washburn <we...@gmail.com> wrote:
> >
> > Aleksey,
> >
> > Excellent step forward -- breaking the patch into two pieces.   This
> made
> > the patch(es) much more readable.
> >
> > I glanced at native_sources_cleanup.patch.  It looks like code for
> > alloc/dealloc vtables and jitted code blocks.  The original patch made
> > vtables into objects.  Will native_sources_cleanup need to change if
> vtables
> > are normal C structs instead?  Also, I see reference to path
> .../gcv4/...  I
> > guess this will need to change to support gc_gen and gc_cc.
> >
> > Once you get native_sources_cleanup.patch in good shape, I have no
> problem
> > committing it.
> >
> > If there is no other debate on class unloading design, I will call for a
> > vote in a seperate email.
> >
> >
> >
> > On 11/2/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> > >
> > > Hi, everyone.
> > >
> > > I've splitted Harmony-2000 (see details:
> > > http://issues.apache.org/jira/browse/HARMONY-2000) patch with
> automatic
> > > class unloading implementation into 2 independent parts:
> > > 1. cleaning native resources (native_sources_cleanup.patch).
> > > 2. automatic unloading design implementation (auto_unloading.patch).
> > >
> > > The first part is independent for all class unloading designs and
> could
> > > be
> > > commited. The second part is class unloading design implementation
> (the
> > > best
> > > class unloading approach is discussed now).
> > >
> > > I propose to commit native_sources_cleanup.patch and continue class
> > > unloading development with minimal requirements on drlvm.
> > >
> > > Aleksey.
> > >
> > >
> > > On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> > > >
> > > > Oops, sorry, misprinted in my suggestion:
> > > >                 if (cl->IsBootstrap() *||
> > > *env->b_VTable_trace_is_not_supported_by_GC)
> > > >
> > > >                 {
> > > >                     vm_enumerate_jlc(c);
> > > >                     if (c->vtable)
> > > >
> vm_enumerate_root_reference((void**)&c->vtObj,
> > > > FALSE);
> > > >                 }
> > > >
> > > > Aleksey.
> > > >
> > > >  On 11/1/06, Aleksey Ignatenko < aleksey.ignatenko@gmail.com> wrote:
> > > > >
> > > > > Weldon,
> > > > >
> > > > > >A question for all involved.  Is it possible to somehow make it
> > > appear
> > > > > that
> > > > > >the new objects related to unloading  (VTable, ClassLoader,
> > > etc)  are
> > > > > always
> > > > > >reachable and thus never collected?  I am trying to figure out a
> > > way to
> > > > > make
> > > > > >integration of class unloading independent of correct support
> > > inside
> > > > > the GC
> > > > > >and JIT.  This option could be a command line switch or compile
> > > time
> > > > > option.
> > > > >
> > > > > I agree with Robin:
> > > > > >Simple.  Keep a list or table of these objects as part of the
> root
> > > set.
> > > > > >Enumerate it optionally depending on a command line option.
> > > > >
> > > > > Details: you can see from Harmony-2000 patch, that this is done
> for
> > > > > Bootstrap classes already. If you look at root_set_enum_common.cpp
> > > (with the
> > > > > patch applied) vm_enumerate_static_fields() function, there is
> line:
> > > > >                 if (cl->IsBootstrap())
> > > > >                 {
> > > > >                     vm_enumerate_jlc(c);
> > > > >                     if (c->vtable)
> > > > >
> > > vm_enumerate_root_reference((void**)&c->vtObj,
> > > > > FALSE);
> > > > >                 }
> > > > >                 else
> > > > >                 {
> > > > >                     vm_enumerate_jlc(c, true/*weak*/);
> > > > >                 }
> > > > > You can see, that there are strong roots to Bootstrap
> j.l.Classesand
> > > > > their VTable objects. So I suppose, that it would be very simple
> to
> > > > > propogate strong roots to all other classes (not only Bootstrap),
> > > something
> > > > > like:
> > > > >                 if (cl->IsBootstrap() *&&
> > > > > env->b_VTable_trace_is_not_supported_by_GC*)
> > > > >                 {
> > > > >                     vm_enumerate_jlc(c);
> > > > >                     if (c->vtable)
> > > > >
> > > vm_enumerate_root_reference((void**)&c->vtObj,
> > > > > FALSE);
> > > > >                 }
> > > > > where *b_VTable_trace_is_not_supported_by_GC *is flag which is set
> > > > > according to used GC. This will force switching off any class
> > > unloading
> > > > > support.
> > > > >
> > > > > Aleksey.
> > > > >
> > > > >  On 11/1/06, Robin Garner <robin.garner@anu.edu.au > wrote:
> > > > > >
> > > > > > Weldon Washburn wrote:
> > > > > > > On 10/30/06, Robin Garner < robin.garner@anu.edu.au > wrote:
> > > > > > >>
> > > > > > >> Weldon Washburn wrote:
> > > > > > >> > On 10/27/06, Geir Magnusson Jr. < geir@pobox.com> wrote:
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >> Weldon Washburn wrote:
> > > > > > >> >> > Steve Blackburn was in Portland Oregon today.  I
> mentioned
> > > the
> > > > > > idea
> > > > > > >> of
> > > > > > >> >> > adding a  reference pointer from object to its
> > > j.l.Classinstance.
> > > > > > >> >> MMTk
> > > > > > >> >> > was
> > > > > > >> >> > not designed with this idea in mind.  It looks like you
> > > will
> > > > > > need to
> > > > > > >> >> fix
> > > > > > >> >> > this part of MMTk and maintain it yourself.  Steve did
> not
> > > > > > seem
> > > > > > >> >> thrilled
> > > > > > >> >> at
> > > > > > >> >> > adding this support to MMTk code base.
> > > > > > >>
> > > > > > >> Actually I think the answer may have been a little garbled
> > > along
> > > > > > the way
> > > > > > >> here: MMTk is not a memory manager *for* Java, it is simply a
> > > > > > memory
> > > > > > >> manager.  We have been careful to eliminate language-specific
> > > > > > features
> > > > > > >> in the heap that it manages.  MMTk has been used to manage C#
> > > (in
> > > > > > the
> > > > > > >> Rotor VM) and was being incorporated into a Haskell runtime
> > > until I
> > > > > > ran
> > > > > > >> out of time.
> > > > > > >>
> > > > > > >> Therefore, MMTk knows nothing about the concept of class
> > > unloading,
> > > > > > or
> > > > > > >> java.lang.Class .
> > > > > > >>
> > > > > > >> >> How does MMTk support class unloading then?
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > MMTk has no special support for class unloading.  This may
> > > have
> > > > > > >> > something to
> > > > > > >> > do with the entire system being written in Java thus class
> > > > > > unloading
> > > > > > >> come
> > > > > > >> > along for free.  If there needs to be a modification to
> > > support
> > > > > > special
> > > > > > >> > case
> > > > > > >> > objects in DRLVM, someone will need to fixup MMTk and
> provide
> > > > > > onging
> > > > > > >> > support of this patch in Harmony.  I have zero idea how big
> > > this
> > > > > > effort
> > > > > > >> > would be.   It would also be good to hear what the impact
> > > will be
> > > > > > on
> > > > > > >> GCV5.
> > > > > > >>
> > > > > > >> MMTk implements several algorithms for retaining the
> reachable
> > > > > > objects
> > > > > > >> in a graph and recycling space used by unreachable ones.  It
> > > relies
> > > > > > on
> > > > > > >> the host VM to provide a set of roots.  It supports several
> > > > > > different
> > > > > > >> semantics of 'weak' references, including but not confined to
> > > those
> > > > > > >> required by Java.
> > > > > > >>
> > > > > > >> If you can implement class unloading using those (which the
> > > current
> > > > > >
> > > > > > >> proposal does), then MMTk can help.
> > > > > > >>
> > > > > > >> If you want to put a pointer to the j.l.Class in the object
> > > header,
> > > > > > MMTk
> > > > > > >> will not care, as it has no way of knowing.  If you put an
> > > > > > additional
> > > > > > >> pointer into the body of every object, then MMTk will see it
> as
> > > > > > just
> > > > > > >> another object to scan.
> > > > > > >>
> > > > > > >> Remember MMTk is a memory manager, not a Java VM!
> > > > > > >>
> > > > > > >>
> > > > > > >> Conversely, supporting some exotic class unloading mechanism
> in
> > >
> > > > > > MMTk
> > > > > > >> shouldn't be hard and wouldn't deter me from trying it out.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Robin, it would be great if you can get MMTk to support this
> > > class
> > > > > > > unloading
> > > > > > > effort.  My main concern is the ongoing maintenance of MMTk
> > > class
> > > > > > unloading
> > > > > > > support.
> > > > > >
> > > > > > I haven't seen any proposal that requires MMTk to be modified,
> so
> > > it's
> > > > > > a
> > > > > > moot point at the moment.
> > > > > >
> > > > > > > A question for all involved.  Is it possible to somehow make
> it
> > > > > > appear that
> > > > > > > the new objects related to unloading  (VTable, ClassLoader,
> > > > > > etc)  are
> > > > > > > always
> > > > > > > reachable and thus never collected?  I am trying to figure out
> a
> > > way
> > > > > > to
> > > > > > > make
> > > > > > > integration of class unloading independent of correct support
> > > inside
> > > > > > the GC
> > > > > > > and JIT.  This option could be a command line switch or
> compile
> > > time
> > > > > >
> > > > > > > option.
> > > > > >
> > > > > > Simple.  Keep a list or table of these objects as part of the
> root
> > >
> > > > > > set.
> > > > > > Enumerate it optionally depending on a command line option.
> > > > > >
> > > > > > cheers,
> > > > > > Robin
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Weldon Washburn
> > Intel Enterprise Solutions Software Division
> >
>
>
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
>
>

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

Aleksey,
I tried to apply native_sources_cleanup_upd.patch.  svn HEAD has changed and
the patch no longer works.  Part of the problem is that JIRA 1558 has been
committed.  In addition to the below issues, I posted comments to
JIRA HARMONY-2000.


On 11/2/06, Weldon Washburn <we...@gmail.com> wrote:
>
> Aleksey,
>
> Excellent step forward -- breaking the patch into two pieces.   This made
> the patch(es) much more readable.
>
> I glanced at native_sources_cleanup.patch.  It looks like code for
> alloc/dealloc vtables and jitted code blocks.  The original patch made
> vtables into objects.  Will native_sources_cleanup need to change if vtables
> are normal C structs instead?  Also, I see reference to path .../gcv4/...  I
> guess this will need to change to support gc_gen and gc_cc.
>
> Once you get native_sources_cleanup.patch in good shape, I have no problem
> committing it.
>
> If there is no other debate on class unloading design, I will call for a
> vote in a seperate email.
>
>
>
> On 11/2/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >
> > Hi, everyone.
> >
> > I've splitted Harmony-2000 (see details:
> > http://issues.apache.org/jira/browse/HARMONY-2000) patch with automatic
> > class unloading implementation into 2 independent parts:
> > 1. cleaning native resources (native_sources_cleanup.patch).
> > 2. automatic unloading design implementation (auto_unloading.patch).
> >
> > The first part is independent for all class unloading designs and could
> > be
> > commited. The second part is class unloading design implementation (the
> > best
> > class unloading approach is discussed now).
> >
> > I propose to commit native_sources_cleanup.patch and continue class
> > unloading development with minimal requirements on drlvm.
> >
> > Aleksey.
> >
> >
> > On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> > >
> > > Oops, sorry, misprinted in my suggestion:
> > >                 if (cl->IsBootstrap() *||
> > *env->b_VTable_trace_is_not_supported_by_GC)
> > >
> > >                 {
> > >                     vm_enumerate_jlc(c);
> > >                     if (c->vtable)
> > >                         vm_enumerate_root_reference((void**)&c->vtObj,
> > > FALSE);
> > >                 }
> > >
> > > Aleksey.
> > >
> > >  On 11/1/06, Aleksey Ignatenko < aleksey.ignatenko@gmail.com> wrote:
> > > >
> > > > Weldon,
> > > >
> > > > >A question for all involved.  Is it possible to somehow make it
> > appear
> > > > that
> > > > >the new objects related to unloading  (VTable, ClassLoader,
> > etc)  are
> > > > always
> > > > >reachable and thus never collected?  I am trying to figure out a
> > way to
> > > > make
> > > > >integration of class unloading independent of correct support
> > inside
> > > > the GC
> > > > >and JIT.  This option could be a command line switch or compile
> > time
> > > > option.
> > > >
> > > > I agree with Robin:
> > > > >Simple.  Keep a list or table of these objects as part of the root
> > set.
> > > > >Enumerate it optionally depending on a command line option.
> > > >
> > > > Details: you can see from Harmony-2000 patch, that this is done for
> > > > Bootstrap classes already. If you look at root_set_enum_common.cpp
> > (with the
> > > > patch applied) vm_enumerate_static_fields() function, there is line:
> > > >                 if (cl->IsBootstrap())
> > > >                 {
> > > >                     vm_enumerate_jlc(c);
> > > >                     if (c->vtable)
> > > >
> > vm_enumerate_root_reference((void**)&c->vtObj,
> > > > FALSE);
> > > >                 }
> > > >                 else
> > > >                 {
> > > >                     vm_enumerate_jlc(c, true/*weak*/);
> > > >                 }
> > > > You can see, that there are strong roots to Bootstrap j.l.Classesand
> > > > their VTable objects. So I suppose, that it would be very simple to
> > > > propogate strong roots to all other classes (not only Bootstrap),
> > something
> > > > like:
> > > >                 if (cl->IsBootstrap() *&&
> > > > env->b_VTable_trace_is_not_supported_by_GC*)
> > > >                 {
> > > >                     vm_enumerate_jlc(c);
> > > >                     if (c->vtable)
> > > >
> > vm_enumerate_root_reference((void**)&c->vtObj,
> > > > FALSE);
> > > >                 }
> > > > where *b_VTable_trace_is_not_supported_by_GC *is flag which is set
> > > > according to used GC. This will force switching off any class
> > unloading
> > > > support.
> > > >
> > > > Aleksey.
> > > >
> > > >  On 11/1/06, Robin Garner <robin.garner@anu.edu.au > wrote:
> > > > >
> > > > > Weldon Washburn wrote:
> > > > > > On 10/30/06, Robin Garner < robin.garner@anu.edu.au > wrote:
> > > > > >>
> > > > > >> Weldon Washburn wrote:
> > > > > >> > On 10/27/06, Geir Magnusson Jr. < geir@pobox.com> wrote:
> > > > > >> >>
> > > > > >> >>
> > > > > >> >>
> > > > > >> >> Weldon Washburn wrote:
> > > > > >> >> > Steve Blackburn was in Portland Oregon today.  I mentioned
> > the
> > > > > idea
> > > > > >> of
> > > > > >> >> > adding a  reference pointer from object to its
> > j.l.Classinstance.
> > > > > >> >> MMTk
> > > > > >> >> > was
> > > > > >> >> > not designed with this idea in mind.  It looks like you
> > will
> > > > > need to
> > > > > >> >> fix
> > > > > >> >> > this part of MMTk and maintain it yourself.  Steve did not
> > > > > seem
> > > > > >> >> thrilled
> > > > > >> >> at
> > > > > >> >> > adding this support to MMTk code base.
> > > > > >>
> > > > > >> Actually I think the answer may have been a little garbled
> > along
> > > > > the way
> > > > > >> here: MMTk is not a memory manager *for* Java, it is simply a
> > > > > memory
> > > > > >> manager.  We have been careful to eliminate language-specific
> > > > > features
> > > > > >> in the heap that it manages.  MMTk has been used to manage C#
> > (in
> > > > > the
> > > > > >> Rotor VM) and was being incorporated into a Haskell runtime
> > until I
> > > > > ran
> > > > > >> out of time.
> > > > > >>
> > > > > >> Therefore, MMTk knows nothing about the concept of class
> > unloading,
> > > > > or
> > > > > >> java.lang.Class .
> > > > > >>
> > > > > >> >> How does MMTk support class unloading then?
> > > > > >> >
> > > > > >> >
> > > > > >> > MMTk has no special support for class unloading.  This may
> > have
> > > > > >> > something to
> > > > > >> > do with the entire system being written in Java thus class
> > > > > unloading
> > > > > >> come
> > > > > >> > along for free.  If there needs to be a modification to
> > support
> > > > > special
> > > > > >> > case
> > > > > >> > objects in DRLVM, someone will need to fixup MMTk and provide
> > > > > onging
> > > > > >> > support of this patch in Harmony.  I have zero idea how big
> > this
> > > > > effort
> > > > > >> > would be.   It would also be good to hear what the impact
> > will be
> > > > > on
> > > > > >> GCV5.
> > > > > >>
> > > > > >> MMTk implements several algorithms for retaining the reachable
> > > > > objects
> > > > > >> in a graph and recycling space used by unreachable ones.  It
> > relies
> > > > > on
> > > > > >> the host VM to provide a set of roots.  It supports several
> > > > > different
> > > > > >> semantics of 'weak' references, including but not confined to
> > those
> > > > > >> required by Java.
> > > > > >>
> > > > > >> If you can implement class unloading using those (which the
> > current
> > > > >
> > > > > >> proposal does), then MMTk can help.
> > > > > >>
> > > > > >> If you want to put a pointer to the j.l.Class in the object
> > header,
> > > > > MMTk
> > > > > >> will not care, as it has no way of knowing.  If you put an
> > > > > additional
> > > > > >> pointer into the body of every object, then MMTk will see it as
> > > > > just
> > > > > >> another object to scan.
> > > > > >>
> > > > > >> Remember MMTk is a memory manager, not a Java VM!
> > > > > >>
> > > > > >>
> > > > > >> Conversely, supporting some exotic class unloading mechanism in
> >
> > > > > MMTk
> > > > > >> shouldn't be hard and wouldn't deter me from trying it out.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Robin, it would be great if you can get MMTk to support this
> > class
> > > > > > unloading
> > > > > > effort.  My main concern is the ongoing maintenance of MMTk
> > class
> > > > > unloading
> > > > > > support.
> > > > >
> > > > > I haven't seen any proposal that requires MMTk to be modified, so
> > it's
> > > > > a
> > > > > moot point at the moment.
> > > > >
> > > > > > A question for all involved.  Is it possible to somehow make it
> > > > > appear that
> > > > > > the new objects related to unloading  (VTable, ClassLoader,
> > > > > etc)  are
> > > > > > always
> > > > > > reachable and thus never collected?  I am trying to figure out a
> > way
> > > > > to
> > > > > > make
> > > > > > integration of class unloading independent of correct support
> > inside
> > > > > the GC
> > > > > > and JIT.  This option could be a command line switch or compile
> > time
> > > > >
> > > > > > option.
> > > > >
> > > > > Simple.  Keep a list or table of these objects as part of the root
> >
> > > > > set.
> > > > > Enumerate it optionally depending on a command line option.
> > > > >
> > > > > cheers,
> > > > > Robin
> > > > >
> > > >
> > > >
> > >
> >
> >
>
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Weldon,
>I glanced at native_sources_cleanup.patch.  It looks like code for
>alloc/dealloc vtables and jitted code blocks.  The original patch made
>vtables into objects.  Will native_sources_cleanup need to change if
vtables
>are normal C structs instead?  Also, I see reference to path
.../gcv4/...  I
>guess this will need to change to support gc_gen and gc_cc.
Vtables are not affected in native resource cleanup patch (no change from c
struct to object).
GCV4: There is some code cleanup and native resource cleaning in gcv4. The
same will be done for gc_gen and gc_cc by GC people with separate JIRA,
becuse it could affect some performance problems.

I have updated patches to the final versions:
native_sources_cleanup_upd.patch, auto_unloading_upd.patch. So, I suppose
native_sources_cleanup_upd.patch is ready for comit.
Aleksey.



On 11/2/06, Weldon Washburn <we...@gmail.com> wrote:
>
> Aleksey,
>
> Excellent step forward -- breaking the patch into two pieces.   This made
> the patch(es) much more readable.
>
> I glanced at native_sources_cleanup.patch.  It looks like code for
> alloc/dealloc vtables and jitted code blocks.  The original patch made
> vtables into objects.  Will native_sources_cleanup need to change if
> vtables
> are normal C structs instead?  Also, I see reference to path
> .../gcv4/...  I
> guess this will need to change to support gc_gen and gc_cc.
>
> Once you get native_sources_cleanup.patch in good shape, I have no problem
> committing it.
>
> If there is no other debate on class unloading design, I will call for a
> vote in a seperate email.
>
>
>
> On 11/2/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >
> > Hi, everyone.
> >
> > I've splitted Harmony-2000 (see details:
> > http://issues.apache.org/jira/browse/HARMONY-2000) patch with automatic
> > class unloading implementation into 2 independent parts:
> > 1. cleaning native resources (native_sources_cleanup.patch).
> > 2. automatic unloading design implementation (auto_unloading.patch).
> >
> > The first part is independent for all class unloading designs and could
> be
> > commited. The second part is class unloading design implementation (the
> > best
> > class unloading approach is discussed now).
> >
> > I propose to commit native_sources_cleanup.patch and continue class
> > unloading development with minimal requirements on drlvm.
> >
> > Aleksey.
> >
> >
> > On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> > >
> > > Oops, sorry, misprinted in my suggestion:
> > >                 if (cl->IsBootstrap() *||
> > *env->b_VTable_trace_is_not_supported_by_GC)
> > >
> > >                 {
> > >                     vm_enumerate_jlc(c);
> > >                     if (c->vtable)
> > >                         vm_enumerate_root_reference((void**)&c->vtObj,
> > > FALSE);
> > >                 }
> > >
> > > Aleksey.
> > >
> > >  On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> > > >
> > > > Weldon,
> > > >
> > > > >A question for all involved.  Is it possible to somehow make it
> > appear
> > > > that
> > > > >the new objects related to unloading  (VTable, ClassLoader,
> etc)  are
> > > > always
> > > > >reachable and thus never collected?  I am trying to figure out a
> way
> > to
> > > > make
> > > > >integration of class unloading independent of correct support
> inside
> > > > the GC
> > > > >and JIT.  This option could be a command line switch or compile
> time
> > > > option.
> > > >
> > > > I agree with Robin:
> > > > >Simple.  Keep a list or table of these objects as part of the root
> > set.
> > > > >Enumerate it optionally depending on a command line option.
> > > >
> > > > Details: you can see from Harmony-2000 patch, that this is done for
> > > > Bootstrap classes already. If you look at root_set_enum_common.cpp
> > (with the
> > > > patch applied) vm_enumerate_static_fields() function, there is line:
> > > >                 if (cl->IsBootstrap())
> > > >                 {
> > > >                     vm_enumerate_jlc(c);
> > > >                     if (c->vtable)
> > > >
> vm_enumerate_root_reference((void**)&c->vtObj,
> > > > FALSE);
> > > >                 }
> > > >                 else
> > > >                 {
> > > >                     vm_enumerate_jlc(c, true/*weak*/);
> > > >                 }
> > > > You can see, that there are strong roots to Bootstrap j.l.Classesand
> > > > their VTable objects. So I suppose, that it would be very simple to
> > > > propogate strong roots to all other classes (not only Bootstrap),
> > something
> > > > like:
> > > >                 if (cl->IsBootstrap() *&&
> > > > env->b_VTable_trace_is_not_supported_by_GC*)
> > > >                 {
> > > >                     vm_enumerate_jlc(c);
> > > >                     if (c->vtable)
> > > >
> vm_enumerate_root_reference((void**)&c->vtObj,
> > > > FALSE);
> > > >                 }
> > > > where *b_VTable_trace_is_not_supported_by_GC *is flag which is set
> > > > according to used GC. This will force switching off any class
> > unloading
> > > > support.
> > > >
> > > > Aleksey.
> > > >
> > > >  On 11/1/06, Robin Garner <robin.garner@anu.edu.au > wrote:
> > > > >
> > > > > Weldon Washburn wrote:
> > > > > > On 10/30/06, Robin Garner < robin.garner@anu.edu.au > wrote:
> > > > > >>
> > > > > >> Weldon Washburn wrote:
> > > > > >> > On 10/27/06, Geir Magnusson Jr. < geir@pobox.com> wrote:
> > > > > >> >>
> > > > > >> >>
> > > > > >> >>
> > > > > >> >> Weldon Washburn wrote:
> > > > > >> >> > Steve Blackburn was in Portland Oregon today.  I mentioned
> > the
> > > > > idea
> > > > > >> of
> > > > > >> >> > adding a  reference pointer from object to its
> > j.l.Classinstance.
> > > > > >> >> MMTk
> > > > > >> >> > was
> > > > > >> >> > not designed with this idea in mind.  It looks like you
> will
> > > > > need to
> > > > > >> >> fix
> > > > > >> >> > this part of MMTk and maintain it yourself.  Steve did not
> > > > > seem
> > > > > >> >> thrilled
> > > > > >> >> at
> > > > > >> >> > adding this support to MMTk code base.
> > > > > >>
> > > > > >> Actually I think the answer may have been a little garbled
> along
> > > > > the way
> > > > > >> here: MMTk is not a memory manager *for* Java, it is simply a
> > > > > memory
> > > > > >> manager.  We have been careful to eliminate language-specific
> > > > > features
> > > > > >> in the heap that it manages.  MMTk has been used to manage C#
> (in
> > > > > the
> > > > > >> Rotor VM) and was being incorporated into a Haskell runtime
> until
> > I
> > > > > ran
> > > > > >> out of time.
> > > > > >>
> > > > > >> Therefore, MMTk knows nothing about the concept of class
> > unloading,
> > > > > or
> > > > > >> java.lang.Class.
> > > > > >>
> > > > > >> >> How does MMTk support class unloading then?
> > > > > >> >
> > > > > >> >
> > > > > >> > MMTk has no special support for class unloading.  This may
> have
> > > > > >> > something to
> > > > > >> > do with the entire system being written in Java thus class
> > > > > unloading
> > > > > >> come
> > > > > >> > along for free.  If there needs to be a modification to
> support
> > > > > special
> > > > > >> > case
> > > > > >> > objects in DRLVM, someone will need to fixup MMTk and provide
> > > > > onging
> > > > > >> > support of this patch in Harmony.  I have zero idea how big
> > this
> > > > > effort
> > > > > >> > would be.   It would also be good to hear what the impact
> will
> > be
> > > > > on
> > > > > >> GCV5.
> > > > > >>
> > > > > >> MMTk implements several algorithms for retaining the reachable
> > > > > objects
> > > > > >> in a graph and recycling space used by unreachable ones.  It
> > relies
> > > > > on
> > > > > >> the host VM to provide a set of roots.  It supports several
> > > > > different
> > > > > >> semantics of 'weak' references, including but not confined to
> > those
> > > > > >> required by Java.
> > > > > >>
> > > > > >> If you can implement class unloading using those (which the
> > current
> > > > >
> > > > > >> proposal does), then MMTk can help.
> > > > > >>
> > > > > >> If you want to put a pointer to the j.l.Class in the object
> > header,
> > > > > MMTk
> > > > > >> will not care, as it has no way of knowing.  If you put an
> > > > > additional
> > > > > >> pointer into the body of every object, then MMTk will see it as
> > > > > just
> > > > > >> another object to scan.
> > > > > >>
> > > > > >> Remember MMTk is a memory manager, not a Java VM!
> > > > > >>
> > > > > >>
> > > > > >> Conversely, supporting some exotic class unloading mechanism in
> > > > > MMTk
> > > > > >> shouldn't be hard and wouldn't deter me from trying it out.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Robin, it would be great if you can get MMTk to support this
> class
> > > > > > unloading
> > > > > > effort.  My main concern is the ongoing maintenance of MMTk
> class
> > > > > unloading
> > > > > > support.
> > > > >
> > > > > I haven't seen any proposal that requires MMTk to be modified, so
> > it's
> > > > > a
> > > > > moot point at the moment.
> > > > >
> > > > > > A question for all involved.  Is it possible to somehow make it
> > > > > appear that
> > > > > > the new objects related to unloading  (VTable, ClassLoader,
> > > > > etc)  are
> > > > > > always
> > > > > > reachable and thus never collected?  I am trying to figure out a
> > way
> > > > > to
> > > > > > make
> > > > > > integration of class unloading independent of correct support
> > inside
> > > > > the GC
> > > > > > and JIT.  This option could be a command line switch or compile
> > time
> > > > >
> > > > > > option.
> > > > >
> > > > > Simple.  Keep a list or table of these objects as part of the root
> > > > > set.
> > > > > Enumerate it optionally depending on a command line option.
> > > > >
> > > > > cheers,
> > > > > Robin
> > > > >
> > > >
> > > >
> > >
> >
> >
>
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
>
>

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

Aleksey,

Excellent step forward -- breaking the patch into two pieces.   This made
the patch(es) much more readable.

I glanced at native_sources_cleanup.patch.  It looks like code for
alloc/dealloc vtables and jitted code blocks.  The original patch made
vtables into objects.  Will native_sources_cleanup need to change if vtables
are normal C structs instead?  Also, I see reference to path .../gcv4/...  I
guess this will need to change to support gc_gen and gc_cc.

Once you get native_sources_cleanup.patch in good shape, I have no problem
committing it.

If there is no other debate on class unloading design, I will call for a
vote in a seperate email.



On 11/2/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>
> Hi, everyone.
>
> I've splitted Harmony-2000 (see details:
> http://issues.apache.org/jira/browse/HARMONY-2000) patch with automatic
> class unloading implementation into 2 independent parts:
> 1. cleaning native resources (native_sources_cleanup.patch).
> 2. automatic unloading design implementation (auto_unloading.patch).
>
> The first part is independent for all class unloading designs and could be
> commited. The second part is class unloading design implementation (the
> best
> class unloading approach is discussed now).
>
> I propose to commit native_sources_cleanup.patch and continue class
> unloading development with minimal requirements on drlvm.
>
> Aleksey.
>
>
> On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >
> > Oops, sorry, misprinted in my suggestion:
> >                 if (cl->IsBootstrap() *||
> *env->b_VTable_trace_is_not_supported_by_GC)
> >
> >                 {
> >                     vm_enumerate_jlc(c);
> >                     if (c->vtable)
> >                         vm_enumerate_root_reference((void**)&c->vtObj,
> > FALSE);
> >                 }
> >
> > Aleksey.
> >
> >  On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> > >
> > > Weldon,
> > >
> > > >A question for all involved.  Is it possible to somehow make it
> appear
> > > that
> > > >the new objects related to unloading  (VTable, ClassLoader, etc)  are
> > > always
> > > >reachable and thus never collected?  I am trying to figure out a way
> to
> > > make
> > > >integration of class unloading independent of correct support inside
> > > the GC
> > > >and JIT.  This option could be a command line switch or compile time
> > > option.
> > >
> > > I agree with Robin:
> > > >Simple.  Keep a list or table of these objects as part of the root
> set.
> > > >Enumerate it optionally depending on a command line option.
> > >
> > > Details: you can see from Harmony-2000 patch, that this is done for
> > > Bootstrap classes already. If you look at root_set_enum_common.cpp
> (with the
> > > patch applied) vm_enumerate_static_fields() function, there is line:
> > >                 if (cl->IsBootstrap())
> > >                 {
> > >                     vm_enumerate_jlc(c);
> > >                     if (c->vtable)
> > >                         vm_enumerate_root_reference((void**)&c->vtObj,
> > > FALSE);
> > >                 }
> > >                 else
> > >                 {
> > >                     vm_enumerate_jlc(c, true/*weak*/);
> > >                 }
> > > You can see, that there are strong roots to Bootstrap j.l.Classes and
> > > their VTable objects. So I suppose, that it would be very simple to
> > > propogate strong roots to all other classes (not only Bootstrap),
> something
> > > like:
> > >                 if (cl->IsBootstrap() *&&
> > > env->b_VTable_trace_is_not_supported_by_GC*)
> > >                 {
> > >                     vm_enumerate_jlc(c);
> > >                     if (c->vtable)
> > >                         vm_enumerate_root_reference((void**)&c->vtObj,
> > > FALSE);
> > >                 }
> > > where *b_VTable_trace_is_not_supported_by_GC *is flag which is set
> > > according to used GC. This will force switching off any class
> unloading
> > > support.
> > >
> > > Aleksey.
> > >
> > >  On 11/1/06, Robin Garner <robin.garner@anu.edu.au > wrote:
> > > >
> > > > Weldon Washburn wrote:
> > > > > On 10/30/06, Robin Garner < robin.garner@anu.edu.au > wrote:
> > > > >>
> > > > >> Weldon Washburn wrote:
> > > > >> > On 10/27/06, Geir Magnusson Jr. < geir@pobox.com> wrote:
> > > > >> >>
> > > > >> >>
> > > > >> >>
> > > > >> >> Weldon Washburn wrote:
> > > > >> >> > Steve Blackburn was in Portland Oregon today.  I mentioned
> the
> > > > idea
> > > > >> of
> > > > >> >> > adding a  reference pointer from object to its
> j.l.Classinstance.
> > > > >> >> MMTk
> > > > >> >> > was
> > > > >> >> > not designed with this idea in mind.  It looks like you will
> > > > need to
> > > > >> >> fix
> > > > >> >> > this part of MMTk and maintain it yourself.  Steve did not
> > > > seem
> > > > >> >> thrilled
> > > > >> >> at
> > > > >> >> > adding this support to MMTk code base.
> > > > >>
> > > > >> Actually I think the answer may have been a little garbled along
> > > > the way
> > > > >> here: MMTk is not a memory manager *for* Java, it is simply a
> > > > memory
> > > > >> manager.  We have been careful to eliminate language-specific
> > > > features
> > > > >> in the heap that it manages.  MMTk has been used to manage C# (in
> > > > the
> > > > >> Rotor VM) and was being incorporated into a Haskell runtime until
> I
> > > > ran
> > > > >> out of time.
> > > > >>
> > > > >> Therefore, MMTk knows nothing about the concept of class
> unloading,
> > > > or
> > > > >> java.lang.Class.
> > > > >>
> > > > >> >> How does MMTk support class unloading then?
> > > > >> >
> > > > >> >
> > > > >> > MMTk has no special support for class unloading.  This may have
> > > > >> > something to
> > > > >> > do with the entire system being written in Java thus class
> > > > unloading
> > > > >> come
> > > > >> > along for free.  If there needs to be a modification to support
> > > > special
> > > > >> > case
> > > > >> > objects in DRLVM, someone will need to fixup MMTk and provide
> > > > onging
> > > > >> > support of this patch in Harmony.  I have zero idea how big
> this
> > > > effort
> > > > >> > would be.   It would also be good to hear what the impact will
> be
> > > > on
> > > > >> GCV5.
> > > > >>
> > > > >> MMTk implements several algorithms for retaining the reachable
> > > > objects
> > > > >> in a graph and recycling space used by unreachable ones.  It
> relies
> > > > on
> > > > >> the host VM to provide a set of roots.  It supports several
> > > > different
> > > > >> semantics of 'weak' references, including but not confined to
> those
> > > > >> required by Java.
> > > > >>
> > > > >> If you can implement class unloading using those (which the
> current
> > > >
> > > > >> proposal does), then MMTk can help.
> > > > >>
> > > > >> If you want to put a pointer to the j.l.Class in the object
> header,
> > > > MMTk
> > > > >> will not care, as it has no way of knowing.  If you put an
> > > > additional
> > > > >> pointer into the body of every object, then MMTk will see it as
> > > > just
> > > > >> another object to scan.
> > > > >>
> > > > >> Remember MMTk is a memory manager, not a Java VM!
> > > > >>
> > > > >>
> > > > >> Conversely, supporting some exotic class unloading mechanism in
> > > > MMTk
> > > > >> shouldn't be hard and wouldn't deter me from trying it out.
> > > > >
> > > > >
> > > > >
> > > > > Robin, it would be great if you can get MMTk to support this class
> > > > > unloading
> > > > > effort.  My main concern is the ongoing maintenance of MMTk class
> > > > unloading
> > > > > support.
> > > >
> > > > I haven't seen any proposal that requires MMTk to be modified, so
> it's
> > > > a
> > > > moot point at the moment.
> > > >
> > > > > A question for all involved.  Is it possible to somehow make it
> > > > appear that
> > > > > the new objects related to unloading  (VTable, ClassLoader,
> > > > etc)  are
> > > > > always
> > > > > reachable and thus never collected?  I am trying to figure out a
> way
> > > > to
> > > > > make
> > > > > integration of class unloading independent of correct support
> inside
> > > > the GC
> > > > > and JIT.  This option could be a command line switch or compile
> time
> > > >
> > > > > option.
> > > >
> > > > Simple.  Keep a list or table of these objects as part of the root
> > > > set.
> > > > Enumerate it optionally depending on a command line option.
> > > >
> > > > cheers,
> > > > Robin
> > > >
> > >
> > >
> >
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Rana,
>Aleksey, how would one test this?
FIrst of all acceptance tests should PASS. This is required because jitted
code allocation mechanism was changed.  The second: I will try to provide
one simple test on class unloading today. We can use class unloading
implementation in Harmony-2000 to pass it.

>The second part is class unloading design implementation (the best
> >class unloading approach is discussed now).
>I did not understand, sorry:-) Which best approach?
It was evening already yersteady, so I probably did not expressed my
thoughts correctly,  :) I ment that there at least 3 proposal in Harmony dev
list on class unloading, and "the best approach" is the one accepted by
Harmony as best according to different criteria like performance, etc.

Aleksey.

On 11/2/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
> On 11/2/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >
> > >Hi, everyone.
> >
> > >I've splitted Harmony-2000 (see details:
> > >http://issues.apache.org/jira/browse/HARMONY-2000) patch >with
> automatic
> > >class unloading implementation into 2 independent parts:
> > >1. cleaning native resources (native_sources_cleanup.patch).
> > >2. automatic unloading design implementation (auto_unloading.patch).
> >
> > >The first part is independent for all class unloading designs >and
> could
> > be
> > >commited.
>
>
> Aleksey, how would one test this?
>
> >The second part is class unloading design implementation (the best
> > >class unloading approach is discussed now).
>
>
> I did not understand, sorry:-) Which best approach?
>
>

Re: [drlvm] Class unloading support

Posted by Rana Dasgupta <rd...@gmail.com>.

On 11/2/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>
> >Hi, everyone.
>
> >I've splitted Harmony-2000 (see details:
> >http://issues.apache.org/jira/browse/HARMONY-2000) patch >with automatic
> >class unloading implementation into 2 independent parts:
> >1. cleaning native resources (native_sources_cleanup.patch).
> >2. automatic unloading design implementation (auto_unloading.patch).
>
> >The first part is independent for all class unloading designs >and could
> be
> >commited.


Aleksey, how would one test this?

>The second part is class unloading design implementation (the best
> >class unloading approach is discussed now).


I did not understand, sorry:-) Which best approach?

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Hi, everyone.

I've splitted Harmony-2000 (see details:
http://issues.apache.org/jira/browse/HARMONY-2000) patch with automatic
class unloading implementation into 2 independent parts:
1. cleaning native resources (native_sources_cleanup.patch).
2. automatic unloading design implementation (auto_unloading.patch).

The first part is independent for all class unloading designs and could be
commited. The second part is class unloading design implementation (the best
class unloading approach is discussed now).

I propose to commit native_sources_cleanup.patch and continue class
unloading development with minimal requirements on drlvm.

Aleksey.


On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>
> Oops, sorry, misprinted in my suggestion:
>                 if (cl->IsBootstrap() *|| *env->b_VTable_trace_is_not_supported_by_GC)
>
>                 {
>                     vm_enumerate_jlc(c);
>                     if (c->vtable)
>                         vm_enumerate_root_reference((void**)&c->vtObj,
> FALSE);
>                 }
>
> Aleksey.
>
>  On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >
> > Weldon,
> >
> > >A question for all involved.  Is it possible to somehow make it appear
> > that
> > >the new objects related to unloading  (VTable, ClassLoader, etc)  are
> > always
> > >reachable and thus never collected?  I am trying to figure out a way to
> > make
> > >integration of class unloading independent of correct support inside
> > the GC
> > >and JIT.  This option could be a command line switch or compile time
> > option.
> >
> > I agree with Robin:
> > >Simple.  Keep a list or table of these objects as part of the root set.
> > >Enumerate it optionally depending on a command line option.
> >
> > Details: you can see from Harmony-2000 patch, that this is done for
> > Bootstrap classes already. If you look at root_set_enum_common.cpp (with the
> > patch applied) vm_enumerate_static_fields() function, there is line:
> >                 if (cl->IsBootstrap())
> >                 {
> >                     vm_enumerate_jlc(c);
> >                     if (c->vtable)
> >                         vm_enumerate_root_reference((void**)&c->vtObj,
> > FALSE);
> >                 }
> >                 else
> >                 {
> >                     vm_enumerate_jlc(c, true/*weak*/);
> >                 }
> > You can see, that there are strong roots to Bootstrap j.l.Classes and
> > their VTable objects. So I suppose, that it would be very simple to
> > propogate strong roots to all other classes (not only Bootstrap), something
> > like:
> >                 if (cl->IsBootstrap() *&&
> > env->b_VTable_trace_is_not_supported_by_GC*)
> >                 {
> >                     vm_enumerate_jlc(c);
> >                     if (c->vtable)
> >                         vm_enumerate_root_reference((void**)&c->vtObj,
> > FALSE);
> >                 }
> > where *b_VTable_trace_is_not_supported_by_GC *is flag which is set
> > according to used GC. This will force switching off any class unloading
> > support.
> >
> > Aleksey.
> >
> >  On 11/1/06, Robin Garner <robin.garner@anu.edu.au > wrote:
> > >
> > > Weldon Washburn wrote:
> > > > On 10/30/06, Robin Garner < robin.garner@anu.edu.au > wrote:
> > > >>
> > > >> Weldon Washburn wrote:
> > > >> > On 10/27/06, Geir Magnusson Jr. < geir@pobox.com> wrote:
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> Weldon Washburn wrote:
> > > >> >> > Steve Blackburn was in Portland Oregon today.  I mentioned the
> > > idea
> > > >> of
> > > >> >> > adding a  reference pointer from object to its j.l.Classinstance.
> > > >> >> MMTk
> > > >> >> > was
> > > >> >> > not designed with this idea in mind.  It looks like you will
> > > need to
> > > >> >> fix
> > > >> >> > this part of MMTk and maintain it yourself.  Steve did not
> > > seem
> > > >> >> thrilled
> > > >> >> at
> > > >> >> > adding this support to MMTk code base.
> > > >>
> > > >> Actually I think the answer may have been a little garbled along
> > > the way
> > > >> here: MMTk is not a memory manager *for* Java, it is simply a
> > > memory
> > > >> manager.  We have been careful to eliminate language-specific
> > > features
> > > >> in the heap that it manages.  MMTk has been used to manage C# (in
> > > the
> > > >> Rotor VM) and was being incorporated into a Haskell runtime until I
> > > ran
> > > >> out of time.
> > > >>
> > > >> Therefore, MMTk knows nothing about the concept of class unloading,
> > > or
> > > >> java.lang.Class.
> > > >>
> > > >> >> How does MMTk support class unloading then?
> > > >> >
> > > >> >
> > > >> > MMTk has no special support for class unloading.  This may have
> > > >> > something to
> > > >> > do with the entire system being written in Java thus class
> > > unloading
> > > >> come
> > > >> > along for free.  If there needs to be a modification to support
> > > special
> > > >> > case
> > > >> > objects in DRLVM, someone will need to fixup MMTk and provide
> > > onging
> > > >> > support of this patch in Harmony.  I have zero idea how big this
> > > effort
> > > >> > would be.   It would also be good to hear what the impact will be
> > > on
> > > >> GCV5.
> > > >>
> > > >> MMTk implements several algorithms for retaining the reachable
> > > objects
> > > >> in a graph and recycling space used by unreachable ones.  It relies
> > > on
> > > >> the host VM to provide a set of roots.  It supports several
> > > different
> > > >> semantics of 'weak' references, including but not confined to those
> > > >> required by Java.
> > > >>
> > > >> If you can implement class unloading using those (which the current
> > >
> > > >> proposal does), then MMTk can help.
> > > >>
> > > >> If you want to put a pointer to the j.l.Class in the object header,
> > > MMTk
> > > >> will not care, as it has no way of knowing.  If you put an
> > > additional
> > > >> pointer into the body of every object, then MMTk will see it as
> > > just
> > > >> another object to scan.
> > > >>
> > > >> Remember MMTk is a memory manager, not a Java VM!
> > > >>
> > > >>
> > > >> Conversely, supporting some exotic class unloading mechanism in
> > > MMTk
> > > >> shouldn't be hard and wouldn't deter me from trying it out.
> > > >
> > > >
> > > >
> > > > Robin, it would be great if you can get MMTk to support this class
> > > > unloading
> > > > effort.  My main concern is the ongoing maintenance of MMTk class
> > > unloading
> > > > support.
> > >
> > > I haven't seen any proposal that requires MMTk to be modified, so it's
> > > a
> > > moot point at the moment.
> > >
> > > > A question for all involved.  Is it possible to somehow make it
> > > appear that
> > > > the new objects related to unloading  (VTable, ClassLoader,
> > > etc)  are
> > > > always
> > > > reachable and thus never collected?  I am trying to figure out a way
> > > to
> > > > make
> > > > integration of class unloading independent of correct support inside
> > > the GC
> > > > and JIT.  This option could be a command line switch or compile time
> > >
> > > > option.
> > >
> > > Simple.  Keep a list or table of these objects as part of the root
> > > set.
> > > Enumerate it optionally depending on a command line option.
> > >
> > > cheers,
> > > Robin
> > >
> >
> >
>

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Oops, sorry, misprinted in my suggestion:
                if (cl->IsBootstrap() *||
*env->b_VTable_trace_is_not_supported_by_GC)

                {
                    vm_enumerate_jlc(c);
                    if (c->vtable)
                        vm_enumerate_root_reference((void**)&c->vtObj,
FALSE);
                }

Aleksey.

On 11/1/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>
> Weldon,
>
> >A question for all involved.  Is it possible to somehow make it appear
> that
> >the new objects related to unloading  (VTable, ClassLoader, etc)  are
> always
> >reachable and thus never collected?  I am trying to figure out a way to
> make
> >integration of class unloading independent of correct support inside the
> GC
> >and JIT.  This option could be a command line switch or compile time
> option.
>
> I agree with Robin:
> >Simple.  Keep a list or table of these objects as part of the root set.
> >Enumerate it optionally depending on a command line option.
>
> Details: you can see from Harmony-2000 patch, that this is done for
> Bootstrap classes already. If you look at root_set_enum_common.cpp (with the
> patch applied) vm_enumerate_static_fields() function, there is line:
>                 if (cl->IsBootstrap())
>                 {
>                     vm_enumerate_jlc(c);
>                     if (c->vtable)
>                         vm_enumerate_root_reference((void**)&c->vtObj,
> FALSE);
>                 }
>                 else
>                 {
>                     vm_enumerate_jlc(c, true/*weak*/);
>                 }
> You can see, that there are strong roots to Bootstrap j.l.Classes and
> their VTable objects. So I suppose, that it would be very simple to
> propogate strong roots to all other classes (not only Bootstrap), something
> like:
>                 if (cl->IsBootstrap() *&&
> env->b_VTable_trace_is_not_supported_by_GC*)
>                 {
>                     vm_enumerate_jlc(c);
>                     if (c->vtable)
>                         vm_enumerate_root_reference((void**)&c->vtObj,
> FALSE);
>                 }
> where *b_VTable_trace_is_not_supported_by_GC *is flag which is set
> according to used GC. This will force switching off any class unloading
> support.
>
> Aleksey.
>
>  On 11/1/06, Robin Garner <ro...@anu.edu.au> wrote:
> >
> > Weldon Washburn wrote:
> > > On 10/30/06, Robin Garner <robin.garner@anu.edu.au > wrote:
> > >>
> > >> Weldon Washburn wrote:
> > >> > On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> > >> >>
> > >> >>
> > >> >>
> > >> >> Weldon Washburn wrote:
> > >> >> > Steve Blackburn was in Portland Oregon today.  I mentioned the
> > idea
> > >> of
> > >> >> > adding a  reference pointer from object to its j.l.Classinstance.
> > >> >> MMTk
> > >> >> > was
> > >> >> > not designed with this idea in mind.  It looks like you will
> > need to
> > >> >> fix
> > >> >> > this part of MMTk and maintain it yourself.  Steve did not seem
> > >> >> thrilled
> > >> >> at
> > >> >> > adding this support to MMTk code base.
> > >>
> > >> Actually I think the answer may have been a little garbled along the
> > way
> > >> here: MMTk is not a memory manager *for* Java, it is simply a memory
> > >> manager.  We have been careful to eliminate language-specific
> > features
> > >> in the heap that it manages.  MMTk has been used to manage C# (in the
> > >> Rotor VM) and was being incorporated into a Haskell runtime until I
> > ran
> > >> out of time.
> > >>
> > >> Therefore, MMTk knows nothing about the concept of class unloading,
> > or
> > >> java.lang.Class.
> > >>
> > >> >> How does MMTk support class unloading then?
> > >> >
> > >> >
> > >> > MMTk has no special support for class unloading.  This may have
> > >> > something to
> > >> > do with the entire system being written in Java thus class
> > unloading
> > >> come
> > >> > along for free.  If there needs to be a modification to support
> > special
> > >> > case
> > >> > objects in DRLVM, someone will need to fixup MMTk and provide
> > onging
> > >> > support of this patch in Harmony.  I have zero idea how big this
> > effort
> > >> > would be.   It would also be good to hear what the impact will be
> > on
> > >> GCV5.
> > >>
> > >> MMTk implements several algorithms for retaining the reachable
> > objects
> > >> in a graph and recycling space used by unreachable ones.  It relies
> > on
> > >> the host VM to provide a set of roots.  It supports several different
> > >> semantics of 'weak' references, including but not confined to those
> > >> required by Java.
> > >>
> > >> If you can implement class unloading using those (which the current
> > >> proposal does), then MMTk can help.
> > >>
> > >> If you want to put a pointer to the j.l.Class in the object header,
> > MMTk
> > >> will not care, as it has no way of knowing.  If you put an additional
> >
> > >> pointer into the body of every object, then MMTk will see it as just
> > >> another object to scan.
> > >>
> > >> Remember MMTk is a memory manager, not a Java VM!
> > >>
> > >>
> > >> Conversely, supporting some exotic class unloading mechanism in MMTk
> > >> shouldn't be hard and wouldn't deter me from trying it out.
> > >
> > >
> > >
> > > Robin, it would be great if you can get MMTk to support this class
> > > unloading
> > > effort.  My main concern is the ongoing maintenance of MMTk class
> > unloading
> > > support.
> >
> > I haven't seen any proposal that requires MMTk to be modified, so it's a
> > moot point at the moment.
> >
> > > A question for all involved.  Is it possible to somehow make it appear
> > that
> > > the new objects related to unloading  (VTable, ClassLoader, etc)  are
> > > always
> > > reachable and thus never collected?  I am trying to figure out a way
> > to
> > > make
> > > integration of class unloading independent of correct support inside
> > the GC
> > > and JIT.  This option could be a command line switch or compile time
> > > option.
> >
> > Simple.  Keep a list or table of these objects as part of the root set.
> > Enumerate it optionally depending on a command line option.
> >
> > cheers,
> > Robin
> >
>
>

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Weldon,

>A question for all involved.  Is it possible to somehow make it appear that
>the new objects related to unloading  (VTable, ClassLoader, etc)  are
always
>reachable and thus never collected?  I am trying to figure out a way to
make
>integration of class unloading independent of correct support inside the GC
>and JIT.  This option could be a command line switch or compile time
option.

I agree with Robin:
>Simple.  Keep a list or table of these objects as part of the root set.
>Enumerate it optionally depending on a command line option.

Details: you can see from Harmony-2000 patch, that this is done for
Bootstrap classes already. If you look at root_set_enum_common.cpp (with the
patch applied) vm_enumerate_static_fields() function, there is line:
                if (cl->IsBootstrap())
                {
                    vm_enumerate_jlc(c);
                    if (c->vtable)
                        vm_enumerate_root_reference((void**)&c->vtObj,
FALSE);
                }
                else
                {
                    vm_enumerate_jlc(c, true/*weak*/);
                }
You can see, that there are strong roots to Bootstrap j.l.Classes and their
VTable objects. So I suppose, that it would be very simple to propogate
strong roots to all other classes (not only Bootstrap), something like:
                if (cl->IsBootstrap() *&&
env->b_VTable_trace_is_not_supported_by_GC*)
                {
                    vm_enumerate_jlc(c);
                    if (c->vtable)
                        vm_enumerate_root_reference((void**)&c->vtObj,
FALSE);
                }
where *b_VTable_trace_is_not_supported_by_GC *is flag which is set according
to used GC. This will force switching off any class unloading support.

Aleksey.

On 11/1/06, Robin Garner <ro...@anu.edu.au> wrote:
>
> Weldon Washburn wrote:
> > On 10/30/06, Robin Garner <ro...@anu.edu.au> wrote:
> >>
> >> Weldon Washburn wrote:
> >> > On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> >> >>
> >> >>
> >> >>
> >> >> Weldon Washburn wrote:
> >> >> > Steve Blackburn was in Portland Oregon today.  I mentioned the
> idea
> >> of
> >> >> > adding a  reference pointer from object to its j.l.Class instance.
> >> >> MMTk
> >> >> > was
> >> >> > not designed with this idea in mind.  It looks like you will need
> to
> >> >> fix
> >> >> > this part of MMTk and maintain it yourself.  Steve did not seem
> >> >> thrilled
> >> >> at
> >> >> > adding this support to MMTk code base.
> >>
> >> Actually I think the answer may have been a little garbled along the
> way
> >> here: MMTk is not a memory manager *for* Java, it is simply a memory
> >> manager.  We have been careful to eliminate language-specific features
> >> in the heap that it manages.  MMTk has been used to manage C# (in the
> >> Rotor VM) and was being incorporated into a Haskell runtime until I ran
> >> out of time.
> >>
> >> Therefore, MMTk knows nothing about the concept of class unloading, or
> >> java.lang.Class.
> >>
> >> >> How does MMTk support class unloading then?
> >> >
> >> >
> >> > MMTk has no special support for class unloading.  This may have
> >> > something to
> >> > do with the entire system being written in Java thus class unloading
> >> come
> >> > along for free.  If there needs to be a modification to support
> special
> >> > case
> >> > objects in DRLVM, someone will need to fixup MMTk and provide onging
> >> > support of this patch in Harmony.  I have zero idea how big this
> effort
> >> > would be.   It would also be good to hear what the impact will be on
> >> GCV5.
> >>
> >> MMTk implements several algorithms for retaining the reachable objects
> >> in a graph and recycling space used by unreachable ones.  It relies on
> >> the host VM to provide a set of roots.  It supports several different
> >> semantics of 'weak' references, including but not confined to those
> >> required by Java.
> >>
> >> If you can implement class unloading using those (which the current
> >> proposal does), then MMTk can help.
> >>
> >> If you want to put a pointer to the j.l.Class in the object header,
> MMTk
> >> will not care, as it has no way of knowing.  If you put an additional
> >> pointer into the body of every object, then MMTk will see it as just
> >> another object to scan.
> >>
> >> Remember MMTk is a memory manager, not a Java VM!
> >>
> >>
> >> Conversely, supporting some exotic class unloading mechanism in MMTk
> >> shouldn't be hard and wouldn't deter me from trying it out.
> >
> >
> >
> > Robin, it would be great if you can get MMTk to support this class
> > unloading
> > effort.  My main concern is the ongoing maintenance of MMTk class
> unloading
> > support.
>
> I haven't seen any proposal that requires MMTk to be modified, so it's a
> moot point at the moment.
>
> > A question for all involved.  Is it possible to somehow make it appear
> that
> > the new objects related to unloading  (VTable, ClassLoader, etc)  are
> > always
> > reachable and thus never collected?  I am trying to figure out a way to
> > make
> > integration of class unloading independent of correct support inside the
> GC
> > and JIT.  This option could be a command line switch or compile time
> > option.
>
> Simple.  Keep a list or table of these objects as part of the root set.
> Enumerate it optionally depending on a command line option.
>
> cheers,
> Robin
>

Re: [drlvm] Class unloading support

Posted by Robin Garner <ro...@anu.edu.au>.

Weldon Washburn wrote:
> On 10/30/06, Robin Garner <ro...@anu.edu.au> wrote:
>>
>> Weldon Washburn wrote:
>> > On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>> >>
>> >>
>> >>
>> >> Weldon Washburn wrote:
>> >> > Steve Blackburn was in Portland Oregon today.  I mentioned the idea
>> of
>> >> > adding a  reference pointer from object to its j.l.Class instance.
>> >> MMTk
>> >> > was
>> >> > not designed with this idea in mind.  It looks like you will need to
>> >> fix
>> >> > this part of MMTk and maintain it yourself.  Steve did not seem
>> >> thrilled
>> >> at
>> >> > adding this support to MMTk code base.
>>
>> Actually I think the answer may have been a little garbled along the way
>> here: MMTk is not a memory manager *for* Java, it is simply a memory
>> manager.  We have been careful to eliminate language-specific features
>> in the heap that it manages.  MMTk has been used to manage C# (in the
>> Rotor VM) and was being incorporated into a Haskell runtime until I ran
>> out of time.
>>
>> Therefore, MMTk knows nothing about the concept of class unloading, or
>> java.lang.Class.
>>
>> >> How does MMTk support class unloading then?
>> >
>> >
>> > MMTk has no special support for class unloading.  This may have
>> > something to
>> > do with the entire system being written in Java thus class unloading
>> come
>> > along for free.  If there needs to be a modification to support special
>> > case
>> > objects in DRLVM, someone will need to fixup MMTk and provide onging
>> > support of this patch in Harmony.  I have zero idea how big this effort
>> > would be.   It would also be good to hear what the impact will be on
>> GCV5.
>>
>> MMTk implements several algorithms for retaining the reachable objects
>> in a graph and recycling space used by unreachable ones.  It relies on
>> the host VM to provide a set of roots.  It supports several different
>> semantics of 'weak' references, including but not confined to those
>> required by Java.
>>
>> If you can implement class unloading using those (which the current
>> proposal does), then MMTk can help.
>>
>> If you want to put a pointer to the j.l.Class in the object header, MMTk
>> will not care, as it has no way of knowing.  If you put an additional
>> pointer into the body of every object, then MMTk will see it as just
>> another object to scan.
>>
>> Remember MMTk is a memory manager, not a Java VM!
>>
>>
>> Conversely, supporting some exotic class unloading mechanism in MMTk
>> shouldn't be hard and wouldn't deter me from trying it out.
> 
> 
> 
> Robin, it would be great if you can get MMTk to support this class 
> unloading
> effort.  My main concern is the ongoing maintenance of MMTk class unloading
> support.

I haven't seen any proposal that requires MMTk to be modified, so it's a 
moot point at the moment.

> A question for all involved.  Is it possible to somehow make it appear that
> the new objects related to unloading  (VTable, ClassLoader, etc)  are 
> always
> reachable and thus never collected?  I am trying to figure out a way to 
> make
> integration of class unloading independent of correct support inside the GC
> and JIT.  This option could be a command line switch or compile time 
> option.

Simple.  Keep a list or table of these objects as part of the root set. 
  Enumerate it optionally depending on a command line option.

cheers,
Robin

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

On 10/30/06, Robin Garner <ro...@anu.edu.au> wrote:
>
> Weldon Washburn wrote:
> > On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> >>
> >>
> >>
> >> Weldon Washburn wrote:
> >> > Steve Blackburn was in Portland Oregon today.  I mentioned the idea
> of
> >> > adding a  reference pointer from object to its j.l.Class instance.
> >> MMTk
> >> > was
> >> > not designed with this idea in mind.  It looks like you will need to
> >> fix
> >> > this part of MMTk and maintain it yourself.  Steve did not seem
> >> thrilled
> >> at
> >> > adding this support to MMTk code base.
>
> Actually I think the answer may have been a little garbled along the way
> here: MMTk is not a memory manager *for* Java, it is simply a memory
> manager.  We have been careful to eliminate language-specific features
> in the heap that it manages.  MMTk has been used to manage C# (in the
> Rotor VM) and was being incorporated into a Haskell runtime until I ran
> out of time.
>
> Therefore, MMTk knows nothing about the concept of class unloading, or
> java.lang.Class.
>
> >> How does MMTk support class unloading then?
> >
> >
> > MMTk has no special support for class unloading.  This may have
> > something to
> > do with the entire system being written in Java thus class unloading
> come
> > along for free.  If there needs to be a modification to support special
> > case
> > objects in DRLVM, someone will need to fixup MMTk and provide onging
> > support of this patch in Harmony.  I have zero idea how big this effort
> > would be.   It would also be good to hear what the impact will be on
> GCV5.
>
> MMTk implements several algorithms for retaining the reachable objects
> in a graph and recycling space used by unreachable ones.  It relies on
> the host VM to provide a set of roots.  It supports several different
> semantics of 'weak' references, including but not confined to those
> required by Java.
>
> If you can implement class unloading using those (which the current
> proposal does), then MMTk can help.
>
> If you want to put a pointer to the j.l.Class in the object header, MMTk
> will not care, as it has no way of knowing.  If you put an additional
> pointer into the body of every object, then MMTk will see it as just
> another object to scan.
>
> Remember MMTk is a memory manager, not a Java VM!
>
>
> Conversely, supporting some exotic class unloading mechanism in MMTk
> shouldn't be hard and wouldn't deter me from trying it out.



Robin, it would be great if you can get MMTk to support this class unloading
effort.  My main concern is the ongoing maintenance of MMTk class unloading
support.

A question for all involved.  Is it possible to somehow make it appear that
the new objects related to unloading  (VTable, ClassLoader, etc)  are always
reachable and thus never collected?  I am trying to figure out a way to make
integration of class unloading independent of correct support inside the GC
and JIT.  This option could be a command line switch or compile time option.

 If (as a
> wild idea) you wanted to periodically scan the heap, and count all
> references to each classloader, you could implement this with very
> little work as a TraceLocal object, and then extend the GC plan you
> wanted with an additional GC phase that would periodically do one of
> these scans after a major GC (for example).
>
> cheers
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support - tested one approach

Posted by Weldon Washburn <we...@gmail.com>.

Interesting idea!   It seems the real issue is "marking and sweeping" the
vtables.  A stab at categorizing the approaches:

a)
Force vtables to be as similar to ordinary java objects as possible.  The
upside: existing GC algorithms will work unaltered.  The downside is vtables
of vtables of vtables...  This has already been discussed at length.

b)
Force vtables to live and die in a unique "vtable space".  The big
challenge seems to be building a custom GC algorithm that is simpler and
less invasive than doing a) above.  Most likely the performance of the
custom GC algorithm will never be an issue.  Vtables have some very
interesting properties that may make this doable.  The 4 (or 8) bytes at
offset "K" always point to a class structure which, in turn, always has a
pointer at offset "Z" back to the vtable.  There are way fewer vtables than
objects.  For practical reasons, it can be assumed that vtables will always
be pinned.  The number of class structs/objects is no greater than the
number of vtables.

A half-baked scheme that might be good enough:  Partition off 50 megabytes
as a hard, fixed vtables space.  Then do a word-by-word scan to pick up
candidate pointers class structs.  Then filter the candidate class struct
pointers by verifying the back pointers.  Occasionally there might be
"floating garbage" with this approach but a valid, live vtable should never
be accidentally freed.  The filtered set are the "live" vtables.  Next scan
the live vtables looking for members that were never marked by the regular
GC as mentioned  below.  When found, zero the vtable, link-list on a free
vtable space list, mark the class struct as "vtable-less".  Process the
"vtable-less" class struct, etc...

It seems as long as part of the system is built using garbage collected java
objects and part of the system is built using malloc/free C structs, the
problem of releasing connected objects/C_structs will be messy and hacky.
In a sense, this issue is really a motivation for re-writing the whole VM in
Java... hmmm...

On 10/31/06, Robin Garner <ro...@anu.edu.au> wrote:
>
> Actually, just thinking about how I would implement this in JikesRVM, I
> would use the reachability based algorithm, but piggyback on the
> existing GC mechanisms:
>
> - Allocate a byte (or word) in each vtable for the purpose of tracking
> class reachability.
> - Periodically, at a time when no GC is running (even the most
> aggressive concurrent GC algorithms have these, I believe), zero this
> flag across all vtables.  This is the beginning of a class-unloading
> epoch.
> - During each GC, when the GC is fetching the GC map for an object,
> *unconditionally* write a value to the class reachability byte.  It may
> make sense for this byte to be in either the first cache-line of the
> vtable, or the cache line that points to the GC map - just make sure the
> mark operation doesn't in general fetch an additional cache line.
> - At a point in the sufficiently far future, when all reachable objects
> are known to have been traced by the GC, sweep the vtables and check the
> reachability of the classloaders.
>
> The features of this approach are:
>
> - Minimal additional work at GC time.  The additional write will cause
> some additional memory traffic, but a) it's to memory that is already
> guaranteed to be in L1 cache, and b) it's an unconditional independent
> write, and c) multiple writes to common classes will be absorbed by the
> write buffer.
>
> - Space cost of at most 1 word per vtable.
>
> - This works whether vtables are objects or VM structures
>
> - If the relationship between a class and a vtable is not 1:1, this only
> need affect the periodic sweep process, which should be infrequent and
> small compared to a GC.
>
> - shouldn't need a stop-the-world at any point.
>
> I've implemented and tested the GC-relevant part of this in JikesRVM,
> and the GC time overhead appears to be just under 1% in the MMTk
> MarkSweep collector.
>
> cheers,
> Robin
>

-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support - tested one approach

Posted by Ivan Volosyuk <iv...@gmail.com>.

+1 for this approach. It will give us some kind of class unloading
without much performance impact on GC.
--
Ivan

On 11/1/06, Robin Garner <ro...@anu.edu.au> wrote:
> Actually, just thinking about how I would implement this in JikesRVM, I
> would use the reachability based algorithm, but piggyback on the
> existing GC mechanisms:
>
> - Allocate a byte (or word) in each vtable for the purpose of tracking
> class reachability.
> - Periodically, at a time when no GC is running (even the most
> aggressive concurrent GC algorithms have these, I believe), zero this
> flag across all vtables.  This is the beginning of a class-unloading epoch.
> - During each GC, when the GC is fetching the GC map for an object,
> *unconditionally* write a value to the class reachability byte.  It may
> make sense for this byte to be in either the first cache-line of the
> vtable, or the cache line that points to the GC map - just make sure the
> mark operation doesn't in general fetch an additional cache line.
> - At a point in the sufficiently far future, when all reachable objects
> are known to have been traced by the GC, sweep the vtables and check the
> reachability of the classloaders.
>
> The features of this approach are:
>
> - Minimal additional work at GC time.  The additional write will cause
> some additional memory traffic, but a) it's to memory that is already
> guaranteed to be in L1 cache, and b) it's an unconditional independent
> write, and c) multiple writes to common classes will be absorbed by the
> write buffer.
>
> - Space cost of at most 1 word per vtable.
>
> - This works whether vtables are objects or VM structures
>
> - If the relationship between a class and a vtable is not 1:1, this only
> need affect the periodic sweep process, which should be infrequent and
> small compared to a GC.
>
> - shouldn't need a stop-the-world at any point.
>
> I've implemented and tested the GC-relevant part of this in JikesRVM,
> and the GC time overhead appears to be just under 1% in the MMTk
> MarkSweep collector.
>
> cheers,
> Robin
-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Weldon Washburn wrote:
> 
>                     Its probably in the noise but it might be possible to
> even reduce the overhead of clearing the vtable "mark" by using a epoch
> number instead of a boolean.  The idea is that after every major GC,
> increment the value used for the mark.  When sweeping the vtables, the 
> stale
> mark values are the unreachable classes.
> 
> cheers

Right.  I'm assuming we're all on the same page here, but I'll spell it 
out anyway:  The number of objects is orders of magnitude higher than 
the number of classes, so any operation on a 'per-class' basis can 
afford to be expensive, whereas per-object operations need to be fast.

Just looking at my stats for SpecJVM98, JBB 2000 and DaCapo 2006-10, the 
ratio of live objects to classes loaded is ~500:1 (geometric mean).  The 
extremes are 11:1 (jython) to 24000:1 (hsqldb).  These are probably also 
  very small heaps compared to enterprise workloads, which would drive 
the number of objects/class up.

The other consideration is that you are not going to want to check for 
unloadable classloaders at every GC.

Therefore, within reason, I don't think the efficiency of per-class 
operations is much of a consideration.

 >                  Its probably in the noise but it might be possible to
 > even reduce the overhead of clearing the vtable "mark" by using a epoch
 > number instead of a boolean.

Under the circumstances, requiring an additional register during GC to 
hold the class epoch number probably loses more than it gains.

cheers

Re: [drlvm] Class unloading support - tested one approach

Posted by Weldon Washburn <we...@gmail.com>.

On 11/1/06, Robin Garner <ro...@anu.edu.au> wrote:
>
> Rana Dasgupta wrote:
> > Robin,
> >    The basic difference of this with Etienne's method is that the flag
> is
> > on the vtable, instead of the CL instance. Do I understand correctly ?
> The
> > GC perf impact is therefore reduced because you need to lookup
> > object->vtable instead of object->class->CLinstancewhen tracing the
> heap.
> > The space overhead is correspondingly slightly higher. Also the GC
> impact
> > may look lower because there are a couple of pseudo GC cycles to reset
> the
> > vtables and sweep the vtables.
> >
> > Thanks,
> > Rana
>
> The relevant part of Etienne's design I believe is this:
>
> > 7- Each class loader structure maintains a set of boolean flags, one
> >  flag per "non-nursery" garbage collected area (even when thread-local
> >  heaps are used).  The flag is set when an instance of a class loaded by
> >  this class leader is moved into the related GC-area.  The flag is unset
> >  when the GC-area is emptied, or (optionally) when it can be determined
> >  that no instance of a class loaded by this class loader remains in the
> >  GC-area.  This is best implemented as follows: a) use an unconditional
> >  write of "true" in the flag every time an object is moved into the
> >  GC-area by the garbage collector, b) unset the related flag in "all"
> >  class loader structures just before collecting a GC-area, then setting
> >  the flag back when an object survives in the area.
>
> My design differs in several key ways from this:
> 1. There is no requirement for a per non-nursery area flag

2. The mark byte/word is set unconditionally whenever an object is
> visited by the GC, not when an object is moved into a particular mature
> space.  This may be the same for some GCs, but not all.

3. The mark byte/word is an unconditional write - Etienne's proposal
> would use a load/mask/write sequence.  This is performance critical.
> 4. My memory of x86 assembler is a little rusty, but I believe a
> constant store can be done without requiring a register for the value to
> be written, where as or-ing a bit value into a word requires a temporary
> register or two.
> 5. In a parallel GC, setting bits in a mask requires a synchronized
> update.  My design doesn't.
>
> The point is an unconditional store to a structure you are already
> accessing is very cheap, whereas register spills, loads and synchronized
> updates are expensive.


I might be missing something here.  But my take is that Robin's design is
really the best one.  Its probably in the noise but it might be possible to
even reduce the overhead of clearing the vtable "mark" by using a epoch
number instead of a boolean.  The idea is that after every major GC,
increment the value used for the mark.  When sweeping the vtables, the stale
mark values are the unreachable classes.

cheers
>
> > On 10/31/06, Robin Garner <ro...@anu.edu.au> wrote:
> >>
> >> Actually, just thinking about how I would implement this in JikesRVM, I
> >> would use the reachability based algorithm, but piggyback on the
> >> existing GC mechanisms:
> >>
> >> - Allocate a byte (or word) in each vtable for the purpose of tracking
> >> class reachability.
> >> - Periodically, at a time when no GC is running (even the most
> >> aggressive concurrent GC algorithms have these, I believe), zero this
> >> flag across all vtables.  This is the beginning of a class-unloading
> >> epoch.
> >> - During each GC, when the GC is fetching the GC map for an object,
> >> *unconditionally* write a value to the class reachability byte.  It may
> >> make sense for this byte to be in either the first cache-line of the
> >> vtable, or the cache line that points to the GC map - just make sure
> the
> >> mark operation doesn't in general fetch an additional cache line.
> >> - At a point in the sufficiently far future, when all reachable objects
> >> are known to have been traced by the GC, sweep the vtables and check
> the
> >> reachability of the classloaders.
> >>
> >> The features of this approach are:
> >>
> >> - Minimal additional work at GC time.  The additional write will cause
> >> some additional memory traffic, but a) it's to memory that is already
> >> guaranteed to be in L1 cache, and b) it's an unconditional independent
> >> write, and c) multiple writes to common classes will be absorbed by the
> >> write buffer.
> >>
> >> - Space cost of at most 1 word per vtable.
> >>
> >> - This works whether vtables are objects or VM structures
> >>
> >> - If the relationship between a class and a vtable is not 1:1, this
> only
> >> need affect the periodic sweep process, which should be infrequent and
> >> small compared to a GC.
> >>
> >> - shouldn't need a stop-the-world at any point.
> >>
> >> I've implemented and tested the GC-relevant part of this in JikesRVM,
> >> and the GC time overhead appears to be just under 1% in the MMTk
> >> MarkSweep collector.
> >>
> >> cheers,
> >> Robin
> >>
> >
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Rana Dasgupta wrote:
> Robin,
>    The basic difference of this with Etienne's method is that the flag is
> on the vtable, instead of the CL instance. Do I understand correctly ? The
> GC perf impact is therefore reduced because you need to lookup
> object->vtable instead of object->class->CLinstancewhen tracing the heap.
> The space overhead is correspondingly slightly higher. Also the GC impact
> may look lower because there are a couple of pseudo GC cycles to reset the
> vtables and sweep the vtables.
> 
> Thanks,
> Rana

The relevant part of Etienne's design I believe is this:

> 7- Each class loader structure maintains a set of boolean flags, one
>  flag per "non-nursery" garbage collected area (even when thread-local
>  heaps are used).  The flag is set when an instance of a class loaded by
>  this class leader is moved into the related GC-area.  The flag is unset
>  when the GC-area is emptied, or (optionally) when it can be determined
>  that no instance of a class loaded by this class loader remains in the
>  GC-area.  This is best implemented as follows: a) use an unconditional
>  write of "true" in the flag every time an object is moved into the
>  GC-area by the garbage collector, b) unset the related flag in "all"
>  class loader structures just before collecting a GC-area, then setting
>  the flag back when an object survives in the area.

My design differs in several key ways from this:
1. There is no requirement for a per non-nursery area flag
2. The mark byte/word is set unconditionally whenever an object is 
visited by the GC, not when an object is moved into a particular mature 
space.  This may be the same for some GCs, but not all.
3. The mark byte/word is an unconditional write - Etienne's proposal 
would use a load/mask/write sequence.  This is performance critical.
4. My memory of x86 assembler is a little rusty, but I believe a 
constant store can be done without requiring a register for the value to 
be written, where as or-ing a bit value into a word requires a temporary 
register or two.
5. In a parallel GC, setting bits in a mask requires a synchronized 
update.  My design doesn't.

The point is an unconditional store to a structure you are already 
accessing is very cheap, whereas register spills, loads and synchronized 
updates are expensive.

cheers

> On 10/31/06, Robin Garner <ro...@anu.edu.au> wrote:
>>
>> Actually, just thinking about how I would implement this in JikesRVM, I
>> would use the reachability based algorithm, but piggyback on the
>> existing GC mechanisms:
>>
>> - Allocate a byte (or word) in each vtable for the purpose of tracking
>> class reachability.
>> - Periodically, at a time when no GC is running (even the most
>> aggressive concurrent GC algorithms have these, I believe), zero this
>> flag across all vtables.  This is the beginning of a class-unloading
>> epoch.
>> - During each GC, when the GC is fetching the GC map for an object,
>> *unconditionally* write a value to the class reachability byte.  It may
>> make sense for this byte to be in either the first cache-line of the
>> vtable, or the cache line that points to the GC map - just make sure the
>> mark operation doesn't in general fetch an additional cache line.
>> - At a point in the sufficiently far future, when all reachable objects
>> are known to have been traced by the GC, sweep the vtables and check the
>> reachability of the classloaders.
>>
>> The features of this approach are:
>>
>> - Minimal additional work at GC time.  The additional write will cause
>> some additional memory traffic, but a) it's to memory that is already
>> guaranteed to be in L1 cache, and b) it's an unconditional independent
>> write, and c) multiple writes to common classes will be absorbed by the
>> write buffer.
>>
>> - Space cost of at most 1 word per vtable.
>>
>> - This works whether vtables are objects or VM structures
>>
>> - If the relationship between a class and a vtable is not 1:1, this only
>> need affect the periodic sweep process, which should be infrequent and
>> small compared to a GC.
>>
>> - shouldn't need a stop-the-world at any point.
>>
>> I've implemented and tested the GC-relevant part of this in JikesRVM,
>> and the GC time overhead appears to be just under 1% in the MMTk
>> MarkSweep collector.
>>
>> cheers,
>> Robin
>>
>

Re: [drlvm] Class unloading support - tested one approach

Posted by Rana Dasgupta <rd...@gmail.com>.

Robin,
    The basic difference of this with Etienne's method is that the flag is
on the vtable, instead of the CL instance. Do I understand correctly ? The
GC perf impact is therefore reduced because you need to lookup
object->vtable instead of object->class->CLinstancewhen tracing the heap.
The space overhead is correspondingly slightly higher. Also the GC impact
may look lower because there are a couple of pseudo GC cycles to reset the
vtables and sweep the vtables.

Thanks,
Rana





On 10/31/06, Robin Garner <ro...@anu.edu.au> wrote:
>
> Actually, just thinking about how I would implement this in JikesRVM, I
> would use the reachability based algorithm, but piggyback on the
> existing GC mechanisms:
>
> - Allocate a byte (or word) in each vtable for the purpose of tracking
> class reachability.
> - Periodically, at a time when no GC is running (even the most
> aggressive concurrent GC algorithms have these, I believe), zero this
> flag across all vtables.  This is the beginning of a class-unloading
> epoch.
> - During each GC, when the GC is fetching the GC map for an object,
> *unconditionally* write a value to the class reachability byte.  It may
> make sense for this byte to be in either the first cache-line of the
> vtable, or the cache line that points to the GC map - just make sure the
> mark operation doesn't in general fetch an additional cache line.
> - At a point in the sufficiently far future, when all reachable objects
> are known to have been traced by the GC, sweep the vtables and check the
> reachability of the classloaders.
>
> The features of this approach are:
>
> - Minimal additional work at GC time.  The additional write will cause
> some additional memory traffic, but a) it's to memory that is already
> guaranteed to be in L1 cache, and b) it's an unconditional independent
> write, and c) multiple writes to common classes will be absorbed by the
> write buffer.
>
> - Space cost of at most 1 word per vtable.
>
> - This works whether vtables are objects or VM structures
>
> - If the relationship between a class and a vtable is not 1:1, this only
> need affect the periodic sweep process, which should be infrequent and
> small compared to a GC.
>
> - shouldn't need a stop-the-world at any point.
>
> I've implemented and tested the GC-relevant part of this in JikesRVM,
> and the GC time overhead appears to be just under 1% in the MMTk
> MarkSweep collector.
>
> cheers,
> Robin
>

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Ivan Volosyuk wrote:
> We will get rid of false sharing. That's true. But it still be quite
> expensive to write those '1' values, because of ping-ponging of the
> cache line between processors. I see only one solution to this: use
> separate mark bits in vtable per GC thread which should reside in
> different cache lines and different from that word containing gcmap
> pointer.

Thinking about it...  Doesn't the "object vtable" suffer from the same
problem, anyway?  It's probably worse, as it will be quite unfeasible to
try to locate them in the "right" cache lines!  Yep, another point
against object-vtables...

Etienne
-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Ivan Volosyuk wrote:
> On 11/9/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>> Ivan Volosyuk wrote:
>> > We will get rid of false sharing. That's true. But it still be quite
>> > expensive to write those '1' values, because of ping-ponging of the
>> > cache line between processors. I see only one solution to this: use
>> > separate mark bits in vtable per GC thread which should reside in
>> > different cache lines and different from that word containing gcmap
>> > pointer.
>>
>> The only thing that a GC thread does is write "1" in this slot; it never
>> writes "0".  So, it is not very important in what order (or even "when")
>> this word is finally commited to main memory.  As long as there is some
>> barrier before the "end of epoch collection" insuring that all
>> processors cache write buffers are commited to memory before tracing
>> vtables (or gc maps).
>>
>> You don't need memory coherency on write-without-read. :-)
> 
> I don't speak about memory coherency, I speak about bus load with
> useless memory traffic between processors and poor CPU cache usage.
> 
Surely this wouldn't happen in a sufficiently weak memory model ?  Lets 
just not support x64 :-)

But I think this false sharing may be what kills this particular idea.
The next cheapest option should be to use a side array of bytes - as 
long as calculating the address of the mark byte can be done without any 
loads or register spills, it should still be cheaper than a full 
test-and-mark operation on the vtable.  I guess there are cache policies 
where this may still be slow on an SMP machine.

Side metadata is easiest to do when objects are in a specific space, and 
has coarse alignment.  Any ideas what the typical size of a DRLVM vtable 
is ?  Would 256 bytes be an excessive alignment boundary ?

I'll try it out in the next day or so.  Unfortunately I don't have 
access to anything with more parallelism than a Pentium D, so it's not 
likely to be conclusive.

-- 
Robin Garner
Dept. of Computer Science
Australian National University
http://cs.anu.edu.au/people/Robin.Garner/

Re: [drlvm] Class unloading support - tested one approach

Posted by Ivan Volosyuk <iv...@gmail.com>.

On 11/9/06, Etienne Gagnon <eg...@sablevm.org> wrote:
> Ivan Volosyuk wrote:
> > We will get rid of false sharing. That's true. But it still be quite
> > expensive to write those '1' values, because of ping-ponging of the
> > cache line between processors. I see only one solution to this: use
> > separate mark bits in vtable per GC thread which should reside in
> > different cache lines and different from that word containing gcmap
> > pointer.
>
> The only thing that a GC thread does is write "1" in this slot; it never
> writes "0".  So, it is not very important in what order (or even "when")
> this word is finally commited to main memory.  As long as there is some
> barrier before the "end of epoch collection" insuring that all
> processors cache write buffers are commited to memory before tracing
> vtables (or gc maps).
>
> You don't need memory coherency on write-without-read. :-)

I don't speak about memory coherency, I speak about bus load with
useless memory traffic between processors and poor CPU cache usage.

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support - tested one approach

Posted by Salikh Zakirov <Sa...@Intel.com>.

Etienne Gagnon wrote:
> Ivan Volosyuk wrote:
>> We will get rid of false sharing. That's true. But it still be quite
>> expensive to write those '1' values, because of ping-ponging of the
>> cache line between processors. I see only one solution to this: use
>> separate mark bits in vtable per GC thread which should reside in
>> different cache lines and different from that word containing gcmap
>> pointer.
> 
> The only thing that a GC thread does is write "1" in this slot; it never
> writes "0".  So, it is not very important in what order (or even "when")
> this word is finally commited to main memory.  As long as there is some
> barrier before the "end of epoch collection" insuring that all
> processors cache write buffers are commited to memory before tracing
> vtables (or gc maps).

The "false sharing" problem occurs whenever one processor writes into
the cache line other processors read from, because it invalidates loaded
copies and makes other processors read it again from main memory.
It doesn't matter if they write 1 or 0

In our case, both gcmap pointer and gcmap itself are likely to be read
by multiple processors, so writing to a location nearby may lead
to false sharing.

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Ivan Volosyuk wrote:
> We will get rid of false sharing. That's true. But it still be quite
> expensive to write those '1' values, because of ping-ponging of the
> cache line between processors. I see only one solution to this: use
> separate mark bits in vtable per GC thread which should reside in
> different cache lines and different from that word containing gcmap
> pointer.

The only thing that a GC thread does is write "1" in this slot; it never
writes "0".  So, it is not very important in what order (or even "when")
this word is finally commited to main memory.  As long as there is some
barrier before the "end of epoch collection" insuring that all
processors cache write buffers are commited to memory before tracing
vtables (or gc maps).

You don't need memory coherency on write-without-read. :-)

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Ivan Volosyuk <iv...@gmail.com>.

On 11/9/06, Etienne Gagnon <eg...@sablevm.org> wrote:
> Salikh Zakirov wrote:
> > Technically, it should not be too difficult to add an additional field to the VTable
> > structure, and require GC to write 1 there during object scanning.
> > However, if the VTable mark is located in the same cache line as gcmap,
> > it may severely hit parallel GC performance on a multiprocessor due to false sharing,
> > as writing VTable mark will invalidate the gcmap pointers loaded to caches of other
> > processors.
> >
> >    object            VTable                   gcmap
> >  +--------+        +-----------+            +------------------+
> >  | VT ptr |------->| gcmap ptr |----------->| offset of ref #1 |
> >  |  ...   |        |    ...    |            | offset of ref #2 |
> >  +--------+        +-----------+            |       ...        |
> >                                             |        0         |
> >                                             +------------------+
>
> If you go that far for every scanned object (!), then you could simply
> place the class unloading bit in the gc map, at index -1) to minimize
> disruption of current code...
>
>    object            VTable                   gcmap
>                                             +------------------+
>  +--------+        +-----------+            | cl.un. mark bit  |
>  | VT ptr |------->| gcmap ptr |----------->| offset of ref #1 |
>  |  ...   |        |    ...    |            | offset of ref #2 |
>  +--------+        +-----------+            |       ...        |
>                                             |        0         |
>                                             +------------------+
>
> This gets rid of the cache line hazard...

We will get rid of false sharing. That's true. But it still be quite
expensive to write those '1' values, because of ping-ponging of the
cache line between processors. I see only one solution to this: use
separate mark bits in vtable per GC thread which should reside in
different cache lines and different from that word containing gcmap
pointer.

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Salikh Zakirov wrote:
> Technically, it should not be too difficult to add an additional field to the VTable
> structure, and require GC to write 1 there during object scanning.
> However, if the VTable mark is located in the same cache line as gcmap,
> it may severely hit parallel GC performance on a multiprocessor due to false sharing,
> as writing VTable mark will invalidate the gcmap pointers loaded to caches of other
> processors. 
> 
>    object            VTable                   gcmap
>  +--------+        +-----------+            +------------------+
>  | VT ptr |------->| gcmap ptr |----------->| offset of ref #1 |
>  |  ...   |        |    ...    |            | offset of ref #2 |
>  +--------+        +-----------+            |       ...        |
>                                             |        0         |
>                                             +------------------+

If you go that far for every scanned object (!), then you could simply
place the class unloading bit in the gc map, at index -1) to minimize
disruption of current code...

   object            VTable                   gcmap
                                            +------------------+
 +--------+        +-----------+            | cl.un. mark bit  |
 | VT ptr |------->| gcmap ptr |----------->| offset of ref #1 |
 |  ...   |        |    ...    |            | offset of ref #2 |
 +--------+        +-----------+            |       ...        |
                                            |        0         |
                                            +------------------+

This gets rid of the cache line hazard...

Why don't you also investigate using SableVM's bidirectional object
layout?  Dayong Gu (a Ph.D. student tha I co-supervise) found that it
was quite simple to implement in JikesRVM.  I don't see why it should be
harder to implement in drlvm...  It would save you this nasty
indirection for scanning objects!  [See my Ph.D. thesis for an
explanation of the layout.  You can get in touch with Dayong through the
coordinates at http://www.sable.mcgill.ca/people/ ].

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Salikh Zakirov <Sa...@Intel.com>.

Robin Garner wrote:
> Etienne Gagnon wrote:
>> 3- Why would it be so hard to add an unconditional write operation
>> during collection (e.g. during copying or marking of an object) in
>> drlvm?  A detailed technical explanation is welcome. :-)
> 
> I actually believe that this should be implementable in a GC-neutral
> way, whether vtables are objects or not.  The GC will at some point ask
> the VM for the GC map of the object it is about to scan.  At this point
> the VM can write the mark of the vtable.
> 
> I guess I'm making an assumption about the GC -> VM interface here, but
> if it doesn't exist, it should :)

In the current GC-VM interface, which is used in DRLVM
(see vm/include/open/gc.h and vm/include/open/vm_gc.h),
the GC never asks VM about gcmap; instead, it is building a gcmap
itself as one of the class loading steps. VM calls gc_class_prepared()
for each loaded class, and GC uses various query functions to learn
about types and offsets of object fields.

The gcmap pointer is stored in the VTable, in the several bytes reserved specifically
for the GC use.

Technically, it should not be too difficult to add an additional field to the VTable
structure, and require GC to write 1 there during object scanning.
However, if the VTable mark is located in the same cache line as gcmap,
it may severely hit parallel GC performance on a multiprocessor due to false sharing,
as writing VTable mark will invalidate the gcmap pointers loaded to caches of other
processors. 

   object            VTable                   gcmap
 +--------+        +-----------+            +------------------+
 | VT ptr |------->| gcmap ptr |----------->| offset of ref #1 |
 |  ...   |        |    ...    |            | offset of ref #2 |
 +--------+        +-----------+            |       ...        |
                                            |        0         |
                                            +------------------+

(* actually, in the current default collector "gc_cc",
 gcmap ptr also has some flags in lower 3 bits, and gcmap has some fields
before offsets array as well *)

That's why we probably would want to have the VTable mark be separated enough
from both gcmap pointer and the gcmap itself. 

>> By the way, what are the currently competing proposals?
>> 1- object vtables
>> 2- Robin/Gagnon proposal  (still finishing up some details ;-)
>> 3- Is there a 3rd?

yes, as far as I heard from Aleksey Ignatenko, there was 3rd prototype in works,
which worked as a completely independent from the GC stop-the-world phase,
tracing the heap and marking classes and classloaders specially.
The tracing functionality was reimplemented within VM without any GC changes.
The stop-the-world phase was piggy-backed into some collections.

And yet before the 3rd prototype, there was one more, which was different
in the tracing implementation. It used GC->VM callback on each object scan.

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Etienne Gagnon wrote:
> 3- Why would it be so hard to add an unconditional write operation
> during collection (e.g. during copying or marking of an object) in
> drlvm?  A detailed technical explanation is welcome. :-)

I actually believe that this should be implementable in a GC-neutral 
way, whether vtables are objects or not.  The GC will at some point ask 
the VM for the GC map of the object it is about to scan.  At this point 
the VM can write the mark of the vtable.

I guess I'm making an assumption about the GC -> VM interface here, but 
if it doesn't exist, it should :)

>   So far, this latest point (3-) seems the sole argument in favor of
>   using the "object-vtables" approach.  Wouldn't the right fix, if
>   it's currently really impossible to implement an unconditional
>   write, be to extend the drlvm GC interface?  Isn't this a design
>   deficiency in the GC interface?  No other argument, so far, seems
>   to be in favor of the object vtable approach, unless I missed some.
> 
> 
> As for Robin's attempt to deal with weak/hard reference to the class
> loader object using a Reference Queue, I am not yet convinced of the
> correctness of the approach...  [off the top of my head: potential
> problem with static synchronized methods].  And, this would be probably
> more work intensive (so, less efficient) than my proposal in 1-[*]
> above.  It might also be tricky to identify all possible situations such
> as Object.getClass() where some special code is needed to deal with a
> problem situation.  I prefer clean, scope-limited code for dealing with
> class unloading.  It's easier, that way, to activate or deactivate class
> loading [dynamically!].

Glad to see I'm not the only one :)  It really was just off the top of 
my head.

> In summary, I would like to be convinced of the "completeness" and of
> the "correctness" of all competing approaches.  I personally am, so far,
> in favor of Robin's unconditional vtable bit/byte/word write idea, along
> with an "adapted" version of my proposal for dealing with class loader
> death (such as proposed in 1-[*] above).
> 
> Also, if somebody was able to find a "correctness" deficiency in my
> proposal, then please let us know, so that we make sure this deficiency
> is eliminated from all competing proposals.
> 
> By the way, what are the currently competing proposals?
> 1- object vtables
> 2- Robin/Gagnon proposal  (still finishing up some details ;-)
> 3- Is there a 3rd?
> 
> Which ones have existing implementations?  How "correct/complete" are
> they?  Do we have access to some "human readable" (i.e. non-code) full
> description of the algorithm?

And what is their performance hit ?

> Etienne
> 
> Robin wrote:
>> On Thu, 2006-11-09 at 02:01 +0300, Ivan Volosyuk wrote:
>>
>>> Robin,
>>>
>>> thank you for detailed description of the algorithm. IMHO, this was
>>> the most complicated place of the whole story: how to have a weak
>>> reference to classloader and still be able to get it alive again. This
>>> shouldn't be performance critical part and is quite doable. I
>>> absolutely agree with your estimations about tracing extra reference
>>> per object. The approach you propose is more efficient and quite
>>> elegant.
>>> --
>>> Ivan
>>
>> Thanks :)
>>
>>
>>> On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
>>>
>>>> Robin Garner wrote:
>>>>
>>>>> Aleksey Ignatenko wrote:
>>>>>
>>>>>> Robin.
>>>>>>
>>>>>>
>>>>>>> OK, well how about keeping a weak reference to the >j.l.ClassLoader
>>>>>>> object instead of a strong one.  When the reference >becomes (strong)ly
>>>>>>> unreachable, invoke the class-unloading phase.
>>>>>>
>>>>>> If you have weak reference to j.l.Classloader - GC will collect it
>>>>>> (with all
>>>>>> appropriate jlClasses) as soon as there are no references to
>>>>>> j.l.Classloaderand appropriate classes. But there is possible
>>>>>> situation when there are some
>>>>>> live objects of that classes and no references to jlClassloader and
>>>>>> jlClasses. This will lead to unpredictable consequences (crash, etc).
>>>>>>
>>>>>>
>>>>>>
>>>>>> I want to remind that there 3 mandatory conditions of class unloading:
>>>>>>
>>>>>> 1. j.l.Classloader instance is unreachable.
>>>>>>
>>>>>> 2. Appropriate j.l.Class instances are unreachable.
>>>>>>
>>>>>> 3. No object of any class loaded by appropriate class loader exists.
>>>>> Let me repeat.  I offer an efficient solution to (3).  I don't purport
>>>>> to have a solution to (1) and (2).
>>>> Let me just add:  This is because I don't think (1) or (2) are
>>>> particularly difficult from a performance point of view, although I'm
>>>> happy to accept that there may still be some subtle engineering challenges.
>>>>
>>>> Now this is just off the top of my head, but what about this for a design:
>>>> - A j.l.ClassLoader maintains a collection of each of the classes it has
>>>> loaded
>>>> - A j.l.Class contains a pointer to its j.l.ClassLoader
>>>> - A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1).
>>>> The point of this is that a class loader and its classes are a 'self
>>>> sustaining' data structure - if one element in it is reachable the whole
>>>> thing is reachable.
>>>>
>>>> The VM maintains a weak reference to all its j.l.ClassLoader instances,
>>>> and maintains a ReferenceQueue for weakly-reachable classloaders.
>>>> ClassLoaders are placed on the ReferenceQueue if and only if they are
>>>> unreachable from the heap (including via their j.l.Class objects).  Note
>>>> this is an irreversible condition: objects that are unreachable can
>>>> never become reachable again, except through very specific methods.
>>>>
>>>> When it sweeps the ReferenceQueue for unreachable classloaders, the VM
>>>> places the unreachable classloaders in a queue of classloaders that are
>>>> candidates for unloading.  This queue is part of the root set of the VM.
>>>>  A classloader in this queue is unreachable from the heap, and can be
>>>> unloaded when there are no objects of any class it has loaded.
>>>>
>>>> This is where my mechanism comes into play.
>>>>
>>>> If an object executes getClass() then its classloader is removed from
>>>> the unloadable classloader queue, its weak reference gets recreated  and
>>>> we're back at the initial state.  My guess is that this is a pretty
>>>> infrequent method call.
>>>>
>>>> I think this stage of the algorithm is easy in performance terms -
>>>> difficult in terms of proving correctness, but if you have an efficient
>>>> reachability mechanism for classes I think the building blocks are
>>>> there, and the subtleties are nothing that a talented engineer can't solve.
>>>>
>>>>
>>>> I'm not 100% sure what your counter-proposal is: I recall 2 approaches
>>> >from the mailing list:
>>>> 1) Each object has an additional word in its header that points back to
>>>>    its j.l.Class object, and we proceed from here.
>>>>
>>>> Given that the mean object size is ~28 bytes, this proposal adds 14% to
>>>> each object size.  This increases the frequency of GC by 14% and incurs
>>>> a 14% slowdown.  Of course this is an oversimplification but a 14%
>>>> slowdown is a pretty lousy starting point to argue from.
>>>>
>>>> 2) The existing pointer in the GC header is traced during GC time.
>>>>
>>>> The average number of pointers per object (excluding the vtable) is
>>>> between 1.5 and 2 for the majority of benchmarks I have looked at
>>>> (footnote: if you know something different, drop me a line) (geometric
>>>> mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}).  Tracing one
>>>> additional reference per object will therefore increase the cost of GC
>>>> by ~60% on average.  Again oversimplification but indicative.  If we
>>>> assume that GC accounts for 10% of runtime (more or less depending on
>>>> heap size), this is a runtime overhead of 6%.
>>>>
>>>> My proposal has been measured at ~1% overhead in GC time, or 0.1% in
>>>> execution time (caveats as above).  If there is some complexity in
>>>> establishing classloader reachability from this basis, I would assume it
>>>> can easliy be absorbed.
>>>>
>>>> Therefore I think my proposal, while not complete, can form the basis of
>>>> an efficient complete system for class unloading.
>>>>
>>>> (PS: I'd *love* to be proven wrong)
>>>>
>>>> cheers,
>>>> Robin
>>>>
>>>>
>>>>> Regards,
>>>>> Robin
>>>>>
>>>>>
>>>>>> Aleksey.
>>>>>>
>>>>>>
>>>>>> On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
>>>>>>
>>>>>>> Pavel Pervov wrote:
>>>>>>>
>>>>>>>> Robin,
>>>>>>>>
>>>>>>>> The kind of model I had in mind was along the lines of:
>>>>>>>>
>>>>>>>>> - VM maintains a linked list (or other collection type) of the
>>>>>>> currently
>>>>>>>
>>>>>>>>> loaded classloaders, each of which in turn maintains the
>>>>>>> collection of
>>>>>>>
>>>>>>>>> classes loaded by that type.  The sweep of classloaders goes
>>>>>>> something
>>>>>>>
>>>>>>>>> like:
>>>>>>>>>
>>>>>>>>> for (ClassLoader cl : classLoaders)
>>>>>>>>>  for (Class c : cl.classes)
>>>>>>>>>    cl.reachable |= c.vtable.reachable
>>>>>>>>
>>>>>>>> This is not enough. There are may be live j/l/Class'es and
>>>>>>>> j/l/Classloader's
>>>>>>>> in the heap. Even though no objects of any classes loaded by a
>>>>>>> particual
>>>>>>>
>>>>>>>> class loader are available in the heap, if we have live reference to
>>>>>>>> j/l/ClassLoader itself, it just can't be unloaded.
>>>>>>> OK, well how about keeping a weak reference to the j.l.ClassLoader
>>>>>>> object instead of a strong one.  When the reference becomes (strong)ly
>>>>>>> unreachable, invoke the class-unloading phase.
>>>>>>>
>>>>>>> To me the key issue from a performance POV is the reachability of
>>>>>>> classes from objects in the heap.  I don't pretend to have an answer to
>>>>>>> the other questions---the performance critical one is the one I have
>>>>>>> addressed, and I accept there may be many solutions to this part of the
>>>>>>> question.
>>>>>>>
>>>>>>>
>>>>>>>> I believe that a separate heap trace pass, different from the standard
>>>>>>>>
>>>>>>>>> GC, that visited vtables and reachable resources from there would
>>>>>>> also
>>>>>>>
>>>>>>>>> be a viable solution.  As mentioned in an earlier post, writing
>>>>>>> this in
>>>>>>>
>>>>>>>
>>>>>>>>> MMTk (where a heap trace operation is a class that you can easily
>>>>>>>>> subtype to do this) would be easy.
>>>>>>>>>
>>>>>>>>> One of the advantages of my other proposal is that it can be
>>>>>>> implemented
>>>>>>>
>>>>>>>>> in the VM independent of the GC to some extent.  This additional
>>>>>>>>> mark/scan phase may or may not be easy to implement, depending on the
>>>>>>>>> structure of DRLVM GCs, which is something I haven't explored.
>>>>>>>>
>>>>>>>> DRLVM may work with (potentially) any number of GCs. Designing class
>>>>>>>> unloading the way, which would require mark&scan cooperation from
>>>>>>> GC, is
>>>>>>>
>>>>>>>> not
>>>>>>>> generally a good idea (from my HPOV).
>>>>>>> That's what I gathered.  hence my proposal.
>>
>>
> 


-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Etienne Gagnon wrote:
> I was making it more complex than it needs...
> 
> Here's an improvement...
> 
> 1- During normal operation, the VM keeps hard references to all class
> loader instances.  [This prevents any premature class loader death].
> 
> 2- At the start of an epoch (or just before), all vtable bits (or byte
> or word) are cleared. [From now on, I will use the "bit" terminology for
> simplicity.  The bit may reside in an otherwise unused byte or even
> word, for efficiency purpose].
> 
> 3- The end of an epoch happens "no sooner" than when all generations /
> heap parts have been collected at least once since the epoch start.
> [One can cheat and visit objects of uncollected parts/generations to
> mark their vtables].
> 
> 4- An "old generation" collection is chosen as the end of an epoch.
> This is "end of epoch collection".  [As class loaders/classes are likely
> to have moved to older generations, there's no point trying to kill them
> in young collections].

In fact classes and clasloaders would be perfect targets for pretenuring.

> 5- Just before starting the "end of epoch collection", all the
> class-loader vtable lists are visited (and bits are cleared in prevision
> of the next epoch).  All vm references to [candidate] class loaders with
> no surviving objects (nor active methods) (e.g. no vtable bit set) are
> made "weak".
> 
> 6- The "end of epoch collection" is launched.
> 
> 7- There's actually no need for "rescuing" class loaders.  The vm
> reference to any surviving [candidate] class loader is made hard again.
>    Interesting fact: other candidate class loaders cannot have any
> instance (nor any active method) as GC doesn't create instances nor
> method calls.  So, there's no need for a rescuing dance!  The list of
> dying class loaders can be used for freeing related native resources.
> 
> IMO: simple, clean, efficient...

It has the downside of being inherently 'stop the world', though.  I 
don't see this as being a big disadvantage, because it shouldn't be hard 
(compared to the work of building a concurrent collector in the first 
place) to extend to a concurrent class-unloader.

> Etienne
> 
> Etienne Gagnon wrote:
>>  If it does, then can somebody explain to me what's wrong with
>>  my proposal of setting, in normal operation, a "hard" reference to
>>  class loader objects, and then temporarily using weak, rescuable
>>  reference to enable class loader collection?  I don't see a performance
>>  hog there.  Rescuing a few class loaders (if any) and their related
>>  classes once per epoch shouldn't cost much!  I have yet to see a
>>  convincing explanation of how continuous collection of "object-vtables"
>>  would be more efficient...
>>
>>  Really, even with Robin's proposal, this would work.  If a class loader
>>  gets into an unloadable state, then most likely, the class loader and
>>  all classes it has loaded will have migrated to an old generation.  So,
>>  as long as we set then end of a class unloading epoch at an old
>>  generation collection, then we can simply "weaken" the class loader
>>  reference during collection (only when the bit of all related vtables
>>  are unset), then apply the finalization-like rescue dance to class
>>  loaders.
>>
>>  [*]This wouldn't affect any other operation, during other GC cycles, as
>>  Robin's unconditional bit/byte/word vtable write only serves to tell us
>>  whether a class loader has had living instances of its classes during
>>  the epoch.
> 


-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Note:  For preventing collection of class loaders related to active
method frames, there are various solutions.  One could simply walk all
method frame stacks just before the end of epoch collection (my
preferred approach) and mark the bit of related vtables.  Another
approach would be to add an unconditional write on every method call
(that would be a big tax to pay!).  I'll let you imagine all the
variations on that theme. :-)

Etienne

Etienne Gagnon wrote:
> I was making it more complex than it needs...
> 
> Here's an improvement...
> 

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

I was making it more complex than it needs...

Here's an improvement...

1- During normal operation, the VM keeps hard references to all class
loader instances.  [This prevents any premature class loader death].

2- At the start of an epoch (or just before), all vtable bits (or byte
or word) are cleared. [From now on, I will use the "bit" terminology for
simplicity.  The bit may reside in an otherwise unused byte or even
word, for efficiency purpose].

3- The end of an epoch happens "no sooner" than when all generations /
heap parts have been collected at least once since the epoch start.
[One can cheat and visit objects of uncollected parts/generations to
mark their vtables].

4- An "old generation" collection is chosen as the end of an epoch.
This is "end of epoch collection".  [As class loaders/classes are likely
to have moved to older generations, there's no point trying to kill them
in young collections].

5- Just before starting the "end of epoch collection", all the
class-loader vtable lists are visited (and bits are cleared in prevision
of the next epoch).  All vm references to [candidate] class loaders with
no surviving objects (nor active methods) (e.g. no vtable bit set) are
made "weak".

6- The "end of epoch collection" is launched.

7- There's actually no need for "rescuing" class loaders.  The vm
reference to any surviving [candidate] class loader is made hard again.
   Interesting fact: other candidate class loaders cannot have any
instance (nor any active method) as GC doesn't create instances nor
method calls.  So, there's no need for a rescuing dance!  The list of
dying class loaders can be used for freeing related native resources.

IMO: simple, clean, efficient...

Etienne

Etienne Gagnon wrote:
>  If it does, then can somebody explain to me what's wrong with
>  my proposal of setting, in normal operation, a "hard" reference to
>  class loader objects, and then temporarily using weak, rescuable
>  reference to enable class loader collection?  I don't see a performance
>  hog there.  Rescuing a few class loaders (if any) and their related
>  classes once per epoch shouldn't cost much!  I have yet to see a
>  convincing explanation of how continuous collection of "object-vtables"
>  would be more efficient...
> 
>  Really, even with Robin's proposal, this would work.  If a class loader
>  gets into an unloadable state, then most likely, the class loader and
>  all classes it has loaded will have migrated to an old generation.  So,
>  as long as we set then end of a class unloading epoch at an old
>  generation collection, then we can simply "weaken" the class loader
>  reference during collection (only when the bit of all related vtables
>  are unset), then apply the finalization-like rescue dance to class
>  loaders.
> 
>  [*]This wouldn't affect any other operation, during other GC cycles, as
>  Robin's unconditional bit/byte/word vtable write only serves to tell us
>  whether a class loader has had living instances of its classes during
>  the epoch.

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

[First, let me say that, as I am not contributing a class unloading
*implementation* to drlvm, I will understand if the project was more
inclined to chose an actually contributed piece of code over a design
without contributed implementation. :-)]

There was a "-1" vote...  Hmmm...  As I voted "+1", I will enter the
arena... :-)

Before getting farther in this class unloading discussion, I would like
to get some clarifications about 3 things:

1- How does drlvm implement object finalization?  Doesn't this require
some "object" rescuing, much similarly to my proposal for correctly
implementing class unloading (the class loader rescuing thing)?

 If it does, then can somebody explain to me what's wrong with
 my proposal of setting, in normal operation, a "hard" reference to
 class loader objects, and then temporarily using weak, rescuable
 reference to enable class loader collection?  I don't see a performance
 hog there.  Rescuing a few class loaders (if any) and their related
 classes once per epoch shouldn't cost much!  I have yet to see a
 convincing explanation of how continuous collection of "object-vtables"
 would be more efficient...

 Really, even with Robin's proposal, this would work.  If a class loader
 gets into an unloadable state, then most likely, the class loader and
 all classes it has loaded will have migrated to an old generation.  So,
 as long as we set then end of a class unloading epoch at an old
 generation collection, then we can simply "weaken" the class loader
 reference during collection (only when the bit of all related vtables
 are unset), then apply the finalization-like rescue dance to class
 loaders.

 [*]This wouldn't affect any other operation, during other GC cycles, as
 Robin's unconditional bit/byte/word vtable write only serves to tell us
 whether a class loader has had living instances of its classes during
 the epoch.

2- I would like to read some "full" description of the object-vtable
proposal.  In particular, how does this proposal deal with the following:
 a) Preventing unloading of a class which has an active static method.
 b) Preventing unloading of a class which has an unsynchronized instance
method active, which has overwritten local variable 0 (or for which the
liveness analysis has detected the death of the reference to "this").

3- Why would it be so hard to add an unconditional write operation
during collection (e.g. during copying or marking of an object) in
drlvm?  A detailed technical explanation is welcome. :-)

  So far, this latest point (3-) seems the sole argument in favor of
  using the "object-vtables" approach.  Wouldn't the right fix, if
  it's currently really impossible to implement an unconditional
  write, be to extend the drlvm GC interface?  Isn't this a design
  deficiency in the GC interface?  No other argument, so far, seems
  to be in favor of the object vtable approach, unless I missed some.

As for Robin's attempt to deal with weak/hard reference to the class
loader object using a Reference Queue, I am not yet convinced of the
correctness of the approach...  [off the top of my head: potential
problem with static synchronized methods].  And, this would be probably
more work intensive (so, less efficient) than my proposal in 1-[*]
above.  It might also be tricky to identify all possible situations such
as Object.getClass() where some special code is needed to deal with a
problem situation.  I prefer clean, scope-limited code for dealing with
class unloading.  It's easier, that way, to activate or deactivate class
loading [dynamically!].

In summary, I would like to be convinced of the "completeness" and of
the "correctness" of all competing approaches.  I personally am, so far,
in favor of Robin's unconditional vtable bit/byte/word write idea, along
with an "adapted" version of my proposal for dealing with class loader
death (such as proposed in 1-[*] above).

Also, if somebody was able to find a "correctness" deficiency in my
proposal, then please let us know, so that we make sure this deficiency
is eliminated from all competing proposals.

By the way, what are the currently competing proposals?
1- object vtables
2- Robin/Gagnon proposal  (still finishing up some details ;-)
3- Is there a 3rd?

Which ones have existing implementations?  How "correct/complete" are
they?  Do we have access to some "human readable" (i.e. non-code) full
description of the algorithm?

Etienne

Robin wrote:
> On Thu, 2006-11-09 at 02:01 +0300, Ivan Volosyuk wrote:
> 
>>Robin,
>>
>>thank you for detailed description of the algorithm. IMHO, this was
>>the most complicated place of the whole story: how to have a weak
>>reference to classloader and still be able to get it alive again. This
>>shouldn't be performance critical part and is quite doable. I
>>absolutely agree with your estimations about tracing extra reference
>>per object. The approach you propose is more efficient and quite
>>elegant.
>>--
>>Ivan
> 
> 
> Thanks :)
> 
> 
>>On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
>>
>>>Robin Garner wrote:
>>>
>>>>Aleksey Ignatenko wrote:
>>>>
>>>>>Robin.
>>>>>
>>>>>
>>>>>>OK, well how about keeping a weak reference to the >j.l.ClassLoader
>>>>>>object instead of a strong one.  When the reference >becomes (strong)ly
>>>>>>unreachable, invoke the class-unloading phase.
>>>>>
>>>>>
>>>>>If you have weak reference to j.l.Classloader - GC will collect it
>>>>>(with all
>>>>>appropriate jlClasses) as soon as there are no references to
>>>>>j.l.Classloaderand appropriate classes. But there is possible
>>>>>situation when there are some
>>>>>live objects of that classes and no references to jlClassloader and
>>>>>jlClasses. This will lead to unpredictable consequences (crash, etc).
>>>>>
>>>>>
>>>>>
>>>>>I want to remind that there 3 mandatory conditions of class unloading:
>>>>>
>>>>>1. j.l.Classloader instance is unreachable.
>>>>>
>>>>>2. Appropriate j.l.Class instances are unreachable.
>>>>>
>>>>>3. No object of any class loaded by appropriate class loader exists.
>>>>
>>>>Let me repeat.  I offer an efficient solution to (3).  I don't purport
>>>>to have a solution to (1) and (2).
>>>
>>>Let me just add:  This is because I don't think (1) or (2) are
>>>particularly difficult from a performance point of view, although I'm
>>>happy to accept that there may still be some subtle engineering challenges.
>>>
>>>Now this is just off the top of my head, but what about this for a design:
>>>- A j.l.ClassLoader maintains a collection of each of the classes it has
>>>loaded
>>>- A j.l.Class contains a pointer to its j.l.ClassLoader
>>>- A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1).
>>>The point of this is that a class loader and its classes are a 'self
>>>sustaining' data structure - if one element in it is reachable the whole
>>>thing is reachable.
>>>
>>>The VM maintains a weak reference to all its j.l.ClassLoader instances,
>>>and maintains a ReferenceQueue for weakly-reachable classloaders.
>>>ClassLoaders are placed on the ReferenceQueue if and only if they are
>>>unreachable from the heap (including via their j.l.Class objects).  Note
>>>this is an irreversible condition: objects that are unreachable can
>>>never become reachable again, except through very specific methods.
>>>
>>>When it sweeps the ReferenceQueue for unreachable classloaders, the VM
>>>places the unreachable classloaders in a queue of classloaders that are
>>>candidates for unloading.  This queue is part of the root set of the VM.
>>>  A classloader in this queue is unreachable from the heap, and can be
>>>unloaded when there are no objects of any class it has loaded.
>>>
>>>This is where my mechanism comes into play.
>>>
>>>If an object executes getClass() then its classloader is removed from
>>>the unloadable classloader queue, its weak reference gets recreated  and
>>>we're back at the initial state.  My guess is that this is a pretty
>>>infrequent method call.
>>>
>>>I think this stage of the algorithm is easy in performance terms -
>>>difficult in terms of proving correctness, but if you have an efficient
>>>reachability mechanism for classes I think the building blocks are
>>>there, and the subtleties are nothing that a talented engineer can't solve.
>>>
>>>
>>>I'm not 100% sure what your counter-proposal is: I recall 2 approaches
>>>from the mailing list:
>>>1) Each object has an additional word in its header that points back to
>>>    its j.l.Class object, and we proceed from here.
>>>
>>>Given that the mean object size is ~28 bytes, this proposal adds 14% to
>>>each object size.  This increases the frequency of GC by 14% and incurs
>>>a 14% slowdown.  Of course this is an oversimplification but a 14%
>>>slowdown is a pretty lousy starting point to argue from.
>>>
>>>2) The existing pointer in the GC header is traced during GC time.
>>>
>>>The average number of pointers per object (excluding the vtable) is
>>>between 1.5 and 2 for the majority of benchmarks I have looked at
>>>(footnote: if you know something different, drop me a line) (geometric
>>>mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}).  Tracing one
>>>additional reference per object will therefore increase the cost of GC
>>>by ~60% on average.  Again oversimplification but indicative.  If we
>>>assume that GC accounts for 10% of runtime (more or less depending on
>>>heap size), this is a runtime overhead of 6%.
>>>
>>>My proposal has been measured at ~1% overhead in GC time, or 0.1% in
>>>execution time (caveats as above).  If there is some complexity in
>>>establishing classloader reachability from this basis, I would assume it
>>>can easliy be absorbed.
>>>
>>>Therefore I think my proposal, while not complete, can form the basis of
>>>an efficient complete system for class unloading.
>>>
>>>(PS: I'd *love* to be proven wrong)
>>>
>>>cheers,
>>>Robin
>>>
>>>
>>>>Regards,
>>>>Robin
>>>>
>>>>
>>>>>
>>>>>Aleksey.
>>>>>
>>>>>
>>>>>On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
>>>>>
>>>>>>Pavel Pervov wrote:
>>>>>>
>>>>>>>Robin,
>>>>>>>
>>>>>>>The kind of model I had in mind was along the lines of:
>>>>>>>
>>>>>>>>- VM maintains a linked list (or other collection type) of the
>>>>>>
>>>>>>currently
>>>>>>
>>>>>>>>loaded classloaders, each of which in turn maintains the
>>>>>>
>>>>>>collection of
>>>>>>
>>>>>>>>classes loaded by that type.  The sweep of classloaders goes
>>>>>>
>>>>>>something
>>>>>>
>>>>>>>>like:
>>>>>>>>
>>>>>>>>for (ClassLoader cl : classLoaders)
>>>>>>>>  for (Class c : cl.classes)
>>>>>>>>    cl.reachable |= c.vtable.reachable
>>>>>>>
>>>>>>>
>>>>>>>This is not enough. There are may be live j/l/Class'es and
>>>>>>>j/l/Classloader's
>>>>>>>in the heap. Even though no objects of any classes loaded by a
>>>>>>
>>>>>>particual
>>>>>>
>>>>>>>class loader are available in the heap, if we have live reference to
>>>>>>>j/l/ClassLoader itself, it just can't be unloaded.
>>>>>>
>>>>>>OK, well how about keeping a weak reference to the j.l.ClassLoader
>>>>>>object instead of a strong one.  When the reference becomes (strong)ly
>>>>>>unreachable, invoke the class-unloading phase.
>>>>>>
>>>>>>To me the key issue from a performance POV is the reachability of
>>>>>>classes from objects in the heap.  I don't pretend to have an answer to
>>>>>>the other questions---the performance critical one is the one I have
>>>>>>addressed, and I accept there may be many solutions to this part of the
>>>>>>question.
>>>>>>
>>>>>>
>>>>>>>I believe that a separate heap trace pass, different from the standard
>>>>>>>
>>>>>>>>GC, that visited vtables and reachable resources from there would
>>>>>>
>>>>>>also
>>>>>>
>>>>>>>>be a viable solution.  As mentioned in an earlier post, writing
>>>>>>
>>>>>>this in
>>>>>>
>>>>>>
>>>>>>>>MMTk (where a heap trace operation is a class that you can easily
>>>>>>>>subtype to do this) would be easy.
>>>>>>>>
>>>>>>>>One of the advantages of my other proposal is that it can be
>>>>>>
>>>>>>implemented
>>>>>>
>>>>>>>>in the VM independent of the GC to some extent.  This additional
>>>>>>>>mark/scan phase may or may not be easy to implement, depending on the
>>>>>>>>structure of DRLVM GCs, which is something I haven't explored.
>>>>>>>
>>>>>>>
>>>>>>>DRLVM may work with (potentially) any number of GCs. Designing class
>>>>>>>unloading the way, which would require mark&scan cooperation from
>>>>>>
>>>>>>GC, is
>>>>>>
>>>>>>>not
>>>>>>>generally a good idea (from my HPOV).
>>>>>>
>>>>>>That's what I gathered.  hence my proposal.
> 
> 
> 

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin <ro...@anu.edu.au>.

On Thu, 2006-11-09 at 02:01 +0300, Ivan Volosyuk wrote:
> Robin,
> 
> thank you for detailed description of the algorithm. IMHO, this was
> the most complicated place of the whole story: how to have a weak
> reference to classloader and still be able to get it alive again. This
> shouldn't be performance critical part and is quite doable. I
> absolutely agree with your estimations about tracing extra reference
> per object. The approach you propose is more efficient and quite
> elegant.
> --
> Ivan

Thanks :)

> On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
> > Robin Garner wrote:
> > > Aleksey Ignatenko wrote:
> > >> Robin.
> > >>
> > >>> OK, well how about keeping a weak reference to the >j.l.ClassLoader
> > >>> object instead of a strong one.  When the reference >becomes (strong)ly
> > >>> unreachable, invoke the class-unloading phase.
> > >>
> > >>
> > >> If you have weak reference to j.l.Classloader - GC will collect it
> > >> (with all
> > >> appropriate jlClasses) as soon as there are no references to
> > >> j.l.Classloaderand appropriate classes. But there is possible
> > >> situation when there are some
> > >> live objects of that classes and no references to jlClassloader and
> > >> jlClasses. This will lead to unpredictable consequences (crash, etc).
> > >>
> > >>
> > >>
> > >> I want to remind that there 3 mandatory conditions of class unloading:
> > >>
> > >> 1. j.l.Classloader instance is unreachable.
> > >>
> > >> 2. Appropriate j.l.Class instances are unreachable.
> > >>
> > >> 3. No object of any class loaded by appropriate class loader exists.
> > >
> > > Let me repeat.  I offer an efficient solution to (3).  I don't purport
> > > to have a solution to (1) and (2).
> >
> > Let me just add:  This is because I don't think (1) or (2) are
> > particularly difficult from a performance point of view, although I'm
> > happy to accept that there may still be some subtle engineering challenges.
> >
> > Now this is just off the top of my head, but what about this for a design:
> > - A j.l.ClassLoader maintains a collection of each of the classes it has
> > loaded
> > - A j.l.Class contains a pointer to its j.l.ClassLoader
> > - A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1).
> > The point of this is that a class loader and its classes are a 'self
> > sustaining' data structure - if one element in it is reachable the whole
> > thing is reachable.
> >
> > The VM maintains a weak reference to all its j.l.ClassLoader instances,
> > and maintains a ReferenceQueue for weakly-reachable classloaders.
> > ClassLoaders are placed on the ReferenceQueue if and only if they are
> > unreachable from the heap (including via their j.l.Class objects).  Note
> > this is an irreversible condition: objects that are unreachable can
> > never become reachable again, except through very specific methods.
> >
> > When it sweeps the ReferenceQueue for unreachable classloaders, the VM
> > places the unreachable classloaders in a queue of classloaders that are
> > candidates for unloading.  This queue is part of the root set of the VM.
> >   A classloader in this queue is unreachable from the heap, and can be
> > unloaded when there are no objects of any class it has loaded.
> >
> > This is where my mechanism comes into play.
> >
> > If an object executes getClass() then its classloader is removed from
> > the unloadable classloader queue, its weak reference gets recreated  and
> > we're back at the initial state.  My guess is that this is a pretty
> > infrequent method call.
> >
> > I think this stage of the algorithm is easy in performance terms -
> > difficult in terms of proving correctness, but if you have an efficient
> > reachability mechanism for classes I think the building blocks are
> > there, and the subtleties are nothing that a talented engineer can't solve.
> >
> >
> > I'm not 100% sure what your counter-proposal is: I recall 2 approaches
> > from the mailing list:
> > 1) Each object has an additional word in its header that points back to
> >     its j.l.Class object, and we proceed from here.
> >
> > Given that the mean object size is ~28 bytes, this proposal adds 14% to
> > each object size.  This increases the frequency of GC by 14% and incurs
> > a 14% slowdown.  Of course this is an oversimplification but a 14%
> > slowdown is a pretty lousy starting point to argue from.
> >
> > 2) The existing pointer in the GC header is traced during GC time.
> >
> > The average number of pointers per object (excluding the vtable) is
> > between 1.5 and 2 for the majority of benchmarks I have looked at
> > (footnote: if you know something different, drop me a line) (geometric
> > mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}).  Tracing one
> > additional reference per object will therefore increase the cost of GC
> > by ~60% on average.  Again oversimplification but indicative.  If we
> > assume that GC accounts for 10% of runtime (more or less depending on
> > heap size), this is a runtime overhead of 6%.
> >
> > My proposal has been measured at ~1% overhead in GC time, or 0.1% in
> > execution time (caveats as above).  If there is some complexity in
> > establishing classloader reachability from this basis, I would assume it
> > can easliy be absorbed.
> >
> > Therefore I think my proposal, while not complete, can form the basis of
> > an efficient complete system for class unloading.
> >
> > (PS: I'd *love* to be proven wrong)
> >
> > cheers,
> > Robin
> >
> > > Regards,
> > > Robin
> > >
> > >>
> > >>
> > >> Aleksey.
> > >>
> > >>
> > >> On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
> > >>>
> > >>> Pavel Pervov wrote:
> > >>> > Robin,
> > >>> >
> > >>> > The kind of model I had in mind was along the lines of:
> > >>> >> - VM maintains a linked list (or other collection type) of the
> > >>> currently
> > >>> >> loaded classloaders, each of which in turn maintains the
> > >>> collection of
> > >>> >> classes loaded by that type.  The sweep of classloaders goes
> > >>> something
> > >>> >> like:
> > >>> >>
> > >>> >> for (ClassLoader cl : classLoaders)
> > >>> >>   for (Class c : cl.classes)
> > >>> >>     cl.reachable |= c.vtable.reachable
> > >>> >
> > >>> >
> > >>> > This is not enough. There are may be live j/l/Class'es and
> > >>> > j/l/Classloader's
> > >>> > in the heap. Even though no objects of any classes loaded by a
> > >>> particual
> > >>> > class loader are available in the heap, if we have live reference to
> > >>> > j/l/ClassLoader itself, it just can't be unloaded.
> > >>>
> > >>> OK, well how about keeping a weak reference to the j.l.ClassLoader
> > >>> object instead of a strong one.  When the reference becomes (strong)ly
> > >>> unreachable, invoke the class-unloading phase.
> > >>>
> > >>> To me the key issue from a performance POV is the reachability of
> > >>> classes from objects in the heap.  I don't pretend to have an answer to
> > >>> the other questions---the performance critical one is the one I have
> > >>> addressed, and I accept there may be many solutions to this part of the
> > >>> question.
> > >>>
> > >>> > I believe that a separate heap trace pass, different from the standard
> > >>> >> GC, that visited vtables and reachable resources from there would
> > >>> also
> > >>> >> be a viable solution.  As mentioned in an earlier post, writing
> > >>> this in
> > >>>
> > >>> >> MMTk (where a heap trace operation is a class that you can easily
> > >>> >> subtype to do this) would be easy.
> > >>> >>
> > >>> >> One of the advantages of my other proposal is that it can be
> > >>> implemented
> > >>> >> in the VM independent of the GC to some extent.  This additional
> > >>> >> mark/scan phase may or may not be easy to implement, depending on the
> > >>> >> structure of DRLVM GCs, which is something I haven't explored.
> > >>> >
> > >>> >
> > >>> > DRLVM may work with (potentially) any number of GCs. Designing class
> > >>> > unloading the way, which would require mark&scan cooperation from
> > >>> GC, is
> > >>> > not
> > >>> > generally a good idea (from my HPOV).
> > >>>
> > >>> That's what I gathered.  hence my proposal.

Re: [drlvm] Class unloading support - tested one approach

Posted by Ivan Volosyuk <iv...@gmail.com>.

Robin,

thank you for detailed description of the algorithm. IMHO, this was
the most complicated place of the whole story: how to have a weak
reference to classloader and still be able to get it alive again. This
shouldn't be performance critical part and is quite doable. I
absolutely agree with your estimations about tracing extra reference
per object. The approach you propose is more efficient and quite
elegant.
--
Ivan

On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
> Robin Garner wrote:
> > Aleksey Ignatenko wrote:
> >> Robin.
> >>
> >>> OK, well how about keeping a weak reference to the >j.l.ClassLoader
> >>> object instead of a strong one.  When the reference >becomes (strong)ly
> >>> unreachable, invoke the class-unloading phase.
> >>
> >>
> >> If you have weak reference to j.l.Classloader - GC will collect it
> >> (with all
> >> appropriate jlClasses) as soon as there are no references to
> >> j.l.Classloaderand appropriate classes. But there is possible
> >> situation when there are some
> >> live objects of that classes and no references to jlClassloader and
> >> jlClasses. This will lead to unpredictable consequences (crash, etc).
> >>
> >>
> >>
> >> I want to remind that there 3 mandatory conditions of class unloading:
> >>
> >> 1. j.l.Classloader instance is unreachable.
> >>
> >> 2. Appropriate j.l.Class instances are unreachable.
> >>
> >> 3. No object of any class loaded by appropriate class loader exists.
> >
> > Let me repeat.  I offer an efficient solution to (3).  I don't purport
> > to have a solution to (1) and (2).
>
> Let me just add:  This is because I don't think (1) or (2) are
> particularly difficult from a performance point of view, although I'm
> happy to accept that there may still be some subtle engineering challenges.
>
> Now this is just off the top of my head, but what about this for a design:
> - A j.l.ClassLoader maintains a collection of each of the classes it has
> loaded
> - A j.l.Class contains a pointer to its j.l.ClassLoader
> - A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1).
> The point of this is that a class loader and its classes are a 'self
> sustaining' data structure - if one element in it is reachable the whole
> thing is reachable.
>
> The VM maintains a weak reference to all its j.l.ClassLoader instances,
> and maintains a ReferenceQueue for weakly-reachable classloaders.
> ClassLoaders are placed on the ReferenceQueue if and only if they are
> unreachable from the heap (including via their j.l.Class objects).  Note
> this is an irreversible condition: objects that are unreachable can
> never become reachable again, except through very specific methods.
>
> When it sweeps the ReferenceQueue for unreachable classloaders, the VM
> places the unreachable classloaders in a queue of classloaders that are
> candidates for unloading.  This queue is part of the root set of the VM.
>   A classloader in this queue is unreachable from the heap, and can be
> unloaded when there are no objects of any class it has loaded.
>
> This is where my mechanism comes into play.
>
> If an object executes getClass() then its classloader is removed from
> the unloadable classloader queue, its weak reference gets recreated  and
> we're back at the initial state.  My guess is that this is a pretty
> infrequent method call.
>
> I think this stage of the algorithm is easy in performance terms -
> difficult in terms of proving correctness, but if you have an efficient
> reachability mechanism for classes I think the building blocks are
> there, and the subtleties are nothing that a talented engineer can't solve.
>
>
> I'm not 100% sure what your counter-proposal is: I recall 2 approaches
> from the mailing list:
> 1) Each object has an additional word in its header that points back to
>     its j.l.Class object, and we proceed from here.
>
> Given that the mean object size is ~28 bytes, this proposal adds 14% to
> each object size.  This increases the frequency of GC by 14% and incurs
> a 14% slowdown.  Of course this is an oversimplification but a 14%
> slowdown is a pretty lousy starting point to argue from.
>
> 2) The existing pointer in the GC header is traced during GC time.
>
> The average number of pointers per object (excluding the vtable) is
> between 1.5 and 2 for the majority of benchmarks I have looked at
> (footnote: if you know something different, drop me a line) (geometric
> mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}).  Tracing one
> additional reference per object will therefore increase the cost of GC
> by ~60% on average.  Again oversimplification but indicative.  If we
> assume that GC accounts for 10% of runtime (more or less depending on
> heap size), this is a runtime overhead of 6%.
>
> My proposal has been measured at ~1% overhead in GC time, or 0.1% in
> execution time (caveats as above).  If there is some complexity in
> establishing classloader reachability from this basis, I would assume it
> can easliy be absorbed.
>
> Therefore I think my proposal, while not complete, can form the basis of
> an efficient complete system for class unloading.
>
> (PS: I'd *love* to be proven wrong)
>
> cheers,
> Robin
>
> > Regards,
> > Robin
> >
> >>
> >>
> >> Aleksey.
> >>
> >>
> >> On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
> >>>
> >>> Pavel Pervov wrote:
> >>> > Robin,
> >>> >
> >>> > The kind of model I had in mind was along the lines of:
> >>> >> - VM maintains a linked list (or other collection type) of the
> >>> currently
> >>> >> loaded classloaders, each of which in turn maintains the
> >>> collection of
> >>> >> classes loaded by that type.  The sweep of classloaders goes
> >>> something
> >>> >> like:
> >>> >>
> >>> >> for (ClassLoader cl : classLoaders)
> >>> >>   for (Class c : cl.classes)
> >>> >>     cl.reachable |= c.vtable.reachable
> >>> >
> >>> >
> >>> > This is not enough. There are may be live j/l/Class'es and
> >>> > j/l/Classloader's
> >>> > in the heap. Even though no objects of any classes loaded by a
> >>> particual
> >>> > class loader are available in the heap, if we have live reference to
> >>> > j/l/ClassLoader itself, it just can't be unloaded.
> >>>
> >>> OK, well how about keeping a weak reference to the j.l.ClassLoader
> >>> object instead of a strong one.  When the reference becomes (strong)ly
> >>> unreachable, invoke the class-unloading phase.
> >>>
> >>> To me the key issue from a performance POV is the reachability of
> >>> classes from objects in the heap.  I don't pretend to have an answer to
> >>> the other questions---the performance critical one is the one I have
> >>> addressed, and I accept there may be many solutions to this part of the
> >>> question.
> >>>
> >>> > I believe that a separate heap trace pass, different from the standard
> >>> >> GC, that visited vtables and reachable resources from there would
> >>> also
> >>> >> be a viable solution.  As mentioned in an earlier post, writing
> >>> this in
> >>>
> >>> >> MMTk (where a heap trace operation is a class that you can easily
> >>> >> subtype to do this) would be easy.
> >>> >>
> >>> >> One of the advantages of my other proposal is that it can be
> >>> implemented
> >>> >> in the VM independent of the GC to some extent.  This additional
> >>> >> mark/scan phase may or may not be easy to implement, depending on the
> >>> >> structure of DRLVM GCs, which is something I haven't explored.
> >>> >
> >>> >
> >>> > DRLVM may work with (potentially) any number of GCs. Designing class
> >>> > unloading the way, which would require mark&scan cooperation from
> >>> GC, is
> >>> > not
> >>> > generally a good idea (from my HPOV).
> >>>
> >>> That's what I gathered.  hence my proposal.

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Etienne Gagnon wrote:
>> OK.  My latest proposal (a few messages ago) was assuming that the
>> nursery was empty when the "end of epoch collection" is launched.
>>
>> If it is not, you can do 2 things:
>>
>> a) do a minor collection to empty it, or
>>
>> b) i  - use a finalization-like list of references to class loader
>>         objects
>>    ii - launch gc, which might mark a previously unmarked vtable
>>    iii- do a finalization-like rescuing for resuscitated class loaders
>>
>> "b)" should really have a minimal performance impact.  As for its
>> "apparent complexity", I would say that this is a non-issue; similar
>> code must already exist in drlvm for implementing finalization.
> 
> Just for clarification: "b)" implies a combined "nursery + mature space"
> collection.
> 
> Actually, for the mature space part, you could get away with a smaller
> collection if you premature all class loaders and classes to a specific
> mature space area; the you only need to collect that space (in addition
> to the nursery).
> 
> Etienne
> 

This sounds rather 'stop the world' - while the barrier is more 
complicated I think it scales to concurrent collectors.

Also, don't forget an instance of a class in the nursery can pass a 
reference to its classloader to a mature-space object under suitably 
bizarre circumstances.  I guess you could have a write barrier on the 
class metadata space ...

... an XOR barrier could actually be an interesting solution ... but I'm 
sure it won't be necessary.

-- 
Robin Garner
Dept. of Computer Science
Australian National University
http://cs.anu.edu.au/people/Robin.Garner/

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

> OK.  My latest proposal (a few messages ago) was assuming that the
> nursery was empty when the "end of epoch collection" is launched.
> 
> If it is not, you can do 2 things:
> 
> a) do a minor collection to empty it, or
> 
> b) i  - use a finalization-like list of references to class loader
>         objects
>    ii - launch gc, which might mark a previously unmarked vtable
>    iii- do a finalization-like rescuing for resuscitated class loaders
> 
> "b)" should really have a minimal performance impact.  As for its
> "apparent complexity", I would say that this is a non-issue; similar
> code must already exist in drlvm for implementing finalization.

Just for clarification: "b)" implies a combined "nursery + mature space"
collection.

Actually, for the mature space part, you could get away with a smaller
collection if you premature all class loaders and classes to a specific
mature space area; the you only need to collect that space (in addition
to the nursery).

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Alexey Varlamov wrote:
>> > My proposal already argued that vtable bit/byte/word marking is
>> > unnecessary for "nursery allocations".  You only need to mark the
>> vtable
>> > of objects that survive collection and pretenured objects.
> 
> 
> I may have missed it, but I only recall you argued that we just need
> to collect mature space for the *final unloading* as CL and classes
> are unlikely to die young, which I agree. But chances that a live
> object of a candidate class appeared in the nursery are higher.
> Otherwise I just do not grok how this algorithm can be proven for
> correctness.
> 

OK.  My latest proposal (a few messages ago) was assuming that the
nursery was empty when the "end of epoch collection" is launched.

If it is not, you can do 2 things:

a) do a minor collection to empty it, or

b) i  - use a finalization-like list of references to class loader
        objects
   ii - launch gc, which might mark a previously unmarked vtable
   iii- do a finalization-like rescuing for resuscitated class loaders

"b)" should really have a minimal performance impact.  As for its
"apparent complexity", I would say that this is a non-issue; similar
code must already exist in drlvm for implementing finalization.

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Alexey Varlamov wrote:
> 2006/11/9, Robin Garner <ro...@anu.edu.au>:
>> Etienne Gagnon wrote:
>> > Alexey Varlamov wrote:
>> >> Sorry if it was already discussed, but I believe this approach also
>> >> requires marking vtable bit/byte on each object allocation, unitl the
>> >> "unloading" GC pass is strictly stop-the-world full-heap collection.
>> >> Robin, did you include this particular overhead too in your
>> >> measurements?
>>
>> I didn't include it - having established that it's cheap during GC where
>> memory bandwidth is at a premium, I kind of took this for granted.
>>
>> > My proposal already argued that vtable bit/byte/word marking is
>> > unnecessary for "nursery allocations".  You only need to mark the 
>> vtable
>> > of objects that survive collection and pretenured objects.
> 
> I may have missed it, but I only recall you argued that we just need
> to collect mature space for the *final unloading* as CL and classes
> are unlikely to die young, which I agree. But chances that a live
> object of a candidate class appeared in the nursery are higher.
> Otherwise I just do not grok how this algorithm can be proven for 
> correctness.

There is definitely some kind of barrier required here.  If no 
references to classes belonging to a c/l exist, but references to one of 
the j.l.classloaders exist, classloader may get marked for collection. 
Objects get created (via reflection, in nursery), references to c/l are 
dropped, classloader unloads.

I believe a barrier in one or more of the reflective methods used to 
create objects from j.l.class/j.l.c/loader references is probably necessary.

Weak references can only be collected at the end of a reachability epoch 
in any case, so I think there may be some stronger guarantees that we 
can use, but I'm too sleepy to thing of them right now :)

>> And this is a persuasive argument.  But I can probably find time to
>> measure it tomorrow if you aren't convinced.
> 
> That would be very kindly, thank you.
> 
> -- 
> Alexey

-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm] Class unloading support - tested one approach

Posted by Alexey Varlamov <al...@gmail.com>.

[snip]
> Alexey,
>
> it looks like what you are thinking about is *concurrent* collector,
> and concurrent garbage collections brings substantial complexity
> even without class unloading.

Salikh,

You are correct. Maybe I'm running ahead of the train, but my concern
is that "scalability" of unloading design is not the last criteria.
The decision we'll do now should not strike back at us in some months.

> However, the design we were discussing was for *stop-the-world* garbage
> collectors, because this is the only thing currently supported by DRLVM,
> and all existing GCs are stop-the-world.

I'm kinda optimistic on gcv5 progress, feeling that concurrent
collection is not improbable to be workable before H2/2007 :)

>
> So, the correctness of unloading algorithm can easily be proved if we consider
> that the "final unloading" collection is a full heap collection,
> i.e. both nursery and mature space is collected.
Yes, things are more or less clear for the case of STW GC so we can
concentrate on scripting more detailed technical proposal...

[skip]

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Etienne Gagnon wrote:
> Salikh Zakirov wrote:
> 
>>I have another concern though: 
>>just before starting "final unloading" collection, we scan vtable marks and identify
>>the candidates for unloading. During the "final unloading" collection, the
>>candidate classloader roots are reported as week. At the end of the trace,
>>we need to rescan vtable marks and "revive" the classloader which were found
>>in possession of live objects. This techniques is exactly the same as the one
>>used for object finalization. 
>>
>>However, in contrast with finalization, we will need to repeat reviving
>>classloaders which have non-0 vtable marks until the process converges, and no
>>new classloaders are revived. (* in finalization, both dead and live objects in finalization
>>queue are revived, and thus the revival converges in just 1 step *).

In case you chose the finalization-like + revival way, then I don't see
any significant performance hit of multiple-step convergence!  For one
thing, you'll probably agree with me that it is quite unlikely to take
more than 1 step to converge in most cases, and the additional work in
the other cases is still quite insignificant relative to the remaining
collection work!

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Salikh Zakirov <Sa...@Intel.com>.

Etienne Gagnon wrote:
>>   3) trace the heap 
>>   4) scan vtable marks and "revive" marked class unloaders, by adding the strong root
>>      from the previously collected "unload list". Remove the revived classloaders from unload list.
>>   5) repeat steps (3) and (4) until there is no classloaders to revive
> 
> As long as it is understood that the repeated (3) is not a full trace.
> It's only a trace starting from the revived roots.  [This is important
> in evaluating the total work done].

Exactly.

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Salikh Zakirov wrote:
> Ah, I think I got it.

Yep.

>   3) trace the heap 
>   4) scan vtable marks and "revive" marked class unloaders, by adding the strong root
>      from the previously collected "unload list". Remove the revived classloaders from unload list.
>   5) repeat steps (3) and (4) until there is no classloaders to revive

As long as it is understood that the repeated (3) is not a full trace.
It's only a trace starting from the revived roots.  [This is important
in evaluating the total work done].

> The voting definitely was premature, as it turns out that even the design under discussion
> can be elaborated to multiple substantially different designs.

Yes, you're right.

Etienne
-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Salikh Zakirov <Sa...@Intel.com>.

Etienne Gagnon wrote:
> Salikh Zakirov wrote:
>>   7) let the GC finish collection and reclaim unreachable objects -- this reclaims java objects
> 
> Just a bit of a warning...  This should be integrated within the
> weak/soft/phantom + finalization framework.  We definitely don't want
> the native resources of a class loader to be freed and later have
> finalization revive the class loader...  :-)

Agreed. "Revival" of classloaders should be done after "revival"
of objects in finalization queue.

I think this scheme can be implemented by introducing one additional GC->VM callback (vm_trace_complete),
which would be called right after GC completed the trace. The call sequence will be as follows:

       GC                                                        VM
        |---------------------------------------> vm_enumerate_root_set()
        | gc_add_root_set_entry()<-------------------------------|
        | gc_add_root_set_entry()<-------------------------------|
        | gc_add_root_set_entry()<-------------------------------|
        |<- - - - - - - - - - -return from vm_enumerate_root_set()
        |                                                        |
   [trace heap]                                                  |
        |                                                        |
        |---------------------------------------> vm_trace_complete()
        | gc_add_root_set_entry()<-------------------------------|
        | gc_add_root_set_entry()<-------------------------------|
        |< - - - - - - - - - - - return from vm_trace_complete()-|
        |                                                        |
   [trace heap from new roots,                                   |
     if there are any]                                           |
        |---------------------------------------> vm_trace_complete()
        |< - - - - - - - - - - - return from vm_trace_complete()-|
        |
   [no retrace, as no new roots were received]
        |
   [reclaim space]
        |

Additionally, even finalization itself can be moved out of GC responsibility,
using this interface and one additional function to query if the object
was already reached or not.

> [Luckily, nothing special needs to be done for JNI code;
> Call<TYPE>StaticMethod does require a native reference to a class
> object.  Yay! ]

Unluckily, something needs to be done for JVMTI. It has a function IterateOverHeap
which is supposed to iterate over both reachable and unreachable objects by scanning
heap linearly.

Thus, if the respective capability (can_tag_objects) has been requested on the VM startup,
the GC must run in a special mode and zero out all unreachable objects, because the unreachable
object may loose its descriptor (VTable) at any time, and GC will not be able even to know its size.

This will prevent some optimizations, like not reclaiming short free areas in unmovable space,
and require some special attention from the GC developers. OTOH, gc_cc already has a special mode
(-Dgc.heap_iteration=1) to support iteration even before class unloading is implemented.

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Salikh Zakirov wrote:
>   7) let the GC finish collection and reclaim unreachable objects -- this reclaims java objects

Just a bit of a warning...  This should be integrated within the
weak/soft/phantom + finalization framework.  We definitely don't want
the native resources of a class loader to be freed and later have
finalization revive the class loader...  :-)

[Luckily, nothing special needs to be done for JNI code;
Call<TYPE>StaticMethod does require a native reference to a class
object.  Yay! ]

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Salikh Zakirov <Sa...@Intel.com>.

Etienne Gagnon wrote:

> "Revival" is only needed if you use the finalization-like approach.  If
> you only do class-unloading GC when the nursery is empty, then no
> revival is needed.  

Ah, I think I got it.

You mean running a minor collection, and then "class unloading" full heap collection
sequentially, without any mutator work in between?
Then, the correctness is observed easily:

  1) all mature objects has their vtable marks set to 1
  2) after minor collection, the nursery is empty
  => all live objects already have vtable marks == 1

  Thus, if we find a classloader with vtable marks == 0, then it has no object instances,
  and its reachability is defined solely by reachability of java.lang.ClassLoader instance
  and existence of the method frames, which can be checked, respectively, by
  enumerating class loader roots as weak roots, and scanning stacks.

  Note, that the class loader, which became eligible for unloading during epoch N,
  will not be unloaded until the end of the epoch N+1.

However, in the case of non-generational collector, the "minor collection followed
by unloading collection" becomes effectively two successive garbage collections.

On the other side, "finalization-like" design goes as follows:

  1) clean vtable marks before "class unloading" collection
  2) enumerate classloader roots as weak and collect array of user classloader pointers for later use
     -- let's call it "unload list"
  3) trace the heap 
  4) scan vtable marks and "revive" marked class unloaders, by adding the strong root
     from the previously collected "unload list". Remove the revived classloaders from unload list.
  5) repeat steps (3) and (4) until there is no classloaders to revive
  6) unload the classloaders, pointed by the "unload list" -- this reclaims native resources
  7) let the GC finish collection and reclaim unreachable objects -- this reclaims java objects

This design unloads classloaders at the end of the very same epoch when they became unloadable.

The voting definitely was premature, as it turns out that even the design under discussion
can be elaborated to multiple substantially different designs.

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Salikh Zakirov wrote:
> I have another concern though: 
> just before starting "final unloading" collection, we scan vtable marks and identify
> the candidates for unloading. During the "final unloading" collection, the
> candidate classloader roots are reported as week. At the end of the trace,
> we need to rescan vtable marks and "revive" the classloader which were found
> in possession of live objects. This techniques is exactly the same as the one
> used for object finalization. 
> 
> However, in contrast with finalization, we will need to repeat reviving
> classloaders which have non-0 vtable marks until the process converges, and no
> new classloaders are revived. (* in finalization, both dead and live objects in finalization
> queue are revived, and thus the revival converges in just 1 step *).

"Revival" is only needed if you use the finalization-like approach.  If
you only do class-unloading GC when the nursery is empty, then no
revival is needed.  In this case, after GC you only need to revert weak
references to hard ones.  Nulled weak references relate to dead class
loaders for which you can definitely free the native resources.

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Salikh Zakirov <Sa...@Intel.com>.

>> Etienne Gagnon wrote:
>> > My proposal already argued that vtable bit/byte/word marking is
>> > unnecessary for "nursery allocations".  You only need to mark the
>> vtable
>> > of objects that survive collection and pretenured objects.

Alexey Varlamov wrote:
> I may have missed it, but I only recall you argued that we just need
> to collect mature space for the *final unloading* as CL and classes
> are unlikely to die young, which I agree. But chances that a live
> object of a candidate class appeared in the nursery are higher.
> Otherwise I just do not grok how this algorithm can be proven for
> correctness.

Alexey, 

it looks like what you are thinking about is *concurrent* collector,
and concurrent garbage collections brings substantial complexity
even without class unloading.

However, the design we were discussing was for *stop-the-world* garbage
collectors, because this is the only thing currently supported by DRLVM,
and all existing GCs are stop-the-world.

So, the correctness of unloading algorithm can easily be proved if we consider
that the "final unloading" collection is a full heap collection,
i.e. both nursery and mature space is collected.

I have another concern though: 
just before starting "final unloading" collection, we scan vtable marks and identify
the candidates for unloading. During the "final unloading" collection, the
candidate classloader roots are reported as week. At the end of the trace,
we need to rescan vtable marks and "revive" the classloader which were found
in possession of live objects. This techniques is exactly the same as the one
used for object finalization. 

However, in contrast with finalization, we will need to repeat reviving
classloaders which have non-0 vtable marks until the process converges, and no
new classloaders are revived. (* in finalization, both dead and live objects in finalization
queue are revived, and thus the revival converges in just 1 step *).

Re: [drlvm] Class unloading support - tested one approach

Posted by Alexey Varlamov <al...@gmail.com>.

2006/11/9, Robin Garner <ro...@anu.edu.au>:
> Etienne Gagnon wrote:
> > Alexey Varlamov wrote:
> >> Sorry if it was already discussed, but I believe this approach also
> >> requires marking vtable bit/byte on each object allocation, unitl the
> >> "unloading" GC pass is strictly stop-the-world full-heap collection.
> >> Robin, did you include this particular overhead too in your
> >> measurements?
>
> I didn't include it - having established that it's cheap during GC where
> memory bandwidth is at a premium, I kind of took this for granted.
>
> > My proposal already argued that vtable bit/byte/word marking is
> > unnecessary for "nursery allocations".  You only need to mark the vtable
> > of objects that survive collection and pretenured objects.

I may have missed it, but I only recall you argued that we just need
to collect mature space for the *final unloading* as CL and classes
are unlikely to die young, which I agree. But chances that a live
object of a candidate class appeared in the nursery are higher.
Otherwise I just do not grok how this algorithm can be proven for correctness.

> And this is a persuasive argument.  But I can probably find time to
> measure it tomorrow if you aren't convinced.

That would be very kindly, thank you.

--
Alexey
>
> --
> Robin Garner
> Dept. of Computer Science
> Australian National University
>

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Etienne Gagnon wrote:
> Alexey Varlamov wrote:
>> Sorry if it was already discussed, but I believe this approach also
>> requires marking vtable bit/byte on each object allocation, unitl the
>> "unloading" GC pass is strictly stop-the-world full-heap collection.
>> Robin, did you include this particular overhead too in your
>> measurements?

I didn't include it - having established that it's cheap during GC where 
memory bandwidth is at a premium, I kind of took this for granted.

> My proposal already argued that vtable bit/byte/word marking is
> unnecessary for "nursery allocations".  You only need to mark the vtable
> of objects that survive collection and pretenured objects.

And this is a persuasive argument.  But I can probably find time to 
measure it tomorrow if you aren't convinced.

-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Alexey Varlamov wrote:
> Sorry if it was already discussed, but I believe this approach also
> requires marking vtable bit/byte on each object allocation, unitl the
> "unloading" GC pass is strictly stop-the-world full-heap collection.
> Robin, did you include this particular overhead too in your
> measurements?

My proposal already argued that vtable bit/byte/word marking is
unnecessary for "nursery allocations".  You only need to mark the vtable
of objects that survive collection and pretenured objects.

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Alexey Varlamov <al...@gmail.com>.

[snip]
> > > My proposal has been measured at ~1% overhead in GC time, or 0.1% in
> > > execution time (caveats as above).  If there is some complexity in
> > > establishing classloader reachability from this basis, I would assume it
> > > can easliy be absorbed.

Sorry if it was already discussed, but I believe this approach also
requires marking vtable bit/byte on each object allocation, unitl the
"unloading" GC pass is strictly stop-the-world full-heap collection.
Robin, did you include this particular overhead too in your
measurements?

--
Regards,
Alexey

Re: [drlvm] Class unloading support - tested one approach

Posted by Alexey Varlamov <al...@gmail.com>.

Uhm, Etienne overtook me with earlier posts.
Seems we are beginning to converge with design.

2006/11/9, Alexey Varlamov <al...@gmail.com>:
> 2006/11/8, Robin Garner <ro...@anu.edu.au>:
> > Robin Garner wrote:
> > > Aleksey Ignatenko wrote:
> > >> Robin.
> > >>
> > >>> OK, well how about keeping a weak reference to the >j.l.ClassLoader
> > >>> object instead of a strong one.  When the reference >becomes (strong)ly
> > >>> unreachable, invoke the class-unloading phase.
> > >>
> > >>
> > >> If you have weak reference to j.l.Classloader - GC will collect it
> > >> (with all
> > >> appropriate jlClasses) as soon as there are no references to
> > >> j.l.Classloaderand appropriate classes. But there is possible
> > >> situation when there are some
> > >> live objects of that classes and no references to jlClassloader and
> > >> jlClasses. This will lead to unpredictable consequences (crash, etc).
> > >>
> > >>
> > >>
> > >> I want to remind that there 3 mandatory conditions of class unloading:
> > >>
> > >> 1. j.l.Classloader instance is unreachable.
> > >>
> > >> 2. Appropriate j.l.Class instances are unreachable.
> > >>
> > >> 3. No object of any class loaded by appropriate class loader exists.
> > >
> > > Let me repeat.  I offer an efficient solution to (3).  I don't purport
> > > to have a solution to (1) and (2).
> >
> > Let me just add:  This is because I don't think (1) or (2) are
> > particularly difficult from a performance point of view, although I'm
> > happy to accept that there may still be some subtle engineering challenges.
>
> Robin,
>
> While your idea to (3) looks brilliant and quite convincing, it only
> covers part of the whole mission. We really need to derive complete
> design solution (like Etienne did), and I feel the voting started in
> the neighbor thread is a bit premature.
> Some of considerations below are beyond of my understanding, could you
> please clarify them (inlined)?
>
> And yet, it would be nice to have a confirmation that the notion of
> "epoch of full-heap-collection" does not imply strict limitations on
> GC algorithms. Maybe this is something obvious for people with more
> decent GC background than me?
>
> >
> > Now this is just off the top of my head, but what about this for a design:
> > - A j.l.ClassLoader maintains a collection of each of the classes it has
> > loaded
> > - A j.l.Class contains a pointer to its j.l.ClassLoader
> > - A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1).
> > The point of this is that a class loader and its classes are a 'self
> > sustaining' data structure - if one element in it is reachable the whole
> > thing is reachable.
> Right. The special case is for system classes which are always in VM
> root set so never reclaimed.
>
> > The VM maintains a weak reference to all its j.l.ClassLoader instances,
> > and maintains a ReferenceQueue for weakly-reachable classloaders.
> > ClassLoaders are placed on the ReferenceQueue if and only if they are
> > unreachable from the heap (including via their j.l.Class objects).
> Here: should it actually read as "WeakReference instances for
> weakly-reachable classloaders are placed on the ReferenceQueue"?
> Otherwise this sentence completely escapes my mind, sorry.
> If the former, when how VM could obtain&rescue referent CL objects (+
> it's j.l.Class instances) after GC pass - AFAIU references are cleared
> automatically before enqueuing? I suppose we are not going to
> introduce inter-phase communication between VM and GC...
>
> > Note this is an irreversible condition: objects that are unreachable can
> > never become reachable again, except through very specific methods.
> >
> > When it sweeps the ReferenceQueue for unreachable classloaders, the VM
> > places the unreachable classloaders in a queue of classloaders that are
> > candidates for unloading.  This queue is part of the root set of the VM.
> Strongly referenced now I suppose.
>
> >  A classloader in this queue is unreachable from the heap, and can be
> > unloaded when there are no objects of any class it has loaded.
> So if the VM decides it is time to try unloading, it should:
> 1) Check if the full epoch has passed;
> 2) for each unloadable CL, scan corresponding vtables;
> 3) if none of the vtables were marked reachable, drop the CL from root
> set completely and clean corresponding native structures; Java
> instances will be reclaimed at nearest GC iteration;
> 4) Reset "epoch marker" and vtable words.
>
> Do I get it right?
>
>
> >
> > This is where my mechanism comes into play.
> >
> > If an object executes getClass() then its classloader is removed from
> > the unloadable classloader queue, its weak reference gets recreated  and
> > we're back at the initial state.  My guess is that this is a pretty
> > infrequent method call.
> >
> > I think this stage of the algorithm is easy in performance terms -
> > difficult in terms of proving correctness, but if you have an efficient
> > reachability mechanism for classes I think the building blocks are
> > there, and the subtleties are nothing that a talented engineer can't solve.
>
> Yes, a bit complicated. Taking into account the issues with
> ReferenceQueue above, I'd rather suggest the following:
>
> 1) The j.l.Class and defining CL have mutual strong references, as said above.
> 2) Normally, the VM reports all CLs as strong roots thus preserving
> them from premature reclamation;
> 3) When the VM decides (by whatever heuristic) it is time to perform
> unloading, it checks epoch invariant and scans all vtables for all
> CLs;
> 4) if a CL has no "reachable" vtables, it is moved to
> unloading-candidates collection and reported as a weak root, otherwise
> it remain in the strong root set.
> 5) If the nearest GC clears some of the weak references above, do
> corresponding natives cleanup and return survived CLs to normal root
> set.
> 6) Reset all data: epoch/vtables/etc and return back to 2).
>
> I believe this is less disruptive to component interfaces and requires
> less support on GC side.
>
> >
> >
> > I'm not 100% sure what your counter-proposal is: I recall 2 approaches
> > from the mailing list:
> > 1) Each object has an additional word in its header that points back to
> >    its j.l.Class object, and we proceed from here.
> >
> > Given that the mean object size is ~28 bytes, this proposal adds 14% to
> > each object size.  This increases the frequency of GC by 14% and incurs
> > a 14% slowdown.  Of course this is an oversimplification but a 14%
> > slowdown is a pretty lousy starting point to argue from.
> >
> > 2) The existing pointer in the GC header is traced during GC time.
> >
> > The average number of pointers per object (excluding the vtable) is
> > between 1.5 and 2 for the majority of benchmarks I have looked at
> > (footnote: if you know something different, drop me a line) (geometric
> > mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}).  Tracing one
> > additional reference per object will therefore increase the cost of GC
> > by ~60% on average.  Again oversimplification but indicative.  If we
> > assume that GC accounts for 10% of runtime (more or less depending on
> > heap size), this is a runtime overhead of 6%.
> Looks reasonable as upper estimation, it would be nice to look at a
> live data though. Aleksey?
>
> > My proposal has been measured at ~1% overhead in GC time, or 0.1% in
> > execution time (caveats as above).  If there is some complexity in
> > establishing classloader reachability from this basis, I would assume it
> > can easliy be absorbed.
> >
> > Therefore I think my proposal, while not complete, can form the basis of
> > an efficient complete system for class unloading.
>
> Nice thing about "automitic" approach is that it does not imply
> slightest limitation on GC policy and adopts to any future algorithms
> improvements. It's a pity the same wasn't (can't be?) said about the
> voted idea.
> Actually some tuning for the "automitic" approach is possible, like
> keeping all j.l.Class & VT instances in a special space which is
> collected only periodically, so GC does not need to trace VTs all the
> time.
>
> --
> Regards,
> Alexey
>
> >
> > (PS: I'd *love* to be proven wrong)
> >
> > cheers,
> > Robin
> >
> > > Regards,
> > > Robin
> > >
> > >>
> > >>
> > >> Aleksey.
> > >>
> > >>
> > >> On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
> > >>>
> > >>> Pavel Pervov wrote:
> > >>> > Robin,
> > >>> >
> > >>> > The kind of model I had in mind was along the lines of:
> > >>> >> - VM maintains a linked list (or other collection type) of the
> > >>> currently
> > >>> >> loaded classloaders, each of which in turn maintains the
> > >>> collection of
> > >>> >> classes loaded by that type.  The sweep of classloaders goes
> > >>> something
> > >>> >> like:
> > >>> >>
> > >>> >> for (ClassLoader cl : classLoaders)
> > >>> >>   for (Class c : cl.classes)
> > >>> >>     cl.reachable |= c.vtable.reachable
> > >>> >
> > >>> >
> > >>> > This is not enough. There are may be live j/l/Class'es and
> > >>> > j/l/Classloader's
> > >>> > in the heap. Even though no objects of any classes loaded by a
> > >>> particual
> > >>> > class loader are available in the heap, if we have live reference to
> > >>> > j/l/ClassLoader itself, it just can't be unloaded.
> > >>>
> > >>> OK, well how about keeping a weak reference to the j.l.ClassLoader
> > >>> object instead of a strong one.  When the reference becomes (strong)ly
> > >>> unreachable, invoke the class-unloading phase.
> > >>>
> > >>> To me the key issue from a performance POV is the reachability of
> > >>> classes from objects in the heap.  I don't pretend to have an answer to
> > >>> the other questions---the performance critical one is the one I have
> > >>> addressed, and I accept there may be many solutions to this part of the
> > >>> question.
> > >>>
> > >>> > I believe that a separate heap trace pass, different from the standard
> > >>> >> GC, that visited vtables and reachable resources from there would
> > >>> also
> > >>> >> be a viable solution.  As mentioned in an earlier post, writing
> > >>> this in
> > >>>
> > >>> >> MMTk (where a heap trace operation is a class that you can easily
> > >>> >> subtype to do this) would be easy.
> > >>> >>
> > >>> >> One of the advantages of my other proposal is that it can be
> > >>> implemented
> > >>> >> in the VM independent of the GC to some extent.  This additional
> > >>> >> mark/scan phase may or may not be easy to implement, depending on the
> > >>> >> structure of DRLVM GCs, which is something I haven't explored.
> > >>> >
> > >>> >
> > >>> > DRLVM may work with (potentially) any number of GCs. Designing class
> > >>> > unloading the way, which would require mark&scan cooperation from
> > >>> GC, is
> > >>> > not
> > >>> > generally a good idea (from my HPOV).
> > >>>
> > >>> That's what I gathered.  hence my proposal.
> > >>>
> > >>> cheers
> > >>>
> > >>> --
> > >>> Robin Garner
> > >>> Dept. of Computer Science
> > >>> Australian National University
> > >>>
> > >>
> > >
> > >
> >
> >
> > --
> > Robin Garner
> > Dept. of Computer Science
> > Australian National University
> >
>

Re: [drlvm] Class unloading support - tested one approach

Posted by Alexey Varlamov <al...@gmail.com>.

2006/11/8, Robin Garner <ro...@anu.edu.au>:
> Robin Garner wrote:
> > Aleksey Ignatenko wrote:
> >> Robin.
> >>
> >>> OK, well how about keeping a weak reference to the >j.l.ClassLoader
> >>> object instead of a strong one.  When the reference >becomes (strong)ly
> >>> unreachable, invoke the class-unloading phase.
> >>
> >>
> >> If you have weak reference to j.l.Classloader - GC will collect it
> >> (with all
> >> appropriate jlClasses) as soon as there are no references to
> >> j.l.Classloaderand appropriate classes. But there is possible
> >> situation when there are some
> >> live objects of that classes and no references to jlClassloader and
> >> jlClasses. This will lead to unpredictable consequences (crash, etc).
> >>
> >>
> >>
> >> I want to remind that there 3 mandatory conditions of class unloading:
> >>
> >> 1. j.l.Classloader instance is unreachable.
> >>
> >> 2. Appropriate j.l.Class instances are unreachable.
> >>
> >> 3. No object of any class loaded by appropriate class loader exists.
> >
> > Let me repeat.  I offer an efficient solution to (3).  I don't purport
> > to have a solution to (1) and (2).
>
> Let me just add:  This is because I don't think (1) or (2) are
> particularly difficult from a performance point of view, although I'm
> happy to accept that there may still be some subtle engineering challenges.

Robin,

While your idea to (3) looks brilliant and quite convincing, it only
covers part of the whole mission. We really need to derive complete
design solution (like Etienne did), and I feel the voting started in
the neighbor thread is a bit premature.
Some of considerations below are beyond of my understanding, could you
please clarify them (inlined)?

And yet, it would be nice to have a confirmation that the notion of
"epoch of full-heap-collection" does not imply strict limitations on
GC algorithms. Maybe this is something obvious for people with more
decent GC background than me?

>
> Now this is just off the top of my head, but what about this for a design:
> - A j.l.ClassLoader maintains a collection of each of the classes it has
> loaded
> - A j.l.Class contains a pointer to its j.l.ClassLoader
> - A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1).
> The point of this is that a class loader and its classes are a 'self
> sustaining' data structure - if one element in it is reachable the whole
> thing is reachable.
Right. The special case is for system classes which are always in VM
root set so never reclaimed.

> The VM maintains a weak reference to all its j.l.ClassLoader instances,
> and maintains a ReferenceQueue for weakly-reachable classloaders.
> ClassLoaders are placed on the ReferenceQueue if and only if they are
> unreachable from the heap (including via their j.l.Class objects).
Here: should it actually read as "WeakReference instances for
weakly-reachable classloaders are placed on the ReferenceQueue"?
Otherwise this sentence completely escapes my mind, sorry.
If the former, when how VM could obtain&rescue referent CL objects (+
it's j.l.Class instances) after GC pass - AFAIU references are cleared
automatically before enqueuing? I suppose we are not going to
introduce inter-phase communication between VM and GC...

> Note this is an irreversible condition: objects that are unreachable can
> never become reachable again, except through very specific methods.
>
> When it sweeps the ReferenceQueue for unreachable classloaders, the VM
> places the unreachable classloaders in a queue of classloaders that are
> candidates for unloading.  This queue is part of the root set of the VM.
Strongly referenced now I suppose.

>  A classloader in this queue is unreachable from the heap, and can be
> unloaded when there are no objects of any class it has loaded.
So if the VM decides it is time to try unloading, it should:
1) Check if the full epoch has passed;
2) for each unloadable CL, scan corresponding vtables;
3) if none of the vtables were marked reachable, drop the CL from root
set completely and clean corresponding native structures; Java
instances will be reclaimed at nearest GC iteration;
4) Reset "epoch marker" and vtable words.

Do I get it right?


>
> This is where my mechanism comes into play.
>
> If an object executes getClass() then its classloader is removed from
> the unloadable classloader queue, its weak reference gets recreated  and
> we're back at the initial state.  My guess is that this is a pretty
> infrequent method call.
>
> I think this stage of the algorithm is easy in performance terms -
> difficult in terms of proving correctness, but if you have an efficient
> reachability mechanism for classes I think the building blocks are
> there, and the subtleties are nothing that a talented engineer can't solve.

Yes, a bit complicated. Taking into account the issues with
ReferenceQueue above, I'd rather suggest the following:

1) The j.l.Class and defining CL have mutual strong references, as said above.
2) Normally, the VM reports all CLs as strong roots thus preserving
them from premature reclamation;
3) When the VM decides (by whatever heuristic) it is time to perform
unloading, it checks epoch invariant and scans all vtables for all
CLs;
4) if a CL has no "reachable" vtables, it is moved to
unloading-candidates collection and reported as a weak root, otherwise
it remain in the strong root set.
5) If the nearest GC clears some of the weak references above, do
corresponding natives cleanup and return survived CLs to normal root
set.
6) Reset all data: epoch/vtables/etc and return back to 2).

I believe this is less disruptive to component interfaces and requires
less support on GC side.

>
>
> I'm not 100% sure what your counter-proposal is: I recall 2 approaches
> from the mailing list:
> 1) Each object has an additional word in its header that points back to
>    its j.l.Class object, and we proceed from here.
>
> Given that the mean object size is ~28 bytes, this proposal adds 14% to
> each object size.  This increases the frequency of GC by 14% and incurs
> a 14% slowdown.  Of course this is an oversimplification but a 14%
> slowdown is a pretty lousy starting point to argue from.
>
> 2) The existing pointer in the GC header is traced during GC time.
>
> The average number of pointers per object (excluding the vtable) is
> between 1.5 and 2 for the majority of benchmarks I have looked at
> (footnote: if you know something different, drop me a line) (geometric
> mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}).  Tracing one
> additional reference per object will therefore increase the cost of GC
> by ~60% on average.  Again oversimplification but indicative.  If we
> assume that GC accounts for 10% of runtime (more or less depending on
> heap size), this is a runtime overhead of 6%.
Looks reasonable as upper estimation, it would be nice to look at a
live data though. Aleksey?

> My proposal has been measured at ~1% overhead in GC time, or 0.1% in
> execution time (caveats as above).  If there is some complexity in
> establishing classloader reachability from this basis, I would assume it
> can easliy be absorbed.
>
> Therefore I think my proposal, while not complete, can form the basis of
> an efficient complete system for class unloading.

Nice thing about "automitic" approach is that it does not imply
slightest limitation on GC policy and adopts to any future algorithms
improvements. It's a pity the same wasn't (can't be?) said about the
voted idea.
Actually some tuning for the "automitic" approach is possible, like
keeping all j.l.Class & VT instances in a special space which is
collected only periodically, so GC does not need to trace VTs all the
time.

--
Regards,
Alexey

>
> (PS: I'd *love* to be proven wrong)
>
> cheers,
> Robin
>
> > Regards,
> > Robin
> >
> >>
> >>
> >> Aleksey.
> >>
> >>
> >> On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
> >>>
> >>> Pavel Pervov wrote:
> >>> > Robin,
> >>> >
> >>> > The kind of model I had in mind was along the lines of:
> >>> >> - VM maintains a linked list (or other collection type) of the
> >>> currently
> >>> >> loaded classloaders, each of which in turn maintains the
> >>> collection of
> >>> >> classes loaded by that type.  The sweep of classloaders goes
> >>> something
> >>> >> like:
> >>> >>
> >>> >> for (ClassLoader cl : classLoaders)
> >>> >>   for (Class c : cl.classes)
> >>> >>     cl.reachable |= c.vtable.reachable
> >>> >
> >>> >
> >>> > This is not enough. There are may be live j/l/Class'es and
> >>> > j/l/Classloader's
> >>> > in the heap. Even though no objects of any classes loaded by a
> >>> particual
> >>> > class loader are available in the heap, if we have live reference to
> >>> > j/l/ClassLoader itself, it just can't be unloaded.
> >>>
> >>> OK, well how about keeping a weak reference to the j.l.ClassLoader
> >>> object instead of a strong one.  When the reference becomes (strong)ly
> >>> unreachable, invoke the class-unloading phase.
> >>>
> >>> To me the key issue from a performance POV is the reachability of
> >>> classes from objects in the heap.  I don't pretend to have an answer to
> >>> the other questions---the performance critical one is the one I have
> >>> addressed, and I accept there may be many solutions to this part of the
> >>> question.
> >>>
> >>> > I believe that a separate heap trace pass, different from the standard
> >>> >> GC, that visited vtables and reachable resources from there would
> >>> also
> >>> >> be a viable solution.  As mentioned in an earlier post, writing
> >>> this in
> >>>
> >>> >> MMTk (where a heap trace operation is a class that you can easily
> >>> >> subtype to do this) would be easy.
> >>> >>
> >>> >> One of the advantages of my other proposal is that it can be
> >>> implemented
> >>> >> in the VM independent of the GC to some extent.  This additional
> >>> >> mark/scan phase may or may not be easy to implement, depending on the
> >>> >> structure of DRLVM GCs, which is something I haven't explored.
> >>> >
> >>> >
> >>> > DRLVM may work with (potentially) any number of GCs. Designing class
> >>> > unloading the way, which would require mark&scan cooperation from
> >>> GC, is
> >>> > not
> >>> > generally a good idea (from my HPOV).
> >>>
> >>> That's what I gathered.  hence my proposal.
> >>>
> >>> cheers
> >>>
> >>> --
> >>> Robin Garner
> >>> Dept. of Computer Science
> >>> Australian National University
> >>>
> >>
> >
> >
>
>
> --
> Robin Garner
> Dept. of Computer Science
> Australian National University
>

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Robin Garner wrote:
> Aleksey Ignatenko wrote:
>> Robin.
>>
>>> OK, well how about keeping a weak reference to the >j.l.ClassLoader
>>> object instead of a strong one.  When the reference >becomes (strong)ly
>>> unreachable, invoke the class-unloading phase.
>>
>>
>> If you have weak reference to j.l.Classloader - GC will collect it 
>> (with all
>> appropriate jlClasses) as soon as there are no references to
>> j.l.Classloaderand appropriate classes. But there is possible
>> situation when there are some
>> live objects of that classes and no references to jlClassloader and
>> jlClasses. This will lead to unpredictable consequences (crash, etc).
>>
>>
>>
>> I want to remind that there 3 mandatory conditions of class unloading:
>>
>> 1. j.l.Classloader instance is unreachable.
>>
>> 2. Appropriate j.l.Class instances are unreachable.
>>
>> 3. No object of any class loaded by appropriate class loader exists.
> 
> Let me repeat.  I offer an efficient solution to (3).  I don't purport 
> to have a solution to (1) and (2).

Let me just add:  This is because I don't think (1) or (2) are 
particularly difficult from a performance point of view, although I'm 
happy to accept that there may still be some subtle engineering challenges.

Now this is just off the top of my head, but what about this for a design:
- A j.l.ClassLoader maintains a collection of each of the classes it has 
loaded
- A j.l.Class contains a pointer to its j.l.ClassLoader
- A j.l.Class maintains a collection of its vtable(s) (or a pointer if 1:1).
The point of this is that a class loader and its classes are a 'self 
sustaining' data structure - if one element in it is reachable the whole 
thing is reachable.

The VM maintains a weak reference to all its j.l.ClassLoader instances, 
and maintains a ReferenceQueue for weakly-reachable classloaders. 
ClassLoaders are placed on the ReferenceQueue if and only if they are 
unreachable from the heap (including via their j.l.Class objects).  Note 
this is an irreversible condition: objects that are unreachable can 
never become reachable again, except through very specific methods.

When it sweeps the ReferenceQueue for unreachable classloaders, the VM 
places the unreachable classloaders in a queue of classloaders that are 
candidates for unloading.  This queue is part of the root set of the VM. 
  A classloader in this queue is unreachable from the heap, and can be 
unloaded when there are no objects of any class it has loaded.

This is where my mechanism comes into play.

If an object executes getClass() then its classloader is removed from 
the unloadable classloader queue, its weak reference gets recreated  and 
we're back at the initial state.  My guess is that this is a pretty 
infrequent method call.

I think this stage of the algorithm is easy in performance terms - 
difficult in terms of proving correctness, but if you have an efficient 
reachability mechanism for classes I think the building blocks are 
there, and the subtleties are nothing that a talented engineer can't solve.

I'm not 100% sure what your counter-proposal is: I recall 2 approaches 
from the mailing list:
1) Each object has an additional word in its header that points back to
    its j.l.Class object, and we proceed from here.

Given that the mean object size is ~28 bytes, this proposal adds 14% to 
each object size.  This increases the frequency of GC by 14% and incurs 
a 14% slowdown.  Of course this is an oversimplification but a 14% 
slowdown is a pretty lousy starting point to argue from.

2) The existing pointer in the GC header is traced during GC time.

The average number of pointers per object (excluding the vtable) is 
between 1.5 and 2 for the majority of benchmarks I have looked at 
(footnote: if you know something different, drop me a line) (geometric 
mean 1.78 for {specJVM, pseudoJBB and DaCapo 20051009}).  Tracing one 
additional reference per object will therefore increase the cost of GC 
by ~60% on average.  Again oversimplification but indicative.  If we 
assume that GC accounts for 10% of runtime (more or less depending on 
heap size), this is a runtime overhead of 6%.

My proposal has been measured at ~1% overhead in GC time, or 0.1% in 
execution time (caveats as above).  If there is some complexity in 
establishing classloader reachability from this basis, I would assume it 
can easliy be absorbed.

Therefore I think my proposal, while not complete, can form the basis of 
an efficient complete system for class unloading.

(PS: I'd *love* to be proven wrong)

cheers,
Robin

> Regards,
> Robin
> 
>>
>>
>> Aleksey.
>>
>>
>> On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
>>>
>>> Pavel Pervov wrote:
>>> > Robin,
>>> >
>>> > The kind of model I had in mind was along the lines of:
>>> >> - VM maintains a linked list (or other collection type) of the
>>> currently
>>> >> loaded classloaders, each of which in turn maintains the 
>>> collection of
>>> >> classes loaded by that type.  The sweep of classloaders goes 
>>> something
>>> >> like:
>>> >>
>>> >> for (ClassLoader cl : classLoaders)
>>> >>   for (Class c : cl.classes)
>>> >>     cl.reachable |= c.vtable.reachable
>>> >
>>> >
>>> > This is not enough. There are may be live j/l/Class'es and
>>> > j/l/Classloader's
>>> > in the heap. Even though no objects of any classes loaded by a 
>>> particual
>>> > class loader are available in the heap, if we have live reference to
>>> > j/l/ClassLoader itself, it just can't be unloaded.
>>>
>>> OK, well how about keeping a weak reference to the j.l.ClassLoader
>>> object instead of a strong one.  When the reference becomes (strong)ly
>>> unreachable, invoke the class-unloading phase.
>>>
>>> To me the key issue from a performance POV is the reachability of
>>> classes from objects in the heap.  I don't pretend to have an answer to
>>> the other questions---the performance critical one is the one I have
>>> addressed, and I accept there may be many solutions to this part of the
>>> question.
>>>
>>> > I believe that a separate heap trace pass, different from the standard
>>> >> GC, that visited vtables and reachable resources from there would 
>>> also
>>> >> be a viable solution.  As mentioned in an earlier post, writing 
>>> this in
>>>
>>> >> MMTk (where a heap trace operation is a class that you can easily
>>> >> subtype to do this) would be easy.
>>> >>
>>> >> One of the advantages of my other proposal is that it can be
>>> implemented
>>> >> in the VM independent of the GC to some extent.  This additional
>>> >> mark/scan phase may or may not be easy to implement, depending on the
>>> >> structure of DRLVM GCs, which is something I haven't explored.
>>> >
>>> >
>>> > DRLVM may work with (potentially) any number of GCs. Designing class
>>> > unloading the way, which would require mark&scan cooperation from 
>>> GC, is
>>> > not
>>> > generally a good idea (from my HPOV).
>>>
>>> That's what I gathered.  hence my proposal.
>>>
>>> cheers
>>>
>>> -- 
>>> Robin Garner
>>> Dept. of Computer Science
>>> Australian National University
>>>
>>
> 
> 

-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Aleksey Ignatenko wrote:
> Robin.
> 
>> OK, well how about keeping a weak reference to the >j.l.ClassLoader
>> object instead of a strong one.  When the reference >becomes (strong)ly
>> unreachable, invoke the class-unloading phase.
> 
> 
> If you have weak reference to j.l.Classloader - GC will collect it (with 
> all
> appropriate jlClasses) as soon as there are no references to
> j.l.Classloaderand appropriate classes. But there is possible
> situation when there are some
> live objects of that classes and no references to jlClassloader and
> jlClasses. This will lead to unpredictable consequences (crash, etc).
> 
> 
> 
> I want to remind that there 3 mandatory conditions of class unloading:
> 
> 1. j.l.Classloader instance is unreachable.
> 
> 2. Appropriate j.l.Class instances are unreachable.
> 
> 3. No object of any class loaded by appropriate class loader exists.

Let me repeat.  I offer an efficient solution to (3).  I don't purport 
to have a solution to (1) and (2).

Regards,
Robin

> 
> 
> Aleksey.
> 
> 
> On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
>>
>> Pavel Pervov wrote:
>> > Robin,
>> >
>> > The kind of model I had in mind was along the lines of:
>> >> - VM maintains a linked list (or other collection type) of the
>> currently
>> >> loaded classloaders, each of which in turn maintains the collection of
>> >> classes loaded by that type.  The sweep of classloaders goes something
>> >> like:
>> >>
>> >> for (ClassLoader cl : classLoaders)
>> >>   for (Class c : cl.classes)
>> >>     cl.reachable |= c.vtable.reachable
>> >
>> >
>> > This is not enough. There are may be live j/l/Class'es and
>> > j/l/Classloader's
>> > in the heap. Even though no objects of any classes loaded by a 
>> particual
>> > class loader are available in the heap, if we have live reference to
>> > j/l/ClassLoader itself, it just can't be unloaded.
>>
>> OK, well how about keeping a weak reference to the j.l.ClassLoader
>> object instead of a strong one.  When the reference becomes (strong)ly
>> unreachable, invoke the class-unloading phase.
>>
>> To me the key issue from a performance POV is the reachability of
>> classes from objects in the heap.  I don't pretend to have an answer to
>> the other questions---the performance critical one is the one I have
>> addressed, and I accept there may be many solutions to this part of the
>> question.
>>
>> > I believe that a separate heap trace pass, different from the standard
>> >> GC, that visited vtables and reachable resources from there would also
>> >> be a viable solution.  As mentioned in an earlier post, writing 
>> this in
>>
>> >> MMTk (where a heap trace operation is a class that you can easily
>> >> subtype to do this) would be easy.
>> >>
>> >> One of the advantages of my other proposal is that it can be
>> implemented
>> >> in the VM independent of the GC to some extent.  This additional
>> >> mark/scan phase may or may not be easy to implement, depending on the
>> >> structure of DRLVM GCs, which is something I haven't explored.
>> >
>> >
>> > DRLVM may work with (potentially) any number of GCs. Designing class
>> > unloading the way, which would require mark&scan cooperation from 
>> GC, is
>> > not
>> > generally a good idea (from my HPOV).
>>
>> That's what I gathered.  hence my proposal.
>>
>> cheers
>>
>> -- 
>> Robin Garner
>> Dept. of Computer Science
>> Australian National University
>>
> 


-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm] Class unloading support - tested one approach

Posted by Aleksey Ignatenko <al...@gmail.com>.

Robin.

>OK, well how about keeping a weak reference to the >j.l.ClassLoader
>object instead of a strong one.  When the reference >becomes (strong)ly
>unreachable, invoke the class-unloading phase.


If you have weak reference to j.l.Classloader - GC will collect it (with all
appropriate jlClasses) as soon as there are no references to
j.l.Classloaderand appropriate classes. But there is possible
situation when there are some
live objects of that classes and no references to jlClassloader and
jlClasses. This will lead to unpredictable consequences (crash, etc).



I want to remind that there 3 mandatory conditions of class unloading:

1. j.l.Classloader instance is unreachable.

2. Appropriate j.l.Class instances are unreachable.

3. No object of any class loaded by appropriate class loader exists.



Aleksey.


On 11/8/06, Robin Garner <ro...@anu.edu.au> wrote:
>
> Pavel Pervov wrote:
> > Robin,
> >
> > The kind of model I had in mind was along the lines of:
> >> - VM maintains a linked list (or other collection type) of the
> currently
> >> loaded classloaders, each of which in turn maintains the collection of
> >> classes loaded by that type.  The sweep of classloaders goes something
> >> like:
> >>
> >> for (ClassLoader cl : classLoaders)
> >>   for (Class c : cl.classes)
> >>     cl.reachable |= c.vtable.reachable
> >
> >
> > This is not enough. There are may be live j/l/Class'es and
> > j/l/Classloader's
> > in the heap. Even though no objects of any classes loaded by a particual
> > class loader are available in the heap, if we have live reference to
> > j/l/ClassLoader itself, it just can't be unloaded.
>
> OK, well how about keeping a weak reference to the j.l.ClassLoader
> object instead of a strong one.  When the reference becomes (strong)ly
> unreachable, invoke the class-unloading phase.
>
> To me the key issue from a performance POV is the reachability of
> classes from objects in the heap.  I don't pretend to have an answer to
> the other questions---the performance critical one is the one I have
> addressed, and I accept there may be many solutions to this part of the
> question.
>
> > I believe that a separate heap trace pass, different from the standard
> >> GC, that visited vtables and reachable resources from there would also
> >> be a viable solution.  As mentioned in an earlier post, writing this in
>
> >> MMTk (where a heap trace operation is a class that you can easily
> >> subtype to do this) would be easy.
> >>
> >> One of the advantages of my other proposal is that it can be
> implemented
> >> in the VM independent of the GC to some extent.  This additional
> >> mark/scan phase may or may not be easy to implement, depending on the
> >> structure of DRLVM GCs, which is something I haven't explored.
> >
> >
> > DRLVM may work with (potentially) any number of GCs. Designing class
> > unloading the way, which would require mark&scan cooperation from GC, is
> > not
> > generally a good idea (from my HPOV).
>
> That's what I gathered.  hence my proposal.
>
> cheers
>
> --
> Robin Garner
> Dept. of Computer Science
> Australian National University
>

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Pavel Pervov wrote:
> Robin,
> 
> The kind of model I had in mind was along the lines of:
>> - VM maintains a linked list (or other collection type) of the currently
>> loaded classloaders, each of which in turn maintains the collection of
>> classes loaded by that type.  The sweep of classloaders goes something
>> like:
>>
>> for (ClassLoader cl : classLoaders)
>>   for (Class c : cl.classes)
>>     cl.reachable |= c.vtable.reachable
> 
> 
> This is not enough. There are may be live j/l/Class'es and 
> j/l/Classloader's
> in the heap. Even though no objects of any classes loaded by a particual
> class loader are available in the heap, if we have live reference to
> j/l/ClassLoader itself, it just can't be unloaded.

OK, well how about keeping a weak reference to the j.l.ClassLoader 
object instead of a strong one.  When the reference becomes (strong)ly 
unreachable, invoke the class-unloading phase.

To me the key issue from a performance POV is the reachability of 
classes from objects in the heap.  I don't pretend to have an answer to 
the other questions---the performance critical one is the one I have 
addressed, and I accept there may be many solutions to this part of the 
question.

> I believe that a separate heap trace pass, different from the standard
>> GC, that visited vtables and reachable resources from there would also
>> be a viable solution.  As mentioned in an earlier post, writing this in
>> MMTk (where a heap trace operation is a class that you can easily
>> subtype to do this) would be easy.
>>
>> One of the advantages of my other proposal is that it can be implemented
>> in the VM independent of the GC to some extent.  This additional
>> mark/scan phase may or may not be easy to implement, depending on the
>> structure of DRLVM GCs, which is something I haven't explored.
> 
> 
> DRLVM may work with (potentially) any number of GCs. Designing class
> unloading the way, which would require mark&scan cooperation from GC, is 
> not
> generally a good idea (from my HPOV).

That's what I gathered.  hence my proposal.

cheers

-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm] Class unloading support - tested one approach

Posted by Pavel Pervov <pm...@gmail.com>.

 Robin,

The kind of model I had in mind was along the lines of:
> - VM maintains a linked list (or other collection type) of the currently
> loaded classloaders, each of which in turn maintains the collection of
> classes loaded by that type.  The sweep of classloaders goes something
> like:
>
> for (ClassLoader cl : classLoaders)
>   for (Class c : cl.classes)
>     cl.reachable |= c.vtable.reachable


This is not enough. There are may be live j/l/Class'es and j/l/Classloader's
in the heap. Even though no objects of any classes loaded by a particual
class loader are available in the heap, if we have live reference to
j/l/ClassLoader itself, it just can't be unloaded.

I believe that a separate heap trace pass, different from the standard
> GC, that visited vtables and reachable resources from there would also
> be a viable solution.  As mentioned in an earlier post, writing this in
> MMTk (where a heap trace operation is a class that you can easily
> subtype to do this) would be easy.
>
> One of the advantages of my other proposal is that it can be implemented
> in the VM independent of the GC to some extent.  This additional
> mark/scan phase may or may not be easy to implement, depending on the
> structure of DRLVM GCs, which is something I haven't explored.


DRLVM may work with (potentially) any number of GCs. Designing class
unloading the way, which would require mark&scan cooperation from GC, is not
generally a good idea (from my HPOV).

-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Aleksey Ignatenko wrote:
> Hi, Robin.
> I do really like this proposed idea of marking VTables from objects via
> additional word field in VTable.
> 
> But I have one question about detecting reachability of the classloaders
> ("sweep the vtables and check the reachability of the classloaders").
> Possibly I missed something, but here is my view of the current model of
> drlvm: all j.l.Classes and j.l.Classloaders are enumerated as strong roots
> (strong references). Therefore we meet situation when all j.l.Classes and
> j.l.Classloaders are always reachable (marked). And no sweep will help to
> detect classloaders reachability.
> I see the single way to distinguish if j.l.Classloader or j.l.Class was
> marked not by strong root from VM but by some reference from heap - is
> to write unique object value into VTable. Then we can detect if some
> jlClasloader was marked from rootset (strong root from VM) or from some 
> live
> object.

The kind of model I had in mind was along the lines of:
- VM maintains a linked list (or other collection type) of the currently 
loaded classloaders, each of which in turn maintains the collection of 
classes loaded by that type.  The sweep of classloaders goes something like:

for (ClassLoader cl : classLoaders)
   for (Class c : cl.classes)
     cl.reachable |= c.vtable.reachable

Then for any classloader where (!reachable), free its native resources 
and remove its strong root.  The java resources will be freed at next GC.

> I also want to say that 1-st proposed design from me assumed addtional
> mark&scan phase without enumeration of jlClasses and jlClassloaders to be
> able to detect their reachability.

I believe that a separate heap trace pass, different from the standard 
GC, that visited vtables and reachable resources from there would also 
be a viable solution.  As mentioned in an earlier post, writing this in 
MMTk (where a heap trace operation is a class that you can easily 
subtype to do this) would be easy.

One of the advantages of my other proposal is that it can be implemented 
in the VM independent of the GC to some extent.  This additional 
mark/scan phase may or may not be easy to implement, depending on the 
structure of DRLVM GCs, which is something I haven't explored.

In terms of runtime cost, I would expect an auxiliary scan of this type 
to be equivalent in cost to a full-heap GC.  The other solution costs 
~1% of all GCs.  As a "back of a matchbox" calculation, if this is run 
less than every 100 (full heap) GCs, then the auxiliary trace is a win, 
if not, my other solution is a win.

> Could you, please, clarify this moment.
> Thanks, Aleksey.

Hope this answers your questions
cheers,
Robin

> 
> On 11/3/06, Rana Dasgupta <rd...@gmail.com> wrote:
>>
>> On 11/2/06, Xiao-Feng Li <xi...@gmail.com> wrote:
>> >
>> > >Robin, thanks for all the clarifications. Now it seems clear to me 
>> >and
>> > >I am convinced by this proposal. :-)
>>
>>
>> Yes, this proposal is the simplest and has the least perf impact. Thanks
>> Robin.
>>
>>
> 

-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm] Class unloading support - tested one approach

Posted by Aleksey Ignatenko <al...@gmail.com>.

Hi, Robin.
I do really like this proposed idea of marking VTables from objects via
additional word field in VTable.

But I have one question about detecting reachability of the classloaders
("sweep the vtables and check the reachability of the classloaders").
Possibly I missed something, but here is my view of the current model of
drlvm: all j.l.Classes and j.l.Classloaders are enumerated as strong roots
(strong references). Therefore we meet situation when all j.l.Classes and
j.l.Classloaders are always reachable (marked). And no sweep will help to
detect classloaders reachability.
I see the single way to distinguish if j.l.Classloader or j.l.Class was
marked not by strong root from VM but by some reference from heap - is
to write unique object value into VTable. Then we can detect if some
jlClasloader was marked from rootset (strong root from VM) or from some live
object.

I also want to say that 1-st proposed design from me assumed addtional
mark&scan phase without enumeration of jlClasses and jlClassloaders to be
able to detect their reachability.

Could you, please, clarify this moment.
Thanks, Aleksey.

On 11/3/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
> On 11/2/06, Xiao-Feng Li <xi...@gmail.com> wrote:
> >
> > >Robin, thanks for all the clarifications. Now it seems clear to me >and
> > >I am convinced by this proposal. :-)
>
>
> Yes, this proposal is the simplest and has the least perf impact. Thanks
> Robin.
>
>

Re: [drlvm] Class unloading support - tested one approach

Posted by Rana Dasgupta <rd...@gmail.com>.

On 11/2/06, Xiao-Feng Li <xi...@gmail.com> wrote:
>
> >Robin, thanks for all the clarifications. Now it seems clear to me >and
> >I am convinced by this proposal. :-)


Yes, this proposal is the simplest and has the least perf impact. Thanks
Robin.

Re: [drlvm] Class unloading support - tested one approach

Posted by Xiao-Feng Li <xi...@gmail.com>.

On 11/3/06, Robin Garner <ro...@anu.edu.au> wrote:
> Xiao-Feng Li wrote:
> > Robin, good idea.
> >
> > I understand the main difference between your design and Aleksey's
> > proposal 1 is, the tracing in your design stops as vtable, but
> > Aleksey's continues to classloader. On the other hand, your approach
> > requires an extra step to sweep the vtables in order to determine the
> > classloaders' reachability.
>
> Actually there are quite a few more differences:
> - This mark phase is built into the standard GC trace, like Aleksey's
> automatic class unloading proposal.
> - This approach requires no additional fields in headers or objects
> (except maybe something to allow enumeration of vtables if this doesn't
> already exist)
> - The additional mark comes at an extremely low cost as discussed
> previously.
>
> The operation to sweep vtables is very cheap, and only needs to be done
> when you believe there are classloaders that can be unloaded, rather
> than at every GC.  You might for example trigger class unloading every
> time a new classloader is loaded.
>
> > If this is true, why not just let the tracing to continue as a
> > complete step to determine the classloaders' reachability?
>
> Because that adds a large overhead to every GC, and requires vtables and
> classloader structures to be traced at every GC.  While the numbers of
> vtables is not large, the number of pointers to them is.  The particular
> flavour of mark in my proposal is much cheaper than the standard test
> and mark operation.
>
> > Another difference is to mark the reachability with an unconditional
> > write instead of a bit mask write. I think this can be applied to
> > either approach.
>
> Not really.
>
> If you use an unconditional mark, you lose the ability to test whether
> any particular mark is the first, and therefore enqueue an object for
> scanning only once, and therefore the heap trace can never complete.
> You can only use unconditional marks to process 'leaf' objects in the heap.
>
> You can always turn a bit map into a byte map and avoid synchronized
> update, but you can't eliminate the dependent load in a standard trace
> algorithm.  The difference in performance between a load-test-write and
> a load-test-mask-write is insignificant.
>
>
> Of course a separate trace of the heap is an attractive operation - in
> MMTk, this is simple to build because the transitive closure code can
> simply be subclassed (eg the sanity checker is ~250 lines of code).
> Depending on how reusable the DRLVM heap trace code is, this may or may
> not be a good option.

Robin, thanks for all the clarifications. Now it seems clear to me and
I am convinced by this proposal. :-)

Thanks,
xiaofeng

> cheers,
> Robin
>
>
> > Thanks,
> > xiaofeng
> >
> > On 11/1/06, Robin Garner <ro...@anu.edu.au> wrote:
> >> Actually, just thinking about how I would implement this in JikesRVM, I
> >> would use the reachability based algorithm, but piggyback on the
> >> existing GC mechanisms:
> >>
> >> - Allocate a byte (or word) in each vtable for the purpose of tracking
> >> class reachability.
> >> - Periodically, at a time when no GC is running (even the most
> >> aggressive concurrent GC algorithms have these, I believe), zero this
> >> flag across all vtables.  This is the beginning of a class-unloading
> >> epoch.
> >> - During each GC, when the GC is fetching the GC map for an object,
> >> *unconditionally* write a value to the class reachability byte.  It may
> >> make sense for this byte to be in either the first cache-line of the
> >> vtable, or the cache line that points to the GC map - just make sure the
> >> mark operation doesn't in general fetch an additional cache line.
> >> - At a point in the sufficiently far future, when all reachable objects
> >> are known to have been traced by the GC, sweep the vtables and check the
> >> reachability of the classloaders.
> >>
> >> The features of this approach are:
> >>
> >> - Minimal additional work at GC time.  The additional write will cause
> >> some additional memory traffic, but a) it's to memory that is already
> >> guaranteed to be in L1 cache, and b) it's an unconditional independent
> >> write, and c) multiple writes to common classes will be absorbed by the
> >> write buffer.
> >>
> >> - Space cost of at most 1 word per vtable.
> >>
> >> - This works whether vtables are objects or VM structures
> >>
> >> - If the relationship between a class and a vtable is not 1:1, this only
> >> need affect the periodic sweep process, which should be infrequent and
> >> small compared to a GC.
> >>
> >> - shouldn't need a stop-the-world at any point.
> >>
> >> I've implemented and tested the GC-relevant part of this in JikesRVM,
> >> and the GC time overhead appears to be just under 1% in the MMTk
> >> MarkSweep collector.
> >>
> >> cheers,
> >> Robin
> >>
>
>

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Xiao-Feng Li wrote:
> Robin, good idea.
> 
> I understand the main difference between your design and Aleksey's
> proposal 1 is, the tracing in your design stops as vtable, but
> Aleksey's continues to classloader. On the other hand, your approach
> requires an extra step to sweep the vtables in order to determine the
> classloaders' reachability.

Actually there are quite a few more differences:
- This mark phase is built into the standard GC trace, like Aleksey's 
automatic class unloading proposal.
- This approach requires no additional fields in headers or objects 
(except maybe something to allow enumeration of vtables if this doesn't 
already exist)
- The additional mark comes at an extremely low cost as discussed 
previously.

The operation to sweep vtables is very cheap, and only needs to be done 
when you believe there are classloaders that can be unloaded, rather 
than at every GC.  You might for example trigger class unloading every 
time a new classloader is loaded.

> If this is true, why not just let the tracing to continue as a
> complete step to determine the classloaders' reachability?

Because that adds a large overhead to every GC, and requires vtables and 
classloader structures to be traced at every GC.  While the numbers of 
vtables is not large, the number of pointers to them is.  The particular 
flavour of mark in my proposal is much cheaper than the standard test 
and mark operation.

> Another difference is to mark the reachability with an unconditional
> write instead of a bit mask write. I think this can be applied to
> either approach.

Not really.

If you use an unconditional mark, you lose the ability to test whether 
any particular mark is the first, and therefore enqueue an object for 
scanning only once, and therefore the heap trace can never complete. 
You can only use unconditional marks to process 'leaf' objects in the heap.

You can always turn a bit map into a byte map and avoid synchronized 
update, but you can't eliminate the dependent load in a standard trace 
algorithm.  The difference in performance between a load-test-write and 
a load-test-mask-write is insignificant.

Of course a separate trace of the heap is an attractive operation - in 
MMTk, this is simple to build because the transitive closure code can 
simply be subclassed (eg the sanity checker is ~250 lines of code). 
Depending on how reusable the DRLVM heap trace code is, this may or may 
not be a good option.

cheers,
Robin

> Thanks,
> xiaofeng
> 
> On 11/1/06, Robin Garner <ro...@anu.edu.au> wrote:
>> Actually, just thinking about how I would implement this in JikesRVM, I
>> would use the reachability based algorithm, but piggyback on the
>> existing GC mechanisms:
>>
>> - Allocate a byte (or word) in each vtable for the purpose of tracking
>> class reachability.
>> - Periodically, at a time when no GC is running (even the most
>> aggressive concurrent GC algorithms have these, I believe), zero this
>> flag across all vtables.  This is the beginning of a class-unloading 
>> epoch.
>> - During each GC, when the GC is fetching the GC map for an object,
>> *unconditionally* write a value to the class reachability byte.  It may
>> make sense for this byte to be in either the first cache-line of the
>> vtable, or the cache line that points to the GC map - just make sure the
>> mark operation doesn't in general fetch an additional cache line.
>> - At a point in the sufficiently far future, when all reachable objects
>> are known to have been traced by the GC, sweep the vtables and check the
>> reachability of the classloaders.
>>
>> The features of this approach are:
>>
>> - Minimal additional work at GC time.  The additional write will cause
>> some additional memory traffic, but a) it's to memory that is already
>> guaranteed to be in L1 cache, and b) it's an unconditional independent
>> write, and c) multiple writes to common classes will be absorbed by the
>> write buffer.
>>
>> - Space cost of at most 1 word per vtable.
>>
>> - This works whether vtables are objects or VM structures
>>
>> - If the relationship between a class and a vtable is not 1:1, this only
>> need affect the periodic sweep process, which should be infrequent and
>> small compared to a GC.
>>
>> - shouldn't need a stop-the-world at any point.
>>
>> I've implemented and tested the GC-relevant part of this in JikesRVM,
>> and the GC time overhead appears to be just under 1% in the MMTk
>> MarkSweep collector.
>>
>> cheers,
>> Robin
>>

Re: [drlvm] Class unloading support - tested one approach

Posted by Xiao-Feng Li <xi...@gmail.com>.

Robin, good idea.

I understand the main difference between your design and Aleksey's
proposal 1 is, the tracing in your design stops as vtable, but
Aleksey's continues to classloader. On the other hand, your approach
requires an extra step to sweep the vtables in order to determine the
classloaders' reachability.

If this is true, why not just let the tracing to continue as a
complete step to determine the classloaders' reachability?

Another difference is to mark the reachability with an unconditional
write instead of a bit mask write. I think this can be applied to
either approach.

Thanks,
xiaofeng

On 11/1/06, Robin Garner <ro...@anu.edu.au> wrote:
> Actually, just thinking about how I would implement this in JikesRVM, I
> would use the reachability based algorithm, but piggyback on the
> existing GC mechanisms:
>
> - Allocate a byte (or word) in each vtable for the purpose of tracking
> class reachability.
> - Periodically, at a time when no GC is running (even the most
> aggressive concurrent GC algorithms have these, I believe), zero this
> flag across all vtables.  This is the beginning of a class-unloading epoch.
> - During each GC, when the GC is fetching the GC map for an object,
> *unconditionally* write a value to the class reachability byte.  It may
> make sense for this byte to be in either the first cache-line of the
> vtable, or the cache line that points to the GC map - just make sure the
> mark operation doesn't in general fetch an additional cache line.
> - At a point in the sufficiently far future, when all reachable objects
> are known to have been traced by the GC, sweep the vtables and check the
> reachability of the classloaders.
>
> The features of this approach are:
>
> - Minimal additional work at GC time.  The additional write will cause
> some additional memory traffic, but a) it's to memory that is already
> guaranteed to be in L1 cache, and b) it's an unconditional independent
> write, and c) multiple writes to common classes will be absorbed by the
> write buffer.
>
> - Space cost of at most 1 word per vtable.
>
> - This works whether vtables are objects or VM structures
>
> - If the relationship between a class and a vtable is not 1:1, this only
> need affect the periodic sweep process, which should be infrequent and
> small compared to a GC.
>
> - shouldn't need a stop-the-world at any point.
>
> I've implemented and tested the GC-relevant part of this in JikesRVM,
> and the GC time overhead appears to be just under 1% in the MMTk
> MarkSweep collector.
>
> cheers,
> Robin
>

Re: [drlvm] Class unloading support - tested one approach

Posted by Etienne Gagnon <eg...@sablevm.org>.

Robin Garner wrote:
> - Allocate a byte (or word) in each vtable for the purpose of tracking
> class reachability.

Yep, there's no reason to keep the bits (or words, for performance) in
the class loader, even in the approach I've proposed.  They could be
moved to the vtable.

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm] Class unloading support - tested one approach

Posted by Robin Garner <ro...@anu.edu.au>.

Actually, just thinking about how I would implement this in JikesRVM, I 
would use the reachability based algorithm, but piggyback on the 
existing GC mechanisms:

- Allocate a byte (or word) in each vtable for the purpose of tracking 
class reachability.
- Periodically, at a time when no GC is running (even the most 
aggressive concurrent GC algorithms have these, I believe), zero this 
flag across all vtables.  This is the beginning of a class-unloading epoch.
- During each GC, when the GC is fetching the GC map for an object, 
*unconditionally* write a value to the class reachability byte.  It may 
make sense for this byte to be in either the first cache-line of the 
vtable, or the cache line that points to the GC map - just make sure the 
mark operation doesn't in general fetch an additional cache line.
- At a point in the sufficiently far future, when all reachable objects 
are known to have been traced by the GC, sweep the vtables and check the 
reachability of the classloaders.

The features of this approach are:

- Minimal additional work at GC time.  The additional write will cause 
some additional memory traffic, but a) it's to memory that is already 
guaranteed to be in L1 cache, and b) it's an unconditional independent 
write, and c) multiple writes to common classes will be absorbed by the 
write buffer.

- Space cost of at most 1 word per vtable.

- This works whether vtables are objects or VM structures

- If the relationship between a class and a vtable is not 1:1, this only 
need affect the periodic sweep process, which should be infrequent and 
small compared to a GC.

- shouldn't need a stop-the-world at any point.

I've implemented and tested the GC-relevant part of this in JikesRVM, 
and the GC time overhead appears to be just under 1% in the MMTk 
MarkSweep collector.

cheers,
Robin

Re: [drlvm] Class unloading support

Posted by Robin Garner <ro...@anu.edu.au>.

Ivan Volosyuk wrote:
> Robin, thank you for this information. I want to ask a few questions
> to check that I understand you correctly.
> 
> On 10/31/06, Robin Garner <ro...@anu.edu.au> wrote:
>> MMTk implements several algorithms for retaining the reachable objects
>> in a graph and recycling space used by unreachable ones.  It relies on
>> the host VM to provide a set of roots.  It supports several different
>> semantics of 'weak' references, including but not confined to those
>> required by Java.
>>
>> If you can implement class unloading using those (which the current
>> proposal does), then MMTk can help.
>>
>> If you want to put a pointer to the j.l.Class in the object header, MMTk
>> will not care, as it has no way of knowing.  If you put an additional
>> pointer into the body of every object, then MMTk will see it as just
>> another object to scan.
> 
> Does this mean that MMTk will not work with VM in which VTable pointer
> (a pointer in object header) points to other heap object?

If the GC map for the object includes this pointer, MMTk will trace it, 
otherwise not.    MMTk's view of an object is abstracted through the 
implementation-specific ObjectModel interface, which provides isolation 
from the implementation details.  When I talk about 'object header', 
more precisely I'm talking about the fields that MMTk doesn't see, since 
MMTk has no real concept of an object header.

In JikesRVM, the TIB is actually an Object[] that lives in the heap - we 
don't trace TIBs from objects, but (AFAIR) via roots from the VM.  If 
you want to trace them during GC, just give MMTk GC maps that include 
them, and it will.  The invariant is simply that the ObjectModel must be 
able to understand the vtables.

>>
>> Remember MMTk is a memory manager, not a Java VM!
>>
>>
>> Conversely, supporting some exotic class unloading mechanism in MMTk
>> shouldn't be hard and wouldn't deter me from trying it out.  If (as a
>> wild idea) you wanted to periodically scan the heap, and count all
>> references to each classloader, you could implement this with very
>> little work as a TraceLocal object, and then extend the GC plan you
>> wanted with an additional GC phase that would periodically do one of
>> these scans after a major GC (for example).
> 
> This looks similar to approach #2 discussed here, agree?
> 

If what you mean is Aleksey's 'Mark and scan' proposal, yes, that sounds 
right.  I'm not advocating it as 'the solution' because I don't know 
what's best here, just saying that implementing it in MMTk wouldn't 
necessarily be hard.

cheers

Re: [drlvm] Class unloading support

Posted by Ivan Volosyuk <iv...@gmail.com>.

Robin, thank you for this information. I want to ask a few questions
to check that I understand you correctly.

On 10/31/06, Robin Garner <ro...@anu.edu.au> wrote:
> MMTk implements several algorithms for retaining the reachable objects
> in a graph and recycling space used by unreachable ones.  It relies on
> the host VM to provide a set of roots.  It supports several different
> semantics of 'weak' references, including but not confined to those
> required by Java.
>
> If you can implement class unloading using those (which the current
> proposal does), then MMTk can help.
>
> If you want to put a pointer to the j.l.Class in the object header, MMTk
> will not care, as it has no way of knowing.  If you put an additional
> pointer into the body of every object, then MMTk will see it as just
> another object to scan.

Does this mean that MMTk will not work with VM in which VTable pointer
(a pointer in object header) points to other heap object?

>
> Remember MMTk is a memory manager, not a Java VM!
>
>
> Conversely, supporting some exotic class unloading mechanism in MMTk
> shouldn't be hard and wouldn't deter me from trying it out.  If (as a
> wild idea) you wanted to periodically scan the heap, and count all
> references to each classloader, you could implement this with very
> little work as a TraceLocal object, and then extend the GC plan you
> wanted with an additional GC phase that would periodically do one of
> these scans after a major GC (for example).

This looks similar to approach #2 discussed here, agree?

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Robin Garner <ro...@anu.edu.au>.

Weldon Washburn wrote:
> On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>>
>>
>>
>> Weldon Washburn wrote:
>> > Steve Blackburn was in Portland Oregon today.  I mentioned the idea of
>> > adding a  reference pointer from object to its j.l.Class instance.  
>> MMTk
>> > was
>> > not designed with this idea in mind.  It looks like you will need to 
>> fix
>> > this part of MMTk and maintain it yourself.  Steve did not seem 
>> thrilled
>> at
>> > adding this support to MMTk code base.

Actually I think the answer may have been a little garbled along the way 
here: MMTk is not a memory manager *for* Java, it is simply a memory 
manager.  We have been careful to eliminate language-specific features 
in the heap that it manages.  MMTk has been used to manage C# (in the 
Rotor VM) and was being incorporated into a Haskell runtime until I ran 
out of time.

Therefore, MMTk knows nothing about the concept of class unloading, or 
java.lang.Class.

>> How does MMTk support class unloading then?
> 
> 
> MMTk has no special support for class unloading.  This may have 
> something to
> do with the entire system being written in Java thus class unloading come
> along for free.  If there needs to be a modification to support special 
> case
> objects in DRLVM, someone will need to fixup MMTk and provide onging
> support of this patch in Harmony.  I have zero idea how big this effort
> would be.   It would also be good to hear what the impact will be on GCV5.

MMTk implements several algorithms for retaining the reachable objects 
in a graph and recycling space used by unreachable ones.  It relies on 
the host VM to provide a set of roots.  It supports several different 
semantics of 'weak' references, including but not confined to those 
required by Java.

If you can implement class unloading using those (which the current 
proposal does), then MMTk can help.

If you want to put a pointer to the j.l.Class in the object header, MMTk 
will not care, as it has no way of knowing.  If you put an additional 
pointer into the body of every object, then MMTk will see it as just 
another object to scan.

Remember MMTk is a memory manager, not a Java VM!

Conversely, supporting some exotic class unloading mechanism in MMTk 
shouldn't be hard and wouldn't deter me from trying it out.  If (as a 
wild idea) you wanted to periodically scan the heap, and count all 
references to each classloader, you could implement this with very 
little work as a TraceLocal object, and then extend the GC plan you 
wanted with an additional GC phase that would periodically do one of 
these scans after a major GC (for example).

cheers

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
>
>
> Weldon Washburn wrote:
> > Steve Blackburn was in Portland Oregon today.  I mentioned the idea of
> > adding a  reference pointer from object to its j.l.Class instance.  MMTk
> > was
> > not designed with this idea in mind.  It looks like you will need to fix
> > this part of MMTk and maintain it yourself.  Steve did not seem thrilled
> at
> > adding this support to MMTk code base.
>
> How does MMTk support class unloading then?


MMTk has no special support for class unloading.  This may have something to
do with the entire system being written in Java thus class unloading come
along for free.  If there needs to be a modification to support special case
objects in DRLVM, someone will need to fixup MMTk and provide onging
support of this patch in Harmony.  I have zero idea how big this effort
would be.   It would also be good to hear what the impact will be on GCV5.

If the immediate goal for svn HEAD is stability, changing something as
fundamental as object layout may not be a good idea.

On the other hand, its good to provide an environment where folks can
publicly experiment on things like object layout and different class
unloading schemes without dragging along GCV5, MMTk, etc.  Maybe a branch
makes sense?



>
> > Have we looked at other class unloading designs?  From what I have read
> in
> > open literature on object layout, I don't recall any special fields to
> > support class unloading.
> >
> >
> > On 10/26/06, Rana Dasgupta <rd...@gmail.com> wrote:
> >>
> >> Aleksey,
> >>   I had a couple of questions.
> >>   You state that DRLVM does not implement the class unloading
> >> optimization,
> >> and this may create memory pressure on some applications that load many
> >> classes. Do we have a real case / example where an application is stuck
> >> for
> >> insufficient memory because it uses a lot of classes initially and then
> >> stops using them, but these are not unloaded? One can imagine a web
> >> browser
> >> doing something like this. Is a web browser a typical use case for the
> >> Harmony JVM?
> >>
> >> Regarding your engineering choices, choice 2 seems nicer, but again I
> >> have
> >> some questions.
> >>
> >> 1. In the class registry, is the reference from the j.l.class instance
> to
> >> the j.l.CL instance a weak refernce and the reverse not a weak
> reference?
> >> 2. I am missing something about the java vtable object. Is it  a first
> >> class
> >> java object with its own java class? In this case the vtable object
> would
> >> have its own vtable which is a java object, but that also would have a
> >> vtable and so on...??? In other words if every java object has a
> vtable,
> >> which is a also a java object.......
> >> 3. If I am misunderstanding the above(  I hope ), the vtable objects
> >> would
> >> need to be pinned to avoid patching virtual calls after GC, efficient
> >> dispatching etc. Does this not put a requirement on compatible GC's to
> be
> >> able to deal with pinned objects?
> >> 4. Why cannot one have a j.l.class reference in the object header, as
> >> Weldon
> >> mentions, instead of this new vtable java type? Is the peformance
> impact
> >> known and do we understand it as compared to heap pressure due to the
> new
> >> vtable object?
> >>
> >> Thanks,
> >> Rana
> >>
> >>
> >>
> >>
> >> > On 10/24/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >> > >
> >> > > Egor,
> >> > > >But it has 1 more "cons" -- JIT should change it's devirtualizer
> >> > > >accordingly to the VTable change. Doable, of course.
> >> > > There is no need to change struct VTable structure - it could be
> >> simply
> >> > > inlined in pinned VTable object + 1 additional reference field to
> >> > > j.l.Class.
> >> > > So there won't be too much work to do on JIT side.
> >> > >
> >> > > >BTW, is it reasonable to "compress" or "enumerate" references to
> >> > > >j.l.Class in each object to reduce the footprint? How many classes
> >> are
> >> > > >alive in heavy-duty applications? not very much probably.
> >> > > We are to trace j.l.Class from every object via VTable to detect if
> >> > > there is
> >> > > any live object of that j.l.Class. This one of requirements of
> class
> >> > > unloading.
> >> > > As for footprint - there is already pointer to struct VTable in
> every
> >> > > object, so changing this pointer to reference to VTable Object will
> >> have
> >> > > no
> >> > > effect on footprint. Compressed VTable pointers will be changed to
> >> > > compressed references. The only effect is that VTable object is a
> >> full
> >> > > Java
> >> > > object and in its turn it is to have own VTable, so number of
> VTable
> >> > > objects
> >> > > will encrease for every class. As Vtable is a small object
> footprint
> >> > > will
> >> > > encrease only for tens of bytes for every loaded class, and as I
> >> know,
> >> > > there
> >> > > are loaded several thousands classes on Eclipse startup, therefore
> >> > > footprint
> >> > > increase is negligible.
> >> > >
> >> > > Aleksey Ignatenko,
> >> > > Intel Enterprise Solutions Software Division
> >> >
> >> > .
> >>
> >>
> >
> >
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.


Weldon Washburn wrote:
> Steve Blackburn was in Portland Oregon today.  I mentioned the idea of
> adding a  reference pointer from object to its j.l.Class instance.  MMTk 
> was
> not designed with this idea in mind.  It looks like you will need to fix
> this part of MMTk and maintain it yourself.  Steve did not seem thrilled at
> adding this support to MMTk code base.

How does MMTk support class unloading then?

> 
> Have we looked at other class unloading designs?  From what I have read in
> open literature on object layout, I don't recall any special fields to
> support class unloading.
> 
> 
> On 10/26/06, Rana Dasgupta <rd...@gmail.com> wrote:
>>
>> Aleksey,
>>   I had a couple of questions.
>>   You state that DRLVM does not implement the class unloading
>> optimization,
>> and this may create memory pressure on some applications that load many
>> classes. Do we have a real case / example where an application is stuck
>> for
>> insufficient memory because it uses a lot of classes initially and then
>> stops using them, but these are not unloaded? One can imagine a web
>> browser
>> doing something like this. Is a web browser a typical use case for the
>> Harmony JVM?
>>
>> Regarding your engineering choices, choice 2 seems nicer, but again I 
>> have
>> some questions.
>>
>> 1. In the class registry, is the reference from the j.l.class instance to
>> the j.l.CL instance a weak refernce and the reverse not a weak reference?
>> 2. I am missing something about the java vtable object. Is it  a first
>> class
>> java object with its own java class? In this case the vtable object would
>> have its own vtable which is a java object, but that also would have a
>> vtable and so on...??? In other words if every java object has a vtable,
>> which is a also a java object.......
>> 3. If I am misunderstanding the above(  I hope ), the vtable objects 
>> would
>> need to be pinned to avoid patching virtual calls after GC, efficient
>> dispatching etc. Does this not put a requirement on compatible GC's to be
>> able to deal with pinned objects?
>> 4. Why cannot one have a j.l.class reference in the object header, as
>> Weldon
>> mentions, instead of this new vtable java type? Is the peformance impact
>> known and do we understand it as compared to heap pressure due to the new
>> vtable object?
>>
>> Thanks,
>> Rana
>>
>>
>>
>>
>> > On 10/24/06, Aleksey Ignatenko <al...@gmail.com> wrote:
>> > >
>> > > Egor,
>> > > >But it has 1 more "cons" -- JIT should change it's devirtualizer
>> > > >accordingly to the VTable change. Doable, of course.
>> > > There is no need to change struct VTable structure - it could be
>> simply
>> > > inlined in pinned VTable object + 1 additional reference field to
>> > > j.l.Class.
>> > > So there won't be too much work to do on JIT side.
>> > >
>> > > >BTW, is it reasonable to "compress" or "enumerate" references to
>> > > >j.l.Class in each object to reduce the footprint? How many classes
>> are
>> > > >alive in heavy-duty applications? not very much probably.
>> > > We are to trace j.l.Class from every object via VTable to detect if
>> > > there is
>> > > any live object of that j.l.Class. This one of requirements of class
>> > > unloading.
>> > > As for footprint - there is already pointer to struct VTable in every
>> > > object, so changing this pointer to reference to VTable Object will
>> have
>> > > no
>> > > effect on footprint. Compressed VTable pointers will be changed to
>> > > compressed references. The only effect is that VTable object is a 
>> full
>> > > Java
>> > > object and in its turn it is to have own VTable, so number of VTable
>> > > objects
>> > > will encrease for every class. As Vtable is a small object footprint
>> > > will
>> > > encrease only for tens of bytes for every loaded class, and as I 
>> know,
>> > > there
>> > > are loaded several thousands classes on Eclipse startup, therefore
>> > > footprint
>> > > increase is negligible.
>> > >
>> > > Aleksey Ignatenko,
>> > > Intel Enterprise Solutions Software Division
>> >
>> > .
>>
>>
> 
>

Re: [drlvm] Class unloading support

Posted by Pavel Pervov <pm...@gmail.com>.

I think I know where the idea of additional reference to j/l/Class in every
object came from.

Lets look at original proposal:

"...

*Automatic class unloading approach.*
...
To do that we need to provide two conditions:

  1. Introduce reference from object to its j.l.Class instance.
..."

Lets rephrase the latter the following way:

"1. Make j.l.Class reachable from all objects of that class."

As everybody agrees that adding one more pointer into object overhead is not
a good idea, the only way is making j.l.Class reachable through available
data - in other words through VTable, which becomes regular Java object.

That is it.

Resume: no direct reference to j.l.Class in object of that class, no
additional object overhead, no MMTk incompatibility.

On 10/27/06, Weldon Washburn <we...@gmail.com> wrote:
>
> Steve Blackburn was in Portland Oregon today.  I mentioned the idea of
> adding a  reference pointer from object to its j.l.Class instance.  MMTk
> was
> not designed with this idea in mind.  It looks like you will need to fix
> this part of MMTk and maintain it yourself.  Steve did not seem thrilled
> at
> adding this support to MMTk code base.

<SNIP>

-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Weldon Washburn <we...@gmail.com>.

Steve Blackburn was in Portland Oregon today.  I mentioned the idea of
adding a  reference pointer from object to its j.l.Class instance.  MMTk was
not designed with this idea in mind.  It looks like you will need to fix
this part of MMTk and maintain it yourself.  Steve did not seem thrilled at
adding this support to MMTk code base.

Have we looked at other class unloading designs?  From what I have read in
open literature on object layout, I don't recall any special fields to
support class unloading.


On 10/26/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
> Aleksey,
>   I had a couple of questions.
>   You state that DRLVM does not implement the class unloading
> optimization,
> and this may create memory pressure on some applications that load many
> classes. Do we have a real case / example where an application is stuck
> for
> insufficient memory because it uses a lot of classes initially and then
> stops using them, but these are not unloaded? One can imagine a web
> browser
> doing something like this. Is a web browser a typical use case for the
> Harmony JVM?
>
> Regarding your engineering choices, choice 2 seems nicer, but again I have
> some questions.
>
> 1. In the class registry, is the reference from the j.l.class instance to
> the j.l.CL instance a weak refernce and the reverse not a weak reference?
> 2. I am missing something about the java vtable object. Is it  a first
> class
> java object with its own java class? In this case the vtable object would
> have its own vtable which is a java object, but that also would have a
> vtable and so on...??? In other words if every java object has a vtable,
> which is a also a java object.......
> 3. If I am misunderstanding the above(  I hope ), the vtable objects would
> need to be pinned to avoid patching virtual calls after GC, efficient
> dispatching etc. Does this not put a requirement on compatible GC's to be
> able to deal with pinned objects?
> 4. Why cannot one have a j.l.class reference in the object header, as
> Weldon
> mentions, instead of this new vtable java type? Is the peformance impact
> known and do we understand it as compared to heap pressure due to the new
> vtable object?
>
> Thanks,
> Rana
>
>
>
>
> > On 10/24/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> > >
> > > Egor,
> > > >But it has 1 more "cons" -- JIT should change it's devirtualizer
> > > >accordingly to the VTable change. Doable, of course.
> > > There is no need to change struct VTable structure - it could be
> simply
> > > inlined in pinned VTable object + 1 additional reference field to
> > > j.l.Class.
> > > So there won't be too much work to do on JIT side.
> > >
> > > >BTW, is it reasonable to "compress" or "enumerate" references to
> > > >j.l.Class in each object to reduce the footprint? How many classes
> are
> > > >alive in heavy-duty applications? not very much probably.
> > > We are to trace j.l.Class from every object via VTable to detect if
> > > there is
> > > any live object of that j.l.Class. This one of requirements of class
> > > unloading.
> > > As for footprint - there is already pointer to struct VTable in every
> > > object, so changing this pointer to reference to VTable Object will
> have
> > > no
> > > effect on footprint. Compressed VTable pointers will be changed to
> > > compressed references. The only effect is that VTable object is a full
> > > Java
> > > object and in its turn it is to have own VTable, so number of VTable
> > > objects
> > > will encrease for every class. As Vtable is a small object footprint
> > > will
> > > encrease only for tens of bytes for every loaded class, and as I know,
> > > there
> > > are loaded several thousands classes on Eclipse startup, therefore
> > > footprint
> > > increase is negligible.
> > >
> > > Aleksey Ignatenko,
> > > Intel Enterprise Solutions Software Division
> >
> > .
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Class unloading support

Posted by Gregory Shimansky <gs...@gmail.com>.

On Saturday 28 October 2006 03:47 Xiao-Feng Li wrote:
> All, I think the problem now is mainly about the class unloading
> design not about whether class unloading happens in server
> environment.

Yes, the problem is easily reproducible in client environment, eclipse is an 
example. It is not an eclipse bug because Java spec doesn't limit an 
application in creating its own class loader instances.

> Class unloading is definitely a feature required in future; but with
> the significance of the required modifications in JVM by this class
> unloading design 2 (using Java object for Vtable), it is probably
> safer to move this work into a branch at the moment until all other
> components are ready for it, and after we have thorough evaluation on
> it since there are still issues to be resolved or discussed.
> 
> Or we can keep it in JIRA and keep the discussion and evaluation going
> on before we decide to support the special design (Java Vtable) in
> other components.
>
> How about it?

I haven't seen any patch proposal, just the design. It seems like everyone 
agreed with automatic collection (#2) which I also agree with. No code was 
presented to evaluate or test so far. I am +1 to maintain a patch in JIRA, 
I'll help to test it myself.

-- 
Gregory Shimansky, Intel Middleware Products Division

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Hi, everyone.

I'd like to summarize the discussion about unloading here. Almost everyone
agreed that class unloading is very important for Harmony project and this
work should be continued.

There were discussed 3 proposals of class unloading designs. There are 2
initially proposed designs where automatic class unloading design was chosen
to be the best. Looks like the latest discussion leads to the possible
proposal of the third reference counting design. The best design is going to
be chosen by community on the basis of performance, complexity, unification
and some other criteria.

Discussion led to one very important additional criterion of class unloading
design, it is modifications efforts of class unloading implementation. It is
considered a very important moment as it affects gc_cc, gc_gen, gcv4, mmtk
and possibly some future developments.

I've put automatic class unloading implementation into JIRA (
issues.apache.org/jira/browse/HARMONY-2000) (see limitations: IA32 and gcv4,
gc_cc is in the progress yet).

There are realized 2 main ideas of class unloading feature:

   1. Native class resources cleanup (which is common for any class
   unloading design).
   2. Introduced automatic class unloading mechanism implementation.

*Native class resources cleanup.*

This cleanup is done when some j.l.Classloader was unloaded. Class related
structures, jit code, are to be cleaned up. There are also common used
collections (like Method_lookup_table), which should be updated relative to
changed number of classes (to avoid crashes and other different
unpredictable situations).

I want to mention one very important cleanup implemented – jit code. You can
see mem_alloc.h/cpp files to see code pool which is used for code and stubs
memory allocations. This is archaic static pool which uses one lock for all
allocations synchronization. It does not allow freeing code memory for
definite unloaded class loader. The patch contains new Code pool attached to
Classloader, all memory allocation are done inside Class loader's pool, so
it could be simply destroyed, plus there is done some optimization on memory
allocation in that pool.

*Introduced automatic class unloading mechanism implementation.*

Struct VTable moved to be VTable object. First of all, having in mind, that
changing object layout (changing VTable* to reference to VTable object in
object header) requires a lot of changes on both VM/GC sides no changes were
done in object layout. Now implementation specifics: struct VTable is
inlined in VTable object, that means that VTables object contains struct
VTable in it's body, so there is formula: (ManagedObject* vtObj; VTable*
vt;) vt =  vtObj + object_header_size(). It means that Vtable object differs
from struct VTable only by object header offset and we can simply convert
VTable object to struct VTable and vice versa. The only change which is done
to struct VTable is adding ManagedObject* jlC; field which is mapped to
reference to appropriate j.l.Class field in VTable Object and automatically
traced by GC. As there is no changes in object layout GC is to calculate
VTable object vtObj from Struct VTable pointer vt (vtObj = vt –
object_header_size()) and trace it for every object (see mark_scan.cpp
changes).

There is a question why VTable object could be needed in object header
(object layout change)? The answer is trivial: this could be required to
avoid GC tracing VTable objects in special way. We can make additional
reference field for every class by reference zero (where VTable is located)
to let GC know that there is references field by zero offset. Then GC will
trace VTable object automatically. I can provide changes in object layout
but I actually prefer not to do this or to do it incrementally (to the
current patch) to avoid additional implementation complications (because it
affects VM/GC + JIT and stubs).

Conclusions:

- Automatic unloading design does not require changing object layout.
Changing object layout could be considered as an enhancement.

- All changes in the patch are done with requirement of minimum code changes
overhead.

- 50% of class unloading task (native class resources cleaning) is done
independent to class unloading mechanism.

Please, ask your questions,

Aleksey.

On 30 Oct 2006 11:40:36 +0600, Egor Pasko <eg...@gmail.com> wrote:
>
> On the 0x210 day of Apache Harmony Rana Dasgupta wrote:
> > I completely agree.
> >
> > +1 for branch if Aleksey wants to experiment
>
> are there any problems with Java VTables in any of the components? JIT
> has no problems, as we discussed. GC should not suffer too, I
> guess. If we meet some problems that take long to fix (and many people
> to do it all), then it may become reasonable to make a branch. But not
> earlier. AFAIR, nobody died of a couple of interdependant JIRAs yet.
>
> So, I would suggest to hold on with braching until it is a strong
> reason.
>
> >
> > On 10/27/06, Xiao-Feng Li <xi...@gmail.com> wrote:
> > >
> > > All, I think the problem now is mainly about the class unloading
> > > design not about whether class unloading happens in server
> > > environment.
> > >
> > > Class unloading is definitely a feature required in future; but with
> > > the significance of the required modifications in JVM by this class
> > > unloading design 2 (using Java object for Vtable), it is probably
> > > safer to move this work into a branch at the moment until all other
> > > components are ready for it, and after we have thorough evaluation on
> > > it since there are still issues to be resolved or discussed.
> > >
> > > Or we can keep it in JIRA and keep the discussion and evaluation going
> > > on before we decide to support the special design (Java Vtable) in
> > > other components.
> > >
> > > How about it?
> > >
> > > Thanks,
> > > xiaofeng
> > >
> > > On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> > > >
> > > >
> > > > Rana Dasgupta wrote:
> > > > > My knowledge in this area is limited. But my understanding was
> that
> > > web
> > > > > servers and other similar hosts recycled processes periodically as
> > > > > standard procedure, thereby tearing down all associated resources.
> > > >
> > > > Yes, but that has nothing to do with what would be happening in the
> app
> > > > server the web server talked to, if one had an architecture where a
> web
> > > > server "fronted" for the app server.
> > > >
> > > > > So
> > > > > classes loaded, but not used for a while went away anyway;
> > > >
> > > > Nope - they aren't loaded in the context of the webserver (when
> using
> > > > httpd).
> > > >
> > > > > this level of
> > > > > resource management was not really urgent. I know that IIS does
> this,
> > > I
> > > > > am not sure about httpd. I am not sure about other host
> environments.
> > > >
> > > > But a process fork model (or thread model) of a webserver has
> nothing to
> > > > do with what's going on in the VM.
> > > >
> > > > I'm talking about servlet engines and app servers like Tomcat and
> > > > Geronimo which have nothing to do with httpd.  Architecturally, they
> are
> > > > separated from the web server (unless you don't use an external
> > > > webserver, and just use the httpd connector in tomcat) and are
> separate,
> > > > independent processes.
> > > >
> > > >      httpd  <------>  Tomcat
> > > >
> > > > The java-based app servers are long running processes, running for
> weeks
> > > > or months.  We need to do clean class unloading.
> > > >
> > > > geir
> > > >
> > > >
> > > > >
> > > > >
> > > > > On 10/27/06, *Geir Magnusson Jr.* <geir@pobox.com
> > > > > <ma...@pobox.com>> wrote:
> > > > >
> > > > >
> > > > >
> > > > >     Rana Dasgupta wrote:
> > > > >      > Aleksey,
> > > > >      >   I had a couple of questions.
> > > > >      >   You state that DRLVM does not implement the class
> unloading
> > > > >     optimization,
> > > > >      > and this may create memory pressure on some applications
> that
> > > > >     load many
> > > > >      > classes. Do we have a real case / example where an
> application
> > > is
> > > > >     stuck for
> > > > >      > insufficient memory because it uses a lot of classes
> initially
> > > > >     and then
> > > > >      > stops using them, but these are not unloaded? One can
> imagine a
> > > > >     web browser
> > > > >      > doing something like this. Is a web browser a typical use
> case
> > > > >     for the
> > > > >      > Harmony JVM?
> > > > >      >
> > > > >
> > > > >     If I understand what you're asking correctly, you'll find this
> > > pattern
> > > > >     in servlet engines or J2EE servers, where deployed apps can be
> > > dumped
> > > > >     and reloaded repeatedly either during development or during
> > > production
> > > > >     deployment, w/o taking the server down.
> > > > >
> > > > >     geir
> > > > >
> > > > >
> > > >
> > >
>
> --
> Egor Pasko, Intel Managed Runtime Division
>
>

Re: [drlvm] Class unloading support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x210 day of Apache Harmony Rana Dasgupta wrote:
> I completely agree.
> 
> +1 for branch if Aleksey wants to experiment

are there any problems with Java VTables in any of the components? JIT
has no problems, as we discussed. GC should not suffer too, I
guess. If we meet some problems that take long to fix (and many people
to do it all), then it may become reasonable to make a branch. But not
earlier. AFAIR, nobody died of a couple of interdependant JIRAs yet.

So, I would suggest to hold on with braching until it is a strong
reason.

> 
> On 10/27/06, Xiao-Feng Li <xi...@gmail.com> wrote:
> >
> > All, I think the problem now is mainly about the class unloading
> > design not about whether class unloading happens in server
> > environment.
> >
> > Class unloading is definitely a feature required in future; but with
> > the significance of the required modifications in JVM by this class
> > unloading design 2 (using Java object for Vtable), it is probably
> > safer to move this work into a branch at the moment until all other
> > components are ready for it, and after we have thorough evaluation on
> > it since there are still issues to be resolved or discussed.
> >
> > Or we can keep it in JIRA and keep the discussion and evaluation going
> > on before we decide to support the special design (Java Vtable) in
> > other components.
> >
> > How about it?
> >
> > Thanks,
> > xiaofeng
> >
> > On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> > >
> > >
> > > Rana Dasgupta wrote:
> > > > My knowledge in this area is limited. But my understanding was that
> > web
> > > > servers and other similar hosts recycled processes periodically as
> > > > standard procedure, thereby tearing down all associated resources.
> > >
> > > Yes, but that has nothing to do with what would be happening in the app
> > > server the web server talked to, if one had an architecture where a web
> > > server "fronted" for the app server.
> > >
> > > > So
> > > > classes loaded, but not used for a while went away anyway;
> > >
> > > Nope - they aren't loaded in the context of the webserver (when using
> > > httpd).
> > >
> > > > this level of
> > > > resource management was not really urgent. I know that IIS does this,
> > I
> > > > am not sure about httpd. I am not sure about other host environments.
> > >
> > > But a process fork model (or thread model) of a webserver has nothing to
> > > do with what's going on in the VM.
> > >
> > > I'm talking about servlet engines and app servers like Tomcat and
> > > Geronimo which have nothing to do with httpd.  Architecturally, they are
> > > separated from the web server (unless you don't use an external
> > > webserver, and just use the httpd connector in tomcat) and are separate,
> > > independent processes.
> > >
> > >      httpd  <------>  Tomcat
> > >
> > > The java-based app servers are long running processes, running for weeks
> > > or months.  We need to do clean class unloading.
> > >
> > > geir
> > >
> > >
> > > >
> > > >
> > > > On 10/27/06, *Geir Magnusson Jr.* <geir@pobox.com
> > > > <ma...@pobox.com>> wrote:
> > > >
> > > >
> > > >
> > > >     Rana Dasgupta wrote:
> > > >      > Aleksey,
> > > >      >   I had a couple of questions.
> > > >      >   You state that DRLVM does not implement the class unloading
> > > >     optimization,
> > > >      > and this may create memory pressure on some applications that
> > > >     load many
> > > >      > classes. Do we have a real case / example where an application
> > is
> > > >     stuck for
> > > >      > insufficient memory because it uses a lot of classes initially
> > > >     and then
> > > >      > stops using them, but these are not unloaded? One can imagine a
> > > >     web browser
> > > >      > doing something like this. Is a web browser a typical use case
> > > >     for the
> > > >      > Harmony JVM?
> > > >      >
> > > >
> > > >     If I understand what you're asking correctly, you'll find this
> > pattern
> > > >     in servlet engines or J2EE servers, where deployed apps can be
> > dumped
> > > >     and reloaded repeatedly either during development or during
> > production
> > > >     deployment, w/o taking the server down.
> > > >
> > > >     geir
> > > >
> > > >
> > >
> >

-- 
Egor Pasko, Intel Managed Runtime Division

Re: [drlvm] Class unloading support

Posted by Rana Dasgupta <rd...@gmail.com>.

I completely agree.

+1 for branch if Aleksey wants to experiment


On 10/27/06, Xiao-Feng Li <xi...@gmail.com> wrote:
>
> All, I think the problem now is mainly about the class unloading
> design not about whether class unloading happens in server
> environment.
>
> Class unloading is definitely a feature required in future; but with
> the significance of the required modifications in JVM by this class
> unloading design 2 (using Java object for Vtable), it is probably
> safer to move this work into a branch at the moment until all other
> components are ready for it, and after we have thorough evaluation on
> it since there are still issues to be resolved or discussed.
>
> Or we can keep it in JIRA and keep the discussion and evaluation going
> on before we decide to support the special design (Java Vtable) in
> other components.
>
> How about it?
>
> Thanks,
> xiaofeng
>
> On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> >
> >
> > Rana Dasgupta wrote:
> > > My knowledge in this area is limited. But my understanding was that
> web
> > > servers and other similar hosts recycled processes periodically as
> > > standard procedure, thereby tearing down all associated resources.
> >
> > Yes, but that has nothing to do with what would be happening in the app
> > server the web server talked to, if one had an architecture where a web
> > server "fronted" for the app server.
> >
> > > So
> > > classes loaded, but not used for a while went away anyway;
> >
> > Nope - they aren't loaded in the context of the webserver (when using
> > httpd).
> >
> > > this level of
> > > resource management was not really urgent. I know that IIS does this,
> I
> > > am not sure about httpd. I am not sure about other host environments.
> >
> > But a process fork model (or thread model) of a webserver has nothing to
> > do with what's going on in the VM.
> >
> > I'm talking about servlet engines and app servers like Tomcat and
> > Geronimo which have nothing to do with httpd.  Architecturally, they are
> > separated from the web server (unless you don't use an external
> > webserver, and just use the httpd connector in tomcat) and are separate,
> > independent processes.
> >
> >      httpd  <------>  Tomcat
> >
> > The java-based app servers are long running processes, running for weeks
> > or months.  We need to do clean class unloading.
> >
> > geir
> >
> >
> > >
> > >
> > > On 10/27/06, *Geir Magnusson Jr.* <geir@pobox.com
> > > <ma...@pobox.com>> wrote:
> > >
> > >
> > >
> > >     Rana Dasgupta wrote:
> > >      > Aleksey,
> > >      >   I had a couple of questions.
> > >      >   You state that DRLVM does not implement the class unloading
> > >     optimization,
> > >      > and this may create memory pressure on some applications that
> > >     load many
> > >      > classes. Do we have a real case / example where an application
> is
> > >     stuck for
> > >      > insufficient memory because it uses a lot of classes initially
> > >     and then
> > >      > stops using them, but these are not unloaded? One can imagine a
> > >     web browser
> > >      > doing something like this. Is a web browser a typical use case
> > >     for the
> > >      > Harmony JVM?
> > >      >
> > >
> > >     If I understand what you're asking correctly, you'll find this
> pattern
> > >     in servlet engines or J2EE servers, where deployed apps can be
> dumped
> > >     and reloaded repeatedly either during development or during
> production
> > >     deployment, w/o taking the server down.
> > >
> > >     geir
> > >
> > >
> >
>

Re: Re: [drlvm] Class unloading support

Posted by Rana Dasgupta <rd...@gmail.com>.

The point is not that it is unimportant because it is an optimization. It is
1) it seems something that is be good to have, but is not urgent
immediately 2) that given the nature of our best solution ( java tables etc.
) is risky, we may not want to experiment with it in the main branch. We
should also study other solutions.

On 10/28/06, Alex Blewitt <al...@gmail.com> wrote:
>
>  True, but then JIT is an optimisation that isn't mandated in the JLS
> > > either :-) There are also JVMs that don't depend on a JIT, but just
> > > because it isn't mandated as a standard doesn't make it any less
> > > important to implement it.
> > >
> > > For that matter, there's nothing in the JLS that mandates how GC
> > > works. It's quite possible to have a JVM that never does any GC, and
> > > just sucks memory until it can't suck any more, and throw an
> > > OutOfMemoryException. What the JLS does say is the order in which
> > > finalise methods should be called prior to the storage being
> > > reclaimed; they don't mandate that the storage must be reclaimed.
> > >
> > > So, just because it's not mandated doesn't mean it's not important to
> > > do :-)
> > >
> > > Alex.
> > >
> >

Re: Re: [drlvm] Class unloading support

Posted by Alex Blewitt <al...@gmail.com>.

On 28/10/06, Mikhail Fursov <mi...@gmail.com> wrote:
> On 10/29/06, Rana Dasgupta <rd...@gmail.com> wrote:
> >
> > From JLS:-
> >
> > ...
> >
> > And ...
> >
> > ..
> >
> > Anyway, I don't want to belabor this point forever, and my opinion is only
> > one among many :-)
> >
> > Good point! Thanks.
> I have never thought (as Java developer) about class unloading
> like an optimization. But if it is, as you pointed, the RI behaviour
> makes developers believe that it is always on :)

True, but then JIT is an optimisation that isn't mandated in the JLS
either :-) There are also JVMs that don't depend on a JIT, but just
because it isn't mandated as a standard doesn't make it any less
important to implement it.

For that matter, there's nothing in the JLS that mandates how GC
works. It's quite possible to have a JVM that never does any GC, and
just sucks memory until it can't suck any more, and throw an
OutOfMemoryException. What the JLS does say is the order in which
finalise methods should be called prior to the storage being
reclaimed; they don't mandate that the storage must be reclaimed.

So, just because it's not mandated doesn't mean it's not important to do :-)

Alex.

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 10/29/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
> From JLS:-
>
> ...
>
> And ...
>
> ..
>
> Anyway, I don't want to belabor this point forever, and my opinion is only
> one among many :-)
>
> Good point! Thanks.
I have never thought (as Java developer) about class unloading
like an optimization. But if it is, as you pointed, the RI behaviour
makes developers believe that it is always on :)


-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Rana Dasgupta <rd...@gmail.com>.

On 10/28/06, Mikhail Fursov <mi...@gmail.com> wrote:
>
> On 10/28/06, Rana Dasgupta <rd...@gmail.com> wrote:
> >
> > On 10/27/06, Xiao-Feng Li <xi...@gmail.com> wrote:
> > >
> > >
> > > >Yes. That's also my opinion. The _design_ of class unloading is >the
> > > >focus of the discussion.
> >
> >
> >
> > I think that before doing an optimization, it is a good idea to
> understand
> > why and where it is needed, and if the usage scenario fits what the
> > software
> > is intending to do. So this discussion is not misplaced. I am not very
> > comfortable with adding a solution before we have understood the
> problem.
> > Probably the first step is to create a use case that can be used to see
> if
> > class unloading is the solution.
>
>
> Rana, I do not understand why do you call class unloading 'an
> optimization'. In this case any GC is optimization too. Class
> unloading is fundamental feature of
> Java language. I would even say that if your application uses custom
> classloader and you never thought about class unloading the design of your
> application is not complete.
> We can never claim that Harmony supports Java 1.5 or even Java 1.2 if it
> does not support unloading for classloaders. As for me, this is very
> important to have this feature and Aleksey's patch is quite a good
> beginning. At least in JIT it does not require any changes at all.

>From JLS:-

"Rationale: Class unloading is an optimization that helps reduce memory use.
Obviously, the semantics of a program should not depend on whether and how a
system chooses to implement  an optimization such as class unloading.....".

And ...

 "Strictly speaking, it was never essential that the issue of class
unloading be discussed by the Java Language Specification, as it is an
optimization. However, it is a subtle issue, and so it was mentioned by way
of clarification. Unfortunately, misunderstandings arose, aggravated by the
the class unloading behavior of JDK 1.1. This behavior  was not mandated by
the Java Language Specification. Indeed, it contradicted the specification;
it was simply a bug. The bug has been fixed in JDK 1.2, and the
specification clarified to avoid such misunderstandings in the future."

Anyway, I don't want to belabor this point forever, and my opinion is only
one among many :-)

Rana

Re: [drlvm] Class unloading support

Posted by Mikhail Fursov <mi...@gmail.com>.

On 10/28/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
> On 10/27/06, Xiao-Feng Li <xi...@gmail.com> wrote:
> >
> >
> > >Yes. That's also my opinion. The _design_ of class unloading is >the
> > >focus of the discussion.
>
>
>
> I think that before doing an optimization, it is a good idea to understand
> why and where it is needed, and if the usage scenario fits what the
> software
> is intending to do. So this discussion is not misplaced. I am not very
> comfortable with adding a solution before we have understood the problem.
> Probably the first step is to create a use case that can be used to see if
> class unloading is the solution.

Rana, I do not understand why do you call class unloading 'an
optimization'. In this case any GC is optimization too. Class
unloading is fundamental feature of
Java language. I would even say that if your application uses custom
classloader and you never thought about class unloading the design of your
application is not complete.
We can never claim that Harmony supports Java 1.5 or even Java 1.2 if it
does not support unloading for classloaders. As for me, this is very
important to have this feature and Aleksey's patch is quite a good
beginning. At least in JIT it does not require any changes at all.

-- 
Mikhail Fursov

Re: [drlvm] Class unloading support

Posted by Rana Dasgupta <rd...@gmail.com>.

On 10/27/06, Xiao-Feng Li <xi...@gmail.com> wrote:
>
>
> >Yes. That's also my opinion. The _design_ of class unloading is >the
> >focus of the discussion.

 I think that before doing an optimization, it is a good idea to understand
why and where it is needed, and if the usage scenario fits what the software
is intending to do. So this discussion is not misplaced. I am not very
comfortable with adding a solution before we have understood the problem.
Probably the first step is to create a use case that can be used to see if
class unloading is the solution.

> > Class unloading is definitely a feature required in future; but with
> > > the significance of the required modifications in JVM by this class
> > > unloading design 2 (using Java object for Vtable), it is probably
> > > safer to move this work into a branch at the moment until all other
> > > components are ready for it, and after we have thorough evaluation on
> > > it since there are still issues to be resolved or discussed.
> >
> > I don't really agree.  It could be because of my background, but my
> > experience with java is in long-running, server-side processes, and
> > clean class-unloading is important.
>
> >I don't know what you "don't really agree". :-) My comment was >that we
> >need thorough study to enable the special design so that the >ongoing
> >development in other components are not impacted too much, >and there
> >are still some issues to be discussed and resolved in this design.

I have the same question. At this point, solution 2 looks more sensible than
solution 1. But that's about all. The design is quite invasive. If we want
to prototype this, my suggestion would be to do this in a branch, create a
use case, demonstrate its value, run stability tests, and then merge it.

> > Or we can keep it in JIRA and keep the discussion and evaluation going
> > > on before we decide to support the special design (Java Vtable) in
> > > other components.
> >
> > That's a different story :)  I'm not advocating one design over another,
> > but we have to be able to dump classloaders w/o leaking memory.
>
>

Re: [drlvm] Class unloading support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.


Xiao-Feng Li wrote:
> On 10/28/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>>
>>
>> Xiao-Feng Li wrote:
>> > All, I think the problem now is mainly about the class unloading
>> > design not about whether class unloading happens in server
>> > environment.
>>
>> I think the problem is if class unloading happens cleanly.  Period.  The
>> fact that it most often happens in a server environment is irrelevant, 
>> IMO.
> 
> Yes. That's also my opinion. The _design_ of class unloading is the
> focus of the discussion.
> 
>> > Class unloading is definitely a feature required in future; but with
>> > the significance of the required modifications in JVM by this class
>> > unloading design 2 (using Java object for Vtable), it is probably
>> > safer to move this work into a branch at the moment until all other
>> > components are ready for it, and after we have thorough evaluation on
>> > it since there are still issues to be resolved or discussed.
>>
>> I don't really agree.  It could be because of my background, but my
>> experience with java is in long-running, server-side processes, and
>> clean class-unloading is important.
> 
> I don't know what you "don't really agree". :-) 

Ah. Sorry - you said "it's required in future".  I think that it's 
required now, but that isn't actually at odds with what you said :)


> My comment was that we
> need thorough study to enable the special design so that the ongoing
> development in other components are not impacted too much, and there
> are still some issues to be discussed and resolved in this design.

Agreed.

> 
>> > Or we can keep it in JIRA and keep the discussion and evaluation going
>> > on before we decide to support the special design (Java Vtable) in
>> > other components.
>>
>> That's a different story :)  I'm not advocating one design over another,
>> but we have to be able to dump classloaders w/o leaking memory.
> 
> Agree again. :-)

So we're in violent agreement :)

geir

> 
>> geir
>>
>> >
>> > How about it?
>> >
>> > Thanks,
>> > xiaofeng
>> >
>> > On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>> >>
>> >>
>> >> Rana Dasgupta wrote:
>> >> > My knowledge in this area is limited. But my understanding was 
>> that web
>> >> > servers and other similar hosts recycled processes periodically as
>> >> > standard procedure, thereby tearing down all associated resources.
>> >>
>> >> Yes, but that has nothing to do with what would be happening in the 
>> app
>> >> server the web server talked to, if one had an architecture where a 
>> web
>> >> server "fronted" for the app server.
>> >>
>> >> > So
>> >> > classes loaded, but not used for a while went away anyway;
>> >>
>> >> Nope - they aren't loaded in the context of the webserver (when using
>> >> httpd).
>> >>
>> >> > this level of
>> >> > resource management was not really urgent. I know that IIS does 
>> this, I
>> >> > am not sure about httpd. I am not sure about other host 
>> environments.
>> >>
>> >> But a process fork model (or thread model) of a webserver has 
>> nothing to
>> >> do with what's going on in the VM.
>> >>
>> >> I'm talking about servlet engines and app servers like Tomcat and
>> >> Geronimo which have nothing to do with httpd.  Architecturally, 
>> they are
>> >> separated from the web server (unless you don't use an external
>> >> webserver, and just use the httpd connector in tomcat) and are 
>> separate,
>> >> independent processes.
>> >>
>> >>      httpd  <------>  Tomcat
>> >>
>> >> The java-based app servers are long running processes, running for 
>> weeks
>> >> or months.  We need to do clean class unloading.
>> >>
>> >> geir
>> >>
>> >>
>> >> >
>> >> >
>> >> > On 10/27/06, *Geir Magnusson Jr.* <geir@pobox.com
>> >> > <ma...@pobox.com>> wrote:
>> >> >
>> >> >
>> >> >
>> >> >     Rana Dasgupta wrote:
>> >> >      > Aleksey,
>> >> >      >   I had a couple of questions.
>> >> >      >   You state that DRLVM does not implement the class unloading
>> >> >     optimization,
>> >> >      > and this may create memory pressure on some applications that
>> >> >     load many
>> >> >      > classes. Do we have a real case / example where an
>> >> application is
>> >> >     stuck for
>> >> >      > insufficient memory because it uses a lot of classes 
>> initially
>> >> >     and then
>> >> >      > stops using them, but these are not unloaded? One can 
>> imagine a
>> >> >     web browser
>> >> >      > doing something like this. Is a web browser a typical use 
>> case
>> >> >     for the
>> >> >      > Harmony JVM?
>> >> >      >
>> >> >
>> >> >     If I understand what you're asking correctly, you'll find this
>> >> pattern
>> >> >     in servlet engines or J2EE servers, where deployed apps can be
>> >> dumped
>> >> >     and reloaded repeatedly either during development or during
>> >> production
>> >> >     deployment, w/o taking the server down.
>> >> >
>> >> >     geir
>> >> >
>> >> >
>> >>
>> >
>>
>

Re: [drlvm] Class unloading support

Posted by Xiao-Feng Li <xi...@gmail.com>.

On 10/28/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
>
> Xiao-Feng Li wrote:
> > All, I think the problem now is mainly about the class unloading
> > design not about whether class unloading happens in server
> > environment.
>
> I think the problem is if class unloading happens cleanly.  Period.  The
> fact that it most often happens in a server environment is irrelevant, IMO.

Yes. That's also my opinion. The _design_ of class unloading is the
focus of the discussion.

> > Class unloading is definitely a feature required in future; but with
> > the significance of the required modifications in JVM by this class
> > unloading design 2 (using Java object for Vtable), it is probably
> > safer to move this work into a branch at the moment until all other
> > components are ready for it, and after we have thorough evaluation on
> > it since there are still issues to be resolved or discussed.
>
> I don't really agree.  It could be because of my background, but my
> experience with java is in long-running, server-side processes, and
> clean class-unloading is important.

I don't know what you "don't really agree". :-) My comment was that we
need thorough study to enable the special design so that the ongoing
development in other components are not impacted too much, and there
are still some issues to be discussed and resolved in this design.

> > Or we can keep it in JIRA and keep the discussion and evaluation going
> > on before we decide to support the special design (Java Vtable) in
> > other components.
>
> That's a different story :)  I'm not advocating one design over another,
> but we have to be able to dump classloaders w/o leaking memory.

Agree again. :-)

> geir
>
> >
> > How about it?
> >
> > Thanks,
> > xiaofeng
> >
> > On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
> >>
> >>
> >> Rana Dasgupta wrote:
> >> > My knowledge in this area is limited. But my understanding was that web
> >> > servers and other similar hosts recycled processes periodically as
> >> > standard procedure, thereby tearing down all associated resources.
> >>
> >> Yes, but that has nothing to do with what would be happening in the app
> >> server the web server talked to, if one had an architecture where a web
> >> server "fronted" for the app server.
> >>
> >> > So
> >> > classes loaded, but not used for a while went away anyway;
> >>
> >> Nope - they aren't loaded in the context of the webserver (when using
> >> httpd).
> >>
> >> > this level of
> >> > resource management was not really urgent. I know that IIS does this, I
> >> > am not sure about httpd. I am not sure about other host environments.
> >>
> >> But a process fork model (or thread model) of a webserver has nothing to
> >> do with what's going on in the VM.
> >>
> >> I'm talking about servlet engines and app servers like Tomcat and
> >> Geronimo which have nothing to do with httpd.  Architecturally, they are
> >> separated from the web server (unless you don't use an external
> >> webserver, and just use the httpd connector in tomcat) and are separate,
> >> independent processes.
> >>
> >>      httpd  <------>  Tomcat
> >>
> >> The java-based app servers are long running processes, running for weeks
> >> or months.  We need to do clean class unloading.
> >>
> >> geir
> >>
> >>
> >> >
> >> >
> >> > On 10/27/06, *Geir Magnusson Jr.* <geir@pobox.com
> >> > <ma...@pobox.com>> wrote:
> >> >
> >> >
> >> >
> >> >     Rana Dasgupta wrote:
> >> >      > Aleksey,
> >> >      >   I had a couple of questions.
> >> >      >   You state that DRLVM does not implement the class unloading
> >> >     optimization,
> >> >      > and this may create memory pressure on some applications that
> >> >     load many
> >> >      > classes. Do we have a real case / example where an
> >> application is
> >> >     stuck for
> >> >      > insufficient memory because it uses a lot of classes initially
> >> >     and then
> >> >      > stops using them, but these are not unloaded? One can imagine a
> >> >     web browser
> >> >      > doing something like this. Is a web browser a typical use case
> >> >     for the
> >> >      > Harmony JVM?
> >> >      >
> >> >
> >> >     If I understand what you're asking correctly, you'll find this
> >> pattern
> >> >     in servlet engines or J2EE servers, where deployed apps can be
> >> dumped
> >> >     and reloaded repeatedly either during development or during
> >> production
> >> >     deployment, w/o taking the server down.
> >> >
> >> >     geir
> >> >
> >> >
> >>
> >
>

Re: [drlvm] Class unloading support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.


Xiao-Feng Li wrote:
> All, I think the problem now is mainly about the class unloading
> design not about whether class unloading happens in server
> environment.

I think the problem is if class unloading happens cleanly.  Period.  The 
fact that it most often happens in a server environment is irrelevant, IMO.

> 
> Class unloading is definitely a feature required in future; but with
> the significance of the required modifications in JVM by this class
> unloading design 2 (using Java object for Vtable), it is probably
> safer to move this work into a branch at the moment until all other
> components are ready for it, and after we have thorough evaluation on
> it since there are still issues to be resolved or discussed.

I don't really agree.  It could be because of my background, but my 
experience with java is in long-running, server-side processes, and 
clean class-unloading is important.

> 
> Or we can keep it in JIRA and keep the discussion and evaluation going
> on before we decide to support the special design (Java Vtable) in
> other components.

That's a different story :)  I'm not advocating one design over another, 
but we have to be able to dump classloaders w/o leaking memory.

geir

> 
> How about it?
> 
> Thanks,
> xiaofeng
> 
> On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>>
>>
>> Rana Dasgupta wrote:
>> > My knowledge in this area is limited. But my understanding was that web
>> > servers and other similar hosts recycled processes periodically as
>> > standard procedure, thereby tearing down all associated resources.
>>
>> Yes, but that has nothing to do with what would be happening in the app
>> server the web server talked to, if one had an architecture where a web
>> server "fronted" for the app server.
>>
>> > So
>> > classes loaded, but not used for a while went away anyway;
>>
>> Nope - they aren't loaded in the context of the webserver (when using
>> httpd).
>>
>> > this level of
>> > resource management was not really urgent. I know that IIS does this, I
>> > am not sure about httpd. I am not sure about other host environments.
>>
>> But a process fork model (or thread model) of a webserver has nothing to
>> do with what's going on in the VM.
>>
>> I'm talking about servlet engines and app servers like Tomcat and
>> Geronimo which have nothing to do with httpd.  Architecturally, they are
>> separated from the web server (unless you don't use an external
>> webserver, and just use the httpd connector in tomcat) and are separate,
>> independent processes.
>>
>>      httpd  <------>  Tomcat
>>
>> The java-based app servers are long running processes, running for weeks
>> or months.  We need to do clean class unloading.
>>
>> geir
>>
>>
>> >
>> >
>> > On 10/27/06, *Geir Magnusson Jr.* <geir@pobox.com
>> > <ma...@pobox.com>> wrote:
>> >
>> >
>> >
>> >     Rana Dasgupta wrote:
>> >      > Aleksey,
>> >      >   I had a couple of questions.
>> >      >   You state that DRLVM does not implement the class unloading
>> >     optimization,
>> >      > and this may create memory pressure on some applications that
>> >     load many
>> >      > classes. Do we have a real case / example where an 
>> application is
>> >     stuck for
>> >      > insufficient memory because it uses a lot of classes initially
>> >     and then
>> >      > stops using them, but these are not unloaded? One can imagine a
>> >     web browser
>> >      > doing something like this. Is a web browser a typical use case
>> >     for the
>> >      > Harmony JVM?
>> >      >
>> >
>> >     If I understand what you're asking correctly, you'll find this 
>> pattern
>> >     in servlet engines or J2EE servers, where deployed apps can be 
>> dumped
>> >     and reloaded repeatedly either during development or during 
>> production
>> >     deployment, w/o taking the server down.
>> >
>> >     geir
>> >
>> >
>>
>

Re: [drlvm] Class unloading support

Posted by Xiao-Feng Li <xi...@gmail.com>.

All, I think the problem now is mainly about the class unloading
design not about whether class unloading happens in server
environment.

Class unloading is definitely a feature required in future; but with
the significance of the required modifications in JVM by this class
unloading design 2 (using Java object for Vtable), it is probably
safer to move this work into a branch at the moment until all other
components are ready for it, and after we have thorough evaluation on
it since there are still issues to be resolved or discussed.

Or we can keep it in JIRA and keep the discussion and evaluation going
on before we decide to support the special design (Java Vtable) in
other components.

How about it?

Thanks,
xiaofeng

On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
>
> Rana Dasgupta wrote:
> > My knowledge in this area is limited. But my understanding was that web
> > servers and other similar hosts recycled processes periodically as
> > standard procedure, thereby tearing down all associated resources.
>
> Yes, but that has nothing to do with what would be happening in the app
> server the web server talked to, if one had an architecture where a web
> server "fronted" for the app server.
>
> > So
> > classes loaded, but not used for a while went away anyway;
>
> Nope - they aren't loaded in the context of the webserver (when using
> httpd).
>
> > this level of
> > resource management was not really urgent. I know that IIS does this, I
> > am not sure about httpd. I am not sure about other host environments.
>
> But a process fork model (or thread model) of a webserver has nothing to
> do with what's going on in the VM.
>
> I'm talking about servlet engines and app servers like Tomcat and
> Geronimo which have nothing to do with httpd.  Architecturally, they are
> separated from the web server (unless you don't use an external
> webserver, and just use the httpd connector in tomcat) and are separate,
> independent processes.
>
>      httpd  <------>  Tomcat
>
> The java-based app servers are long running processes, running for weeks
> or months.  We need to do clean class unloading.
>
> geir
>
>
> >
> >
> > On 10/27/06, *Geir Magnusson Jr.* <geir@pobox.com
> > <ma...@pobox.com>> wrote:
> >
> >
> >
> >     Rana Dasgupta wrote:
> >      > Aleksey,
> >      >   I had a couple of questions.
> >      >   You state that DRLVM does not implement the class unloading
> >     optimization,
> >      > and this may create memory pressure on some applications that
> >     load many
> >      > classes. Do we have a real case / example where an application is
> >     stuck for
> >      > insufficient memory because it uses a lot of classes initially
> >     and then
> >      > stops using them, but these are not unloaded? One can imagine a
> >     web browser
> >      > doing something like this. Is a web browser a typical use case
> >     for the
> >      > Harmony JVM?
> >      >
> >
> >     If I understand what you're asking correctly, you'll find this pattern
> >     in servlet engines or J2EE servers, where deployed apps can be dumped
> >     and reloaded repeatedly either during development or during production
> >     deployment, w/o taking the server down.
> >
> >     geir
> >
> >
>

Re: [drlvm] Class unloading support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

Rana Dasgupta wrote:
> My knowledge in this area is limited. But my understanding was that web 
> servers and other similar hosts recycled processes periodically as 
> standard procedure, thereby tearing down all associated resources. 

Yes, but that has nothing to do with what would be happening in the app 
server the web server talked to, if one had an architecture where a web 
server "fronted" for the app server.

> So 
> classes loaded, but not used for a while went away anyway; 

Nope - they aren't loaded in the context of the webserver (when using 
httpd).

> this level of 
> resource management was not really urgent. I know that IIS does this, I 
> am not sure about httpd. I am not sure about other host environments.

But a process fork model (or thread model) of a webserver has nothing to 
do with what's going on in the VM.

I'm talking about servlet engines and app servers like Tomcat and 
Geronimo which have nothing to do with httpd.  Architecturally, they are 
separated from the web server (unless you don't use an external 
webserver, and just use the httpd connector in tomcat) and are separate, 
independent processes.

     httpd  <------>  Tomcat

The java-based app servers are long running processes, running for weeks 
or months.  We need to do clean class unloading.

geir

> 
>  
> On 10/27/06, *Geir Magnusson Jr.* <geir@pobox.com 
> <ma...@pobox.com>> wrote:
> 
> 
> 
>     Rana Dasgupta wrote:
>      > Aleksey,
>      >   I had a couple of questions.
>      >   You state that DRLVM does not implement the class unloading
>     optimization,
>      > and this may create memory pressure on some applications that
>     load many
>      > classes. Do we have a real case / example where an application is
>     stuck for
>      > insufficient memory because it uses a lot of classes initially
>     and then
>      > stops using them, but these are not unloaded? One can imagine a
>     web browser
>      > doing something like this. Is a web browser a typical use case
>     for the
>      > Harmony JVM?
>      >
> 
>     If I understand what you're asking correctly, you'll find this pattern
>     in servlet engines or J2EE servers, where deployed apps can be dumped
>     and reloaded repeatedly either during development or during production
>     deployment, w/o taking the server down.
> 
>     geir
> 
>

Re: [drlvm] Class unloading support

Posted by Rana Dasgupta <rd...@gmail.com>.

My knowledge in this area is limited. But my understanding was that web
servers and other similar hosts recycled processes periodically as standard
procedure, thereby tearing down all associated resources. So classes loaded,
but not used for a while went away anyway; this level of resource management
was not really urgent. I know that IIS does this, I am not sure about httpd.
I am not sure about other host environments.

On 10/27/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
>
>
> Rana Dasgupta wrote:
> > Aleksey,
> >   I had a couple of questions.
> >   You state that DRLVM does not implement the class unloading
> optimization,
> > and this may create memory pressure on some applications that load many
> > classes. Do we have a real case / example where an application is stuck
> for
> > insufficient memory because it uses a lot of classes initially and then
> > stops using them, but these are not unloaded? One can imagine a web
> browser
> > doing something like this. Is a web browser a typical use case for the
> > Harmony JVM?
> >
>
> If I understand what you're asking correctly, you'll find this pattern
> in servlet engines or J2EE servers, where deployed apps can be dumped
> and reloaded repeatedly either during development or during production
> deployment, w/o taking the server down.
>
> geir
>
>

Re: [drlvm] Class unloading support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

Rana Dasgupta wrote:
> Aleksey,
>   I had a couple of questions.
>   You state that DRLVM does not implement the class unloading optimization,
> and this may create memory pressure on some applications that load many
> classes. Do we have a real case / example where an application is stuck for
> insufficient memory because it uses a lot of classes initially and then
> stops using them, but these are not unloaded? One can imagine a web browser
> doing something like this. Is a web browser a typical use case for the
> Harmony JVM?
> 

If I understand what you're asking correctly, you'll find this pattern 
in servlet engines or J2EE servers, where deployed apps can be dumped 
and reloaded repeatedly either during development or during production 
deployment, w/o taking the server down.

geir

Re: Re: [drlvm] Class unloading support

Posted by Ivan Volosyuk <iv...@gmail.com>.

Alex,

The process of detecting unused classloaders eligible for GC and
reclamation of its native resources we call here class unloading.

-- 
Ivan
Intel Enterprise Solutions Software Division

On 10/27/06, Alex Blewitt <al...@gmail.com> wrote:
> As a fairly obvious comment, all JSPs that are translated into classes
> and then executed (or indeed, web apps that are deployed and then shut
> down again) are exactly the kind of place that new classloaders are
> created and then classes used. I'm not sure what you meant by 'class
> unloading', and whether that was unrelated classes in the same
> classloader, or when a classloader becomes eligible for GC() (and thus
> all of its contents do too).
>
> Alex.

Re: Re: [drlvm] Class unloading support

Posted by Alex Blewitt <al...@gmail.com>.

As a fairly obvious comment, all JSPs that are translated into classes
and then executed (or indeed, web apps that are deployed and then shut
down again) are exactly the kind of place that new classloaders are
created and then classes used. I'm not sure what you meant by 'class
unloading', and whether that was unrelated classes in the same
classloader, or when a classloader becomes eligible for GC() (and thus
all of its contents do too).

Alex.

Re: [drlvm] Class unloading support

Posted by Rana Dasgupta <rd...@gmail.com>.

Aleksey,
   I had a couple of questions.
   You state that DRLVM does not implement the class unloading optimization,
and this may create memory pressure on some applications that load many
classes. Do we have a real case / example where an application is stuck for
insufficient memory because it uses a lot of classes initially and then
stops using them, but these are not unloaded? One can imagine a web browser
doing something like this. Is a web browser a typical use case for the
Harmony JVM?

  Regarding your engineering choices, choice 2 seems nicer, but again I have
some questions.

1. In the class registry, is the reference from the j.l.class instance to
the j.l.CL instance a weak refernce and the reverse not a weak reference?
2. I am missing something about the java vtable object. Is it  a first class
java object with its own java class? In this case the vtable object would
have its own vtable which is a java object, but that also would have a
vtable and so on...??? In other words if every java object has a vtable,
which is a also a java object.......
3. If I am misunderstanding the above(  I hope ), the vtable objects would
need to be pinned to avoid patching virtual calls after GC, efficient
dispatching etc. Does this not put a requirement on compatible GC's to be
able to deal with pinned objects?
4. Why cannot one have a j.l.class reference in the object header, as Weldon
mentions, instead of this new vtable java type? Is the peformance impact
known and do we understand it as compared to heap pressure due to the new
vtable object?

Thanks,
Rana

> On 10/24/06, Aleksey Ignatenko <al...@gmail.com> wrote:
> >
> > Egor,
> > >But it has 1 more "cons" -- JIT should change it's devirtualizer
> > >accordingly to the VTable change. Doable, of course.
> > There is no need to change struct VTable structure - it could be simply
> > inlined in pinned VTable object + 1 additional reference field to
> > j.l.Class.
> > So there won't be too much work to do on JIT side.
> >
> > >BTW, is it reasonable to "compress" or "enumerate" references to
> > >j.l.Class in each object to reduce the footprint? How many classes are
> > >alive in heavy-duty applications? not very much probably.
> > We are to trace j.l.Class from every object via VTable to detect if
> > there is
> > any live object of that j.l.Class. This one of requirements of class
> > unloading.
> > As for footprint - there is already pointer to struct VTable in every
> > object, so changing this pointer to reference to VTable Object will have
> > no
> > effect on footprint. Compressed VTable pointers will be changed to
> > compressed references. The only effect is that VTable object is a full
> > Java
> > object and in its turn it is to have own VTable, so number of VTable
> > objects
> > will encrease for every class. As Vtable is a small object footprint
> > will
> > encrease only for tens of bytes for every loaded class, and as I know,
> > there
> > are loaded several thousands classes on Eclipse startup, therefore
> > footprint
> > increase is negligible.
> >
> > Aleksey Ignatenko,
> > Intel Enterprise Solutions Software Division
>
> .

Re: [drlvm] Class unloading support

Posted by Aleksey Ignatenko <al...@gmail.com>.

Egor,
>But it has 1 more "cons" -- JIT should change it's devirtualizer
>accordingly to the VTable change. Doable, of course.
There is no need to change struct VTable structure - it could be simply
inlined in pinned VTable object + 1 additional reference field to j.l.Class.
So there won't be too much work to do on JIT side.

>BTW, is it reasonable to "compress" or "enumerate" references to
>j.l.Class in each object to reduce the footprint? How many classes are
>alive in heavy-duty applications? not very much probably.
We are to trace j.l.Class from every object via VTable to detect if there is
any live object of that j.l.Class. This one of requirements of class
unloading.
As for footprint - there is already pointer to struct VTable in every
object, so changing this pointer to reference to VTable Object will have no
effect on footprint. Compressed VTable pointers will be changed to
compressed references. The only effect is that VTable object is a full Java
object and in its turn it is to have own VTable, so number of VTable objects
will encrease for every class. As Vtable is a small object footprint will
encrease only for tens of bytes for every loaded class, and as I know, there
are loaded several thousands classes on Eclipse startup, therefore footprint
increase is negligible.

Aleksey Ignatenko,
Intel Enterprise Solutions Software Division.

On 24 Oct 2006 23:02:41 +0700, Egor Pasko <eg...@gmail.com> wrote:

> On the 0x20C day of Apache Harmony Aleksey Ignatenko wrote:
> > Hello all!
> >
> >
> >
> > As you probably know current version of harmony DRLVM has no class
> unloading
> > support. This leads to the fact that some Java applications accumulate
> > memory leaks leading to memory overflow and crashes.
> >
> > In this message I would like to describe two approaches for class
> unloading
> > in DRLVM and propose to implement one of them as basic. Pros and cons
> for
> > both approaches are presented below. Lets name these approaches:
> >
> >    1. Mark and scan based approach.
> >    2. Automatic class unloading approach.
>
> I am +1 to (2)=(Automatic class unloading approach). Do not like
> stop-the-world.
>
> But it has 1 more "cons" -- JIT should change it's devirtualizer
> accordingly to the VTable change. Doable, of course.
>
> BTW, is it reasonable to "compress" or "enumerate" references to
> j.l.Class in each object to reduce the footprint? How many classes are
> alive in heavy-duty applications? not very much probably.
>
> > *Current DRLVM implementation specifics.*
> >
> >
> >
> > All Java.lang.Class (j.l.Class) and java.lang.Classloader (
> j.l.Classloader)
> > instances are enumerated as strong roots inside VM, which leads to the
> state
> > when all j.l.Class and j.l.Classloader instances are always reachable.
> >
> >
> >
> > To unload class loader CL three conditions are to be fulfilled (*):
> >
> >    1. j.l.Classloader instance of CL is unreachable.
> >    2. Classes (j.l.Class instances) loaded by CL are unreachable.
> >    3. No object of any class loaded by CL exists.
> >
> >
> >
> > Here is brief description for the both approaches:
> >
> >
> >
> > *Mark and scan based approach.*
> >
> > Java heap trace is performed by VM Core at the beginning of
> stop-the-world.
> > If some class loader and its classes are unreachable and there is no
> object
> > of these classes, then exclude this class loader from enumeration to
> make GC
> > collect it. After GC happens and appropriate j.l.Classloader instance is
> > collected  remove native resources from C heap: class loader and all
> > classes loaded by it, jitted code and so on. Corresponding Java objects
> > should already be collected by GC at this moment.
> >
> > Pros:
> >
> > - Simplicity  requires only additional mark&scan functionality on VM
> side
> > to detect classes for unloading + few changes in enumeration algorithm.
> >
> > Cons:
> >
> > - Requires additional GC/VM functionality to trace j.l.Class and
> > j.l.Classloader instances from each object.
> >
> > - Duplicates mark&scan functionality on VM side.
> >
> > - Affects every plugged GC.
> >
> > - "Stop-the-world" state of VM is required, i.e. all threads except the
> one
> > performing unloading should be suspended.
> >
> > - Possibly some additional limitations on new GCs.
> >
> >
> >
> > *Automatic class unloading approach.*
> >
> > "Automatic class unloading" means that j.l.Classloader instance is
> unloaded
> > automatically (w/o additional enumeration tricks or GC dependency) and
> after
> > we detect that some class loader was unloaded we destroy its native
> > resources. To do that we need to provide two conditions:
> >
> >    1. Introduce reference from object to its j.l.Class instance.
> >    2. Class registry - introduce references from j.l.Classes to its
> >    defining j.l.Classloader and references from j.l.Classloader to
> >    j.l.Classes loaded by it (unloading is to be done for
> > j.l.Classloaderand corresponding
> >    j.l.Classes at once).
> >
> >
> >
> > *Introduce reference from object to its j.l.Class instance.*
> >
> > DRLVM has definite implementation specifics. Object is described with
> native
> > VTable structure, which has pointers to class and other related data.
> > VTables can have different sizes according to object class specifics.
> The
> > main idea of referencing j.l.Class from object is to make VTable a
> special
> > Java object with reference to appropriate j.l.Class instance, but give
> it a
> > regular object view from GC point of view. VTable pointer is located in
> > object by zero offset and therefore can be simply considered as
> reference
> > field. Thus we can implement j.l.Class instance tracing from object via
> > VTable object. VTable object is considered to be pinned for
> simplification.
> >
> >
> >
> > In summary, having class registry and reference from object to its
> > j.l.Classinstance we guarantee that some class loader CL can be
> > unloaded only if
> > three conditions are fulfilled described above (*). To find out when
> Java
> > part of class loader was unloaded j.l.Classloader instance should be
> > enumerated as weak root. When this root becomes equal to null  destroy
> > native memory of appropriate class loader.
> >
> >
> >
> > Pros:
> >
> > - Unification of unloading approach  no additional requirements from
> GC.
> >
> > - Stop-the-world is not required.
> >
> > - GC handles VTables automatically as regular objects.
> >
> > Cons
> >
> > - Number of objects to be increased.
> >
> > - Memory footprint to be increased both for native and Java heaps (as
> VTable
> > objects appear).
> >
> >
> >
> > *Conclusion. *
> >
> > I prefer automatic class unloading approach due to the described set of
> > properties (see above). It is more flexible and perspective solution.
> Also
> > JVM specification is mostly related to automatic class unloading
> approach
> > while mark and scan based approach looks more like class unloading
> > workaround.
> >
> >
> >
> >
> >
> > Please, do not hesitate to ask questions.
> >
> > Best regards,
> >
> > Aleksey Ignatenko,
> >
> > Intel Enterprise Solutions Software Division.
>
> --
> Egor Pasko, Intel Managed Runtime Division
>
>

Re: [drlvm] Class unloading support

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x20C day of Apache Harmony Aleksey Ignatenko wrote:
> Hello all!
> 
> 
> 
> As you probably know current version of harmony DRLVM has no class unloading
> support. This leads to the fact that some Java applications accumulate
> memory leaks leading to memory overflow and crashes.
> 
> In this message I would like to describe two approaches for class unloading
> in DRLVM and propose to implement one of them as basic. Pros and cons for
> both approaches are presented below. Lets name these approaches:
> 
>    1. Mark and scan based approach.
>    2. Automatic class unloading approach.

I am +1 to (2)=(Automatic class unloading approach). Do not like
stop-the-world. 

But it has 1 more "cons" -- JIT should change it's devirtualizer
accordingly to the VTable change. Doable, of course.

BTW, is it reasonable to "compress" or "enumerate" references to
j.l.Class in each object to reduce the footprint? How many classes are
alive in heavy-duty applications? not very much probably.

> *Current DRLVM implementation specifics.*
> 
> 
> 
> All Java.lang.Class (j.l.Class) and java.lang.Classloader (j.l.Classloader)
> instances are enumerated as strong roots inside VM, which leads to the state
> when all j.l.Class and j.l.Classloader instances are always reachable.
> 
> 
> 
> To unload class loader CL three conditions are to be fulfilled (*):
> 
>    1. j.l.Classloader instance of CL is unreachable.
>    2. Classes (j.l.Class instances) loaded by CL are unreachable.
>    3. No object of any class loaded by CL exists.
> 
> 
> 
> Here is brief description for the both approaches:
> 
> 
> 
> *Mark and scan based approach.*
> 
> Java heap trace is performed by VM Core at the beginning of stop-the-world.
> If some class loader and its classes are unreachable and there is no object
> of these classes, then exclude this class loader from enumeration to make GC
> collect it. After GC happens and appropriate j.l.Classloader instance is
> collected  remove native resources from C heap: class loader and all
> classes loaded by it, jitted code and so on. Corresponding Java objects
> should already be collected by GC at this moment.
> 
> Pros:
> 
> - Simplicity  requires only additional mark&scan functionality on VM side
> to detect classes for unloading + few changes in enumeration algorithm.
> 
> Cons:
> 
> - Requires additional GC/VM functionality to trace j.l.Class and
> j.l.Classloader instances from each object.
> 
> - Duplicates mark&scan functionality on VM side.
> 
> - Affects every plugged GC.
> 
> - "Stop-the-world" state of VM is required, i.e. all threads except the one
> performing unloading should be suspended.
> 
> - Possibly some additional limitations on new GCs.
> 
> 
> 
> *Automatic class unloading approach.*
> 
> "Automatic class unloading" means that j.l.Classloader instance is unloaded
> automatically (w/o additional enumeration tricks or GC dependency) and after
> we detect that some class loader was unloaded we destroy its native
> resources. To do that we need to provide two conditions:
> 
>    1. Introduce reference from object to its j.l.Class instance.
>    2. Class registry - introduce references from j.l.Classes to its
>    defining j.l.Classloader and references from j.l.Classloader to
>    j.l.Classes loaded by it (unloading is to be done for
> j.l.Classloaderand corresponding
>    j.l.Classes at once).
> 
> 
> 
> *Introduce reference from object to its j.l.Class instance.*
> 
> DRLVM has definite implementation specifics. Object is described with native
> VTable structure, which has pointers to class and other related data.
> VTables can have different sizes according to object class specifics. The
> main idea of referencing j.l.Class from object is to make VTable a special
> Java object with reference to appropriate j.l.Class instance, but give it a
> regular object view from GC point of view. VTable pointer is located in
> object by zero offset and therefore can be simply considered as reference
> field. Thus we can implement j.l.Class instance tracing from object via
> VTable object. VTable object is considered to be pinned for simplification.
> 
> 
> 
> In summary, having class registry and reference from object to its
> j.l.Classinstance we guarantee that some class loader CL can be
> unloaded only if
> three conditions are fulfilled described above (*). To find out when Java
> part of class loader was unloaded j.l.Classloader instance should be
> enumerated as weak root. When this root becomes equal to null  destroy
> native memory of appropriate class loader.
> 
> 
> 
> Pros:
> 
> - Unification of unloading approach  no additional requirements from GC.
> 
> - Stop-the-world is not required.
> 
> - GC handles VTables automatically as regular objects.
> 
> Cons
> 
> - Number of objects to be increased.
> 
> - Memory footprint to be increased both for native and Java heaps (as VTable
> objects appear).
> 
> 
> 
> *Conclusion. *
> 
> I prefer automatic class unloading approach due to the described set of
> properties (see above). It is more flexible and perspective solution. Also
> JVM specification is mostly related to automatic class unloading approach
> while mark and scan based approach looks more like class unloading
> workaround.
> 
> 
> 
> 
> 
> Please, do not hesitate to ask questions.
> 
> Best regards,
> 
> Aleksey Ignatenko,
> 
> Intel Enterprise Solutions Software Division.

-- 
Egor Pasko, Intel Managed Runtime Division