You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Etienne Gagnon <eg...@sablevm.org> on 2006/10/30 15:49:36 UTC

[drlvm][sablevm] Desing of Class Unloading Support

Hi all,

Here's a more structured proposal for a simple and effective
implementation of class unloading support.

In accordance with Section 2.17.8 of the JVM spec, class unloading (and
its related native resource cleanup) can only happen when the class
loader instance becomes unreachable.  For this to happen, we put in
place the following things:

1- Each class loader is represented by some VM internal structure.
 [We'll call it the "class loader structure"].

2- Each class loader internal structure, except (optionally) the
 bootstrap class loader, maintains a weak reference to an object
 instance of class ClassLoader (or some subclass).  The Java instance
 has some opaque pointer back to the internal VM structure.   The Java
 instance is usually created before the internal VM structure.  The
 instance constructor is usually in charge of creating the internal VM
 structure.  [We'll call it the "class loader instance"]

3- Each class loader instance maintains a collection of loaded classes.
 A class/interface is never removed from this collection.  This
 collection maintains "hard" (i.e. "not weak") references to
 classes/interfaces.

4- [Informative] A class loader instance is also most likely to maintain
 a collection of classes for which it has "initiated" class loading.
 This collection should use hard references (as weak references won't
 lead to earlier class loading).

5- Each class loader instance maintains a hard reference to its parent
 class loader.  This reference is (optionally) null if the parent is the
 bootstrap class loader.

6- Each j.l.Class instance maintains a hard reference to the class
 loader instance of the class loader that has loaded it.  [This is not
 the "initiating" loaders, but really the "loading" loader].

7- Each class loader structure maintains a set of boolean flags, one
 flag per "non-nursery" garbage collected area (even when thread-local
 heaps are used).  The flag is set when an instance of a class loaded by
 this class leader is moved into the related GC-area.  The flag is unset
 when the GC-area is emptied, or (optionally) when it can be determined
 that no instance of a class loaded by this class loader remains in the
 GC-area.  This is best implemented as follows: a) use an unconditional
 write of "true" in the flag every time an object is moved into the
 GC-area by the garbage collector, b) unset the related flag in "all"
 class loader structures just before collecting a GC-area, then setting
 the flag back when an object survives in the area.

8- Each method invocation frame maintains a hard reference to either its
 surrounding instance (in case of instance methods, i.e. (invokevirtual,
 invokeinterface, and invokespecial) or its surrounding class
 (invokestatic).  This is already required for synchronized methods
 (it's not a good idea to allow the instance to be collected before the
 end of a synchronized instance method call; yep, learned the hard way
 in SableVM...)  So, the "overhead" is quite minimal.  The importance of
 this is in the correctness of not letting a class loader to die while a
 static/instance method of a class loaded by it is still active, leading
 to premature release of native resources (such as jitted code, etc.).

9- A little magic is required to prevent premature collection of a class
 loader instance and its loaded j.l.Class instances (see [3-] above), as
  object instances do not maintain a hard reference to their j.l.Class
 instance, yet we want to preserve the correctness of Object.getClass().

 So, the simplest approach is to maintain a hard reference in a class
 loader structure to its class loader instance (in addition to the weak
 reference in [2-] above).  This reference is kept always set (thus
 preventing collection of the class loader instance), except when *all*
 the following conditions are met:
  a) All nurseries are empty.
  b) All GC-area flags are unset.

 Actually, for making this practical and preserving correctness, it's a
 little trickier.  It requires a 2-step process, much like the
 object-finalization dance.  Here's a typical example:

 On a major collection, where all nurseries are collected, and some (but
 not necessary all) other GC-areas are collected, we do the following
 sequence of actions:
  a) All class loader structures are visited.  All flags related to
   non-nursery GC-areas that we intend to collect are unset.  If this
   leads to *all* flags to be unset, the hard reference to the class
   loader instance is set to NULL (thus enabling, possibly, the
   collection of the class loader instance).

  b) The garbage collection cycle is started and proceeds as usual.
   Note that the work mandated in [7-] above is also done, which might
   lead to setting back some flags in class loader structures that had
   all their flags unset in [a)].

  c) After the initial garbage collection is applied, and just before
   the usual treatment of weak references (where they are set to NULL
   when pointing to a collected object), all class loader structures
   are visited again.  The hard pointer of every class loader structure
   that has any flag set is set back to point to the class loader
   instance if it was NULL (same as how object instances are preserved
   for finalization).

  d) If [c)] has triggered any change (i.e. it mandates the survival of
   additional class loader instances that were due to die), the garbage
   collection cycle is "extended" to rescue the additional class loader
   instances and all objects they can reach.

  e) Any additional work of the garbage collection cycle is done (e.g.
   soft, weak, and phantom references, finalization handling).

  f) All class loader structures are visited again.  Every structure for
   which the weak reference has NOT been set to NULL has its hard
   reference set to the weak reference target.  Every structure for
   which the weak reference has been set to NULL is now ready to be
   unloaded (i.e. release all of its native resources, including jitted
   code, class information, method information, vtables, and so on).


In addition,I highly recommend using the approach proposed in Chapter 3
of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
class-loader related memory.  It has many advantages:

1- No "header space" overhead for very small allocations.  [This is a
 typical "hidden" space overhead of malloc() implementations to allow
 for later free() calls].
2- Minimal memory fragmentation.  [Allocation only happens in large
   blocks].
3- Simple and very efficient allocation.  [No overhead for complex
   management of freeing small areas later].
4- Efficient freeing of large memory blocks on class unloading.
5- Possibility of clever usage of this memory; see Chapter 4 of the same
   document for the implementation of sparse interface virtual tables
   enabling invokeinterface at the simple cost of invokevirtual.  :-)


I hope this is useful to both projects [drlvm][sablevm]  :-)

Etienne

(C) 2006 by Etienne M. Gagnon <eg...@sablebm.org>
This text is licensed under the Apache License, Version 2.0.

[You may add this document in svn;  I am willing to sign the required
Apache agreement to make it so, if you intend to use it in drlvm's
implementation].

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.
On 11/9/06, Robin Garner <ro...@anu.edu.au> wrote:
>
> Geir Magnusson Jr. wrote:
> >
> >
> > Weldon Washburn wrote:
> >>
> >>
> >> On 11/8/06, *Geir Magnusson Jr.* <geir@pobox.com
> >> <ma...@pobox.com>> wrote:
> >>
> >>
> >>
> >>     Weldon Washburn wrote:
> >>      > On 11/7/06, Ivan Volosyuk < ivan.volosyuk@gmail.com
> >>     <ma...@gmail.com>> wrote:
> >>      >>
> >>      >> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <egor.pasko@gmail.com
> >>     <ma...@gmail.com>> wrote:
> >>      >> > > I already have one idea how to benefit from movable
> vtables.
> >>      >
> >>      >
> >>      > There would have to be a very compelling argument for making
> >> vtables
> >>      > movable.  Like a business workload that Harmony needs to run
> >>     within the
> >>      > next
> >>      > 12 months.
> >>
> >>     How would a business workload need this directly?
> >>
> >> That's the point.  I can't figure out any compelling story for moving
> >> vtables.  As far as I can tell, its over-engineering.   I would love
> >> to be proven wrong.
> >
> > But isn't this simply an implementation detail of something that is
> > important, namely the class unloading?
> >
> > geir


I have no problem calling it an implementation detail.  Its an important
implementation detail that somehow got mixed into the design conversation.
Worth noting is that ultimately the committer is on the hook for committing
an implementation.  It would be good to have the discussion on moving vtable
implementation before someone spends a bunch of time on it.

While it did come up as an issue in the class-unloading talks I think
> most of us believe it to be orthogonal.
>
> cheers
>
> --
> Robin Garner
> Dept. of Computer Science
> Australian National University
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Robin Garner <ro...@anu.edu.au>.
Geir Magnusson Jr. wrote:
> 
> 
> Weldon Washburn wrote:
>>
>>
>> On 11/8/06, *Geir Magnusson Jr.* <geir@pobox.com 
>> <ma...@pobox.com>> wrote:
>>
>>
>>
>>     Weldon Washburn wrote:
>>      > On 11/7/06, Ivan Volosyuk < ivan.volosyuk@gmail.com
>>     <ma...@gmail.com>> wrote:
>>      >>
>>      >> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <egor.pasko@gmail.com
>>     <ma...@gmail.com>> wrote:
>>      >> > > I already have one idea how to benefit from movable vtables.
>>      >
>>      >
>>      > There would have to be a very compelling argument for making 
>> vtables
>>      > movable.  Like a business workload that Harmony needs to run
>>     within the
>>      > next
>>      > 12 months.
>>
>>     How would a business workload need this directly?
>>  
>> That's the point.  I can't figure out any compelling story for moving 
>> vtables.  As far as I can tell, its over-engineering.   I would love 
>> to be proven wrong.
> 
> But isn't this simply an implementation detail of something that is 
> important, namely the class unloading?
> 
> geir

While it did come up as an issue in the class-unloading talks I think 
most of us believe it to be orthogonal.

cheers

-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

Weldon Washburn wrote:
> 
> 
> On 11/8/06, *Geir Magnusson Jr.* <geir@pobox.com 
> <ma...@pobox.com>> wrote:
> 
> 
> 
>     Weldon Washburn wrote:
>      > On 11/7/06, Ivan Volosyuk < ivan.volosyuk@gmail.com
>     <ma...@gmail.com>> wrote:
>      >>
>      >> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <egor.pasko@gmail.com
>     <ma...@gmail.com>> wrote:
>      >> > > I already have one idea how to benefit from movable vtables.
>      >
>      >
>      > There would have to be a very compelling argument for making vtables
>      > movable.  Like a business workload that Harmony needs to run
>     within the
>      > next
>      > 12 months.
> 
>     How would a business workload need this directly? 
> 
>  
> That's the point.  I can't figure out any compelling story for moving 
> vtables.  As far as I can tell, its over-engineering.   I would love to 
> be proven wrong.

But isn't this simply an implementation detail of something that is 
important, namely the class unloading?

geir


Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.
On 11/8/06, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
>
>
> Weldon Washburn wrote:
> > On 11/7/06, Ivan Volosyuk <iv...@gmail.com> wrote:
> >>
> >> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >> > > I already have one idea how to benefit from movable vtables.
> >
> >
> > There would have to be a very compelling argument for making vtables
> > movable.  Like a business workload that Harmony needs to run within the
> > next
> > 12 months.
>
> How would a business workload need this directly?


That's the point.  I can't figure out any compelling story for moving
vtables.  As far as I can tell, its over-engineering.   I would love to be
proven wrong.

geir
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

Weldon Washburn wrote:
> On 11/7/06, Ivan Volosyuk <iv...@gmail.com> wrote:
>>
>> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
>> > > I already have one idea how to benefit from movable vtables.
> 
> 
> There would have to be a very compelling argument for making vtables
> movable.  Like a business workload that Harmony needs to run within the 
> next
> 12 months.

How would a business workload need this directly?

geir


Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Robin Garner <ro...@anu.edu.au>.
Weldon Washburn wrote:
> On 11/7/06, Ivan Volosyuk <iv...@gmail.com> wrote:
>>
>> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
>> > > I already have one idea how to benefit from movable vtables.
> 
> 
> There would have to be a very compelling argument for making vtables
> movable.  Like a business workload that Harmony needs to run within the 
> next
> 12 months.

The cost of moving vtables would be huge.  It would have to be a very 
hefty optimization :)

-- 
Robin Garner
Dept. of Computer Science
Australian National University

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.
On 11/7/06, Ivan Volosyuk <iv...@gmail.com> wrote:
>
> On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > > I already have one idea how to benefit from movable vtables.


There would have to be a very compelling argument for making vtables
movable.  Like a business workload that Harmony needs to run within the next
12 months.

>
> > in GCV4.1? :)
>
> Yes
>
> --
> Ivan
> Intel Enterprise Solutions Software Division
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.
On 07 Nov 2006 14:35:55 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > I already have one idea how to benefit from movable vtables.
>
> in GCV4.1? :)

Yes

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.
On the 0x217 day of Apache Harmony Ivan Volosyuk wrote:
> In current GCv4.1 implementation there is an assumption that vtables
> will not move. It is used in compaction algorithm. Strictly speaking,
> the only thing I need is to distinguish objects and vtables during
> allocation. If so, one of GC algorithms may treat vtables as pinned
> objects, while another could make use of the ability to move the
> vtables. 

Ivan, thank you for making it clear!

> I already have one idea how to benefit from movable vtables.

in GCV4.1? :)

> --
> Ivan
> 
> On 03 Nov 2006 14:34:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > On the 0x214 day of Apache Harmony Aleksey Ignatenko wrote:
> > > Egor,
> > >
> > > Vtable objects pinning is required not only by JIT, this is also required by
> > > GC, which relies on that VTables are non movable. So this not a way to
> > > disable guarded devirtualization. Pinning is required anyway.
> >
> > Sorry, but I am not aware of places, where pinning is required other
> > than for JIT. If you menttion one or two, that would be great for
> > understanding and the next step to beat my ignorance in this subject :)
> >
> > > On 01 Nov 2006 10:37:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > > >
> > > > On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> > > > > On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> > > > >
> > > > > > >Yet:
> > > > > >
> > > > > > >1- You do need pinning, so you rule out some of the simplest GCs (e.g
> > > > .
> > > > > > >simple, non-generational copying without pinning.)  [Apparently, for
> > > > > > >some very large heaps, simple copying a can be quite difficult to
> > > > beat,
> > > > > > >efficiency wise, if you believe some relatively recent JikesRVM
> > > > related
> > > > > > >paper...]
> > > > >
> > > > >
> > > > > Yes, this was one of my  concerns about the vtable object approach. This
> > > > is
> > > > > limiting, but this is one specific GC requirement. (Maybe for GC's that
> > > > > don't support pinning, the JIT can compare object->vtable->class for
> > > > guarded
> > > > > devirtiualization, or even not do guarded devirtualization, sort of
> > > > support
> > > > > the GC in downlevel mode). For the refcounting method we need to hand
> > > > off
> > > > > between  GC and VM before and after processing weak references, update
> > > > the
> > > > > generational or semispace related CL flags, and also use the GC to undo
> > > > or
> > > > > rescue CL instances that may come alive due to the generational flag
> > > > > processing.
> > > > >
> > > > >
> > > > >
> > > > > > >2- You do have overhead even on minor collections.  With my approach,
> > > > > > >you could limit the (quite similar to yours, if you put a
> > > > > > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > > > > > >cycles.
> > > > >
> > > > >
> > > > > I think the main advantage of the vtable object approach is that it is
> > > > > somewhat elegant and natural, if one can get past the idea of non C
> > > > vtables
> > > > > :-). Special casing to avoid object->vtable scans during minor
> > > > collections
> > > > > etc. just breaks that. Relying on GC all the way forces a class
> > > > unloading
> > > > > overhead to every GC cycle( even for the young generation collections ).
> > > > > There is also a space overhead that I can't really estimate(
> > > > proportional to
> > > > > class ....etc. etc.). As I understood it, there is no impact on MMTk
> > > > based
> > > > > GC's, but I may be wrong.
> > > > > If class unloading is done at specific moments only, the refcounting
> > > > > approach does not add a perf overhead to each GC cycle, there is no heap
> > > > > overhead of the method either. But the former implies yet another
> > > > > secondary heuristic to optimally choose the class unloading triggers,
> > > > this
> > > > > depends on the application profile and is not really once an hour/day
> > > > etc.
> > > > > My guess( humbly ) would be that the refcounting method "may" be
> > > > somewhat
> > > > > more time/space efficient, but that's probably not the only issue. There
> > > > is
> > > > > the issue of implementation correctness, existing code, etc. And I don't
> > > > > know what's the best way to go to the next step.
> > > > > A suggestion could be to take Harmony-2000, review it, put it in a
> > > > > branch,
> > > >
> > > > an alternative: JIT can disable guarded devirtualization via an
> > > > option. Commit the unloading, use/tune GCV5 with that opion until it
> > > > supports pinning. No branch required.
> > > >
> > > > > tune and test it , wait for GCV5 to start supporting pinning, try with
> > > > MMTk,
> > > > > and then integrate. If we do this, the refcounting approach would be a
> > > > > fallback for DRLVM.
> > > > > We need to decide on next steps, we cannot debate the algorithm forever
> > > > :-)
> 

-- 
Egor Pasko


Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.
In current GCv4.1 implementation there is an assumption that vtables
will not move. It is used in compaction algorithm. Strictly speaking,
the only thing I need is to distinguish objects and vtables during
allocation. If so, one of GC algorithms may treat vtables as pinned
objects, while another could make use of the ability to move the
vtables. I already have one idea how to benefit from movable vtables.
--
Ivan

On 03 Nov 2006 14:34:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> On the 0x214 day of Apache Harmony Aleksey Ignatenko wrote:
> > Egor,
> >
> > Vtable objects pinning is required not only by JIT, this is also required by
> > GC, which relies on that VTables are non movable. So this not a way to
> > disable guarded devirtualization. Pinning is required anyway.
>
> Sorry, but I am not aware of places, where pinning is required other
> than for JIT. If you menttion one or two, that would be great for
> understanding and the next step to beat my ignorance in this subject :)
>
> > On 01 Nov 2006 10:37:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > >
> > > On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> > > > On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> > > >
> > > > > >Yet:
> > > > >
> > > > > >1- You do need pinning, so you rule out some of the simplest GCs (e.g
> > > .
> > > > > >simple, non-generational copying without pinning.)  [Apparently, for
> > > > > >some very large heaps, simple copying a can be quite difficult to
> > > beat,
> > > > > >efficiency wise, if you believe some relatively recent JikesRVM
> > > related
> > > > > >paper...]
> > > >
> > > >
> > > > Yes, this was one of my  concerns about the vtable object approach. This
> > > is
> > > > limiting, but this is one specific GC requirement. (Maybe for GC's that
> > > > don't support pinning, the JIT can compare object->vtable->class for
> > > guarded
> > > > devirtiualization, or even not do guarded devirtualization, sort of
> > > support
> > > > the GC in downlevel mode). For the refcounting method we need to hand
> > > off
> > > > between  GC and VM before and after processing weak references, update
> > > the
> > > > generational or semispace related CL flags, and also use the GC to undo
> > > or
> > > > rescue CL instances that may come alive due to the generational flag
> > > > processing.
> > > >
> > > >
> > > >
> > > > > >2- You do have overhead even on minor collections.  With my approach,
> > > > > >you could limit the (quite similar to yours, if you put a
> > > > > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > > > > >cycles.
> > > >
> > > >
> > > > I think the main advantage of the vtable object approach is that it is
> > > > somewhat elegant and natural, if one can get past the idea of non C
> > > vtables
> > > > :-). Special casing to avoid object->vtable scans during minor
> > > collections
> > > > etc. just breaks that. Relying on GC all the way forces a class
> > > unloading
> > > > overhead to every GC cycle( even for the young generation collections ).
> > > > There is also a space overhead that I can't really estimate(
> > > proportional to
> > > > class ....etc. etc.). As I understood it, there is no impact on MMTk
> > > based
> > > > GC's, but I may be wrong.
> > > > If class unloading is done at specific moments only, the refcounting
> > > > approach does not add a perf overhead to each GC cycle, there is no heap
> > > > overhead of the method either. But the former implies yet another
> > > > secondary heuristic to optimally choose the class unloading triggers,
> > > this
> > > > depends on the application profile and is not really once an hour/day
> > > etc.
> > > > My guess( humbly ) would be that the refcounting method "may" be
> > > somewhat
> > > > more time/space efficient, but that's probably not the only issue. There
> > > is
> > > > the issue of implementation correctness, existing code, etc. And I don't
> > > > know what's the best way to go to the next step.
> > > > A suggestion could be to take Harmony-2000, review it, put it in a
> > > > branch,
> > >
> > > an alternative: JIT can disable guarded devirtualization via an
> > > option. Commit the unloading, use/tune GCV5 with that opion until it
> > > supports pinning. No branch required.
> > >
> > > > tune and test it , wait for GCV5 to start supporting pinning, try with
> > > MMTk,
> > > > and then integrate. If we do this, the refcounting approach would be a
> > > > fallback for DRLVM.
> > > > We need to decide on next steps, we cannot debate the algorithm forever
> > > :-)

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.
On the 0x214 day of Apache Harmony Aleksey Ignatenko wrote:
> Egor,
> 
> Vtable objects pinning is required not only by JIT, this is also required by
> GC, which relies on that VTables are non movable. So this not a way to
> disable guarded devirtualization. Pinning is required anyway.

Sorry, but I am not aware of places, where pinning is required other
than for JIT. If you menttion one or two, that would be great for
understanding and the next step to beat my ignorance in this subject :)

> On 01 Nov 2006 10:37:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >
> > On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> > > On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> > >
> > > > >Yet:
> > > >
> > > > >1- You do need pinning, so you rule out some of the simplest GCs (e.g
> > .
> > > > >simple, non-generational copying without pinning.)  [Apparently, for
> > > > >some very large heaps, simple copying a can be quite difficult to
> > beat,
> > > > >efficiency wise, if you believe some relatively recent JikesRVM
> > related
> > > > >paper...]
> > >
> > >
> > > Yes, this was one of my  concerns about the vtable object approach. This
> > is
> > > limiting, but this is one specific GC requirement. (Maybe for GC's that
> > > don't support pinning, the JIT can compare object->vtable->class for
> > guarded
> > > devirtiualization, or even not do guarded devirtualization, sort of
> > support
> > > the GC in downlevel mode). For the refcounting method we need to hand
> > off
> > > between  GC and VM before and after processing weak references, update
> > the
> > > generational or semispace related CL flags, and also use the GC to undo
> > or
> > > rescue CL instances that may come alive due to the generational flag
> > > processing.
> > >
> > >
> > >
> > > > >2- You do have overhead even on minor collections.  With my approach,
> > > > >you could limit the (quite similar to yours, if you put a
> > > > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > > > >cycles.
> > >
> > >
> > > I think the main advantage of the vtable object approach is that it is
> > > somewhat elegant and natural, if one can get past the idea of non C
> > vtables
> > > :-). Special casing to avoid object->vtable scans during minor
> > collections
> > > etc. just breaks that. Relying on GC all the way forces a class
> > unloading
> > > overhead to every GC cycle( even for the young generation collections ).
> > > There is also a space overhead that I can't really estimate(
> > proportional to
> > > class ....etc. etc.). As I understood it, there is no impact on MMTk
> > based
> > > GC's, but I may be wrong.
> > > If class unloading is done at specific moments only, the refcounting
> > > approach does not add a perf overhead to each GC cycle, there is no heap
> > > overhead of the method either. But the former implies yet another
> > > secondary heuristic to optimally choose the class unloading triggers,
> > this
> > > depends on the application profile and is not really once an hour/day
> > etc.
> > > My guess( humbly ) would be that the refcounting method "may" be
> > somewhat
> > > more time/space efficient, but that's probably not the only issue. There
> > is
> > > the issue of implementation correctness, existing code, etc. And I don't
> > > know what's the best way to go to the next step.
> > > A suggestion could be to take Harmony-2000, review it, put it in a
> > > branch,
> >
> > an alternative: JIT can disable guarded devirtualization via an
> > option. Commit the unloading, use/tune GCV5 with that opion until it
> > supports pinning. No branch required.
> >
> > > tune and test it , wait for GCV5 to start supporting pinning, try with
> > MMTk,
> > > and then integrate. If we do this, the refcounting approach would be a
> > > fallback for DRLVM.
> > > We need to decide on next steps, we cannot debate the algorithm forever
> > :-)
> >
> > --
> > Egor Pasko, Intel Managed Runtime Division
> >
> >

-- 
Egor Pasko


Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Aleksey Ignatenko <al...@gmail.com>.
Egor,

Vtable objects pinning is required not only by JIT, this is also required by
GC, which relies on that VTables are non movable. So this not a way to
disable guarded devirtualization. Pinning is required anyway.

Aleksey.


On 01 Nov 2006 10:37:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
>
> On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> > On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> >
> > > >Yet:
> > >
> > > >1- You do need pinning, so you rule out some of the simplest GCs (e.g
> .
> > > >simple, non-generational copying without pinning.)  [Apparently, for
> > > >some very large heaps, simple copying a can be quite difficult to
> beat,
> > > >efficiency wise, if you believe some relatively recent JikesRVM
> related
> > > >paper...]
> >
> >
> > Yes, this was one of my  concerns about the vtable object approach. This
> is
> > limiting, but this is one specific GC requirement. (Maybe for GC's that
> > don't support pinning, the JIT can compare object->vtable->class for
> guarded
> > devirtiualization, or even not do guarded devirtualization, sort of
> support
> > the GC in downlevel mode). For the refcounting method we need to hand
> off
> > between  GC and VM before and after processing weak references, update
> the
> > generational or semispace related CL flags, and also use the GC to undo
> or
> > rescue CL instances that may come alive due to the generational flag
> > processing.
> >
> >
> >
> > > >2- You do have overhead even on minor collections.  With my approach,
> > > >you could limit the (quite similar to yours, if you put a
> > > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > > >cycles.
> >
> >
> > I think the main advantage of the vtable object approach is that it is
> > somewhat elegant and natural, if one can get past the idea of non C
> vtables
> > :-). Special casing to avoid object->vtable scans during minor
> collections
> > etc. just breaks that. Relying on GC all the way forces a class
> unloading
> > overhead to every GC cycle( even for the young generation collections ).
> > There is also a space overhead that I can't really estimate(
> proportional to
> > class ....etc. etc.). As I understood it, there is no impact on MMTk
> based
> > GC's, but I may be wrong.
> > If class unloading is done at specific moments only, the refcounting
> > approach does not add a perf overhead to each GC cycle, there is no heap
> > overhead of the method either. But the former implies yet another
> > secondary heuristic to optimally choose the class unloading triggers,
> this
> > depends on the application profile and is not really once an hour/day
> etc.
> > My guess( humbly ) would be that the refcounting method "may" be
> somewhat
> > more time/space efficient, but that's probably not the only issue. There
> is
> > the issue of implementation correctness, existing code, etc. And I don't
> > know what's the best way to go to the next step.
> > A suggestion could be to take Harmony-2000, review it, put it in a
> > branch,
>
> an alternative: JIT can disable guarded devirtualization via an
> option. Commit the unloading, use/tune GCV5 with that opion until it
> supports pinning. No branch required.
>
> > tune and test it , wait for GCV5 to start supporting pinning, try with
> MMTk,
> > and then integrate. If we do this, the refcounting approach would be a
> > fallback for DRLVM.
> > We need to decide on next steps, we cannot debate the algorithm forever
> :-)
>
> --
> Egor Pasko, Intel Managed Runtime Division
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.
On 11/2/06, Xiao-Feng Li <xi...@gmail.com> wrote:
> On 11/2/06, Robin Garner <ro...@anu.edu.au> wrote:
> > Xiao-Feng Li wrote:
> > > On 11/1/06, Mikhail Fursov <mi...@gmail.com> wrote:
> > >> On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > >> >
> > >> > On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> > >> > > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com>
> > >> wrote:
> > >> > > >
> > >> > > > agreed. not patching .. just reporting 'golden' VTable refs to
> > >> GC, am
> > >> > > > I right?
> > >> > > >
> > >> > > Yes, and everytime we report it to GC and GC moves an object - it
> > >> > patches
> > >> > > the address we report.
> > >> >
> > >> > so, by saying "patching" you insist to put immediate operands into
> > >> > instructions? That's probably worth it, but it extends the JIT<->GC
> > >> > interface. How about making a simple operand (reg/mem) as the first
> > >> step?
> > >>
> > >>
> > >> Egor, I thinks this is slightly more complicated problem. If vtable
> > >> object
> > >> is moved we must update all devirtualization points in every method
> > >> compiled
> > >> before. It can require an extension of JIT<->VM<->GC interface.
> > >> Another solution I see is to collect info about all devirtualization
> > >> points
> > >> in JIT (code addrs) and report these addresses as enumeration roots.
> > >> This is
> > >> JIT-only solution, and disadvantage is a significant (~hot methods count)
> > >> increase of number of objects reported.
> > >>
> > >> On the other hand I see no reasons to unpin vtables in the nearest future
> > >> (Let's GC guru correct me). If you use special (freelist-type ?)
> > >> allocator
> > >> in GC the memory fragmentation when unloading pinned vtable objects
> > >> could be
> > >> low.
> > >
> > > Yes, vtable should never be moved except for very weird reason. And
> > > yes, to pin certain amount of objects is not a big performance issue
> > > (in both temporal and spatial wise).
> > >
> > > -xiaofeng
> > >
> > >> --
> > >> Mikhail Fursov
> > >>
> > >>
> >
> > In MMTk, this kind of 'pinning' is an allocation-time policy decision of
> > the type I was advocating in the GC helpers thread.  Once a GC allows
> > for the idea of supporting multiple collection policies (which
> > generational GC requires in any case), then adding a non-moving space to
> > a memory manager is easy.
> >
> > Most memory managers will have a non-moving large object space no matter
> >   what the primary collection policy is.  The DRLVM collectors have this
> > too, don't they ?
> > Pinning an object after allocation is a harder problem, but not
> > something required in this case.
>
> Yes, I agree with all what you said. And DRLVM GCv4/v5 doesn't move
> large objects at the moment.

GCv4.1 does. There is no problems to support pinned allocation here anyway.

-- 
Ivan
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Xiao-Feng Li <xi...@gmail.com>.
On 11/2/06, Robin Garner <ro...@anu.edu.au> wrote:
> Xiao-Feng Li wrote:
> > On 11/1/06, Mikhail Fursov <mi...@gmail.com> wrote:
> >> On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >> >
> >> > On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> >> > > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com>
> >> wrote:
> >> > > >
> >> > > > agreed. not patching .. just reporting 'golden' VTable refs to
> >> GC, am
> >> > > > I right?
> >> > > >
> >> > > Yes, and everytime we report it to GC and GC moves an object - it
> >> > patches
> >> > > the address we report.
> >> >
> >> > so, by saying "patching" you insist to put immediate operands into
> >> > instructions? That's probably worth it, but it extends the JIT<->GC
> >> > interface. How about making a simple operand (reg/mem) as the first
> >> step?
> >>
> >>
> >> Egor, I thinks this is slightly more complicated problem. If vtable
> >> object
> >> is moved we must update all devirtualization points in every method
> >> compiled
> >> before. It can require an extension of JIT<->VM<->GC interface.
> >> Another solution I see is to collect info about all devirtualization
> >> points
> >> in JIT (code addrs) and report these addresses as enumeration roots.
> >> This is
> >> JIT-only solution, and disadvantage is a significant (~hot methods count)
> >> increase of number of objects reported.
> >>
> >> On the other hand I see no reasons to unpin vtables in the nearest future
> >> (Let's GC guru correct me). If you use special (freelist-type ?)
> >> allocator
> >> in GC the memory fragmentation when unloading pinned vtable objects
> >> could be
> >> low.
> >
> > Yes, vtable should never be moved except for very weird reason. And
> > yes, to pin certain amount of objects is not a big performance issue
> > (in both temporal and spatial wise).
> >
> > -xiaofeng
> >
> >> --
> >> Mikhail Fursov
> >>
> >>
>
> In MMTk, this kind of 'pinning' is an allocation-time policy decision of
> the type I was advocating in the GC helpers thread.  Once a GC allows
> for the idea of supporting multiple collection policies (which
> generational GC requires in any case), then adding a non-moving space to
> a memory manager is easy.
>
> Most memory managers will have a non-moving large object space no matter
>   what the primary collection policy is.  The DRLVM collectors have this
> too, don't they ?
> Pinning an object after allocation is a harder problem, but not
> something required in this case.

Yes, I agree with all what you said. And DRLVM GCv4/v5 doesn't move
large objects at the moment.

Thanks,
xiaofeng

> cheers
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Robin Garner <ro...@anu.edu.au>.
Xiao-Feng Li wrote:
> On 11/1/06, Mikhail Fursov <mi...@gmail.com> wrote:
>> On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
>> >
>> > On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
>> > > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> 
>> wrote:
>> > > >
>> > > > agreed. not patching .. just reporting 'golden' VTable refs to 
>> GC, am
>> > > > I right?
>> > > >
>> > > Yes, and everytime we report it to GC and GC moves an object - it
>> > patches
>> > > the address we report.
>> >
>> > so, by saying "patching" you insist to put immediate operands into
>> > instructions? That's probably worth it, but it extends the JIT<->GC
>> > interface. How about making a simple operand (reg/mem) as the first 
>> step?
>>
>>
>> Egor, I thinks this is slightly more complicated problem. If vtable 
>> object
>> is moved we must update all devirtualization points in every method 
>> compiled
>> before. It can require an extension of JIT<->VM<->GC interface.
>> Another solution I see is to collect info about all devirtualization 
>> points
>> in JIT (code addrs) and report these addresses as enumeration roots. 
>> This is
>> JIT-only solution, and disadvantage is a significant (~hot methods count)
>> increase of number of objects reported.
>>
>> On the other hand I see no reasons to unpin vtables in the nearest future
>> (Let's GC guru correct me). If you use special (freelist-type ?) 
>> allocator
>> in GC the memory fragmentation when unloading pinned vtable objects 
>> could be
>> low.
> 
> Yes, vtable should never be moved except for very weird reason. And
> yes, to pin certain amount of objects is not a big performance issue
> (in both temporal and spatial wise).
> 
> -xiaofeng
> 
>> -- 
>> Mikhail Fursov
>>
>>

In MMTk, this kind of 'pinning' is an allocation-time policy decision of 
the type I was advocating in the GC helpers thread.  Once a GC allows 
for the idea of supporting multiple collection policies (which 
generational GC requires in any case), then adding a non-moving space to 
a memory manager is easy.

Most memory managers will have a non-moving large object space no matter 
  what the primary collection policy is.  The DRLVM collectors have this 
too, don't they ?

Pinning an object after allocation is a harder problem, but not 
something required in this case.

cheers


Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Xiao-Feng Li <xi...@gmail.com>.
On 11/1/06, Mikhail Fursov <mi...@gmail.com> wrote:
> On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >
> > On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> > > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > > >
> > > > agreed. not patching .. just reporting 'golden' VTable refs to GC, am
> > > > I right?
> > > >
> > > Yes, and everytime we report it to GC and GC moves an object - it
> > patches
> > > the address we report.
> >
> > so, by saying "patching" you insist to put immediate operands into
> > instructions? That's probably worth it, but it extends the JIT<->GC
> > interface. How about making a simple operand (reg/mem) as the first step?
>
>
> Egor, I thinks this is slightly more complicated problem. If vtable object
> is moved we must update all devirtualization points in every method compiled
> before. It can require an extension of JIT<->VM<->GC interface.
> Another solution I see is to collect info about all devirtualization points
> in JIT (code addrs) and report these addresses as enumeration roots. This is
> JIT-only solution, and disadvantage is a significant (~hot methods count)
> increase of number of objects reported.
>
> On the other hand I see no reasons to unpin vtables in the nearest future
> (Let's GC guru correct me). If you use special (freelist-type ?) allocator
> in GC the memory fragmentation when unloading pinned vtable objects could be
> low.

Yes, vtable should never be moved except for very weird reason. And
yes, to pin certain amount of objects is not a big performance issue
(in both temporal and spatial wise).

-xiaofeng

> --
> Mikhail Fursov
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Mikhail Fursov <mi...@gmail.com>.
On 01 Nov 2006 16:05:41 +0600, Egor Pasko <eg...@gmail.com> wrote:
>
> On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> > On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> wrote:
> > >
> > > agreed. not patching .. just reporting 'golden' VTable refs to GC, am
> > > I right?
> > >
> > Yes, and everytime we report it to GC and GC moves an object - it
> patches
> > the address we report.
>
> so, by saying "patching" you insist to put immediate operands into
> instructions? That's probably worth it, but it extends the JIT<->GC
> interface. How about making a simple operand (reg/mem) as the first step?


Egor, I thinks this is slightly more complicated problem. If vtable object
is moved we must update all devirtualization points in every method compiled
before. It can require an extension of JIT<->VM<->GC interface.
Another solution I see is to collect info about all devirtualization points
in JIT (code addrs) and report these addresses as enumeration roots. This is
JIT-only solution, and disadvantage is a significant (~hot methods count)
increase of number of objects reported.

On the other hand I see no reasons to unpin vtables in the nearest future
(Let's GC guru correct me). If you use special (freelist-type ?) allocator
in GC the memory fragmentation when unloading pinned vtable objects could be
low.

-- 
Mikhail Fursov

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.
On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> wrote:
> >
> > agreed. not patching .. just reporting 'golden' VTable refs to GC, am
> > I right?
> >
> Yes, and everytime we report it to GC and GC moves an object - it patches
> the address we report.

so, by saying "patching" you insist to put immediate operands into
instructions? That's probably worth it, but it extends the JIT<->GC
interface. How about making a simple operand (reg/mem) as the first step?

-- 
Egor Pasko, Intel Managed Runtime Division


Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Mikhail Fursov <mi...@gmail.com>.
On 01 Nov 2006 15:56:28 +0600, Egor Pasko <eg...@gmail.com> wrote:
>
> agreed. not patching .. just reporting 'golden' VTable refs to GC, am
> I right?
>
Yes, and everytime we report it to GC and GC moves an object - it patches
the address we report.


-- 
Mikhail Fursov

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.
On 10/31/06, Xiao-Feng Li <xi...@gmail.com> wrote:
> On 10/31/06, Pavel Pervov <pm...@gmail.com> wrote:
> > > 7- Each class loader structure maintains a set of boolean flags, one
> > > flag per "non-nursery" garbage collected area (even when thread-local
> > > heaps are used).  The flag is set when an instance of a class loaded by
> > > this class leader is moved into the related GC-area.  The flag is unset
> > > when the GC-area is emptied, or (optionally) when it can be determined
> > > that no instance of a class loaded by this class loader remains in the
> > > GC-area.  This is best implemented as follows: a) use an unconditional
> > > write of "true" in the flag every time an object is moved into the
> > > GC-area by the garbage collector, b) unset the related flag in "all"
> > > class loader structures just before collecting a GC-area, then setting
> > > the flag back when an object survives in the area.
> >
> >
> > Requires identification of object' class type during GC. Will most
> > probably degrade GC performance.
>
> Yes, this is also my concern.

Yes, tracing and marking of Vtable objects can be cheaper then tracing
object->vtable->class->classloader for each object.

Even #2 proposal will degrade performance, but this approach will do
this even more.

--
Ivan

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.
On the 0x214 day of Apache Harmony Mikhail Fursov wrote:
> On 11/1/06, Rana Dasgupta <rd...@gmail.com> wrote:
> >
> > Maybe for GC's that don't support pinning, the JIT can compare
> > object->vtable->class for guarded
> > devirtiualization, or even not do guarded devirtualization, sort of
> > support
> > the GC in downlevel mode
> 
> 
> I think this is not a long term solution for a JIT. IMO the best solutions
> for a JIT with unpinned vtables would be
> 1) Short term: turn devirtualization off (As Egor has proposed)
> 2) Long term: patch devirtualization calls when GC moves object (usual
> enumeration routine)

agreed. not patching .. just reporting 'golden' VTable refs to GC, am
I right?

> Storing vtable in the object without additional indirection in memory is
> important from the performance POV.

-- 
Egor Pasko, Intel Managed Runtime Division


Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Mikhail Fursov <mi...@gmail.com>.
On 11/1/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
> Maybe for GC's that don't support pinning, the JIT can compare
> object->vtable->class for guarded
> devirtiualization, or even not do guarded devirtualization, sort of
> support
> the GC in downlevel mode


I think this is not a long term solution for a JIT. IMO the best solutions
for a JIT with unpinned vtables would be
1) Short term: turn devirtualization off (As Egor has proposed)
2) Long term: patch devirtualization calls when GC moves object (usual
enumeration routine)

Storing vtable in the object without additional indirection in memory is
important from the performance POV.

-- 
Mikhail Fursov

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.
On 10/31/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
> On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
>
> > >Yet:
> >
> > >1- You do need pinning, so you rule out some of the simplest GCs (e.g.
> > >simple, non-generational copying without pinning.)  [Apparently, for
> > >some very large heaps, simple copying a can be quite difficult to beat,
> > >efficiency wise, if you believe some relatively recent JikesRVM related
> > >paper...]
>
>
> Yes, this was one of my  concerns about the vtable object approach. This
> is
> limiting, but this is one specific GC requirement. (Maybe for GC's that
> don't support pinning, the JIT can compare object->vtable->class for
> guarded
> devirtiualization, or even not do guarded devirtualization, sort of
> support
> the GC in downlevel mode). For the refcounting method we need to hand off
> between  GC and VM before and after processing weak references, update the
> generational or semispace related CL flags, and also use the GC to undo or
> rescue CL instances that may come alive due to the generational flag
> processing.


> >2- You do have overhead even on minor collections.  With my approach,
> > >you could limit the (quite similar to yours, if you put a
> > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > >cycles.
>
>
> I think the main advantage of the vtable object approach is that it is
> somewhat elegant and natural, if one can get past the idea of non C
> vtables
> :-). Special casing to avoid object->vtable scans during minor collections
> etc. just breaks that. Relying on GC all the way forces a class unloading
> overhead to every GC cycle( even for the young generation collections ).
> There is also a space overhead that I can't really estimate( proportional
> to
> class ....etc. etc.). As I understood it, there is no impact on MMTk based
> GC's, but I may be wrong.


Actually Robin Garner in the other class unloading thread ([drlvm] classs
unloading support) said minor mods to MMTk might be required.

If class unloading is done at specific moments only, the refcounting
> approach does not add a perf overhead to each GC cycle, there is no heap
> overhead of the method either. But the former implies yet another
> secondary heuristic to optimally choose the class unloading triggers, this
> depends on the application profile and is not really once an hour/day etc.
> My guess( humbly ) would be that the refcounting method "may" be somewhat
> more time/space efficient, but that's probably not the only issue. There
> is
> the issue of implementation correctness, existing code, etc. And I don't
> know what's the best way to go to the next step.
> A suggestion could be to take Harmony-2000, review it, put it in a branch,
> tune and test it , wait for GCV5 to start supporting pinning, try with
> MMTk,
> and then integrate.



+1
I can't really visualize the changes to 40 files by looking at a diff file.
It seems inefficient for all of us to battle applying the patch simply to be
able to look at the code and set break points with the debugger.


If we do this, the refcounting approach would be a
> fallback for DRLVM.
> We need to decide on next steps, we cannot debate the algorithm forever
> :-)







-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Egor Pasko <eg...@gmail.com>.
On the 0x214 day of Apache Harmony Rana Dasgupta wrote:
> On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:
> 
> > >Yet:
> >
> > >1- You do need pinning, so you rule out some of the simplest GCs (e.g.
> > >simple, non-generational copying without pinning.)  [Apparently, for
> > >some very large heaps, simple copying a can be quite difficult to beat,
> > >efficiency wise, if you believe some relatively recent JikesRVM related
> > >paper...]
> 
> 
> Yes, this was one of my  concerns about the vtable object approach. This is
> limiting, but this is one specific GC requirement. (Maybe for GC's that
> don't support pinning, the JIT can compare object->vtable->class for guarded
> devirtiualization, or even not do guarded devirtualization, sort of support
> the GC in downlevel mode). For the refcounting method we need to hand off
> between  GC and VM before and after processing weak references, update the
> generational or semispace related CL flags, and also use the GC to undo or
> rescue CL instances that may come alive due to the generational flag
> processing.
> 
> 
> 
> > >2- You do have overhead even on minor collections.  With my approach,
> > >you could limit the (quite similar to yours, if you put a
> > >class-loader/NULL pointer in the vtable) overhead only to selected GC
> > >cycles.
> 
> 
> I think the main advantage of the vtable object approach is that it is
> somewhat elegant and natural, if one can get past the idea of non C vtables
> :-). Special casing to avoid object->vtable scans during minor collections
> etc. just breaks that. Relying on GC all the way forces a class unloading
> overhead to every GC cycle( even for the young generation collections ).
> There is also a space overhead that I can't really estimate( proportional to
> class ....etc. etc.). As I understood it, there is no impact on MMTk based
> GC's, but I may be wrong.
> If class unloading is done at specific moments only, the refcounting
> approach does not add a perf overhead to each GC cycle, there is no heap
> overhead of the method either. But the former implies yet another
> secondary heuristic to optimally choose the class unloading triggers, this
> depends on the application profile and is not really once an hour/day etc.
> My guess( humbly ) would be that the refcounting method "may" be somewhat
> more time/space efficient, but that's probably not the only issue. There is
> the issue of implementation correctness, existing code, etc. And I don't
> know what's the best way to go to the next step.
> A suggestion could be to take Harmony-2000, review it, put it in a
> branch,

an alternative: JIT can disable guarded devirtualization via an
option. Commit the unloading, use/tune GCV5 with that opion until it
supports pinning. No branch required.

> tune and test it , wait for GCV5 to start supporting pinning, try with MMTk,
> and then integrate. If we do this, the refcounting approach would be a
> fallback for DRLVM.
> We need to decide on next steps, we cannot debate the algorithm forever :-)

-- 
Egor Pasko, Intel Managed Runtime Division


Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Rana Dasgupta <rd...@gmail.com>.
On 10/31/06, Etienne Gagnon <egagnon@sablevm.org > wrote:

> >Yet:
>
> >1- You do need pinning, so you rule out some of the simplest GCs (e.g.
> >simple, non-generational copying without pinning.)  [Apparently, for
> >some very large heaps, simple copying a can be quite difficult to beat,
> >efficiency wise, if you believe some relatively recent JikesRVM related
> >paper...]


Yes, this was one of my  concerns about the vtable object approach. This is
limiting, but this is one specific GC requirement. (Maybe for GC's that
don't support pinning, the JIT can compare object->vtable->class for guarded
devirtiualization, or even not do guarded devirtualization, sort of support
the GC in downlevel mode). For the refcounting method we need to hand off
between  GC and VM before and after processing weak references, update the
generational or semispace related CL flags, and also use the GC to undo or
rescue CL instances that may come alive due to the generational flag
processing.



> >2- You do have overhead even on minor collections.  With my approach,
> >you could limit the (quite similar to yours, if you put a
> >class-loader/NULL pointer in the vtable) overhead only to selected GC
> >cycles.


I think the main advantage of the vtable object approach is that it is
somewhat elegant and natural, if one can get past the idea of non C vtables
:-). Special casing to avoid object->vtable scans during minor collections
etc. just breaks that. Relying on GC all the way forces a class unloading
overhead to every GC cycle( even for the young generation collections ).
There is also a space overhead that I can't really estimate( proportional to
class ....etc. etc.). As I understood it, there is no impact on MMTk based
GC's, but I may be wrong.
If class unloading is done at specific moments only, the refcounting
approach does not add a perf overhead to each GC cycle, there is no heap
overhead of the method either. But the former implies yet another
secondary heuristic to optimally choose the class unloading triggers, this
depends on the application profile and is not really once an hour/day etc.
My guess( humbly ) would be that the refcounting method "may" be somewhat
more time/space efficient, but that's probably not the only issue. There is
the issue of implementation correctness, existing code, etc. And I don't
know what's the best way to go to the next step.
A suggestion could be to take Harmony-2000, review it, put it in a branch,
tune and test it , wait for GCV5 to start supporting pinning, try with MMTk,
and then integrate. If we do this, the refcounting approach would be a
fallback for DRLVM.
We need to decide on next steps, we cannot debate the algorithm forever :-)

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.
On 10/31/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>
> Yet:
>
> 1- You do need pinning, so you rule out some of the simplest GCs (e.g.
> simple, non-generational copying without pinning.)  [Apparently, for
> some very large heaps, simple copying a can be quite difficult to beat,
> efficiency wise, if you believe some relatively recent JikesRVM related
> paper...]
>
> 2- You do have overhead even on minor collections.  With my approach,
> you could limit the (quite similar to yours, if you put a
> class-loader/NULL pointer in the vtable) overhead only to selected GC
> cycles.
>
> Of course, I am sure that all of the proposed approaches have their
> benefits/drawbacks.  I was simply contributing to the ongoing
> discussion.  I have no special reason to try very hard to convince you
> that "my idea is better than yours"!  I'm only joining the debate for
> trying find the most suitable solution.  I've already gained knowledge,
> from the discussion so far, that I'll be able to apply eventually in
> SableVM. :-)
>
> Maybe the best solution lies in mixing some of the various ideas
> proposed so far...


I too learned a lot from this thread.  I also suspect a better solution will
emerge from these kinds of discussions.

Etienne
>
> Ivan Volosyuk wrote:
> > Actually, no need to add the overhead to _all_ cycles. We don't need
> > to trace the vtables everytime. On minor collections all the pinned
> > vtables can be linearly scanned, thus most expensive tracing from
> > object to vtable can be avoided in this case.
> >
>
> --
> Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Etienne Gagnon <eg...@sablevm.org>.
Yet:

1- You do need pinning, so you rule out some of the simplest GCs (e.g.
simple, non-generational copying without pinning.)  [Apparently, for
some very large heaps, simple copying a can be quite difficult to beat,
efficiency wise, if you believe some relatively recent JikesRVM related
paper...]

2- You do have overhead even on minor collections.  With my approach,
you could limit the (quite similar to yours, if you put a
class-loader/NULL pointer in the vtable) overhead only to selected GC
cycles.

Of course, I am sure that all of the proposed approaches have their
benefits/drawbacks.  I was simply contributing to the ongoing
discussion.  I have no special reason to try very hard to convince you
that "my idea is better than yours"!  I'm only joining the debate for
trying find the most suitable solution.  I've already gained knowledge,
from the discussion so far, that I'll be able to apply eventually in
SableVM. :-)

Maybe the best solution lies in mixing some of the various ideas
proposed so far...

Etienne

Ivan Volosyuk wrote:
> Actually, no need to add the overhead to _all_ cycles. We don't need
> to trace the vtables everytime. On minor collections all the pinned
> vtables can be linearly scanned, thus most expensive tracing from
> object to vtable can be avoided in this case.
> 

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Ivan Volosyuk <iv...@gmail.com>.
Actually, no need to add the overhead to _all_ cycles. We don't need
to trace the vtables everytime. On minor collections all the pinned
vtables can be linearly scanned, thus most expensive tracing from
object to vtable can be avoided in this case.

-- 
Ivan
Intel Enterprise Solutions Software Division

On 10/31/06, Etienne Gagnon <eg...@sablevm.org> wrote:
> Actually, I think that Java vtables would be more expensive than my
> proposed approach (when you take my proposed improvements in my reply to
> Pavel Pervov), as you add overhead to all GC cycles!  [Unless you don't
> "trace" from every visited object to its vtable?]
>
> I really don't like much the idea of an "object" vtable.  It requires
> things such as "pinning", etc.  Looks more expensive than my solution.
>
> Etienne
>
> Rana Dasgupta wrote:
> > Etienne,
> >  This is a good design, thanks. Conceptually, reference counting in the VM
> > is somewhat similar to Aleksey's proposal 1, if I understand correctly.
> > This
> > design also requires quite a few hand-offs between the VM and GC. In DRLVM,
> > the problem is that we have quite a few GC's, not all within our control.
> >  However, it seems to me that we can either desire to make unloading
> > automatic, in which case, we will need things like java vtables etc and
> > leave most things to the GC. Or we can do refcounting or tracing in the VM,
> > and work lock step with the GC(s). I am not sure which is the better way.

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Etienne Gagnon <eg...@sablevm.org>.
Actually, I think that Java vtables would be more expensive than my
proposed approach (when you take my proposed improvements in my reply to
Pavel Pervov), as you add overhead to all GC cycles!  [Unless you don't
"trace" from every visited object to its vtable?]

I really don't like much the idea of an "object" vtable.  It requires
things such as "pinning", etc.  Looks more expensive than my solution.

Etienne

Rana Dasgupta wrote:
> Etienne,
>  This is a good design, thanks. Conceptually, reference counting in the VM
> is somewhat similar to Aleksey's proposal 1, if I understand correctly.
> This
> design also requires quite a few hand-offs between the VM and GC. In DRLVM,
> the problem is that we have quite a few GC's, not all within our control.
>  However, it seems to me that we can either desire to make unloading
> automatic, in which case, we will need things like java vtables etc and
> leave most things to the GC. Or we can do refcounting or tracing in the VM,
> and work lock step with the GC(s). I am not sure which is the better way.

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Rana Dasgupta <rd...@gmail.com>.
Etienne,
  This is a good design, thanks. Conceptually, reference counting in the VM
is somewhat similar to Aleksey's proposal 1, if I understand correctly. This
design also requires quite a few hand-offs between the VM and GC. In DRLVM,
the problem is that we have quite a few GC's, not all within our control.
  However, it seems to me that we can either desire to make unloading
automatic, in which case, we will need things like java vtables etc and
leave most things to the GC. Or we can do refcounting or tracing in the VM,
 and work lock step with the GC(s). I am not sure which is the better way.

Thanks,
Rana


On 10/30/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>
> Hi all,
>
> Here's a more structured proposal for a simple and effective
> implementation of class unloading support.
>
> In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> its related native resource cleanup) can only happen when the class
> loader instance becomes unreachable.  For this to happen, we put in
> place the following things:
>
> 1- Each class loader is represented by some VM internal structure.
> [We'll call it the "class loader structure"].
>
> 2- Each class loader internal structure, except (optionally) the
> bootstrap class loader, maintains a weak reference to an object
> instance of class ClassLoader (or some subclass).  The Java instance
> has some opaque pointer back to the internal VM structure.   The Java
> instance is usually created before the internal VM structure.  The
> instance constructor is usually in charge of creating the internal VM
> structure.  [We'll call it the "class loader instance"]
>
> 3- Each class loader instance maintains a collection of loaded classes.
> A class/interface is never removed from this collection.  This
> collection maintains "hard" (i.e. "not weak") references to
> classes/interfaces.
>
> 4- [Informative] A class loader instance is also most likely to maintain
> a collection of classes for which it has "initiated" class loading.
> This collection should use hard references (as weak references won't
> lead to earlier class loading).
>
> 5- Each class loader instance maintains a hard reference to its parent
> class loader.  This reference is (optionally) null if the parent is the
> bootstrap class loader.
>
> 6- Each j.l.Class instance maintains a hard reference to the class
> loader instance of the class loader that has loaded it.  [This is not
> the "initiating" loaders, but really the "loading" loader].
>
> 7- Each class loader structure maintains a set of boolean flags, one
> flag per "non-nursery" garbage collected area (even when thread-local
> heaps are used).  The flag is set when an instance of a class loaded by
> this class leader is moved into the related GC-area.  The flag is unset
> when the GC-area is emptied, or (optionally) when it can be determined
> that no instance of a class loaded by this class loader remains in the
> GC-area.  This is best implemented as follows: a) use an unconditional
> write of "true" in the flag every time an object is moved into the
> GC-area by the garbage collector, b) unset the related flag in "all"
> class loader structures just before collecting a GC-area, then setting
> the flag back when an object survives in the area.
>
> 8- Each method invocation frame maintains a hard reference to either its
> surrounding instance (in case of instance methods, i.e. (invokevirtual,
> invokeinterface, and invokespecial) or its surrounding class
> (invokestatic).  This is already required for synchronized methods
> (it's not a good idea to allow the instance to be collected before the
> end of a synchronized instance method call; yep, learned the hard way
> in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> this is in the correctness of not letting a class loader to die while a
> static/instance method of a class loaded by it is still active, leading
> to premature release of native resources (such as jitted code, etc.).
>
> 9- A little magic is required to prevent premature collection of a class
> loader instance and its loaded j.l.Class instances (see [3-] above), as
> object instances do not maintain a hard reference to their j.l.Class
> instance, yet we want to preserve the correctness of Object.getClass().
>
> So, the simplest approach is to maintain a hard reference in a class
> loader structure to its class loader instance (in addition to the weak
> reference in [2-] above).  This reference is kept always set (thus
> preventing collection of the class loader instance), except when *all*
> the following conditions are met:
> a) All nurseries are empty.
> b) All GC-area flags are unset.
>
> Actually, for making this practical and preserving correctness, it's a
> little trickier.  It requires a 2-step process, much like the
> object-finalization dance.  Here's a typical example:
>
> On a major collection, where all nurseries are collected, and some (but
> not necessary all) other GC-areas are collected, we do the following
> sequence of actions:
> a) All class loader structures are visited.  All flags related to
>   non-nursery GC-areas that we intend to collect are unset.  If this
>   leads to *all* flags to be unset, the hard reference to the class
>   loader instance is set to NULL (thus enabling, possibly, the
>   collection of the class loader instance).
>
> b) The garbage collection cycle is started and proceeds as usual.
>   Note that the work mandated in [7-] above is also done, which might
>   lead to setting back some flags in class loader structures that had
>   all their flags unset in [a)].
>
> c) After the initial garbage collection is applied, and just before
>   the usual treatment of weak references (where they are set to NULL
>   when pointing to a collected object), all class loader structures
>   are visited again.  The hard pointer of every class loader structure
>   that has any flag set is set back to point to the class loader
>   instance if it was NULL (same as how object instances are preserved
>   for finalization).
>
> d) If [c)] has triggered any change (i.e. it mandates the survival of
>   additional class loader instances that were due to die), the garbage
>   collection cycle is "extended" to rescue the additional class loader
>   instances and all objects they can reach.
>
> e) Any additional work of the garbage collection cycle is done (e.g.
>   soft, weak, and phantom references, finalization handling).
>
> f) All class loader structures are visited again.  Every structure for
>   which the weak reference has NOT been set to NULL has its hard
>   reference set to the weak reference target.  Every structure for
>   which the weak reference has been set to NULL is now ready to be
>   unloaded ( i.e. release all of its native resources, including jitted
>   code, class information, method information, vtables, and so on).
>
>
> In addition,I highly recommend using the approach proposed in Chapter 3
> of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> class-loader related memory.  It has many advantages:
>
> 1- No "header space" overhead for very small allocations.  [This is a
> typical "hidden" space overhead of malloc() implementations to allow
> for later free() calls].
> 2- Minimal memory fragmentation.  [Allocation only happens in large
>   blocks].
> 3- Simple and very efficient allocation.  [No overhead for complex
>   management of freeing small areas later].
> 4- Efficient freeing of large memory blocks on class unloading.
> 5- Possibility of clever usage of this memory; see Chapter 4 of the same
>   document for the implementation of sparse interface virtual tables
>   enabling invokeinterface at the simple cost of invokevirtual.  :-)
>
>
> I hope this is useful to both projects [drlvm][sablevm]  :-)
>
> Etienne
>
> (C) 2006 by Etienne M. Gagnon <eg...@sablebm.org>
> This text is licensed under the Apache License, Version 2.0.
>
> [You may add this document in svn;  I am willing to sign the required
> Apache agreement to make it so, if you intend to use it in drlvm's
> implementation].
>
> --
> Etienne M. Gagnon, Ph.D.             http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Etienne Gagnon <eg...@sablevm.org>.
>> 4- [Informative] A class loader instance is also most likely to maintain
>>...
> This is not true. Look for the thread "[drlvm] Non-bug difference
> HARMONY-1688?", where Eugene Ostrovsky desribed initiating loaders in
> details with links to specification.

OK.

>> 7- Each class loader structure maintains a set of boolean flags, one
>>...
> Requires identification of object' class type during GC. Will most
> probably degrade GC performance.

Not necessarily.  It really depends whether you want to "always" care
about class unloading, or if you only care about it when doing "major"
collections.  Maybe you only want to unload classes on "full
collections", when all generations are collected.  In such a case, you
would not do anything special (e.g. not maintain these bits) during any
other collection than full ones.

As for type identification, this is not necessarily required.  You only
need to add a pointer in the vtable header (That's a 1-liner in SableVM)
that points to:

1- NULL for any class of an "unloadable class loader" (e.g. bootstrap,
system?)

2- ClassLoader structure, for ones that we wish to unload (user class
loader).

Maybe that's the "big" change to the vtable that was argued about in
this thread?  If yes, the "bigness" of it was quite misleading to me;
such a change is a trivial one, to me.  In SableVM, it's really just the
following change:

1- Add a field in the vtable "struct" in file type.h  (1 line)
2- Initialize the field to non-zero for classes of non-bootstrap loader
(1 line).

No big deal...

As an additional optimization(???), one could set a bit in the object
header when the pointer (in the vtable) is not NULL, yet parsing the
bits might cost more that dereferencing the vtable pointer and checking
the field against NULL.  [I know, this is most probably a very bad idea!]


You could even go further and only do class unloading when a special
request is made for it.  This way, you don't do anything special during
normal collection.  When the special request is done, you do a full GC
and unload any class (and loader) you can...

I guess that some of these ideas had already been somehow discussed on
this thread; I likely did misunderstand some of the few messages I read.


>> 8- Each method invocation frame maintains a hard reference to either its
>>...
> Not generally true for optimizing JITs. "This" (or "class") can be omitted
> from enumeration if it is not used anywhere in the code. Generally, this
> technique reduces number of registers used in the code ("register pressure"
> they call it :)).

OK.  Yet, for correctness, you want to make sure that at any time you
want to unload classes, you do make sure that you take into account
classes of active methods.  This can be achieved in various ways; I was
proposing one that was natural to SableVM. :-)

>> 9- A little magic is required to prevent premature collection of a class
>>...
> This requires more involvment of a GC in unloading process and affects GC
> code more. In DRLVM, GC is designed to be a replaceable component.
> Moreover,
> we already have 3 different working GCs and MMTk on the way. So, including
> GC into the design is not a good idea for DRLVM.

There is a dependency between GC and class unloading.  Somehow, you must
be aware if there are still instances, around, of needed classes.  You
don't need to "always" care for class unloading, while doing GC; as I
said above, you could reduce the overhead to well defined moments.  [You
could have rules such that: at full collections, only, and no more than
once per 1hour | 10 minutes | ...


-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Xiao-Feng Li <xi...@gmail.com>.
On 10/31/06, Pavel Pervov <pm...@gmail.com> wrote:
> Ignatenko vs Gagnon proposal checklist follows. :)
>
>
> > In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> > its related native resource cleanup) can only happen when the class
> > loader instance becomes unreachable.  For this to happen, we put in
> > place the following things:
>
> 1- Each class loader is represented by some VM internal structure.
> > [We'll call it the "class loader structure"].
>
>
> This is true.
>
>
>
> > 2- Each class loader internal structure, except (optionally) the
> > bootstrap class loader, maintains a weak reference to an object
> > instance of class ClassLoader (or some subclass).  The Java instance
> > has some opaque pointer back to the internal VM structure.   The Java
> > instance is usually created before the internal VM structure.  The
> > instance constructor is usually in charge of creating the internal VM
> > structure.  [We'll call it the "class loader instance"]
>
>
> This is true.
>
>
>
> > 3- Each class loader instance maintains a collection of loaded classes.
> > A class/interface is never removed from this collection.  This
> > collection maintains "hard" (i.e. "not weak") references to
> > classes/interfaces.
>
>
> This is true.
>
>
>
> > 4- [Informative] A class loader instance is also most likely to maintain
> > a collection of classes for which it has "initiated" class loading.
> > This collection should use hard references (as weak references won't
> > lead to earlier class loading).
>
>
> This is not true. Look for the thread "[drlvm] Non-bug difference
> HARMONY-1688?", where Eugene Ostrovsky desribed initiating loaders in
> details with links to specification.
>
>
> > 5- Each class loader instance maintains a hard reference to its parent
> > class loader.  This reference is (optionally) null if the parent is the
> > bootstrap class loader.
>
>
> This is true. This is actually a part of delegation framework.
>
>
>
> > 6- Each j.l.Class instance maintains a hard reference to the class
> > loader instance of the class loader that has loaded it.  [This is not
> > the "initiating" loaders, but really the "loading" loader].
>
>
> This is true. AFAIU, this class loader is called "defining" loader for a
> class.
>
>
>
> > 7- Each class loader structure maintains a set of boolean flags, one
> > flag per "non-nursery" garbage collected area (even when thread-local
> > heaps are used).  The flag is set when an instance of a class loaded by
> > this class leader is moved into the related GC-area.  The flag is unset
> > when the GC-area is emptied, or (optionally) when it can be determined
> > that no instance of a class loaded by this class loader remains in the
> > GC-area.  This is best implemented as follows: a) use an unconditional
> > write of "true" in the flag every time an object is moved into the
> > GC-area by the garbage collector, b) unset the related flag in "all"
> > class loader structures just before collecting a GC-area, then setting
> > the flag back when an object survives in the area.
>
>
> Requires identification of object' class type during GC. Will most
> probably degrade GC performance.

Yes, this is also my concern.

Thanks,
xiaofeng

> > 8- Each method invocation frame maintains a hard reference to either its
> > surrounding instance (in case of instance methods, i.e. (invokevirtual,
> > invokeinterface, and invokespecial) or its surrounding class
> > (invokestatic).  This is already required for synchronized methods
> > (it's not a good idea to allow the instance to be collected before the
> > end of a synchronized instance method call; yep, learned the hard way
> > in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> > this is in the correctness of not letting a class loader to die while a
> > static/instance method of a class loaded by it is still active, leading
> > to premature release of native resources (such as jitted code, etc.).
>
>
> Not generally true for optimizing JITs. "This" (or "class") can be omitted
> from enumeration if it is not used anywhere in the code. Generally, this
> technique reduces number of registers used in the code ("register pressure"
> they call it :)).
>
>
>
> > 9- A little magic is required to prevent premature collection of a class
> > loader instance and its loaded j.l.Class instances (see [3-] above), as
> > object instances do not maintain a hard reference to their j.l.Class
> > instance, yet we want to preserve the correctness of Object.getClass().
> >
> > So, the simplest approach is to maintain a hard reference in a class
> > loader structure to its class loader instance (in addition to the weak
> > reference in [2-] above).  This reference is kept always set (thus
> > preventing collection of the class loader instance), except when *all*
> > the following conditions are met:
> > a) All nurseries are empty.
> > b) All GC-area flags are unset.
>
>
> This requires more involvment of a GC in unloading process and affects GC
> code more. In DRLVM, GC is designed to be a replaceable component. Moreover,
> we already have 3 different working GCs and MMTk on the way. So, including
> GC into the design is not a good idea for DRLVM.
>
>
> <SNIP>
>
> In addition,I highly recommend using the approach proposed in Chapter 3
> > of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> > class-loader related memory.  It has many advantages:
>
>
> It is also true. Per class loader memory allocation is already used for part
> of data allocated for this class loader. Look in HARMONY-2000 which brings
> per-class loader pools to the extent.
>
> <SNIP>
>
> --
> Pavel Pervov,
> Intel Enterprise Solutions Software Division
>
>

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.
On 10/30/06, Pavel Pervov <pm...@gmail.com> wrote:
>
> Ignatenko vs Gagnon proposal checklist follows. :)
>
>
> > In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> > its related native resource cleanup) can only happen when the class
> > loader instance becomes unreachable.  For this to happen, we put in
> > place the following things:
>
> 1- Each class loader is represented by some VM internal structure.
> > [We'll call it the "class loader structure"].
>
>
> This is true.
>
>
>
> > 2- Each class loader internal structure, except (optionally) the
> > bootstrap class loader, maintains a weak reference to an object
> > instance of class ClassLoader (or some subclass).  The Java instance
> > has some opaque pointer back to the internal VM structure.   The Java
> > instance is usually created before the internal VM structure.  The
> > instance constructor is usually in charge of creating the internal VM
> > structure.  [We'll call it the "class loader instance"]
>
>
> This is true.
>
>
>
> > 3- Each class loader instance maintains a collection of loaded classes.
> > A class/interface is never removed from this collection.  This
> > collection maintains "hard" (i.e. "not weak") references to
> > classes/interfaces.
>
>
> This is true.
>
>
>
> > 4- [Informative] A class loader instance is also most likely to maintain
> > a collection of classes for which it has "initiated" class loading.
> > This collection should use hard references (as weak references won't
> > lead to earlier class loading).
>
>
> This is not true. Look for the thread "[drlvm] Non-bug difference
> HARMONY-1688?", where Eugene Ostrovsky desribed initiating loaders in
> details with links to specification.
>
>
> > 5- Each class loader instance maintains a hard reference to its parent
> > class loader.  This reference is (optionally) null if the parent is the
> > bootstrap class loader.
>
>
> This is true. This is actually a part of delegation framework.
>
>
>
> > 6- Each j.l.Class instance maintains a hard reference to the class
> > loader instance of the class loader that has loaded it.  [This is not
> > the "initiating" loaders, but really the "loading" loader].
>
>
> This is true. AFAIU, this class loader is called "defining" loader for a
> class.
>
>
>
> > 7- Each class loader structure maintains a set of boolean flags, one
> > flag per "non-nursery" garbage collected area (even when thread-local
> > heaps are used).  The flag is set when an instance of a class loaded by
> > this class leader is moved into the related GC-area.  The flag is unset
> > when the GC-area is emptied, or (optionally) when it can be determined
> > that no instance of a class loaded by this class loader remains in the
> > GC-area.  This is best implemented as follows: a) use an unconditional
> > write of "true" in the flag every time an object is moved into the
> > GC-area by the garbage collector, b) unset the related flag in "all"
> > class loader structures just before collecting a GC-area, then setting
> > the flag back when an object survives in the area.
>
>
> Requires identification of object' class type during GC. Will most
> probably degrade GC performance.


Good point.  To get an idea of how much impact on performance, it would have
to be measured.

> 8- Each method invocation frame maintains a hard reference to either its
> > surrounding instance (in case of instance methods, i.e. (invokevirtual,
> > invokeinterface, and invokespecial) or its surrounding class
> > (invokestatic).  This is already required for synchronized methods
> > (it's not a good idea to allow the instance to be collected before the
> > end of a synchronized instance method call; yep, learned the hard way
> > in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> > this is in the correctness of not letting a class loader to die while a
> > static/instance method of a class loaded by it is still active, leading
> > to premature release of native resources (such as jitted code, etc.).
>
>
> Not generally true for optimizing JITs. "This" (or "class") can be omitted
> from enumeration if it is not used anywhere in the code. Generally, this
> technique reduces number of registers used in the code ("register
> pressure"
> they call it :)).


Good point.  If a JIT inlines a method that makes zero reference to "this",
there may not be a way of identifying the class involved.

> 9- A little magic is required to prevent premature collection of a class
> > loader instance and its loaded j.l.Class instances (see [3-] above), as
> > object instances do not maintain a hard reference to their j.l.Class
> > instance, yet we want to preserve the correctness of Object.getClass().
> >
> > So, the simplest approach is to maintain a hard reference in a class
> > loader structure to its class loader instance (in addition to the weak
> > reference in [2-] above).  This reference is kept always set (thus
> > preventing collection of the class loader instance), except when *all*
> > the following conditions are met:
> > a) All nurseries are empty.
> > b) All GC-area flags are unset.
>
>
> This requires more involvment of a GC in unloading process and affects GC
> code more. In DRLVM, GC is designed to be a replaceable component.
> Moreover,
> we already have 3 different working GCs and MMTk on the way. So, including
> GC into the design is not a good idea for DRLVM.


Good point.

<SNIP>
>
> In addition,I highly recommend using the approach proposed in Chapter 3
> > of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> > class-loader related memory.  It has many advantages:
>
>
> It is also true. Per class loader memory allocation is already used for
> part
> of data allocated for this class loader. Look in HARMONY-2000 which brings
> per-class loader pools to the extent.
>
> <SNIP>
>
> --
> Pavel Pervov,
> Intel Enterprise Solutions Software Division
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Pavel Pervov <pm...@gmail.com>.
Ignatenko vs Gagnon proposal checklist follows. :)


> In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> its related native resource cleanup) can only happen when the class
> loader instance becomes unreachable.  For this to happen, we put in
> place the following things:

1- Each class loader is represented by some VM internal structure.
> [We'll call it the "class loader structure"].


This is true.



> 2- Each class loader internal structure, except (optionally) the
> bootstrap class loader, maintains a weak reference to an object
> instance of class ClassLoader (or some subclass).  The Java instance
> has some opaque pointer back to the internal VM structure.   The Java
> instance is usually created before the internal VM structure.  The
> instance constructor is usually in charge of creating the internal VM
> structure.  [We'll call it the "class loader instance"]


This is true.



> 3- Each class loader instance maintains a collection of loaded classes.
> A class/interface is never removed from this collection.  This
> collection maintains "hard" (i.e. "not weak") references to
> classes/interfaces.


This is true.



> 4- [Informative] A class loader instance is also most likely to maintain
> a collection of classes for which it has "initiated" class loading.
> This collection should use hard references (as weak references won't
> lead to earlier class loading).


This is not true. Look for the thread "[drlvm] Non-bug difference
HARMONY-1688?", where Eugene Ostrovsky desribed initiating loaders in
details with links to specification.


> 5- Each class loader instance maintains a hard reference to its parent
> class loader.  This reference is (optionally) null if the parent is the
> bootstrap class loader.


This is true. This is actually a part of delegation framework.



> 6- Each j.l.Class instance maintains a hard reference to the class
> loader instance of the class loader that has loaded it.  [This is not
> the "initiating" loaders, but really the "loading" loader].


This is true. AFAIU, this class loader is called "defining" loader for a
class.



> 7- Each class loader structure maintains a set of boolean flags, one
> flag per "non-nursery" garbage collected area (even when thread-local
> heaps are used).  The flag is set when an instance of a class loaded by
> this class leader is moved into the related GC-area.  The flag is unset
> when the GC-area is emptied, or (optionally) when it can be determined
> that no instance of a class loaded by this class loader remains in the
> GC-area.  This is best implemented as follows: a) use an unconditional
> write of "true" in the flag every time an object is moved into the
> GC-area by the garbage collector, b) unset the related flag in "all"
> class loader structures just before collecting a GC-area, then setting
> the flag back when an object survives in the area.


Requires identification of object' class type during GC. Will most
probably degrade GC performance.



> 8- Each method invocation frame maintains a hard reference to either its
> surrounding instance (in case of instance methods, i.e. (invokevirtual,
> invokeinterface, and invokespecial) or its surrounding class
> (invokestatic).  This is already required for synchronized methods
> (it's not a good idea to allow the instance to be collected before the
> end of a synchronized instance method call; yep, learned the hard way
> in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> this is in the correctness of not letting a class loader to die while a
> static/instance method of a class loaded by it is still active, leading
> to premature release of native resources (such as jitted code, etc.).


Not generally true for optimizing JITs. "This" (or "class") can be omitted
from enumeration if it is not used anywhere in the code. Generally, this
technique reduces number of registers used in the code ("register pressure"
they call it :)).



> 9- A little magic is required to prevent premature collection of a class
> loader instance and its loaded j.l.Class instances (see [3-] above), as
> object instances do not maintain a hard reference to their j.l.Class
> instance, yet we want to preserve the correctness of Object.getClass().
>
> So, the simplest approach is to maintain a hard reference in a class
> loader structure to its class loader instance (in addition to the weak
> reference in [2-] above).  This reference is kept always set (thus
> preventing collection of the class loader instance), except when *all*
> the following conditions are met:
> a) All nurseries are empty.
> b) All GC-area flags are unset.


This requires more involvment of a GC in unloading process and affects GC
code more. In DRLVM, GC is designed to be a replaceable component. Moreover,
we already have 3 different working GCs and MMTk on the way. So, including
GC into the design is not a good idea for DRLVM.


<SNIP>

In addition,I highly recommend using the approach proposed in Chapter 3
> of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> class-loader related memory.  It has many advantages:


It is also true. Per class loader memory allocation is already used for part
of data allocated for this class loader. Look in HARMONY-2000 which brings
per-class loader pools to the extent.

<SNIP>

-- 
Pavel Pervov,
Intel Enterprise Solutions Software Division

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by Weldon Washburn <we...@gmail.com>.
I like it.  I don't fully understand the fine details yet.  But overall it
seems to be a clean design.  Maybe it makes sense for someone to prototype
this in drlvm.

On 10/30/06, Etienne Gagnon <eg...@sablevm.org> wrote:
>
> Hi all,
>
> Here's a more structured proposal for a simple and effective
> implementation of class unloading support.
>
> In accordance with Section 2.17.8 of the JVM spec, class unloading (and
> its related native resource cleanup) can only happen when the class
> loader instance becomes unreachable.  For this to happen, we put in
> place the following things:
>
> 1- Each class loader is represented by some VM internal structure.
> [We'll call it the "class loader structure"].
>
> 2- Each class loader internal structure, except (optionally) the
> bootstrap class loader, maintains a weak reference to an object
> instance of class ClassLoader (or some subclass).  The Java instance
> has some opaque pointer back to the internal VM structure.   The Java
> instance is usually created before the internal VM structure.  The
> instance constructor is usually in charge of creating the internal VM
> structure.  [We'll call it the "class loader instance"]
>
> 3- Each class loader instance maintains a collection of loaded classes.
> A class/interface is never removed from this collection.  This
> collection maintains "hard" (i.e. "not weak") references to
> classes/interfaces.
>
> 4- [Informative] A class loader instance is also most likely to maintain
> a collection of classes for which it has "initiated" class loading.
> This collection should use hard references (as weak references won't
> lead to earlier class loading).
>
> 5- Each class loader instance maintains a hard reference to its parent
> class loader.  This reference is (optionally) null if the parent is the
> bootstrap class loader.
>
> 6- Each j.l.Class instance maintains a hard reference to the class
> loader instance of the class loader that has loaded it.  [This is not
> the "initiating" loaders, but really the "loading" loader].
>
> 7- Each class loader structure maintains a set of boolean flags, one
> flag per "non-nursery" garbage collected area (even when thread-local
> heaps are used).  The flag is set when an instance of a class loaded by
> this class leader is moved into the related GC-area.  The flag is unset
> when the GC-area is emptied, or (optionally) when it can be determined
> that no instance of a class loaded by this class loader remains in the
> GC-area.  This is best implemented as follows: a) use an unconditional
> write of "true" in the flag every time an object is moved into the
> GC-area by the garbage collector, b) unset the related flag in "all"
> class loader structures just before collecting a GC-area, then setting
> the flag back when an object survives in the area.
>
> 8- Each method invocation frame maintains a hard reference to either its
> surrounding instance (in case of instance methods, i.e. (invokevirtual,
> invokeinterface, and invokespecial) or its surrounding class
> (invokestatic).  This is already required for synchronized methods
> (it's not a good idea to allow the instance to be collected before the
> end of a synchronized instance method call; yep, learned the hard way
> in SableVM...)  So, the "overhead" is quite minimal.  The importance of
> this is in the correctness of not letting a class loader to die while a
> static/instance method of a class loaded by it is still active, leading
> to premature release of native resources (such as jitted code, etc.).
>
> 9- A little magic is required to prevent premature collection of a class
> loader instance and its loaded j.l.Class instances (see [3-] above), as
> object instances do not maintain a hard reference to their j.l.Class
> instance, yet we want to preserve the correctness of Object.getClass().
>
> So, the simplest approach is to maintain a hard reference in a class
> loader structure to its class loader instance (in addition to the weak
> reference in [2-] above).  This reference is kept always set (thus
> preventing collection of the class loader instance), except when *all*
> the following conditions are met:
> a) All nurseries are empty.
> b) All GC-area flags are unset.
>
> Actually, for making this practical and preserving correctness, it's a
> little trickier.  It requires a 2-step process, much like the
> object-finalization dance.  Here's a typical example:
>
> On a major collection, where all nurseries are collected, and some (but
> not necessary all) other GC-areas are collected, we do the following
> sequence of actions:
> a) All class loader structures are visited.  All flags related to
>   non-nursery GC-areas that we intend to collect are unset.  If this
>   leads to *all* flags to be unset, the hard reference to the class
>   loader instance is set to NULL (thus enabling, possibly, the
>   collection of the class loader instance).
>
> b) The garbage collection cycle is started and proceeds as usual.
>   Note that the work mandated in [7-] above is also done, which might
>   lead to setting back some flags in class loader structures that had
>   all their flags unset in [a)].
>
> c) After the initial garbage collection is applied, and just before
>   the usual treatment of weak references (where they are set to NULL
>   when pointing to a collected object), all class loader structures
>   are visited again.  The hard pointer of every class loader structure
>   that has any flag set is set back to point to the class loader
>   instance if it was NULL (same as how object instances are preserved
>   for finalization).
>
> d) If [c)] has triggered any change (i.e. it mandates the survival of
>   additional class loader instances that were due to die), the garbage
>   collection cycle is "extended" to rescue the additional class loader
>   instances and all objects they can reach.
>
> e) Any additional work of the garbage collection cycle is done (e.g.
>   soft, weak, and phantom references, finalization handling).
>
> f) All class loader structures are visited again.  Every structure for
>   which the weak reference has NOT been set to NULL has its hard
>   reference set to the weak reference target.  Every structure for
>   which the weak reference has been set to NULL is now ready to be
>   unloaded (i.e. release all of its native resources, including jitted
>   code, class information, method information, vtables, and so on).
>
>
> In addition,I highly recommend using the approach proposed in Chapter 3
> of http://sablevm.org/people/egagnon/gagnon-phd.pdf for managing
> class-loader related memory.  It has many advantages:
>
> 1- No "header space" overhead for very small allocations.  [This is a
> typical "hidden" space overhead of malloc() implementations to allow
> for later free() calls].
> 2- Minimal memory fragmentation.  [Allocation only happens in large
>   blocks].
> 3- Simple and very efficient allocation.  [No overhead for complex
>   management of freeing small areas later].
> 4- Efficient freeing of large memory blocks on class unloading.
> 5- Possibility of clever usage of this memory; see Chapter 4 of the same
>   document for the implementation of sparse interface virtual tables
>   enabling invokeinterface at the simple cost of invokevirtual.  :-)
>
>
> I hope this is useful to both projects [drlvm][sablevm]  :-)
>
> Etienne
>
> (C) 2006 by Etienne M. Gagnon <eg...@sablebm.org>
> This text is licensed under the Apache License, Version 2.0.
>
> [You may add this document in svn;  I am willing to sign the required
> Apache agreement to make it so, if you intend to use it in drlvm's
> implementation].
>
> --
> Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
> SableVM:                                       http://www.sablevm.org/
> SableCC:                                       http://www.sablecc.org/
>
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [admin] ICLA / ACQ (Was: [drlvm][sablevm] Desing of Class Unloading Support)

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.
ICLA should be faxed to the number on the document  ASQ can be sent scanned

Etienne Gagnon wrote:
> Geir Magnusson Jr. wrote:
>> However, it would be great if you had an ICLA and ACQ on file to save
>> you the trouble of typing this in the future :)  "Better living through
>> paperwork!"
> 
> OK; I should have made this a while ago...  Can they be submitted by
> email (where?) as "scanned" documents in PDF format?  This is usually
> accepted here as much as Fax documents.  [Much better quality, actually!]
> 
> Etienne
> 

[admin] ICLA / ACQ (Was: [drlvm][sablevm] Desing of Class Unloading Support)

Posted by Etienne Gagnon <eg...@sablevm.org>.
Geir Magnusson Jr. wrote:
> However, it would be great if you had an ICLA and ACQ on file to save
> you the trouble of typing this in the future :)  "Better living through
> paperwork!"

OK; I should have made this a while ago...  Can they be submitted by
email (where?) as "scanned" documents in PDF format?  This is usually
accepted here as much as Fax documents.  [Much better quality, actually!]

Etienne

-- 
Etienne M. Gagnon, Ph.D.            http://www.info2.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/

Re: [drlvm][sablevm] Desing of Class Unloading Support

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.

Etienne Gagnon wrote:

[SNIP]

> 
> I hope this is useful to both projects [drlvm][sablevm]  :-)

This was really great - I need to go back and read it carefully.  Thanks
so much!

> 
> Etienne
> 
> (C) 2006 by Etienne M. Gagnon <eg...@sablebm.org>
> This text is licensed under the Apache License, Version 2.0.
> 
> [You may add this document in svn;  I am willing to sign the required
> Apache agreement to make it so, if you intend to use it in drlvm's
> implementation].

This isn't really necessary - by the terms of this list, anything
submitted is considered a contribution under the terms of the Apache
license and ICLA, unless noted otherwise as "NOT A CONTRIBUTION".

However, it would be great if you had an ICLA and ACQ on file to save
you the trouble of typing this in the future :)  "Better living through 
paperwork!"

geir