You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by "Zakirov, Salikh" <sa...@intel.com> on 2006/02/15 13:45:59 UTC
[vm] finalization subsystem design

Hi,

I would like to discuss a design of finalization subsystem.
Note, that these are my thoughts only, they have not been
really implemented yet. Some of the design
issues are still not clear, so I am looking forward to hear
your opinion.

Thanks a lot!

==================

0. Introduction. 

The detailed design I am going to discuss is based on the following
high-level
design:

	* at allocation time, all finalizable objects are registered
with
       live finalization queue. Dead finalization queue is initially
empty.

	* all finalizable objects are kept alive 
       while referenced from finalization queue, either from live
       or dead queue.

	* every once in a while, the GC initiates the process of
finalization
       detection. The GC scans objects from live finalizable queue one
       by one, and moves some of them to the dead finalization queue.

	* a dedicated finalization thread (or several threads)
       asynchronously check if the dead finalization queue 
       is not empty, and runs finalize() method for those objects.
       The reference to objects is then dropped from 
       the dead finalization queue.

	* finalized objects, which are no more referenced either from
finalization
       queue or static fields are reclaimed by the GC in exactly the
same way
       as ordinary java objects.

0. Data structure: Finalization queue

The finalizable object queue may be implemented in a number of ways, 
and the most appealing one is to keep it all within the java heap. 
Keeping finalizable queue within the java heap will give following
"for free":

	* the shortage of memory for finalizable queue automatically
       results in OutOfMemoryError

     * the memory availability in java heap is more predictable
       than native heap memory availability

     * reclamation of memory used by the finalization queue and 
       finalizable objects is done automatically by the GC.

For example, the following java class may be used to keep both the live
finalization queue and dead finalization queue:

	class FinalizationQueue {
		FinalizationQueue next;		// null-terminated
linked list
		Object[] array;			// the storage of the
queued objects
		int live;				//
array[0..live-1] are live objects
		int dead;				//
array[live..live+dead-1] are dead objects
	}

Generally, I do not expect that the sequential organization 
of finalization queue will cause any performance issues, as most of the 
operations with finalization queue are sequential anyway.

1. GC interface
The finalization subsystem interacts with the GC on three occasions
	* allocation of finalizable object
	* promotion of finalizable object from live to dead queue
	* enumeration of the root set of the finalizable queue

These interaction can be implemented in following interface

	ManagedObject* f_object_allocated(ManagedObject* obj)

	void f_handle_finalization_queue()
	bool gc_is_object_live(ManagedObject* obj)

	void f_enumerate_roots()
	void gc_add_root_set_entry(ManagedObject** root)

Prefix "f_" denotes functions provided by the finalization subsystem,
that the GC will call, and prefix "gc_" denotes GC callbacks.

The context for these functions is the GC interface, suggested earlier
by Weldon Washburn in 
	
http://wiki.apache.org/harmony-data/attachments/HarmonyArchitecture/atta
chments/gc_interface.txt.

The GC will interact with the finalization subsystem in the following
way:

	- on a object allocation, the GC checks if the object class has
finalize()
       method overridden by calling (class_is_finalizable()).
       If the GC finds out that just allocated object is finalizable,
       it calls f_object_allocated().
     - finalization subsystem receives a newly allocated finalizable
object
       reference and registers it with a live finalization queue.
       Registration in the queue may require other object allocation,
       which in turn may cause garbage collection. 
       Finalization subsystem returns object pointer after it registers
       it successfully. It may return NULL if the registration was not
       successful due to shortage of memory.
       Some care must be taken to protect the pointer to the allocated
       objects if other allocations take place, because any allocation
       may cause garbage collection, and the garbage collection may
       move objects.
     - The finalization subsystem returns the object pointer to GC,
       and the GC passes it further to the caller of allocation
function.

     - Later, some time in the middle of the next garbage collection
cycle
       after the GC has completed tracing and marking of strongly
reachable
       objects, the GC calls f_handle_finalization_queue(). 
     - The finalization subsystem iterates over live finalization queue
       and calls gc_is_object_live() for each live finalizable object.
     - The GC returns false from gc_is_object_live() if the object was
found
       to be provably unreachable, and true otherwise. 
       Thus, the finalization may be postponed indefinitely in
incremental
       or conservative collectors.
     - The finalization subsystem moves provably unreachable finalizable
       objects from live to dead queue
     - The finalization subsystem returns from
f_handle_finalization_queue()
       after completing iteration over live finalization queue.

     - After the GC called f_handle_finalization_queue() and the call
has
       returned, the GC will call f_enumerate_roots() to get the root
set
       for f-reachable objects, including live and dead finalization
queue.
     - The finalization subsystem will respond by repeatedly calling
       gc_add_root_set_entry() for each root. 
       If finalization queue is implemented as described above
       (class FinalizationQueue), the root set will consist of just one
       pointer to the head of the queue
     - The finalization subsystem returns from f_enumerate_roots() once
       it completes f-roots enumeration
     - The GC traces the heap from newly obtained f-roots to make sure
       that all f-reachable objects are not reclaimed.

2. Finalizer thread interface

According to the JVM specification, particularly the requirement of
holding
no user-visible locks, the finalization itself must not occur on a user
threads.

      (* Note: I assume 1:1 mapping of java threads
         to OS threads and straightforward implementation
         of java monitors as mutexes/conditional variables
        or other OS synchronization primitives *)

Thus, one or more dedicated finalization threads are used.

The finalization thread is created either during the VM initialization
or when the finalization queue size reaches some threshold.

The finalization queue works in a infinite work loop, falling to sleep
if no work is available. The following interface can be used for
interaction of finalizer thread with finalization queue

        ManagedObject* f_pop_dead_object()
        unsigned f_dead_queue_size()
        unsigned f_live_queue_size()

This interface may as well be implemented in java, though the
interaction
with garbage collection will involve some tricks, as the garbage
collection
may start just in the moment when finalization thread was in the middle
of getting an object from the queue.

Possible solution is to use atomic operations from
java.util.concurrent.atomic
on instance variable of to lock individual chunks of the finalization
queue,
and for modification of the linked list of chunks.

--
Salikh Zakirov, Intel Middleware Products Division