You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Alexei Fedotov <al...@gmail.com> on 2008/04/14 22:23:10 UTC

Re: [harmony-gc-5] updates on Parrot GC [was Re: Comparison of Harmony GC_Gen vs. Parrot]

Hello Senaka,
That's a good progress in understanding how Parrot is built.

Generally, I would suggest keeping build systems of Parrot and GC as
is, with minimal changes. For example, you may try the following:

1. Build Parrot as is using a Parrot build system..
2. Build GC DLL using DRLVM build system.
3. Copy GC DLL to Parrot.

You may want to ask few questions:
1. How Parrot would know about this new DLL? You will need to change
Parrot command line parsing to understand a new option.
2. How it would know which functions to call to collect garbage?
<answer mentions header files>
3. How it would be possible to maintain all these changes in order?
<answer was in the correspondence>
4. Etc...

Thanks.


On Tue, Apr 15, 2008 at 12:09 AM, Senaka Fernando <se...@gmail.com> wrote:
> Hi all,
>
>  I have finally made it possible to build Parrot in a C++ environment. And,
>  have managed to uncover most differences between Parrot's C and C++ build
>  streams. I have also contacted Mark (from Parrot) regarding being my
>  co-mentor and he's interested. I also do get a great deal of support from
>  the Parrot community. The Parrot developer meeting will be held tomorrow at
>  18.30 GMT. I hope that there would be a discussion on the GC.
>
>  Regards,
>  Senaka
>
>  On Sun, Apr 13, 2008 at 5:41 PM, Senaka Fernando <se...@gmail.com>
>
>
> wrote:
>
>  > Hi Xiao-Feng,
>  >
>  > Here is a detailed answer to your questions. I have copy-pasted some info
>  > from Parrot documentation on GC.
>  >
>  > Objects on the heap are laid out as PMCs (PObjects), and buffers.
>  > Allocation is pool based. The Arenas structure holds function pointers for
>  > the core defined interface of the currently active GC subsystem:
>  > "init_pool", "do_gc_mark",        "finalize_gc_system". It holds various
>  > accounting information for the GC subsystem, including how many GC runs have
>  > been completed, amount of memory allocated since the last run, and total
>  > memory allocated. This accounting information is updated by the GC system.
>  > The current block level for GC mark and sweep phases is stored in the Arenas
>  > structure.
>  >
>  > The Memory_Pool structure is a simple memory pool. It contains a pointer
>  > to the top block of the allocated pool, the total allocated size of the
>  > pool, the block size, and some details on the reclamation characteristics of
>  > the pool.
>  >
>  > The Small_Object_Pool structure is a richer memory pool for object
>  > allocation.  It tracks details like the number of allocated and free objects
>  > in the pool, a list of free objects, and for the generational GC
>  > implementation maintains linked lists of white, black, and gray PMCs. It
>  > contains a pointer to a simple Memory_Pool (the base storage of the pool).
>  > It holds function pointers for adding and retrieving free objects in the
>  > pool, and for allocating objects.
>  >
>  > Each PMC/Buffer will contain a set of flags that will govern the behavior
>  > and state of it in the presence of the GC.
>  > Ex:- PObj_active_destroy_FLAG, PObj_data_is_PMC_array_FLAG, PObj_live_FLAG
>  > etc.
>  >
>  > Each PObject has a header which is of type UnionVal, a union of various
>  > fields, in addition to flags. A PMC has a Vtable in it. Thus, each allocated
>  > object will have header info within it.
>  >
>  > Each GC core defines 4 function pointers stored in the Small_Object_Pool
>  > structures. One to allocate new objects, another to add a freed object to
>  > the free list, another to retrieve a free object from the free list and one
>  > to reallocate for additional objects. If a Small_Object_Pool is full (no
>  > free objects) a new one will needed to be created. Thus, each object on a
>  > small object pool is like a place holder for a new instance.
>  >
>  > Heap is laid out as arenas, having two memory pools and six small object
>  > pools.
>  >
>  > There are two marking phases, for PMCs
>  >
>  > 1. Initial Marking
>  > Each PMC has a "flags" member which, among other things, facilitates
>  > garbage collection. At the beginning of the mark phase, the
>  > "PObj_is_live_FLAG" and "PObj_is_fully_marked_FLAG" are both unset, which
>  > flags the PMC as presumed dead (white). The initial mark phase of the
>  > collection cycle goes through each PMC in the root set and sets the
>  > Obj_is_live_FLAG" bit in the "flags" member (the PMC is gray).  It does not
>  > set the "PObj_is_fully_marked_FLAG" bit (changing the PMC to black), because
>  > in the initial mark, the PMCs or buffers contained by a PMC are not marked.
>  > It also appends the PMC to the end of a list used for further marking.
>  > However, if the PMC has already been marked as black, the current end of
>  > list is returned (instead of appending the already processed PMC) to prevent
>  > endless looping.
>  >
>  > 2. Incremental Marking
>  > After the root set of PMCs have been marked, a series of incremental mark
>  > runs are performed. These may be performed frequently, between other
>  > operations.  The incremental mark runs work to move gray PMCs to black. They
>  > take a PMC from the list for further marking, mark any PMCs or buffers it
>  > contains as gray (the "PObj_is_live_FLAG" is set and the
>  > "PObj_is_fully_marked_FLAG" is left unset), and add the contained PMCs or
>  > buffers to the list for further marking.  If the PMC has a custom mark
>  > function in its vtable, it is called at this point.
>  >
>  > For Buffers, no incremental marking is involved.
>  > The initial marking phase also marks the root set of buffers. Because
>  > buffers cannot contain other buffers, they are immediately marked as black
>  > and not added to the list for further marking. Because PMCs may contain
>  > buffers, the buffer collection phase can't run until the incremental marking
>  > of PMCs is completed.
>  >
>  > When the list for further marking is empty (all gray PMCs have changed to
>  > black), the collection stage is started. First, PMCs are collected, followed
>  > by buffers. In both cases (PMC and buffer), the "live" and "fully_marked"
>  > flags are reset after examination for reclamation.
>  >
>  > To collect PMCs, each PMC arena is examined from the most recently created
>  > backwards.  Each PMC is examined to see if it is live, already on the free
>  > list, or constant.  If it is not, then it is added to the free list and
>  > marked as being on the free list with the "PObj_on_free_list_FLAG".
>  >
>  > To collect buffers, each Buffer arena is examined from the most recently
>  > created backwards.  If the buffer is not live, not already on the free list
>  > and it is not a constant or copy on write, then it is added to the free pool
>  > for reuse and marked with the "PObj_on_free_list_FLAG".
>  >
>  > Thus, the objects are scanned during the mark phase and then identified as
>  > live, and collected. The collection process is triggered after the marking
>  > is complete.
>  >
>  > Allocation of objects is handled by pool structures.
>  >
>  > I can relate the necessary source code portions to this discussion if
>  > required. However, some of the above mentioned features are not fully
>  > implemented. But, according to the Parrot community, this will be their
>  > future direction.
>  >
>  > I also have fixed most build errors with Parrot & C++ and they are really
>  > happy about it. Now am in the process of resolving some linking conflicts on
>  > Parrot's C++ build.
>  >
>  > Regards,
>  > Senaka
>  >
>  > On Mon, Apr 7, 2008 at 10:25 AM, Senaka Fernando <se...@gmail.com>
>  > wrote:
>  >
>  > > Hi Xiao-Feng,
>  > >
>  > > Thanks for these questions. I believe that they'd be really helpful in
>  > > understanding VM <-> GC assumptions on the Parrot end.
>  > >
>  > > Will work on answering these, and perhaps a comparison with Harmony.
>  > >
>  > > Regards,
>  > > Senaka
>  > >
>  > >
>  > > On Mon, Apr 7, 2008 at 5:11 AM, Xiao-Feng Li <xi...@gmail.com>
>  > > wrote:
>  > >
>  > > > Senaka, thanks for the page. I think the most important things are
>  > > > related to the VM <-> GC protocol. Some questions that may help you:
>  > > > 1. How Parrot layout/encode an object/array? fields, size, object
>  > > > header info, etc.
>  > >
>  > >
>  > > > 2. How Parrot layout/arrange the heap? free list? pages?
>  > > > 3. What's the process of an object creation? When and how?
>  > > > 4. How is a collection process triggered?
>  > > > 5. How does Parrot GC trace live objects and collect them?
>  > > >
>  > > > Some of the questions above might be GC internals, so it's more
>  > > > desirable if you can understand the Parrot VM's assumptions on GC.
>  > > > I.e., does it assume the heap is laid out in certain way, does it
>  > > > assume the objects are encoded in certain way, does it assume the
>  > > > roots are enumerated in certain way, etc.? Depending on your progress,
>  > > > more details might be needed later on.
>  > > >
>  > > > For this project, you might have to understand the Parrot current
>  > > > status for the above questions. It helps you and us to identify the
>  > > > key issues to resolve, and the main efforts to be focused on. For GC
>  > > > porting over different VMs, it's not like porting an application over
>  > > > different OSes, because of the implicit assumptions between VM and GC.
>  > > > I would expect some redesign work required for GC porting, hence you
>  > > > have to understand the Parrot design in certain depth.
>  > > >
>  > > > Thanks,
>  > > > xiaofeng
>  > > >
>  > > > On Mon, Apr 7, 2008 at 4:26 AM, Alexei Fedotov <
>  > > > alexei.fedotov@gmail.com> wrote:
>  > > > > Good job, Senaka.
>  > > > >
>  > > > >  The general perception was that internal and external GC interfaces
>  > > > >  were mixed, which maked this document less usable for harmony-gc-5
>  > > > >  project than it could be. For example, sweeping, marking and
>  > > > >  reclaiming are internal interfaces while allocation, stack
>  > > > enumeration
>  > > > >  (please take a look at vm_enumerate_root_set_all_threads) and gc
>  > > > >  invocation are external interfaces. We should pay more attention to
>  > > > >  external interfaces for harmony-gc-5 project.
>  > > > >
>  > > > >  Thanks.
>  > > > >
>  > > > >
>  > > > >
>  > > > >
>  > > > >
>  > > > >
>  > > > >
>  > > > >  On Sun, Apr 6, 2008 at 8:35 PM, Senaka Fernando <
>  > > > senakafdo@gmail.com> wrote:
>  > > > >  >
>  > > > >  > Hi all,
>  > > > >  >
>  > > > >  >  I have almost finished comparing the two interfaces of Harmony
>  > > > and Parrot.
>  > > > >  >  However, I'm not 100% sure on whether I got everything right,
>  > > > but I believe
>  > > > >  >  that most of it is. Therefore, it would be really great if you
>  > > > could review
>  > > > >  >  the wiki page and let me know whether it is correct and precise.
>  > > > >  >
>  > > > >  >  Sections marked as TBD are not yet finalized. And, I have
>  > > > omitted some
>  > > > >  >  instances of where Harmony supports something and Parrot doesn't
>  > > > for
>  > > > >  >  simplicity.
>  > > > >  >
>  > > > >  >  The wiki page is found at [1]
>  > > > >  >
>  > > > >  >  [1]
>  > > > http://wiki.apache.org/harmony/gc_comparison/gc_gen_harmony_vs_parrot
>  > > > >  >
>  > > > >  >  Regards,
>  > > > >  >  Senaka
>  > > > >  >
>  > > > >
>  > > > >
>  > > > >
>  > > > >  --
>  > > > >  With best regards,
>  > > > >  Alexei
>  > > > >
>  > > >
>  > > >
>  > > >
>  > > > --
>  > > > http://xiao-feng.blogspot.com
>  > > >
>  > >
>  > >
>  >
>



-- 
With best regards,
Alexei

Re: [harmony-gc-5] updates on Parrot GC [was Re: Comparison of Harmony GC_Gen vs. Parrot]

Posted by Senaka Fernando <se...@gmail.com>.
Hi Alexei,

I have sent the patches, bug fixes through RT (their issue tracking system)
and most of them have been committed. Well yes, you first send a mail to
open a ticket and then you can correspond through the web interface.

Regards,
Senaka

On Tue, Apr 15, 2008 at 2:12 AM, Alexei Fedotov <al...@gmail.com>
wrote:

> Great. Thanks for explaining. Don't forget to submit the patch (it is
> tricky for Parrot: you have to attach it to the mail with a specific
> subject).
>
> On Tue, Apr 15, 2008 at 12:31 AM, Senaka Fernando <se...@gmail.com>
> wrote:
> > Hi Alexei,
> >
> >  Actually it was the Parrot folks who asked me whether I could get the
> C++
> >  build fixed, as they use some C++ compilers too. I have started on
> question
> >  1. Which involves creating somewhat like a plug-in which will probably
> be
> >  used by Parrot to connect any other GC implementation in the future.
> Because
> >  even they do say that they support 3rd party GCs it is not a part of
> their
> >  system as yet.
> >
> >  Regards,
> >  Senaka
> >
> >  On Tue, Apr 15, 2008 at 1:53 AM, Alexei Fedotov <
> alexei.fedotov@gmail.com>
> >
> >
> > wrote:
> >
> >  > Hello Senaka,
> >  > That's a good progress in understanding how Parrot is built.
> >  >
> >  > Generally, I would suggest keeping build systems of Parrot and GC as
> >  > is, with minimal changes. For example, you may try the following:
> >  >
> >  > 1. Build Parrot as is using a Parrot build system..
> >  > 2. Build GC DLL using DRLVM build system.
> >  > 3. Copy GC DLL to Parrot.
> >  >
> >  > You may want to ask few questions:
> >  > 1. How Parrot would know about this new DLL? You will need to change
> >  > Parrot command line parsing to understand a new option.
> >  > 2. How it would know which functions to call to collect garbage?
> >  > <answer mentions header files>
> >  > 3. How it would be possible to maintain all these changes in order?
> >  > <answer was in the correspondence>
> >  > 4. Etc...
> >  >
> >  > Thanks.
> >  >
> >  >
> >  > On Tue, Apr 15, 2008 at 12:09 AM, Senaka Fernando <
> senakafdo@gmail.com>
> >  > wrote:
> >  > > Hi all,
> >  > >
> >  > >  I have finally made it possible to build Parrot in a C++
> environment.
> >  > And,
> >  > >  have managed to uncover most differences between Parrot's C and
> C++
> >  > build
> >  > >  streams. I have also contacted Mark (from Parrot) regarding being
> my
> >  > >  co-mentor and he's interested. I also do get a great deal of
> support
> >  > from
> >  > >  the Parrot community. The Parrot developer meeting will be held
> >  > tomorrow at
> >  > >  18.30 GMT. I hope that there would be a discussion on the GC.
> >  > >
> >  > >  Regards,
> >  > >  Senaka
> >  > >
> >  > >  On Sun, Apr 13, 2008 at 5:41 PM, Senaka Fernando <
> senakafdo@gmail.com>
> >  > >
> >  > >
> >  > > wrote:
> >  > >
> >  > >  > Hi Xiao-Feng,
> >  > >  >
> >  > >  > Here is a detailed answer to your questions. I have copy-pasted
> some
> >  > info
> >  > >  > from Parrot documentation on GC.
> >  > >  >
> >  > >  > Objects on the heap are laid out as PMCs (PObjects), and
> buffers.
> >  > >  > Allocation is pool based. The Arenas structure holds function
> >  > pointers for
> >  > >  > the core defined interface of the currently active GC subsystem:
> >  > >  > "init_pool", "do_gc_mark",        "finalize_gc_system". It holds
> >  > various
> >  > >  > accounting information for the GC subsystem, including how many
> GC
> >  > runs have
> >  > >  > been completed, amount of memory allocated since the last run,
> and
> >  > total
> >  > >  > memory allocated. This accounting information is updated by the
> GC
> >  > system.
> >  > >  > The current block level for GC mark and sweep phases is stored
> in the
> >  > Arenas
> >  > >  > structure.
> >  > >  >
> >  > >  > The Memory_Pool structure is a simple memory pool. It contains a
> >  > pointer
> >  > >  > to the top block of the allocated pool, the total allocated size
> of
> >  > the
> >  > >  > pool, the block size, and some details on the reclamation
> >  > characteristics of
> >  > >  > the pool.
> >  > >  >
> >  > >  > The Small_Object_Pool structure is a richer memory pool for
> object
> >  > >  > allocation.  It tracks details like the number of allocated and
> free
> >  > objects
> >  > >  > in the pool, a list of free objects, and for the generational GC
> >  > >  > implementation maintains linked lists of white, black, and gray
> PMCs.
> >  > It
> >  > >  > contains a pointer to a simple Memory_Pool (the base storage of
> the
> >  > pool).
> >  > >  > It holds function pointers for adding and retrieving free
> objects in
> >  > the
> >  > >  > pool, and for allocating objects.
> >  > >  >
> >  > >  > Each PMC/Buffer will contain a set of flags that will govern the
> >  > behavior
> >  > >  > and state of it in the presence of the GC.
> >  > >  > Ex:- PObj_active_destroy_FLAG, PObj_data_is_PMC_array_FLAG,
> >  > PObj_live_FLAG
> >  > >  > etc.
> >  > >  >
> >  > >  > Each PObject has a header which is of type UnionVal, a union of
> >  > various
> >  > >  > fields, in addition to flags. A PMC has a Vtable in it. Thus,
> each
> >  > allocated
> >  > >  > object will have header info within it.
> >  > >  >
> >  > >  > Each GC core defines 4 function pointers stored in the
> >  > Small_Object_Pool
> >  > >  > structures. One to allocate new objects, another to add a freed
> >  > object to
> >  > >  > the free list, another to retrieve a free object from the free
> list
> >  > and one
> >  > >  > to reallocate for additional objects. If a Small_Object_Pool is
> full
> >  > (no
> >  > >  > free objects) a new one will needed to be created. Thus, each
> object
> >  > on a
> >  > >  > small object pool is like a place holder for a new instance.
> >  > >  >
> >  > >  > Heap is laid out as arenas, having two memory pools and six
> small
> >  > object
> >  > >  > pools.
> >  > >  >
> >  > >  > There are two marking phases, for PMCs
> >  > >  >
> >  > >  > 1. Initial Marking
> >  > >  > Each PMC has a "flags" member which, among other things,
> facilitates
> >  > >  > garbage collection. At the beginning of the mark phase, the
> >  > >  > "PObj_is_live_FLAG" and "PObj_is_fully_marked_FLAG" are both
> unset,
> >  > which
> >  > >  > flags the PMC as presumed dead (white). The initial mark phase
> of the
> >  > >  > collection cycle goes through each PMC in the root set and sets
> the
> >  > >  > Obj_is_live_FLAG" bit in the "flags" member (the PMC is gray).
>  It
> >  > does not
> >  > >  > set the "PObj_is_fully_marked_FLAG" bit (changing the PMC to
> black),
> >  > because
> >  > >  > in the initial mark, the PMCs or buffers contained by a PMC are
> not
> >  > marked.
> >  > >  > It also appends the PMC to the end of a list used for further
> >  > marking.
> >  > >  > However, if the PMC has already been marked as black, the
> current end
> >  > of
> >  > >  > list is returned (instead of appending the already processed
> PMC) to
> >  > prevent
> >  > >  > endless looping.
> >  > >  >
> >  > >  > 2. Incremental Marking
> >  > >  > After the root set of PMCs have been marked, a series of
> incremental
> >  > mark
> >  > >  > runs are performed. These may be performed frequently, between
> other
> >  > >  > operations.  The incremental mark runs work to move gray PMCs to
> >  > black. They
> >  > >  > take a PMC from the list for further marking, mark any PMCs or
> >  > buffers it
> >  > >  > contains as gray (the "PObj_is_live_FLAG" is set and the
> >  > >  > "PObj_is_fully_marked_FLAG" is left unset), and add the
> contained
> >  > PMCs or
> >  > >  > buffers to the list for further marking.  If the PMC has a
> custom
> >  > mark
> >  > >  > function in its vtable, it is called at this point.
> >  > >  >
> >  > >  > For Buffers, no incremental marking is involved.
> >  > >  > The initial marking phase also marks the root set of buffers.
> Because
> >  > >  > buffers cannot contain other buffers, they are immediately
> marked as
> >  > black
> >  > >  > and not added to the list for further marking. Because PMCs may
> >  > contain
> >  > >  > buffers, the buffer collection phase can't run until the
> incremental
> >  > marking
> >  > >  > of PMCs is completed.
> >  > >  >
> >  > >  > When the list for further marking is empty (all gray PMCs have
> >  > changed to
> >  > >  > black), the collection stage is started. First, PMCs are
> collected,
> >  > followed
> >  > >  > by buffers. In both cases (PMC and buffer), the "live" and
> >  > "fully_marked"
> >  > >  > flags are reset after examination for reclamation.
> >  > >  >
> >  > >  > To collect PMCs, each PMC arena is examined from the most
> recently
> >  > created
> >  > >  > backwards.  Each PMC is examined to see if it is live, already
> on the
> >  > free
> >  > >  > list, or constant.  If it is not, then it is added to the free
> list
> >  > and
> >  > >  > marked as being on the free list with the
> "PObj_on_free_list_FLAG".
> >  > >  >
> >  > >  > To collect buffers, each Buffer arena is examined from the most
> >  > recently
> >  > >  > created backwards.  If the buffer is not live, not already on
> the
> >  > free list
> >  > >  > and it is not a constant or copy on write, then it is added to
> the
> >  > free pool
> >  > >  > for reuse and marked with the "PObj_on_free_list_FLAG".
> >  > >  >
> >  > >  > Thus, the objects are scanned during the mark phase and then
> >  > identified as
> >  > >  > live, and collected. The collection process is triggered after
> the
> >  > marking
> >  > >  > is complete.
> >  > >  >
> >  > >  > Allocation of objects is handled by pool structures.
> >  > >  >
> >  > >  > I can relate the necessary source code portions to this
> discussion if
> >  > >  > required. However, some of the above mentioned features are not
> fully
> >  > >  > implemented. But, according to the Parrot community, this will
> be
> >  > their
> >  > >  > future direction.
> >  > >  >
> >  > >  > I also have fixed most build errors with Parrot & C++ and they
> are
> >  > really
> >  > >  > happy about it. Now am in the process of resolving some linking
> >  > conflicts on
> >  > >  > Parrot's C++ build.
> >  > >  >
> >  > >  > Regards,
> >  > >  > Senaka
> >  > >  >
> >  > >  > On Mon, Apr 7, 2008 at 10:25 AM, Senaka Fernando <
> senakafdo@gmail.com
> >  > >
> >  > >  > wrote:
> >  > >  >
> >  > >  > > Hi Xiao-Feng,
> >  > >  > >
> >  > >  > > Thanks for these questions. I believe that they'd be really
> helpful
> >  > in
> >  > >  > > understanding VM <-> GC assumptions on the Parrot end.
> >  > >  > >
> >  > >  > > Will work on answering these, and perhaps a comparison with
> >  > Harmony.
> >  > >  > >
> >  > >  > > Regards,
> >  > >  > > Senaka
> >  > >  > >
> >  > >  > >
> >  > >  > > On Mon, Apr 7, 2008 at 5:11 AM, Xiao-Feng Li <
> xiaofeng.li@gmail.com
> >  > >
> >  > >  > > wrote:
> >  > >  > >
> >  > >  > > > Senaka, thanks for the page. I think the most important
> things
> >  > are
> >  > >  > > > related to the VM <-> GC protocol. Some questions that may
> help
> >  > you:
> >  > >  > > > 1. How Parrot layout/encode an object/array? fields, size,
> object
> >  > >  > > > header info, etc.
> >  > >  > >
> >  > >  > >
> >  > >  > > > 2. How Parrot layout/arrange the heap? free list? pages?
> >  > >  > > > 3. What's the process of an object creation? When and how?
> >  > >  > > > 4. How is a collection process triggered?
> >  > >  > > > 5. How does Parrot GC trace live objects and collect them?
> >  > >  > > >
> >  > >  > > > Some of the questions above might be GC internals, so it's
> more
> >  > >  > > > desirable if you can understand the Parrot VM's assumptions
> on
> >  > GC.
> >  > >  > > > I.e., does it assume the heap is laid out in certain way,
> does it
> >  > >  > > > assume the objects are encoded in certain way, does it
> assume the
> >  > >  > > > roots are enumerated in certain way, etc.? Depending on your
> >  > progress,
> >  > >  > > > more details might be needed later on.
> >  > >  > > >
> >  > >  > > > For this project, you might have to understand the Parrot
> current
> >  > >  > > > status for the above questions. It helps you and us to
> identify
> >  > the
> >  > >  > > > key issues to resolve, and the main efforts to be focused
> on. For
> >  > GC
> >  > >  > > > porting over different VMs, it's not like porting an
> application
> >  > over
> >  > >  > > > different OSes, because of the implicit assumptions between
> VM
> >  > and GC.
> >  > >  > > > I would expect some redesign work required for GC porting,
> hence
> >  > you
> >  > >  > > > have to understand the Parrot design in certain depth.
> >  > >  > > >
> >  > >  > > > Thanks,
> >  > >  > > > xiaofeng
> >  > >  > > >
> >  > >  > > > On Mon, Apr 7, 2008 at 4:26 AM, Alexei Fedotov <
> >  > >  > > > alexei.fedotov@gmail.com> wrote:
> >  > >  > > > > Good job, Senaka.
> >  > >  > > > >
> >  > >  > > > >  The general perception was that internal and external GC
> >  > interfaces
> >  > >  > > > >  were mixed, which maked this document less usable for
> >  > harmony-gc-5
> >  > >  > > > >  project than it could be. For example, sweeping, marking
> and
> >  > >  > > > >  reclaiming are internal interfaces while allocation,
> stack
> >  > >  > > > enumeration
> >  > >  > > > >  (please take a look at vm_enumerate_root_set_all_threads)
> and
> >  > gc
> >  > >  > > > >  invocation are external interfaces. We should pay more
> >  > attention to
> >  > >  > > > >  external interfaces for harmony-gc-5 project.
> >  > >  > > > >
> >  > >  > > > >  Thanks.
> >  > >  > > > >
> >  > >  > > > >
> >  > >  > > > >
> >  > >  > > > >
> >  > >  > > > >
> >  > >  > > > >
> >  > >  > > > >
> >  > >  > > > >  On Sun, Apr 6, 2008 at 8:35 PM, Senaka Fernando <
> >  > >  > > > senakafdo@gmail.com> wrote:
> >  > >  > > > >  >
> >  > >  > > > >  > Hi all,
> >  > >  > > > >  >
> >  > >  > > > >  >  I have almost finished comparing the two interfaces of
> >  > Harmony
> >  > >  > > > and Parrot.
> >  > >  > > > >  >  However, I'm not 100% sure on whether I got everything
> >  > right,
> >  > >  > > > but I believe
> >  > >  > > > >  >  that most of it is. Therefore, it would be really
> great if
> >  > you
> >  > >  > > > could review
> >  > >  > > > >  >  the wiki page and let me know whether it is correct
> and
> >  > precise.
> >  > >  > > > >  >
> >  > >  > > > >  >  Sections marked as TBD are not yet finalized. And, I
> have
> >  > >  > > > omitted some
> >  > >  > > > >  >  instances of where Harmony supports something and
> Parrot
> >  > doesn't
> >  > >  > > > for
> >  > >  > > > >  >  simplicity.
> >  > >  > > > >  >
> >  > >  > > > >  >  The wiki page is found at [1]
> >  > >  > > > >  >
> >  > >  > > > >  >  [1]
> >  > >  > > >
> >  > http://wiki.apache.org/harmony/gc_comparison/gc_gen_harmony_vs_parrot
> >  > >  > > > >  >
> >  > >  > > > >  >  Regards,
> >  > >  > > > >  >  Senaka
> >  > >  > > > >  >
> >  > >  > > > >
> >  > >  > > > >
> >  > >  > > > >
> >  > >  > > > >  --
> >  > >  > > > >  With best regards,
> >  > >  > > > >  Alexei
> >  > >  > > > >
> >  > >  > > >
> >  > >  > > >
> >  > >  > > >
> >  > >  > > > --
> >  > >  > > > http://xiao-feng.blogspot.com
> >  > >  > > >
> >  > >  > >
> >  > >  > >
> >  > >  >
> >  > >
> >  >
> >  >
> >  >
> >  > --
> >  > With best regards,
> >  > Alexei
> >  >
> >
>
>
>
> --
> With best regards,
> Alexei
>

Re: [harmony-gc-5] updates on Parrot GC [was Re: Comparison of Harmony GC_Gen vs. Parrot]

Posted by Alexei Fedotov <al...@gmail.com>.
Great. Thanks for explaining. Don't forget to submit the patch (it is
tricky for Parrot: you have to attach it to the mail with a specific
subject).

On Tue, Apr 15, 2008 at 12:31 AM, Senaka Fernando <se...@gmail.com> wrote:
> Hi Alexei,
>
>  Actually it was the Parrot folks who asked me whether I could get the C++
>  build fixed, as they use some C++ compilers too. I have started on question
>  1. Which involves creating somewhat like a plug-in which will probably be
>  used by Parrot to connect any other GC implementation in the future. Because
>  even they do say that they support 3rd party GCs it is not a part of their
>  system as yet.
>
>  Regards,
>  Senaka
>
>  On Tue, Apr 15, 2008 at 1:53 AM, Alexei Fedotov <al...@gmail.com>
>
>
> wrote:
>
>  > Hello Senaka,
>  > That's a good progress in understanding how Parrot is built.
>  >
>  > Generally, I would suggest keeping build systems of Parrot and GC as
>  > is, with minimal changes. For example, you may try the following:
>  >
>  > 1. Build Parrot as is using a Parrot build system..
>  > 2. Build GC DLL using DRLVM build system.
>  > 3. Copy GC DLL to Parrot.
>  >
>  > You may want to ask few questions:
>  > 1. How Parrot would know about this new DLL? You will need to change
>  > Parrot command line parsing to understand a new option.
>  > 2. How it would know which functions to call to collect garbage?
>  > <answer mentions header files>
>  > 3. How it would be possible to maintain all these changes in order?
>  > <answer was in the correspondence>
>  > 4. Etc...
>  >
>  > Thanks.
>  >
>  >
>  > On Tue, Apr 15, 2008 at 12:09 AM, Senaka Fernando <se...@gmail.com>
>  > wrote:
>  > > Hi all,
>  > >
>  > >  I have finally made it possible to build Parrot in a C++ environment.
>  > And,
>  > >  have managed to uncover most differences between Parrot's C and C++
>  > build
>  > >  streams. I have also contacted Mark (from Parrot) regarding being my
>  > >  co-mentor and he's interested. I also do get a great deal of support
>  > from
>  > >  the Parrot community. The Parrot developer meeting will be held
>  > tomorrow at
>  > >  18.30 GMT. I hope that there would be a discussion on the GC.
>  > >
>  > >  Regards,
>  > >  Senaka
>  > >
>  > >  On Sun, Apr 13, 2008 at 5:41 PM, Senaka Fernando <se...@gmail.com>
>  > >
>  > >
>  > > wrote:
>  > >
>  > >  > Hi Xiao-Feng,
>  > >  >
>  > >  > Here is a detailed answer to your questions. I have copy-pasted some
>  > info
>  > >  > from Parrot documentation on GC.
>  > >  >
>  > >  > Objects on the heap are laid out as PMCs (PObjects), and buffers.
>  > >  > Allocation is pool based. The Arenas structure holds function
>  > pointers for
>  > >  > the core defined interface of the currently active GC subsystem:
>  > >  > "init_pool", "do_gc_mark",        "finalize_gc_system". It holds
>  > various
>  > >  > accounting information for the GC subsystem, including how many GC
>  > runs have
>  > >  > been completed, amount of memory allocated since the last run, and
>  > total
>  > >  > memory allocated. This accounting information is updated by the GC
>  > system.
>  > >  > The current block level for GC mark and sweep phases is stored in the
>  > Arenas
>  > >  > structure.
>  > >  >
>  > >  > The Memory_Pool structure is a simple memory pool. It contains a
>  > pointer
>  > >  > to the top block of the allocated pool, the total allocated size of
>  > the
>  > >  > pool, the block size, and some details on the reclamation
>  > characteristics of
>  > >  > the pool.
>  > >  >
>  > >  > The Small_Object_Pool structure is a richer memory pool for object
>  > >  > allocation.  It tracks details like the number of allocated and free
>  > objects
>  > >  > in the pool, a list of free objects, and for the generational GC
>  > >  > implementation maintains linked lists of white, black, and gray PMCs.
>  > It
>  > >  > contains a pointer to a simple Memory_Pool (the base storage of the
>  > pool).
>  > >  > It holds function pointers for adding and retrieving free objects in
>  > the
>  > >  > pool, and for allocating objects.
>  > >  >
>  > >  > Each PMC/Buffer will contain a set of flags that will govern the
>  > behavior
>  > >  > and state of it in the presence of the GC.
>  > >  > Ex:- PObj_active_destroy_FLAG, PObj_data_is_PMC_array_FLAG,
>  > PObj_live_FLAG
>  > >  > etc.
>  > >  >
>  > >  > Each PObject has a header which is of type UnionVal, a union of
>  > various
>  > >  > fields, in addition to flags. A PMC has a Vtable in it. Thus, each
>  > allocated
>  > >  > object will have header info within it.
>  > >  >
>  > >  > Each GC core defines 4 function pointers stored in the
>  > Small_Object_Pool
>  > >  > structures. One to allocate new objects, another to add a freed
>  > object to
>  > >  > the free list, another to retrieve a free object from the free list
>  > and one
>  > >  > to reallocate for additional objects. If a Small_Object_Pool is full
>  > (no
>  > >  > free objects) a new one will needed to be created. Thus, each object
>  > on a
>  > >  > small object pool is like a place holder for a new instance.
>  > >  >
>  > >  > Heap is laid out as arenas, having two memory pools and six small
>  > object
>  > >  > pools.
>  > >  >
>  > >  > There are two marking phases, for PMCs
>  > >  >
>  > >  > 1. Initial Marking
>  > >  > Each PMC has a "flags" member which, among other things, facilitates
>  > >  > garbage collection. At the beginning of the mark phase, the
>  > >  > "PObj_is_live_FLAG" and "PObj_is_fully_marked_FLAG" are both unset,
>  > which
>  > >  > flags the PMC as presumed dead (white). The initial mark phase of the
>  > >  > collection cycle goes through each PMC in the root set and sets the
>  > >  > Obj_is_live_FLAG" bit in the "flags" member (the PMC is gray).  It
>  > does not
>  > >  > set the "PObj_is_fully_marked_FLAG" bit (changing the PMC to black),
>  > because
>  > >  > in the initial mark, the PMCs or buffers contained by a PMC are not
>  > marked.
>  > >  > It also appends the PMC to the end of a list used for further
>  > marking.
>  > >  > However, if the PMC has already been marked as black, the current end
>  > of
>  > >  > list is returned (instead of appending the already processed PMC) to
>  > prevent
>  > >  > endless looping.
>  > >  >
>  > >  > 2. Incremental Marking
>  > >  > After the root set of PMCs have been marked, a series of incremental
>  > mark
>  > >  > runs are performed. These may be performed frequently, between other
>  > >  > operations.  The incremental mark runs work to move gray PMCs to
>  > black. They
>  > >  > take a PMC from the list for further marking, mark any PMCs or
>  > buffers it
>  > >  > contains as gray (the "PObj_is_live_FLAG" is set and the
>  > >  > "PObj_is_fully_marked_FLAG" is left unset), and add the contained
>  > PMCs or
>  > >  > buffers to the list for further marking.  If the PMC has a custom
>  > mark
>  > >  > function in its vtable, it is called at this point.
>  > >  >
>  > >  > For Buffers, no incremental marking is involved.
>  > >  > The initial marking phase also marks the root set of buffers. Because
>  > >  > buffers cannot contain other buffers, they are immediately marked as
>  > black
>  > >  > and not added to the list for further marking. Because PMCs may
>  > contain
>  > >  > buffers, the buffer collection phase can't run until the incremental
>  > marking
>  > >  > of PMCs is completed.
>  > >  >
>  > >  > When the list for further marking is empty (all gray PMCs have
>  > changed to
>  > >  > black), the collection stage is started. First, PMCs are collected,
>  > followed
>  > >  > by buffers. In both cases (PMC and buffer), the "live" and
>  > "fully_marked"
>  > >  > flags are reset after examination for reclamation.
>  > >  >
>  > >  > To collect PMCs, each PMC arena is examined from the most recently
>  > created
>  > >  > backwards.  Each PMC is examined to see if it is live, already on the
>  > free
>  > >  > list, or constant.  If it is not, then it is added to the free list
>  > and
>  > >  > marked as being on the free list with the "PObj_on_free_list_FLAG".
>  > >  >
>  > >  > To collect buffers, each Buffer arena is examined from the most
>  > recently
>  > >  > created backwards.  If the buffer is not live, not already on the
>  > free list
>  > >  > and it is not a constant or copy on write, then it is added to the
>  > free pool
>  > >  > for reuse and marked with the "PObj_on_free_list_FLAG".
>  > >  >
>  > >  > Thus, the objects are scanned during the mark phase and then
>  > identified as
>  > >  > live, and collected. The collection process is triggered after the
>  > marking
>  > >  > is complete.
>  > >  >
>  > >  > Allocation of objects is handled by pool structures.
>  > >  >
>  > >  > I can relate the necessary source code portions to this discussion if
>  > >  > required. However, some of the above mentioned features are not fully
>  > >  > implemented. But, according to the Parrot community, this will be
>  > their
>  > >  > future direction.
>  > >  >
>  > >  > I also have fixed most build errors with Parrot & C++ and they are
>  > really
>  > >  > happy about it. Now am in the process of resolving some linking
>  > conflicts on
>  > >  > Parrot's C++ build.
>  > >  >
>  > >  > Regards,
>  > >  > Senaka
>  > >  >
>  > >  > On Mon, Apr 7, 2008 at 10:25 AM, Senaka Fernando <senakafdo@gmail.com
>  > >
>  > >  > wrote:
>  > >  >
>  > >  > > Hi Xiao-Feng,
>  > >  > >
>  > >  > > Thanks for these questions. I believe that they'd be really helpful
>  > in
>  > >  > > understanding VM <-> GC assumptions on the Parrot end.
>  > >  > >
>  > >  > > Will work on answering these, and perhaps a comparison with
>  > Harmony.
>  > >  > >
>  > >  > > Regards,
>  > >  > > Senaka
>  > >  > >
>  > >  > >
>  > >  > > On Mon, Apr 7, 2008 at 5:11 AM, Xiao-Feng Li <xiaofeng.li@gmail.com
>  > >
>  > >  > > wrote:
>  > >  > >
>  > >  > > > Senaka, thanks for the page. I think the most important things
>  > are
>  > >  > > > related to the VM <-> GC protocol. Some questions that may help
>  > you:
>  > >  > > > 1. How Parrot layout/encode an object/array? fields, size, object
>  > >  > > > header info, etc.
>  > >  > >
>  > >  > >
>  > >  > > > 2. How Parrot layout/arrange the heap? free list? pages?
>  > >  > > > 3. What's the process of an object creation? When and how?
>  > >  > > > 4. How is a collection process triggered?
>  > >  > > > 5. How does Parrot GC trace live objects and collect them?
>  > >  > > >
>  > >  > > > Some of the questions above might be GC internals, so it's more
>  > >  > > > desirable if you can understand the Parrot VM's assumptions on
>  > GC.
>  > >  > > > I.e., does it assume the heap is laid out in certain way, does it
>  > >  > > > assume the objects are encoded in certain way, does it assume the
>  > >  > > > roots are enumerated in certain way, etc.? Depending on your
>  > progress,
>  > >  > > > more details might be needed later on.
>  > >  > > >
>  > >  > > > For this project, you might have to understand the Parrot current
>  > >  > > > status for the above questions. It helps you and us to identify
>  > the
>  > >  > > > key issues to resolve, and the main efforts to be focused on. For
>  > GC
>  > >  > > > porting over different VMs, it's not like porting an application
>  > over
>  > >  > > > different OSes, because of the implicit assumptions between VM
>  > and GC.
>  > >  > > > I would expect some redesign work required for GC porting, hence
>  > you
>  > >  > > > have to understand the Parrot design in certain depth.
>  > >  > > >
>  > >  > > > Thanks,
>  > >  > > > xiaofeng
>  > >  > > >
>  > >  > > > On Mon, Apr 7, 2008 at 4:26 AM, Alexei Fedotov <
>  > >  > > > alexei.fedotov@gmail.com> wrote:
>  > >  > > > > Good job, Senaka.
>  > >  > > > >
>  > >  > > > >  The general perception was that internal and external GC
>  > interfaces
>  > >  > > > >  were mixed, which maked this document less usable for
>  > harmony-gc-5
>  > >  > > > >  project than it could be. For example, sweeping, marking and
>  > >  > > > >  reclaiming are internal interfaces while allocation, stack
>  > >  > > > enumeration
>  > >  > > > >  (please take a look at vm_enumerate_root_set_all_threads) and
>  > gc
>  > >  > > > >  invocation are external interfaces. We should pay more
>  > attention to
>  > >  > > > >  external interfaces for harmony-gc-5 project.
>  > >  > > > >
>  > >  > > > >  Thanks.
>  > >  > > > >
>  > >  > > > >
>  > >  > > > >
>  > >  > > > >
>  > >  > > > >
>  > >  > > > >
>  > >  > > > >
>  > >  > > > >  On Sun, Apr 6, 2008 at 8:35 PM, Senaka Fernando <
>  > >  > > > senakafdo@gmail.com> wrote:
>  > >  > > > >  >
>  > >  > > > >  > Hi all,
>  > >  > > > >  >
>  > >  > > > >  >  I have almost finished comparing the two interfaces of
>  > Harmony
>  > >  > > > and Parrot.
>  > >  > > > >  >  However, I'm not 100% sure on whether I got everything
>  > right,
>  > >  > > > but I believe
>  > >  > > > >  >  that most of it is. Therefore, it would be really great if
>  > you
>  > >  > > > could review
>  > >  > > > >  >  the wiki page and let me know whether it is correct and
>  > precise.
>  > >  > > > >  >
>  > >  > > > >  >  Sections marked as TBD are not yet finalized. And, I have
>  > >  > > > omitted some
>  > >  > > > >  >  instances of where Harmony supports something and Parrot
>  > doesn't
>  > >  > > > for
>  > >  > > > >  >  simplicity.
>  > >  > > > >  >
>  > >  > > > >  >  The wiki page is found at [1]
>  > >  > > > >  >
>  > >  > > > >  >  [1]
>  > >  > > >
>  > http://wiki.apache.org/harmony/gc_comparison/gc_gen_harmony_vs_parrot
>  > >  > > > >  >
>  > >  > > > >  >  Regards,
>  > >  > > > >  >  Senaka
>  > >  > > > >  >
>  > >  > > > >
>  > >  > > > >
>  > >  > > > >
>  > >  > > > >  --
>  > >  > > > >  With best regards,
>  > >  > > > >  Alexei
>  > >  > > > >
>  > >  > > >
>  > >  > > >
>  > >  > > >
>  > >  > > > --
>  > >  > > > http://xiao-feng.blogspot.com
>  > >  > > >
>  > >  > >
>  > >  > >
>  > >  >
>  > >
>  >
>  >
>  >
>  > --
>  > With best regards,
>  > Alexei
>  >
>



-- 
With best regards,
Alexei

Re: [harmony-gc-5] updates on Parrot GC [was Re: Comparison of Harmony GC_Gen vs. Parrot]

Posted by Senaka Fernando <se...@gmail.com>.
Hi Alexei,

Actually it was the Parrot folks who asked me whether I could get the C++
build fixed, as they use some C++ compilers too. I have started on question
1. Which involves creating somewhat like a plug-in which will probably be
used by Parrot to connect any other GC implementation in the future. Because
even they do say that they support 3rd party GCs it is not a part of their
system as yet.

Regards,
Senaka

On Tue, Apr 15, 2008 at 1:53 AM, Alexei Fedotov <al...@gmail.com>
wrote:

> Hello Senaka,
> That's a good progress in understanding how Parrot is built.
>
> Generally, I would suggest keeping build systems of Parrot and GC as
> is, with minimal changes. For example, you may try the following:
>
> 1. Build Parrot as is using a Parrot build system..
> 2. Build GC DLL using DRLVM build system.
> 3. Copy GC DLL to Parrot.
>
> You may want to ask few questions:
> 1. How Parrot would know about this new DLL? You will need to change
> Parrot command line parsing to understand a new option.
> 2. How it would know which functions to call to collect garbage?
> <answer mentions header files>
> 3. How it would be possible to maintain all these changes in order?
> <answer was in the correspondence>
> 4. Etc...
>
> Thanks.
>
>
> On Tue, Apr 15, 2008 at 12:09 AM, Senaka Fernando <se...@gmail.com>
> wrote:
> > Hi all,
> >
> >  I have finally made it possible to build Parrot in a C++ environment.
> And,
> >  have managed to uncover most differences between Parrot's C and C++
> build
> >  streams. I have also contacted Mark (from Parrot) regarding being my
> >  co-mentor and he's interested. I also do get a great deal of support
> from
> >  the Parrot community. The Parrot developer meeting will be held
> tomorrow at
> >  18.30 GMT. I hope that there would be a discussion on the GC.
> >
> >  Regards,
> >  Senaka
> >
> >  On Sun, Apr 13, 2008 at 5:41 PM, Senaka Fernando <se...@gmail.com>
> >
> >
> > wrote:
> >
> >  > Hi Xiao-Feng,
> >  >
> >  > Here is a detailed answer to your questions. I have copy-pasted some
> info
> >  > from Parrot documentation on GC.
> >  >
> >  > Objects on the heap are laid out as PMCs (PObjects), and buffers.
> >  > Allocation is pool based. The Arenas structure holds function
> pointers for
> >  > the core defined interface of the currently active GC subsystem:
> >  > "init_pool", "do_gc_mark",        "finalize_gc_system". It holds
> various
> >  > accounting information for the GC subsystem, including how many GC
> runs have
> >  > been completed, amount of memory allocated since the last run, and
> total
> >  > memory allocated. This accounting information is updated by the GC
> system.
> >  > The current block level for GC mark and sweep phases is stored in the
> Arenas
> >  > structure.
> >  >
> >  > The Memory_Pool structure is a simple memory pool. It contains a
> pointer
> >  > to the top block of the allocated pool, the total allocated size of
> the
> >  > pool, the block size, and some details on the reclamation
> characteristics of
> >  > the pool.
> >  >
> >  > The Small_Object_Pool structure is a richer memory pool for object
> >  > allocation.  It tracks details like the number of allocated and free
> objects
> >  > in the pool, a list of free objects, and for the generational GC
> >  > implementation maintains linked lists of white, black, and gray PMCs.
> It
> >  > contains a pointer to a simple Memory_Pool (the base storage of the
> pool).
> >  > It holds function pointers for adding and retrieving free objects in
> the
> >  > pool, and for allocating objects.
> >  >
> >  > Each PMC/Buffer will contain a set of flags that will govern the
> behavior
> >  > and state of it in the presence of the GC.
> >  > Ex:- PObj_active_destroy_FLAG, PObj_data_is_PMC_array_FLAG,
> PObj_live_FLAG
> >  > etc.
> >  >
> >  > Each PObject has a header which is of type UnionVal, a union of
> various
> >  > fields, in addition to flags. A PMC has a Vtable in it. Thus, each
> allocated
> >  > object will have header info within it.
> >  >
> >  > Each GC core defines 4 function pointers stored in the
> Small_Object_Pool
> >  > structures. One to allocate new objects, another to add a freed
> object to
> >  > the free list, another to retrieve a free object from the free list
> and one
> >  > to reallocate for additional objects. If a Small_Object_Pool is full
> (no
> >  > free objects) a new one will needed to be created. Thus, each object
> on a
> >  > small object pool is like a place holder for a new instance.
> >  >
> >  > Heap is laid out as arenas, having two memory pools and six small
> object
> >  > pools.
> >  >
> >  > There are two marking phases, for PMCs
> >  >
> >  > 1. Initial Marking
> >  > Each PMC has a "flags" member which, among other things, facilitates
> >  > garbage collection. At the beginning of the mark phase, the
> >  > "PObj_is_live_FLAG" and "PObj_is_fully_marked_FLAG" are both unset,
> which
> >  > flags the PMC as presumed dead (white). The initial mark phase of the
> >  > collection cycle goes through each PMC in the root set and sets the
> >  > Obj_is_live_FLAG" bit in the "flags" member (the PMC is gray).  It
> does not
> >  > set the "PObj_is_fully_marked_FLAG" bit (changing the PMC to black),
> because
> >  > in the initial mark, the PMCs or buffers contained by a PMC are not
> marked.
> >  > It also appends the PMC to the end of a list used for further
> marking.
> >  > However, if the PMC has already been marked as black, the current end
> of
> >  > list is returned (instead of appending the already processed PMC) to
> prevent
> >  > endless looping.
> >  >
> >  > 2. Incremental Marking
> >  > After the root set of PMCs have been marked, a series of incremental
> mark
> >  > runs are performed. These may be performed frequently, between other
> >  > operations.  The incremental mark runs work to move gray PMCs to
> black. They
> >  > take a PMC from the list for further marking, mark any PMCs or
> buffers it
> >  > contains as gray (the "PObj_is_live_FLAG" is set and the
> >  > "PObj_is_fully_marked_FLAG" is left unset), and add the contained
> PMCs or
> >  > buffers to the list for further marking.  If the PMC has a custom
> mark
> >  > function in its vtable, it is called at this point.
> >  >
> >  > For Buffers, no incremental marking is involved.
> >  > The initial marking phase also marks the root set of buffers. Because
> >  > buffers cannot contain other buffers, they are immediately marked as
> black
> >  > and not added to the list for further marking. Because PMCs may
> contain
> >  > buffers, the buffer collection phase can't run until the incremental
> marking
> >  > of PMCs is completed.
> >  >
> >  > When the list for further marking is empty (all gray PMCs have
> changed to
> >  > black), the collection stage is started. First, PMCs are collected,
> followed
> >  > by buffers. In both cases (PMC and buffer), the "live" and
> "fully_marked"
> >  > flags are reset after examination for reclamation.
> >  >
> >  > To collect PMCs, each PMC arena is examined from the most recently
> created
> >  > backwards.  Each PMC is examined to see if it is live, already on the
> free
> >  > list, or constant.  If it is not, then it is added to the free list
> and
> >  > marked as being on the free list with the "PObj_on_free_list_FLAG".
> >  >
> >  > To collect buffers, each Buffer arena is examined from the most
> recently
> >  > created backwards.  If the buffer is not live, not already on the
> free list
> >  > and it is not a constant or copy on write, then it is added to the
> free pool
> >  > for reuse and marked with the "PObj_on_free_list_FLAG".
> >  >
> >  > Thus, the objects are scanned during the mark phase and then
> identified as
> >  > live, and collected. The collection process is triggered after the
> marking
> >  > is complete.
> >  >
> >  > Allocation of objects is handled by pool structures.
> >  >
> >  > I can relate the necessary source code portions to this discussion if
> >  > required. However, some of the above mentioned features are not fully
> >  > implemented. But, according to the Parrot community, this will be
> their
> >  > future direction.
> >  >
> >  > I also have fixed most build errors with Parrot & C++ and they are
> really
> >  > happy about it. Now am in the process of resolving some linking
> conflicts on
> >  > Parrot's C++ build.
> >  >
> >  > Regards,
> >  > Senaka
> >  >
> >  > On Mon, Apr 7, 2008 at 10:25 AM, Senaka Fernando <senakafdo@gmail.com
> >
> >  > wrote:
> >  >
> >  > > Hi Xiao-Feng,
> >  > >
> >  > > Thanks for these questions. I believe that they'd be really helpful
> in
> >  > > understanding VM <-> GC assumptions on the Parrot end.
> >  > >
> >  > > Will work on answering these, and perhaps a comparison with
> Harmony.
> >  > >
> >  > > Regards,
> >  > > Senaka
> >  > >
> >  > >
> >  > > On Mon, Apr 7, 2008 at 5:11 AM, Xiao-Feng Li <xiaofeng.li@gmail.com
> >
> >  > > wrote:
> >  > >
> >  > > > Senaka, thanks for the page. I think the most important things
> are
> >  > > > related to the VM <-> GC protocol. Some questions that may help
> you:
> >  > > > 1. How Parrot layout/encode an object/array? fields, size, object
> >  > > > header info, etc.
> >  > >
> >  > >
> >  > > > 2. How Parrot layout/arrange the heap? free list? pages?
> >  > > > 3. What's the process of an object creation? When and how?
> >  > > > 4. How is a collection process triggered?
> >  > > > 5. How does Parrot GC trace live objects and collect them?
> >  > > >
> >  > > > Some of the questions above might be GC internals, so it's more
> >  > > > desirable if you can understand the Parrot VM's assumptions on
> GC.
> >  > > > I.e., does it assume the heap is laid out in certain way, does it
> >  > > > assume the objects are encoded in certain way, does it assume the
> >  > > > roots are enumerated in certain way, etc.? Depending on your
> progress,
> >  > > > more details might be needed later on.
> >  > > >
> >  > > > For this project, you might have to understand the Parrot current
> >  > > > status for the above questions. It helps you and us to identify
> the
> >  > > > key issues to resolve, and the main efforts to be focused on. For
> GC
> >  > > > porting over different VMs, it's not like porting an application
> over
> >  > > > different OSes, because of the implicit assumptions between VM
> and GC.
> >  > > > I would expect some redesign work required for GC porting, hence
> you
> >  > > > have to understand the Parrot design in certain depth.
> >  > > >
> >  > > > Thanks,
> >  > > > xiaofeng
> >  > > >
> >  > > > On Mon, Apr 7, 2008 at 4:26 AM, Alexei Fedotov <
> >  > > > alexei.fedotov@gmail.com> wrote:
> >  > > > > Good job, Senaka.
> >  > > > >
> >  > > > >  The general perception was that internal and external GC
> interfaces
> >  > > > >  were mixed, which maked this document less usable for
> harmony-gc-5
> >  > > > >  project than it could be. For example, sweeping, marking and
> >  > > > >  reclaiming are internal interfaces while allocation, stack
> >  > > > enumeration
> >  > > > >  (please take a look at vm_enumerate_root_set_all_threads) and
> gc
> >  > > > >  invocation are external interfaces. We should pay more
> attention to
> >  > > > >  external interfaces for harmony-gc-5 project.
> >  > > > >
> >  > > > >  Thanks.
> >  > > > >
> >  > > > >
> >  > > > >
> >  > > > >
> >  > > > >
> >  > > > >
> >  > > > >
> >  > > > >  On Sun, Apr 6, 2008 at 8:35 PM, Senaka Fernando <
> >  > > > senakafdo@gmail.com> wrote:
> >  > > > >  >
> >  > > > >  > Hi all,
> >  > > > >  >
> >  > > > >  >  I have almost finished comparing the two interfaces of
> Harmony
> >  > > > and Parrot.
> >  > > > >  >  However, I'm not 100% sure on whether I got everything
> right,
> >  > > > but I believe
> >  > > > >  >  that most of it is. Therefore, it would be really great if
> you
> >  > > > could review
> >  > > > >  >  the wiki page and let me know whether it is correct and
> precise.
> >  > > > >  >
> >  > > > >  >  Sections marked as TBD are not yet finalized. And, I have
> >  > > > omitted some
> >  > > > >  >  instances of where Harmony supports something and Parrot
> doesn't
> >  > > > for
> >  > > > >  >  simplicity.
> >  > > > >  >
> >  > > > >  >  The wiki page is found at [1]
> >  > > > >  >
> >  > > > >  >  [1]
> >  > > >
> http://wiki.apache.org/harmony/gc_comparison/gc_gen_harmony_vs_parrot
> >  > > > >  >
> >  > > > >  >  Regards,
> >  > > > >  >  Senaka
> >  > > > >  >
> >  > > > >
> >  > > > >
> >  > > > >
> >  > > > >  --
> >  > > > >  With best regards,
> >  > > > >  Alexei
> >  > > > >
> >  > > >
> >  > > >
> >  > > >
> >  > > > --
> >  > > > http://xiao-feng.blogspot.com
> >  > > >
> >  > >
> >  > >
> >  >
> >
>
>
>
> --
> With best regards,
> Alexei
>

Re: [harmony-gc-5] updates on Parrot GC [was Re: Comparison of Harmony GC_Gen vs. Parrot]

Posted by Senaka Fernando <se...@gmail.com>.
Hi Xiao-Feng,

I will get these answered one-to-one. Today should be a good day to get help
as there should be many folks around. I will try to answer them on my own
and perhaps post it on a wiki page so that they can then let me know whether
I'm on the right track.

Regards,
Senaka

2008/4/15 Xiao-Feng Li <xi...@gmail.com>:

> On Tue, Apr 15, 2008 at 4:23 AM, Alexei Fedotov
> <al...@gmail.com> wrote:
> > Hello Senaka,
> >  That's a good progress in understanding how Parrot is built.
> >
> >  Generally, I would suggest keeping build systems of Parrot and GC as
> >  is, with minimal changes. For example, you may try the following:
> >
> >  1. Build Parrot as is using a Parrot build system..
> >  2. Build GC DLL using DRLVM build system.
> >  3. Copy GC DLL to Parrot.
> >
> >  You may want to ask few questions:
> >  1. How Parrot would know about this new DLL? You will need to change
> >  Parrot command line parsing to understand a new option.
> >  2. How it would know which functions to call to collect garbage?
> >  <answer mentions header files>
> >  3. How it would be possible to maintain all these changes in order?
> >  <answer was in the correspondence>
> >  4. Etc...
>
> Good suggestions. We need know if Parrot can connect with a DLL GC. If
> it can't, probably it's good for it to improve. To use DLL GC will
> force the Parrot VM to consider modularity and interface clearly.
>
> Btw, I had read the doc of Parrot GC. Algorithm-wise, I guess it is a
> single-thread (sequential) stop-the-world mark-sweep GC. And I guess
> it is trying to add incremental mark-sweep to it.
>
> PMC looks like objects, and Buffer looks like variable-length stings.
> The doc says PMC may contain PMC and Buffer, while Buffer contains
> nothing. I don't know if the "contain" here means "reference".
>
> To make GC into DLL, somebody really needs to answer following VM <->
> GC interactions in Parrot as in Harmony:
>
> Mainly the followings are agreed between VM and GC
> 1. Partially revealed obj and vtable definitions (object layout)
> 2. Object header bits left for GC usage
> 3. GC ↔ VM interfaces in open/gc.h, vm_gc.h
> 4. GC asks VM to suspend/resume mutators, Include GC safe-point support in
> VM
> 5. GC asks VM to enumerate root references, Include stack frame
> unwinding support in VM
> 6. Misc (not critical): finalizer/weakref, class unloading, etc.
>
> Basically they tell how GC works in the system
> 1. How VM asks GC to allocate objects
> 2. How VM triggers collection
> 3. How GC asks VM to suspend mutators
> 4. How GC asks VM to enumerate root references
> 5. How GC traces object connection graph
>
> These are the key points for GC porting or developing
>
> Thanks,
> xiaofeng
>
> >  Thanks.
> >
> >
> >
> >
> >  On Tue, Apr 15, 2008 at 12:09 AM, Senaka Fernando <se...@gmail.com>
> wrote:
> >  > Hi all,
> >  >
> >  >  I have finally made it possible to build Parrot in a C++
> environment. And,
> >  >  have managed to uncover most differences between Parrot's C and C++
> build
> >  >  streams. I have also contacted Mark (from Parrot) regarding being my
> >  >  co-mentor and he's interested. I also do get a great deal of support
> from
> >  >  the Parrot community. The Parrot developer meeting will be held
> tomorrow at
> >  >  18.30 GMT. I hope that there would be a discussion on the GC.
> >  >
> >  >  Regards,
> >  >  Senaka
> >  >
> >  >  On Sun, Apr 13, 2008 at 5:41 PM, Senaka Fernando <
> senakafdo@gmail.com>
> >  >
> >  >
> >  > wrote:
> >  >
> >  >  > Hi Xiao-Feng,
> >  >  >
> >  >  > Here is a detailed answer to your questions. I have copy-pasted
> some info
> >  >  > from Parrot documentation on GC.
> >  >  >
> >  >  > Objects on the heap are laid out as PMCs (PObjects), and buffers.
> >  >  > Allocation is pool based. The Arenas structure holds function
> pointers for
> >  >  > the core defined interface of the currently active GC subsystem:
> >  >  > "init_pool", "do_gc_mark",        "finalize_gc_system". It holds
> various
> >  >  > accounting information for the GC subsystem, including how many GC
> runs have
> >  >  > been completed, amount of memory allocated since the last run, and
> total
> >  >  > memory allocated. This accounting information is updated by the GC
> system.
> >  >  > The current block level for GC mark and sweep phases is stored in
> the Arenas
> >  >  > structure.
> >  >  >
> >  >  > The Memory_Pool structure is a simple memory pool. It contains a
> pointer
> >  >  > to the top block of the allocated pool, the total allocated size
> of the
> >  >  > pool, the block size, and some details on the reclamation
> characteristics of
> >  >  > the pool.
> >  >  >
> >  >  > The Small_Object_Pool structure is a richer memory pool for object
> >  >  > allocation.  It tracks details like the number of allocated and
> free objects
> >  >  > in the pool, a list of free objects, and for the generational GC
> >  >  > implementation maintains linked lists of white, black, and gray
> PMCs. It
> >  >  > contains a pointer to a simple Memory_Pool (the base storage of
> the pool).
> >  >  > It holds function pointers for adding and retrieving free objects
> in the
> >  >  > pool, and for allocating objects.
> >  >  >
> >  >  > Each PMC/Buffer will contain a set of flags that will govern the
> behavior
> >  >  > and state of it in the presence of the GC.
> >  >  > Ex:- PObj_active_destroy_FLAG, PObj_data_is_PMC_array_FLAG,
> PObj_live_FLAG
> >  >  > etc.
> >  >  >
> >  >  > Each PObject has a header which is of type UnionVal, a union of
> various
> >  >  > fields, in addition to flags. A PMC has a Vtable in it. Thus, each
> allocated
> >  >  > object will have header info within it.
> >  >  >
> >  >  > Each GC core defines 4 function pointers stored in the
> Small_Object_Pool
> >  >  > structures. One to allocate new objects, another to add a freed
> object to
> >  >  > the free list, another to retrieve a free object from the free
> list and one
> >  >  > to reallocate for additional objects. If a Small_Object_Pool is
> full (no
> >  >  > free objects) a new one will needed to be created. Thus, each
> object on a
> >  >  > small object pool is like a place holder for a new instance.
> >  >  >
> >  >  > Heap is laid out as arenas, having two memory pools and six small
> object
> >  >  > pools.
> >  >  >
> >  >  > There are two marking phases, for PMCs
> >  >  >
> >  >  > 1. Initial Marking
> >  >  > Each PMC has a "flags" member which, among other things,
> facilitates
> >  >  > garbage collection. At the beginning of the mark phase, the
> >  >  > "PObj_is_live_FLAG" and "PObj_is_fully_marked_FLAG" are both
> unset, which
> >  >  > flags the PMC as presumed dead (white). The initial mark phase of
> the
> >  >  > collection cycle goes through each PMC in the root set and sets
> the
> >  >  > Obj_is_live_FLAG" bit in the "flags" member (the PMC is gray).  It
> does not
> >  >  > set the "PObj_is_fully_marked_FLAG" bit (changing the PMC to
> black), because
> >  >  > in the initial mark, the PMCs or buffers contained by a PMC are
> not marked.
> >  >  > It also appends the PMC to the end of a list used for further
> marking.
> >  >  > However, if the PMC has already been marked as black, the current
> end of
> >  >  > list is returned (instead of appending the already processed PMC)
> to prevent
> >  >  > endless looping.
> >  >  >
> >  >  > 2. Incremental Marking
> >  >  > After the root set of PMCs have been marked, a series of
> incremental mark
> >  >  > runs are performed. These may be performed frequently, between
> other
> >  >  > operations.  The incremental mark runs work to move gray PMCs to
> black. They
> >  >  > take a PMC from the list for further marking, mark any PMCs or
> buffers it
> >  >  > contains as gray (the "PObj_is_live_FLAG" is set and the
> >  >  > "PObj_is_fully_marked_FLAG" is left unset), and add the contained
> PMCs or
> >  >  > buffers to the list for further marking.  If the PMC has a custom
> mark
> >  >  > function in its vtable, it is called at this point.
> >  >  >
> >  >  > For Buffers, no incremental marking is involved.
> >  >  > The initial marking phase also marks the root set of buffers.
> Because
> >  >  > buffers cannot contain other buffers, they are immediately marked
> as black
> >  >  > and not added to the list for further marking. Because PMCs may
> contain
> >  >  > buffers, the buffer collection phase can't run until the
> incremental marking
> >  >  > of PMCs is completed.
> >  >  >
> >  >  > When the list for further marking is empty (all gray PMCs have
> changed to
> >  >  > black), the collection stage is started. First, PMCs are
> collected, followed
> >  >  > by buffers. In both cases (PMC and buffer), the "live" and
> "fully_marked"
> >  >  > flags are reset after examination for reclamation.
> >  >  >
> >  >  > To collect PMCs, each PMC arena is examined from the most recently
> created
> >  >  > backwards.  Each PMC is examined to see if it is live, already on
> the free
> >  >  > list, or constant.  If it is not, then it is added to the free
> list and
> >  >  > marked as being on the free list with the
> "PObj_on_free_list_FLAG".
> >  >  >
> >  >  > To collect buffers, each Buffer arena is examined from the most
> recently
> >  >  > created backwards.  If the buffer is not live, not already on the
> free list
> >  >  > and it is not a constant or copy on write, then it is added to the
> free pool
> >  >  > for reuse and marked with the "PObj_on_free_list_FLAG".
> >  >  >
> >  >  > Thus, the objects are scanned during the mark phase and then
> identified as
> >  >  > live, and collected. The collection process is triggered after the
> marking
> >  >  > is complete.
> >  >  >
> >  >  > Allocation of objects is handled by pool structures.
> >  >  >
> >  >  > I can relate the necessary source code portions to this discussion
> if
> >  >  > required. However, some of the above mentioned features are not
> fully
> >  >  > implemented. But, according to the Parrot community, this will be
> their
> >  >  > future direction.
> >  >  >
> >  >  > I also have fixed most build errors with Parrot & C++ and they are
> really
> >  >  > happy about it. Now am in the process of resolving some linking
> conflicts on
> >  >  > Parrot's C++ build.
> >  >  >
> >  >  > Regards,
> >  >  > Senaka
> >  >  >
> >  >  > On Mon, Apr 7, 2008 at 10:25 AM, Senaka Fernando <
> senakafdo@gmail.com>
> >  >  > wrote:
> >  >  >
> >  >  > > Hi Xiao-Feng,
> >  >  > >
> >  >  > > Thanks for these questions. I believe that they'd be really
> helpful in
> >  >  > > understanding VM <-> GC assumptions on the Parrot end.
> >  >  > >
> >  >  > > Will work on answering these, and perhaps a comparison with
> Harmony.
> >  >  > >
> >  >  > > Regards,
> >  >  > > Senaka
> >  >  > >
> >  >  > >
> >  >  > > On Mon, Apr 7, 2008 at 5:11 AM, Xiao-Feng Li <
> xiaofeng.li@gmail.com>
> >  >  > > wrote:
> >  >  > >
> >  >  > > > Senaka, thanks for the page. I think the most important things
> are
> >  >  > > > related to the VM <-> GC protocol. Some questions that may
> help you:
> >  >  > > > 1. How Parrot layout/encode an object/array? fields, size,
> object
> >  >  > > > header info, etc.
> >  >  > >
> >  >  > >
> >  >  > > > 2. How Parrot layout/arrange the heap? free list? pages?
> >  >  > > > 3. What's the process of an object creation? When and how?
> >  >  > > > 4. How is a collection process triggered?
> >  >  > > > 5. How does Parrot GC trace live objects and collect them?
> >  >  > > >
> >  >  > > > Some of the questions above might be GC internals, so it's
> more
> >  >  > > > desirable if you can understand the Parrot VM's assumptions on
> GC.
> >  >  > > > I.e., does it assume the heap is laid out in certain way, does
> it
> >  >  > > > assume the objects are encoded in certain way, does it assume
> the
> >  >  > > > roots are enumerated in certain way, etc.? Depending on your
> progress,
> >  >  > > > more details might be needed later on.
> >  >  > > >
> >  >  > > > For this project, you might have to understand the Parrot
> current
> >  >  > > > status for the above questions. It helps you and us to
> identify the
> >  >  > > > key issues to resolve, and the main efforts to be focused on.
> For GC
> >  >  > > > porting over different VMs, it's not like porting an
> application over
> >  >  > > > different OSes, because of the implicit assumptions between VM
> and GC.
> >  >  > > > I would expect some redesign work required for GC porting,
> hence you
> >  >  > > > have to understand the Parrot design in certain depth.
> >  >  > > >
> >  >  > > > Thanks,
> >  >  > > > xiaofeng
> >  >  > > >
> >  >  > > > On Mon, Apr 7, 2008 at 4:26 AM, Alexei Fedotov <
> >  >  > > > alexei.fedotov@gmail.com> wrote:
> >  >  > > > > Good job, Senaka.
> >  >  > > > >
> >  >  > > > >  The general perception was that internal and external GC
> interfaces
> >  >  > > > >  were mixed, which maked this document less usable for
> harmony-gc-5
> >  >  > > > >  project than it could be. For example, sweeping, marking
> and
> >  >  > > > >  reclaiming are internal interfaces while allocation, stack
> >  >  > > > enumeration
> >  >  > > > >  (please take a look at vm_enumerate_root_set_all_threads)
> and gc
> >  >  > > > >  invocation are external interfaces. We should pay more
> attention to
> >  >  > > > >  external interfaces for harmony-gc-5 project.
> >  >  > > > >
> >  >  > > > >  Thanks.
> >  >  > > > >
> >  >  > > > >
> >  >  > > > >
> >  >  > > > >
> >  >  > > > >
> >  >  > > > >
> >  >  > > > >
> >  >  > > > >  On Sun, Apr 6, 2008 at 8:35 PM, Senaka Fernando <
> >  >  > > > senakafdo@gmail.com> wrote:
> >  >  > > > >  >
> >  >  > > > >  > Hi all,
> >  >  > > > >  >
> >  >  > > > >  >  I have almost finished comparing the two interfaces of
> Harmony
> >  >  > > > and Parrot.
> >  >  > > > >  >  However, I'm not 100% sure on whether I got everything
> right,
> >  >  > > > but I believe
> >  >  > > > >  >  that most of it is. Therefore, it would be really great
> if you
> >  >  > > > could review
> >  >  > > > >  >  the wiki page and let me know whether it is correct and
> precise.
> >  >  > > > >  >
> >  >  > > > >  >  Sections marked as TBD are not yet finalized. And, I
> have
> >  >  > > > omitted some
> >  >  > > > >  >  instances of where Harmony supports something and Parrot
> doesn't
> >  >  > > > for
> >  >  > > > >  >  simplicity.
> >  >  > > > >  >
> >  >  > > > >  >  The wiki page is found at [1]
> >  >  > > > >  >
> >  >  > > > >  >  [1]
> >  >  > > >
> http://wiki.apache.org/harmony/gc_comparison/gc_gen_harmony_vs_parrot
> >  >  > > > >  >
> >  >  > > > >  >  Regards,
> >  >  > > > >  >  Senaka
> >  >  > > > >  >
> >  >  > > > >
> >  >  > > > >
> >  >  > > > >
> >  >  > > > >  --
> >  >  > > > >  With best regards,
> >  >  > > > >  Alexei
> >  >  > > > >
> >  >  > > >
> >  >  > > >
> >  >  > > >
> >  >  > > > --
> >  >  > > > http://xiao-feng.blogspot.com
> >  >  > > >
> >  >  > >
> >  >  > >
> >  >  >
> >  >
> >
> >
> >
> >  --
> >  With best regards,
> >  Alexei
> >
>
>
>
> --
> http://xiao-feng.blogspot.com
>

Re: [harmony-gc-5] updates on Parrot GC [was Re: Comparison of Harmony GC_Gen vs. Parrot]

Posted by Xiao-Feng Li <xi...@gmail.com>.
On Tue, Apr 15, 2008 at 4:23 AM, Alexei Fedotov
<al...@gmail.com> wrote:
> Hello Senaka,
>  That's a good progress in understanding how Parrot is built.
>
>  Generally, I would suggest keeping build systems of Parrot and GC as
>  is, with minimal changes. For example, you may try the following:
>
>  1. Build Parrot as is using a Parrot build system..
>  2. Build GC DLL using DRLVM build system.
>  3. Copy GC DLL to Parrot.
>
>  You may want to ask few questions:
>  1. How Parrot would know about this new DLL? You will need to change
>  Parrot command line parsing to understand a new option.
>  2. How it would know which functions to call to collect garbage?
>  <answer mentions header files>
>  3. How it would be possible to maintain all these changes in order?
>  <answer was in the correspondence>
>  4. Etc...

Good suggestions. We need know if Parrot can connect with a DLL GC. If
it can't, probably it's good for it to improve. To use DLL GC will
force the Parrot VM to consider modularity and interface clearly.

Btw, I had read the doc of Parrot GC. Algorithm-wise, I guess it is a
single-thread (sequential) stop-the-world mark-sweep GC. And I guess
it is trying to add incremental mark-sweep to it.

PMC looks like objects, and Buffer looks like variable-length stings.
The doc says PMC may contain PMC and Buffer, while Buffer contains
nothing. I don't know if the "contain" here means "reference".

To make GC into DLL, somebody really needs to answer following VM <->
GC interactions in Parrot as in Harmony:

Mainly the followings are agreed between VM and GC
1. Partially revealed obj and vtable definitions (object layout)
2. Object header bits left for GC usage
3. GC ↔ VM interfaces in open/gc.h, vm_gc.h
4. GC asks VM to suspend/resume mutators, Include GC safe-point support in VM
5. GC asks VM to enumerate root references, Include stack frame
unwinding support in VM
6. Misc (not critical): finalizer/weakref, class unloading, etc.

Basically they tell how GC works in the system
1. How VM asks GC to allocate objects
2. How VM triggers collection
3. How GC asks VM to suspend mutators
4. How GC asks VM to enumerate root references
5. How GC traces object connection graph

These are the key points for GC porting or developing

Thanks,
xiaofeng

>  Thanks.
>
>
>
>
>  On Tue, Apr 15, 2008 at 12:09 AM, Senaka Fernando <se...@gmail.com> wrote:
>  > Hi all,
>  >
>  >  I have finally made it possible to build Parrot in a C++ environment. And,
>  >  have managed to uncover most differences between Parrot's C and C++ build
>  >  streams. I have also contacted Mark (from Parrot) regarding being my
>  >  co-mentor and he's interested. I also do get a great deal of support from
>  >  the Parrot community. The Parrot developer meeting will be held tomorrow at
>  >  18.30 GMT. I hope that there would be a discussion on the GC.
>  >
>  >  Regards,
>  >  Senaka
>  >
>  >  On Sun, Apr 13, 2008 at 5:41 PM, Senaka Fernando <se...@gmail.com>
>  >
>  >
>  > wrote:
>  >
>  >  > Hi Xiao-Feng,
>  >  >
>  >  > Here is a detailed answer to your questions. I have copy-pasted some info
>  >  > from Parrot documentation on GC.
>  >  >
>  >  > Objects on the heap are laid out as PMCs (PObjects), and buffers.
>  >  > Allocation is pool based. The Arenas structure holds function pointers for
>  >  > the core defined interface of the currently active GC subsystem:
>  >  > "init_pool", "do_gc_mark",        "finalize_gc_system". It holds various
>  >  > accounting information for the GC subsystem, including how many GC runs have
>  >  > been completed, amount of memory allocated since the last run, and total
>  >  > memory allocated. This accounting information is updated by the GC system.
>  >  > The current block level for GC mark and sweep phases is stored in the Arenas
>  >  > structure.
>  >  >
>  >  > The Memory_Pool structure is a simple memory pool. It contains a pointer
>  >  > to the top block of the allocated pool, the total allocated size of the
>  >  > pool, the block size, and some details on the reclamation characteristics of
>  >  > the pool.
>  >  >
>  >  > The Small_Object_Pool structure is a richer memory pool for object
>  >  > allocation.  It tracks details like the number of allocated and free objects
>  >  > in the pool, a list of free objects, and for the generational GC
>  >  > implementation maintains linked lists of white, black, and gray PMCs. It
>  >  > contains a pointer to a simple Memory_Pool (the base storage of the pool).
>  >  > It holds function pointers for adding and retrieving free objects in the
>  >  > pool, and for allocating objects.
>  >  >
>  >  > Each PMC/Buffer will contain a set of flags that will govern the behavior
>  >  > and state of it in the presence of the GC.
>  >  > Ex:- PObj_active_destroy_FLAG, PObj_data_is_PMC_array_FLAG, PObj_live_FLAG
>  >  > etc.
>  >  >
>  >  > Each PObject has a header which is of type UnionVal, a union of various
>  >  > fields, in addition to flags. A PMC has a Vtable in it. Thus, each allocated
>  >  > object will have header info within it.
>  >  >
>  >  > Each GC core defines 4 function pointers stored in the Small_Object_Pool
>  >  > structures. One to allocate new objects, another to add a freed object to
>  >  > the free list, another to retrieve a free object from the free list and one
>  >  > to reallocate for additional objects. If a Small_Object_Pool is full (no
>  >  > free objects) a new one will needed to be created. Thus, each object on a
>  >  > small object pool is like a place holder for a new instance.
>  >  >
>  >  > Heap is laid out as arenas, having two memory pools and six small object
>  >  > pools.
>  >  >
>  >  > There are two marking phases, for PMCs
>  >  >
>  >  > 1. Initial Marking
>  >  > Each PMC has a "flags" member which, among other things, facilitates
>  >  > garbage collection. At the beginning of the mark phase, the
>  >  > "PObj_is_live_FLAG" and "PObj_is_fully_marked_FLAG" are both unset, which
>  >  > flags the PMC as presumed dead (white). The initial mark phase of the
>  >  > collection cycle goes through each PMC in the root set and sets the
>  >  > Obj_is_live_FLAG" bit in the "flags" member (the PMC is gray).  It does not
>  >  > set the "PObj_is_fully_marked_FLAG" bit (changing the PMC to black), because
>  >  > in the initial mark, the PMCs or buffers contained by a PMC are not marked.
>  >  > It also appends the PMC to the end of a list used for further marking.
>  >  > However, if the PMC has already been marked as black, the current end of
>  >  > list is returned (instead of appending the already processed PMC) to prevent
>  >  > endless looping.
>  >  >
>  >  > 2. Incremental Marking
>  >  > After the root set of PMCs have been marked, a series of incremental mark
>  >  > runs are performed. These may be performed frequently, between other
>  >  > operations.  The incremental mark runs work to move gray PMCs to black. They
>  >  > take a PMC from the list for further marking, mark any PMCs or buffers it
>  >  > contains as gray (the "PObj_is_live_FLAG" is set and the
>  >  > "PObj_is_fully_marked_FLAG" is left unset), and add the contained PMCs or
>  >  > buffers to the list for further marking.  If the PMC has a custom mark
>  >  > function in its vtable, it is called at this point.
>  >  >
>  >  > For Buffers, no incremental marking is involved.
>  >  > The initial marking phase also marks the root set of buffers. Because
>  >  > buffers cannot contain other buffers, they are immediately marked as black
>  >  > and not added to the list for further marking. Because PMCs may contain
>  >  > buffers, the buffer collection phase can't run until the incremental marking
>  >  > of PMCs is completed.
>  >  >
>  >  > When the list for further marking is empty (all gray PMCs have changed to
>  >  > black), the collection stage is started. First, PMCs are collected, followed
>  >  > by buffers. In both cases (PMC and buffer), the "live" and "fully_marked"
>  >  > flags are reset after examination for reclamation.
>  >  >
>  >  > To collect PMCs, each PMC arena is examined from the most recently created
>  >  > backwards.  Each PMC is examined to see if it is live, already on the free
>  >  > list, or constant.  If it is not, then it is added to the free list and
>  >  > marked as being on the free list with the "PObj_on_free_list_FLAG".
>  >  >
>  >  > To collect buffers, each Buffer arena is examined from the most recently
>  >  > created backwards.  If the buffer is not live, not already on the free list
>  >  > and it is not a constant or copy on write, then it is added to the free pool
>  >  > for reuse and marked with the "PObj_on_free_list_FLAG".
>  >  >
>  >  > Thus, the objects are scanned during the mark phase and then identified as
>  >  > live, and collected. The collection process is triggered after the marking
>  >  > is complete.
>  >  >
>  >  > Allocation of objects is handled by pool structures.
>  >  >
>  >  > I can relate the necessary source code portions to this discussion if
>  >  > required. However, some of the above mentioned features are not fully
>  >  > implemented. But, according to the Parrot community, this will be their
>  >  > future direction.
>  >  >
>  >  > I also have fixed most build errors with Parrot & C++ and they are really
>  >  > happy about it. Now am in the process of resolving some linking conflicts on
>  >  > Parrot's C++ build.
>  >  >
>  >  > Regards,
>  >  > Senaka
>  >  >
>  >  > On Mon, Apr 7, 2008 at 10:25 AM, Senaka Fernando <se...@gmail.com>
>  >  > wrote:
>  >  >
>  >  > > Hi Xiao-Feng,
>  >  > >
>  >  > > Thanks for these questions. I believe that they'd be really helpful in
>  >  > > understanding VM <-> GC assumptions on the Parrot end.
>  >  > >
>  >  > > Will work on answering these, and perhaps a comparison with Harmony.
>  >  > >
>  >  > > Regards,
>  >  > > Senaka
>  >  > >
>  >  > >
>  >  > > On Mon, Apr 7, 2008 at 5:11 AM, Xiao-Feng Li <xi...@gmail.com>
>  >  > > wrote:
>  >  > >
>  >  > > > Senaka, thanks for the page. I think the most important things are
>  >  > > > related to the VM <-> GC protocol. Some questions that may help you:
>  >  > > > 1. How Parrot layout/encode an object/array? fields, size, object
>  >  > > > header info, etc.
>  >  > >
>  >  > >
>  >  > > > 2. How Parrot layout/arrange the heap? free list? pages?
>  >  > > > 3. What's the process of an object creation? When and how?
>  >  > > > 4. How is a collection process triggered?
>  >  > > > 5. How does Parrot GC trace live objects and collect them?
>  >  > > >
>  >  > > > Some of the questions above might be GC internals, so it's more
>  >  > > > desirable if you can understand the Parrot VM's assumptions on GC.
>  >  > > > I.e., does it assume the heap is laid out in certain way, does it
>  >  > > > assume the objects are encoded in certain way, does it assume the
>  >  > > > roots are enumerated in certain way, etc.? Depending on your progress,
>  >  > > > more details might be needed later on.
>  >  > > >
>  >  > > > For this project, you might have to understand the Parrot current
>  >  > > > status for the above questions. It helps you and us to identify the
>  >  > > > key issues to resolve, and the main efforts to be focused on. For GC
>  >  > > > porting over different VMs, it's not like porting an application over
>  >  > > > different OSes, because of the implicit assumptions between VM and GC.
>  >  > > > I would expect some redesign work required for GC porting, hence you
>  >  > > > have to understand the Parrot design in certain depth.
>  >  > > >
>  >  > > > Thanks,
>  >  > > > xiaofeng
>  >  > > >
>  >  > > > On Mon, Apr 7, 2008 at 4:26 AM, Alexei Fedotov <
>  >  > > > alexei.fedotov@gmail.com> wrote:
>  >  > > > > Good job, Senaka.
>  >  > > > >
>  >  > > > >  The general perception was that internal and external GC interfaces
>  >  > > > >  were mixed, which maked this document less usable for harmony-gc-5
>  >  > > > >  project than it could be. For example, sweeping, marking and
>  >  > > > >  reclaiming are internal interfaces while allocation, stack
>  >  > > > enumeration
>  >  > > > >  (please take a look at vm_enumerate_root_set_all_threads) and gc
>  >  > > > >  invocation are external interfaces. We should pay more attention to
>  >  > > > >  external interfaces for harmony-gc-5 project.
>  >  > > > >
>  >  > > > >  Thanks.
>  >  > > > >
>  >  > > > >
>  >  > > > >
>  >  > > > >
>  >  > > > >
>  >  > > > >
>  >  > > > >
>  >  > > > >  On Sun, Apr 6, 2008 at 8:35 PM, Senaka Fernando <
>  >  > > > senakafdo@gmail.com> wrote:
>  >  > > > >  >
>  >  > > > >  > Hi all,
>  >  > > > >  >
>  >  > > > >  >  I have almost finished comparing the two interfaces of Harmony
>  >  > > > and Parrot.
>  >  > > > >  >  However, I'm not 100% sure on whether I got everything right,
>  >  > > > but I believe
>  >  > > > >  >  that most of it is. Therefore, it would be really great if you
>  >  > > > could review
>  >  > > > >  >  the wiki page and let me know whether it is correct and precise.
>  >  > > > >  >
>  >  > > > >  >  Sections marked as TBD are not yet finalized. And, I have
>  >  > > > omitted some
>  >  > > > >  >  instances of where Harmony supports something and Parrot doesn't
>  >  > > > for
>  >  > > > >  >  simplicity.
>  >  > > > >  >
>  >  > > > >  >  The wiki page is found at [1]
>  >  > > > >  >
>  >  > > > >  >  [1]
>  >  > > > http://wiki.apache.org/harmony/gc_comparison/gc_gen_harmony_vs_parrot
>  >  > > > >  >
>  >  > > > >  >  Regards,
>  >  > > > >  >  Senaka
>  >  > > > >  >
>  >  > > > >
>  >  > > > >
>  >  > > > >
>  >  > > > >  --
>  >  > > > >  With best regards,
>  >  > > > >  Alexei
>  >  > > > >
>  >  > > >
>  >  > > >
>  >  > > >
>  >  > > > --
>  >  > > > http://xiao-feng.blogspot.com
>  >  > > >
>  >  > >
>  >  > >
>  >  >
>  >
>
>
>
>  --
>  With best regards,
>  Alexei
>



-- 
http://xiao-feng.blogspot.com