You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Steve Blackburn <St...@anu.edu.au> on 2005/05/27 23:30:54 UTC

Work items

Hi all,

I imagine I'm not alone in thinking that there are a great many
concrete things people can be working on right away, even while
discussion on key design issues continues.  It would be good to have a
work list of concrete work items that eager folks can go to.  So I'll
stick my neck out and throw out a list of a few tasks that can be
worked on right away.  These include prototyping, research and
improvements to existing infrastructure. I have not included broad
design issues on there since design issues are not "projects", but the
subject of large scale discusion.

Among the items listed below, I have included items from MMTk and
Jikes RVM and I'm hoping others will do the same and add items from
other projects that work towards the goal of contributing to the
Harmony project.  I hope everyone feels free to add to the list with
any work items they can think of (and argue for the removal of items
if they want).  This is just a start.  The list definitely needs
improving and growing, but at least there is no reason for anyone to
be sitting on their hands ;-)

What do people think?

--Steve


. prototype backend generator [prototype]

  Explore Ertl & Gregg's work to develop a "backend generator" which
  leverages the portability of gcj to automatically generate backends
  for a simple JIT.  The semantics of Java bytecodes are expressed
  using Java code, and then gcj is used to generate code fragments
  which are then captured for use by a simple JIT (Ertl & Gregg used C
  and gcc, but with vmmagic support, Java and gcj would be nicer).
  See also

 http://www.csc.uvic.ca/~csc586a/papers/ertlgregg04.pdf
 
  
http://mail-archives.apache.org/mod_mbox/incubator-harmony-dev/200505.mbox/%3c20050523.084854.343163557.shudo@aist.go.jp%3e

. vmmagic support for gcj [gcj]

  Implement the vmmagic unboxed types in gcj, so they are compiled to
  unboxed native operations (eg Address.loadInt() will perform an
  integer load from an address value, and all instances of Address
  will appear not as object instances but as 32/64 bit primitives).

 
  
http://jikesrvm.sourceforge.net/api/org/vmmagic/unboxed/package-summary.html

  This will allow gcj to be used to build Java code which uses the
  vmmagic primitves.  Thus gcj could be used to build a virtual
  machine image, an interpreter, or be used to automatically generate
  a compiler back end.

. bytecode optimization [reserach]

  On architectures where there is no high performance JIT, the
  performance of the interpreter or simple jit (see backend generator
  project) will be the limiting factor.  (High performance JITs are
  heavyweight and inevitably will only exist on the architectures were
  there has been demonstrated demand for it). Since a simple
  interpreter or simple JIT has very limited opportunity for
  optimization, it may be very important to perform bytecode
  optimization.  There are a number of toolkits including bloat and
  soot.

  http://www.sable.mcgill.ca/soot/
  http://www.cs.purdue.edu/s3/projects/bloat/
  http://portal.acm.org/citation.cfm?id=781995.782008

. dynamic loading VM components and libraries [research]

  Whether built in Java, C/C++ or a mix, harmony should probably have
  strong support for dynamic loading and sharing of precompiled
  "libraries" designed in from the begining.  This is a very
  interesting problem space, possibly made more interesting by
  JSR-121.  Jikes RVM, for example, mmaps a monolithic image
  corresponding to a lot of precompiled and initialized classes.  ORP
  dynamically loads components such as the JIT.  Generalizing these
  ideas would be enormously helpful to harmony. Minimizing the
  footprint of the VM, particularly in the context of multiple VMs is
  interesting.  There are complex issues associated with class
  initialization and related code optimization issues (a static final
  field that is initiated in a context sensitive way can't be constant
  folded by an optimizing compiler except in a context specific
  way---in other words the optimization would have to be ignored or
  else separate code would have to be generated for each instance).
  This also stirs up questions of heap and memory layout, the location
  of class fields versus code etc etc.  There is a lot here.  It is
  very important and very interesting.  As far as jsr 121 goes, it is
  important to note that Doug Lea is involved in both Harmony and the
  JSR proposal.

  http://jcp.org/aboutJava/communityprocess/pr/jsr121/index.html

. benchmarks [dacapo]

  Good benchmarks are essential.  A major effort has been underway in
  the DaCapo group to put together a new benchmark suite.  Plans are
  to relicense this under the Apache license.  The suite badly needs
  new packaging and a new harness and is looking for someone to pick
  up the ball and run with it.  Many of the constituent benchmarks are
  drawn from the Jakarta project.

  http://www-ali.cs.umass.edu/DaCapo/gcbm.html

. Generalize object model abstraction [mmtk,ovm]

  There are many different object models (ways of laying out an
  object, its header and associated metadata).  A good GC toolkit
  should be able to abstract over the various object models a host VM
  may implement.  Providing a suitably high level abstraction without
  performance penalty is an important item on the MMTk todo list.
  This sort of generality is something the OVM project has focussed
  on.

  http://www.ovmj.org/
  http://www.ovmj.org/ovmir.ps

. MMTk as .so [mmtk]

  Build MMTk as a .so, ideally adding new pragmas so that barriers and
  allocation sequeneces are materialized as inlined C in mmtk.h.
  Builds on prior work by Robin Garner and Andrew Gray.  Will allow
  MMTk to be linked into C-based VM's.

  http://cs.anu.edu.au/~Andrew.Gray/rmtk/

. jikes rvm opt modularization [jikesrvm]

  The Jikes RVM optimizing compiler is one of the valuable aspects of
  Jikes RVM.  If it is to be reused by harmony it needs to be properly
  modularized.  This project will involve "normalizing" the code
  (getting rid of Jikes RVM-specific idioms and ideosynchrosies) and
  clearly defining the module.  In addition to extracting one of the
  most important

  http://jikesrvm.sourceforge.net/userguide/HTML/optdetails.html

. remove dependency on jburg in jikesrvm [jikesrvm]

  iburg is a widely used bottom up re-write tool used to produce code
  generators.  Jikes RVM added some extensions, known within jikes rvm
  as jburg.  There also exists a tool called jburg which is a java
  implementation of the original burs work (not directly derived from
  iburg, but written from scratch on the basis of the orginal Fraser,
  Hansen and Proebsting paper.

  http://jburg.sourceforge.net/
  http://cvs.sourceforge.net/viewcvs.py/jikesrvm/rvm/src/tools/jburg/
  http://www.cs.princeton.edu/software/iburg/
  http://portal.acm.org/citation.cfm?id=151640.151642
  http://www.cs.manchester.ac.uk/mscprojects/projects.05/jam.html


Re: Work items

Posted by Steve Blackburn <St...@anu.edu.au>.
Tom Tromey wrote:

>Don't forget hacking on Classpath :-)
>  
>
Gosh no!!  ;-)  Obviously that should very much be on the work list.

I suspect that for the work items at peer projects (classpath, mmtk, 
gcj, jikesrvm etc), that when possible the worklist just provide links 
to work items maintained by the associated project.  Dave Grove has 
started constructing a list for the opt compiler, so once that's 
available we can remove the jikes rvm items and replace them with a link 
to the jikes rvm list.

>Steve> . bytecode optimization [reserach]
>
>Something interesting sort of related to this area is "vmgen".  This
>seems like a nice way to build interpreters.
>
Thanks for pointing that out!

This is actually the same work as the first item on the list I sent out 
(prototype backend generator), only I was unaware of vmgen (Anton 
Ertl).  I only knew about the paper with Gregg.  vmgen appears to be 
GPLed.  It looks interesting.

http://www.complang.tuwien.ac.at/anton/vmgen/
http://www.complang.tuwien.ac.at/anton/
http://www.complang.tuwien.ac.at/projects/backends.html

--Steve

Re: Work items

Posted by Tom Tromey <tr...@redhat.com>.
>>>>> "Steve" == Steve Blackburn <St...@anu.edu.au> writes:

Steve> I imagine I'm not alone in thinking that there are a great many
Steve> concrete things people can be working on right away, even while
Steve> discussion on key design issues continues.

Don't forget hacking on Classpath :-)

Steve> . bytecode optimization [reserach]

Something interesting sort of related to this area is "vmgen".  This
seems like a nice way to build interpreters.  It seems to let you
focus on writing the logic, and not worry about the details (stack
caching, superinstructions, scheduling, whatever new stuff they've
come up with...)

Tom

Re: Work items

Posted by "Geir Magnusson Jr." <ge...@apache.org>.
Thanks Steve!

Looking at these, they seem to be split between working at the  
respective projects, and working here.

For stuff elsewhere, how do we want to keep in touch?  Cross-posting  
might be too much, but maybe people that do go work on these ideas  
elsewhere can report back from time to time?

For the stuff that can work here, lets try to establish informal  
subject keys (ex [dynaload] ........ ) so we can separate what is  
what on the dev list.  Also, I'll setup SVN and start the committer  
policy discussion.  We're going to need to highly segment the code  
repository for reasons that will be made clear.

geir


On May 27, 2005, at 5:30 PM, Steve Blackburn wrote:

> Hi all,
>
> I imagine I'm not alone in thinking that there are a great many
> concrete things people can be working on right away, even while
> discussion on key design issues continues.  It would be good to have a
> work list of concrete work items that eager folks can go to.  So I'll
> stick my neck out and throw out a list of a few tasks that can be
> worked on right away.  These include prototyping, research and
> improvements to existing infrastructure. I have not included broad
> design issues on there since design issues are not "projects", but the
> subject of large scale discusion.
>
> Among the items listed below, I have included items from MMTk and
> Jikes RVM and I'm hoping others will do the same and add items from
> other projects that work towards the goal of contributing to the
> Harmony project.  I hope everyone feels free to add to the list with
> any work items they can think of (and argue for the removal of items
> if they want).  This is just a start.  The list definitely needs
> improving and growing, but at least there is no reason for anyone to
> be sitting on their hands ;-)
>
> What do people think?
>
> --Steve
>
>
> . prototype backend generator [prototype]
>
>  Explore Ertl & Gregg's work to develop a "backend generator" which
>  leverages the portability of gcj to automatically generate backends
>  for a simple JIT.  The semantics of Java bytecodes are expressed
>  using Java code, and then gcj is used to generate code fragments
>  which are then captured for use by a simple JIT (Ertl & Gregg used C
>  and gcc, but with vmmagic support, Java and gcj would be nicer).
>  See also
>
> http://www.csc.uvic.ca/~csc586a/papers/ertlgregg04.pdf
>  http://mail-archives.apache.org/mod_mbox/incubator-harmony-dev/ 
> 200505.mbox/%3c20050523.084854.343163557.shudo@aist.go.jp%3e
>
> . vmmagic support for gcj [gcj]
>
>  Implement the vmmagic unboxed types in gcj, so they are compiled to
>  unboxed native operations (eg Address.loadInt() will perform an
>  integer load from an address value, and all instances of Address
>  will appear not as object instances but as 32/64 bit primitives).
>
>  http://jikesrvm.sourceforge.net/api/org/vmmagic/unboxed/package- 
> summary.html
>
>  This will allow gcj to be used to build Java code which uses the
>  vmmagic primitves.  Thus gcj could be used to build a virtual
>  machine image, an interpreter, or be used to automatically generate
>  a compiler back end.
>
> . bytecode optimization [reserach]
>
>  On architectures where there is no high performance JIT, the
>  performance of the interpreter or simple jit (see backend generator
>  project) will be the limiting factor.  (High performance JITs are
>  heavyweight and inevitably will only exist on the architectures were
>  there has been demonstrated demand for it). Since a simple
>  interpreter or simple JIT has very limited opportunity for
>  optimization, it may be very important to perform bytecode
>  optimization.  There are a number of toolkits including bloat and
>  soot.
>
>  http://www.sable.mcgill.ca/soot/
>  http://www.cs.purdue.edu/s3/projects/bloat/
>  http://portal.acm.org/citation.cfm?id=781995.782008
>
> . dynamic loading VM components and libraries [research]
>
>  Whether built in Java, C/C++ or a mix, harmony should probably have
>  strong support for dynamic loading and sharing of precompiled
>  "libraries" designed in from the begining.  This is a very
>  interesting problem space, possibly made more interesting by
>  JSR-121.  Jikes RVM, for example, mmaps a monolithic image
>  corresponding to a lot of precompiled and initialized classes.  ORP
>  dynamically loads components such as the JIT.  Generalizing these
>  ideas would be enormously helpful to harmony. Minimizing the
>  footprint of the VM, particularly in the context of multiple VMs is
>  interesting.  There are complex issues associated with class
>  initialization and related code optimization issues (a static final
>  field that is initiated in a context sensitive way can't be constant
>  folded by an optimizing compiler except in a context specific
>  way---in other words the optimization would have to be ignored or
>  else separate code would have to be generated for each instance).
>  This also stirs up questions of heap and memory layout, the location
>  of class fields versus code etc etc.  There is a lot here.  It is
>  very important and very interesting.  As far as jsr 121 goes, it is
>  important to note that Doug Lea is involved in both Harmony and the
>  JSR proposal.
>
>  http://jcp.org/aboutJava/communityprocess/pr/jsr121/index.html
>
> . benchmarks [dacapo]
>
>  Good benchmarks are essential.  A major effort has been underway in
>  the DaCapo group to put together a new benchmark suite.  Plans are
>  to relicense this under the Apache license.  The suite badly needs
>  new packaging and a new harness and is looking for someone to pick
>  up the ball and run with it.  Many of the constituent benchmarks are
>  drawn from the Jakarta project.
>
>  http://www-ali.cs.umass.edu/DaCapo/gcbm.html
>
> . Generalize object model abstraction [mmtk,ovm]
>
>  There are many different object models (ways of laying out an
>  object, its header and associated metadata).  A good GC toolkit
>  should be able to abstract over the various object models a host VM
>  may implement.  Providing a suitably high level abstraction without
>  performance penalty is an important item on the MMTk todo list.
>  This sort of generality is something the OVM project has focussed
>  on.
>
>  http://www.ovmj.org/
>  http://www.ovmj.org/ovmir.ps
>
> . MMTk as .so [mmtk]
>
>  Build MMTk as a .so, ideally adding new pragmas so that barriers and
>  allocation sequeneces are materialized as inlined C in mmtk.h.
>  Builds on prior work by Robin Garner and Andrew Gray.  Will allow
>  MMTk to be linked into C-based VM's.
>
>  http://cs.anu.edu.au/~Andrew.Gray/rmtk/
>
> . jikes rvm opt modularization [jikesrvm]
>
>  The Jikes RVM optimizing compiler is one of the valuable aspects of
>  Jikes RVM.  If it is to be reused by harmony it needs to be properly
>  modularized.  This project will involve "normalizing" the code
>  (getting rid of Jikes RVM-specific idioms and ideosynchrosies) and
>  clearly defining the module.  In addition to extracting one of the
>  most important
>
>  http://jikesrvm.sourceforge.net/userguide/HTML/optdetails.html
>
> . remove dependency on jburg in jikesrvm [jikesrvm]
>
>  iburg is a widely used bottom up re-write tool used to produce code
>  generators.  Jikes RVM added some extensions, known within jikes rvm
>  as jburg.  There also exists a tool called jburg which is a java
>  implementation of the original burs work (not directly derived from
>  iburg, but written from scratch on the basis of the orginal Fraser,
>  Hansen and Proebsting paper.
>
>  http://jburg.sourceforge.net/
>  http://cvs.sourceforge.net/viewcvs.py/jikesrvm/rvm/src/tools/jburg/
>  http://www.cs.princeton.edu/software/iburg/
>  http://portal.acm.org/citation.cfm?id=151640.151642
>  http://www.cs.manchester.ac.uk/mscprojects/projects.05/jam.html
>
>

-- 
Geir Magnusson Jr                                  +1-203-665-6437
geirm@apache.org



Re: Work items

Posted by "Geir Magnusson Jr." <ge...@apache.org>.
On May 27, 2005, at 6:09 PM, Raffaele Castagno wrote:

> 2005/5/27, Steve Blackburn <St...@anu.edu.au>:
>
>>
>> Hi all,
>>
>> I imagine I'm not alone in thinking that there are a great many
>> concrete things people can be working on right away, even while
>> discussion on key design issues continues. It would be good to have a
>>
>>
>
> My 2 (euro)cents:
>
> There's not only the need to start implementing code.
>
> I'm (slowly) translating some of the wiki documentation to Italian,  
> but
> someone could also create a webpage for the Incubator site,

We'll put up a webpage to track incubator status, but we also need to  
decide on how we want to do the project website.

Ideas?  I've used simple xdoc, maven and forrest, and I think that  
simple xdoc is the most straightforward.

> or sort
> alphabetically the "People" page, organize the reference  
> documentation, or
> simply change the layout of the wiki to make it more accessibile  
> and good
> looking. These are tasks that anyone could take in charge, but that  
> are
> somehow important anyway.

indeed.  thanks

geir

>
> Best Regards
>
> Raffaele
>
> -- 
> If you want a GMail account, send me an E-Mail.
>

-- 
Geir Magnusson Jr                                  +1-203-665-6437
geirm@apache.org



Re: Work items

Posted by Steve Blackburn <St...@anu.edu.au>.
Raffaele Castagno wrote:

>There's not only the need to start implementing code.
>
>I'm (slowly) translating some of the wiki documentation to Italian, but 
>someone could also create a webpage for the Incubator site, or sort 
>alphabetically the "People" page, organize the reference documentation, or 
>simply change the layout of the wiki to make it more accessibile and good 
>looking. These are tasks that anyone could take in charge, but that are 
>somehow important anyway.
>
>  
>
Absolutely!!!  This is invaluable to the project!

--Steve

Re: Work items

Posted by Raffaele Castagno <ra...@gmail.com>.
2005/5/27, Steve Blackburn <St...@anu.edu.au>:
> 
> Hi all,
> 
> I imagine I'm not alone in thinking that there are a great many
> concrete things people can be working on right away, even while
> discussion on key design issues continues. It would be good to have a
> 

My 2 (euro)cents:

There's not only the need to start implementing code.

I'm (slowly) translating some of the wiki documentation to Italian, but 
someone could also create a webpage for the Incubator site, or sort 
alphabetically the "People" page, organize the reference documentation, or 
simply change the layout of the wiki to make it more accessibile and good 
looking. These are tasks that anyone could take in charge, but that are 
somehow important anyway.

Best Regards

Raffaele

-- 
If you want a GMail account, send me an E-Mail.

Re: Work items

Posted by Steve Blackburn <St...@anu.edu.au>.
Rodrigo Kumpera wrote:

>Next time I'm going to implement something in line with jikesRVM`s vmmagic 
>(more like stealing the whole concept)
>
That's the idea!  vmmagic has emerged from at least three separate 
projects (OVM, JikesRVM and jnode).  If you're interested in the whole 
magic concept, there is discussion of it in the original Jalapeno papers 
and here's a tech report written by OVM people a couple of years back: 
http://www.ovmj.org/www_papers/idioms.ps

Perhaps we can start a new thread with [exec] to discuss execution 
(interpretation and JIT).  I think it would be great if people like you 
were to discuss your experiences some more.  We may need separate 
[interp] and [jit] threads, but to start with there are a bunch of 
questions about how the two ideas relate, how they interact, the various 
pros and cons, so maybe we just start with the broader thread of [exec]

> and maybe give a try with 
>self-hosting. Steve, do you have some pointers about how jikesRVM or OVM 
>does that?
>
In a nutshell, jikes rvm is first generation, it was the first to do 
this. OVM came later, and from what I hear did a nicer job and 
introduced greater flexibility.  BTW, I'm not quite sure what you mean 
by "self-hosting".  One interpretation is that you only need Jikes RVM 
to build Jikes RVM.  That is currently not true because of some 
relatively minor limitations in Jikes RVM.  So we depend on a third 
party VM to build (Sun, IBM, or kaffe).

Here are some of the original papers describing the java-in-java approach:
  http://jikesrvm.sourceforge.net/info/pubs.shtml#oopsla99_jvm
  http://jikesrvm.sourceforge.net/info/pubs.shtml#wcsss99

Note that JikesRVM has moved a fair way since then, however, the boot 
image writing and the key java-in-java issues have not changed much at all.

I'd also suggest you take a look at the OVM sources:

http://www.ovmj.org/software.htm

With some luck some of the ovm'ers will chime in with something more 
helpful.

--Steve

Re: Work items

Posted by Rodrigo Kumpera <ku...@gmail.com>.
Latelly I've been playing with a toy JITer written in java (like jikesRVM 
baseline compiler) for x86 on windows. It works on a single pass and perform 
no optimizations at all, but the generated code is correct.
The parts missing are the hard ones, object allocation and exception 
handling.

Next time I'm going to implement something in line with jikesRVM`s vmmagic 
(more like stealing the whole concept) and maybe give a try with 
self-hosting. Steve, do you have some pointers about how jikesRVM or OVM 
does that? 


Rodrigo


On 5/27/05, Steve Blackburn <St...@anu.edu.au> wrote:
> 
> Hi all,
> 
> . prototype backend generator [prototype]
> 
> Explore Ertl & Gregg's work to develop a "backend generator" which
> leverages the portability of gcj to automatically generate backends
> for a simple JIT. The semantics of Java bytecodes are expressed
> using Java code, and then gcj is used to generate code fragments
> which are then captured for use by a simple JIT (Ertl & Gregg used C
> and gcc, but with vmmagic support, Java and gcj would be nicer).
> See also
> 
> http://www.csc.uvic.ca/~csc586a/papers/ertlgregg04.pdf
> 
> 
> 
> http://mail-archives.apache.org/mod_mbox/incubator-harmony-dev/200505.mbox/%3c20050523.084854.343163557.shudo@aist.go.jp%3e
> 
>