You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Doug Lea <dl...@cs.oswego.edu> on 2005/05/22 17:04:07 UTC

Harmonizing on modularity

No matter whether you think you are starting with a JVM written in
Java or a micro-kernel-ish one in C (which seem to be the leading
options), you will probably discover that you end up writing most of
it in Java.  For just about every major subsystem, you will find that
some of it has to be in Java anyway, and most of it turns out to be
better to write in Java.  Steve Blackburn mentioned some of the
technical issues with respect to GC. Similar ones arise with
concurrency support, reflection, IO, verification, code generation,
and so on, as has been discovered by people developing both
research and commercial JVMs.

One of the challenges here is that the Java programming language
curently does not make it possible to distinguish those classes
sitting in the JRE library (usually, the stuff in "rt.jar") that are
logically part of the JVM vs the normal APIs that normal Java
programmers are supposed to use. (Although most existing JVMs
dynamically enforce inaccessiblity of some APIs by exploiting rules
about bootclasspaths in some special cases, but this is not a general
solution.)

As one consequence, some people have been led to make all sorts of
now-well-known complaints about developers programming to the
platform-specific APIs of existing JVMs. This must be avoided in
Harmony.

Independently, there has been a lot of discussion lately of possible
language enhancements resulting in some kind of module support for
J2SE 7.0 (Yes, the one after Mustang) to provide a more general
solution to the need for semi-private interfaces among subsystem-level
components, as well as for similar issues that arise in layered
middleware, as well as higher-level escape-from-jar-hell issues.  Some
people would like to see a first class module system (see for example
MJ http://www.research.ibm.com/people/d/dgrove/papers/oopsla03.html,
as well as similar work at Utah http://www.cs.utah.edu/flux/) Some
people would like something more like a saner version of C++
"friends". I don't think that any of these have been subjected to
enough thought and empirical experience to make a decision about which
way to go.

So it seems to me that this would be a good opportunity to harmonize:
Help work on possible forms of language-based module support, work
within JCP and with other JVM providers to eventually propel this into
the standard, and help shape the basic structure of a second/third
generation JVM.

(Digression: The issues underlying all of this have long been a big
concern to me, but I spent a long time fruitlessly dealing with them
the wrong way. JVM providers don't want to standardize on the contents
of the APIs for intrinsics etc. But they should be pleased to
standardize on the way in which they are expressed.)


-Doug

Re: Harmonizing on modularity

Posted by Jakob Praher <jp...@yahoo.de>.
Hi Renaud,

Renaud BECHADE wrote:
> >I think this discussion soon gets into a java language/system debate,
> >because one could argue why we need to do this tight bundling between
> >the bunch of classes in rt.jar and the vm version. For instance: Why do
> >I have to wait for JVM 6 to fix that bug in Swing, which I need now in
> >my implementation. On the other hand this "expected behavior" is what
> >makes Java very appealing to integrators. ....
> 
> You are kind of pinpointing subtle incompatibilities that /will/ exist and
> require some packaging effort to get users well... use the VM with ergonomic
> perception ("it just works"). If we consider an OS parabola, this is just
> like FreeBSD vs. Linux: both are POSIX and you should not patch code to have
> it run on the other (you can even run Linux ELF code on FreeBSD), but in
> practice some adaptations are required (for instance on my machines the
> Sun-Linux-JDK crashes with a great facility...). Hence FreeBSD has its own
> ports system.

Sure these are minor issues. But on the other hand Java is more than an
operating system. In current operating systems, the developer works with
3rd party libs most of the time. The only direct access to posix stuff
is through core libraries which don't change much. The java runtime
packages many things - from java.lang.String to Swing to Xalan and
Xerces. So I wouldn't say that rt.jar makes the java runtime - rt.jar is
just a packaging model suitable to Sun. One could take other directions
towards that. But to remain copatible (and thus be competitive) one has
to make sure that it works like in rt.jar. Java's way of binary
compatibility makes this easier than it is in C/C++ for instance.

-- Jakob


Re: Harmonizing on modularity

Posted by "Richard S. Hall" <he...@ungoverned.org>.
These issues were being touched on a prior thread named "Class Library 
Modularity".

I think this is a very interesting area. A lot of this type of 
modularity stuff has been investigated in the OSGi framework for quite 
some time. It is possible to go a long way toward decent modularity just 
by defining a sane class loader runtime, which is the OSGi framework 
approach.

I am the implementer of Oscar, an open source OSGi framework 
implementation. I have been experimenting with capabilities that improve 
the current OSGi v3 capabilities for dealing with versions and the 
sharing of implementation as well as specification packages.

I have just coded (and will make available this week) features that 
allow a JAR file to declare exported package "filters" so that it is 
possible for a JAR file to declare the visibility of certain classes in 
its exported packages, while maintaining full visibility within the JAR 
file. Check back at the following link later this week for details:

    http://oscar.objectweb.org/oscar-alpha.html

This is some pretty interesting stuff and does not require any changes 
to the Java language. I envision that it would be possible to use such 
an approach to bundle up various JVM class libraries to not only hide 
their implementation APIs, but to also make them independently 
deployable. Starting a JVM "from scratch" would make it pretty easy to 
follow the proper rules right from the beginning.

There is a lot that can be done following this approach. Future OSGi 
specifications may have more to say about these kinds of issues too.

-> richard

Jakob Praher wrote:

>Hi Doug,
>
>thanks for joining the discussion.
>
>Doug Lea wrote:
>  
>
>>No matter whether you think you are starting with a JVM written in
>>Java or a micro-kernel-ish one in C (which seem to be the leading
>>options), you will probably discover that you end up writing most of
>>it in Java.  
>>    
>>
>I think that a SystemJava dialect, like in Jikes (where the compiler
>does the magic) is very interesting. I was and am a big Self/Smalltalk
>fan, where this debate is much larger, but I also think writing the core
>in a C/C++ can give some interesting points too. Some kind of language
>argnosticism and the ability to export some features more easily to non
>java systems, and I think when it comes to Address manipulation, C is
>much more handy. [this is just my opinion]
>
>As I said the most interesting part is that of a lower level common
>infrastructure
>
>a) to extend the vm more safely
>b) to build bridges to other runtimes (parley)
>c) to make pluggable extensions.
>
>Implementing a middle layer in Java makes much sense to me.
>
>  
>
>>For just about every major subsystem, you will find that
>>some of it has to be in Java anyway, 
>>    
>>
>I would agree with you, but I wouldn't call it Java here. Because that
>is mostly a very restricted sub-dialect of Java. ("No" invocation costs
>of methods, etc).
>
>  
>
>>and most of it turns out to be
>>better to write in Java.  Steve Blackburn mentioned some of the
>>technical issues with respect to GC. Similar ones arise with
>>concurrency support, reflection, IO, verification, code generation,
>>and so on, as has been discovered by people developing both
>>research and commercial JVMs.
>>    
>>
>
>What your results from the OVM project?
>
>Probably true. But I would favor:
>
>* System langauge for the very low level infrastructure and to do
>express raw access (pointers, explicit memory layout) to objects for
>high performance very rare use.
>
>* Special Java (JikesRVM like) to have a low level interface, which
>abstracts from much very low level stuff (implemented on top/in terms of
>that)
>	-> GC details
>Perhaps even restrict much of Java's dynamic behavior for that.
>Having multiple linkages would be interesting for that (implemented via
>attributes for instance).
>
>* Java for much of the rest
>
>AFAIK working with LLVM is also a joy (it is written in C++) and this
>has many advantages too.
>
>  
>
>>One of the challenges here is that the Java programming language
>>curently does not make it possible to distinguish those classes
>>sitting in the JRE library (usually, the stuff in "rt.jar") that are
>>logically part of the JVM vs the normal APIs that normal Java
>>programmers are supposed to use. (Although most existing JVMs
>>dynamically enforce inaccessiblity of some APIs by exploiting rules
>>about bootclasspaths in some special cases, but this is not a general
>>solution.)
>>    
>>
>Good point. I would see it abit otherwise. One has to question what
>classes are "system" on whose not. rt.jar is kind of grown out of suns
>implementation. I think they have chosen this big rt.jar because you can
>mmap it and you have access to all the "core" Java apis faster. I thinkk
>one should make a distinction between classes which are closer to the vm
>(for instance AtomcXxx would be of that kind) and others not
>(xerces,xalan classes). rt.jar is much too coarse grained for that.
>
>But making system classes cross vm implies that there exists some
>special API for that, which is probably hard to achieve (as you said
>below). But having such system classes would at least make explict
>treatment of special classes much more easy.
>
>I think this discussion soon gets into a java language/system debate,
>because one could argue why we need to do this tight bundling between
>the bunch of classes in rt.jar and the vm version. For instance: Why do
>I have to wait for JVM 6 to fix that bug in Swing, which I need now in
>my implementation. On the other hand this "expected behavior" is what
>makes Java very appealing to integrators. ....
>
>  
>
>>Independently, there has been a lot of discussion lately of possible
>>language enhancements resulting in some kind of module support for
>>J2SE 7.0 (Yes, the one after Mustang) to provide a more general
>>solution to the need for semi-private interfaces among subsystem-level
>>components, as well as for similar issues that arise in layered
>>middleware, as well as higher-level escape-from-jar-hell issues.  
>>    
>>
>A powerful friend like approach would be very cool. I would like to join
>the discussion on that.
>
>Some
>  
>
>>people would like to see a first class module system (see for example
>>MJ http://www.research.ibm.com/people/d/dgrove/papers/oopsla03.html,
>>as well as similar work at Utah http://www.cs.utah.edu/flux/) Some
>>people would like something more like a saner version of C++
>>"friends". I don't think that any of these have been subjected to
>>enough thought and empirical experience to make a decision about which
>>way to go.
>>    
>>
>:-) - I have started a modjava project at sf.net (modjava.sf.net) some
>time ago, but doing it without the power of the vm sort of was not that
>interesting to me. (Security issues, and not beeing able to faster then
> jar based approach was some kind of frustrating) Also I haven't found
>that momentum in the open source communities.
>  
>
>>So it seems to me that this would be a good opportunity to harmonize:
>>Help work on possible forms of language-based module support, work
>>within JCP and with other JVM providers to eventually propel this into
>>the standard, and help shape the basic structure of a second/third
>>generation JVM.
>>    
>>
>Good advice. Looking forward for a JSR on that.
>  
>
>>(Digression: The issues underlying all of this have long been a big
>>concern to me, but I spent a long time fruitlessly dealing with them
>>the wrong way. JVM providers don't want to standardize on the contents
>>of the APIs for intrinsics etc. But they should be pleased to
>>standardize on the way in which they are expressed.)
>>    
>>
>Having a standardized middle-layer IR, whith a well defined low level
>type system and export format to store and transform the ir would be a
>big step forward for me. Every advanced adaptible vm compiler does a
>transformation of java bytecodes to at least a MIR before generating
>processor code. If the format were standardized (like some kind of LLVM)
>this would be *very* interesting. Also they have the ability to
>implement instriscs via shared objects to extend the IR without
>introducing new opcodes.
>
>-- Jakob
>
>
>
>  
>

RE: Harmonizing on modularity

Posted by Renaud BECHADE <re...@numerix.com>.
>I think this discussion soon gets into a java language/system debate,
>because one could argue why we need to do this tight bundling between
>the bunch of classes in rt.jar and the vm version. For instance: Why do
>I have to wait for JVM 6 to fix that bug in Swing, which I need now in
>my implementation. On the other hand this "expected behavior" is what
>makes Java very appealing to integrators. ....

You are kind of pinpointing subtle incompatibilities that /will/ exist and
require some packaging effort to get users well... use the VM with ergonomic
perception ("it just works"). If we consider an OS parabola, this is just
like FreeBSD vs. Linux: both are POSIX and you should not patch code to have
it run on the other (you can even run Linux ELF code on FreeBSD), but in
practice some adaptations are required (for instance on my machines the
Sun-Linux-JDK crashes with a great facility...). Hence FreeBSD has its own
ports system.

RB

-----Original Message-----
From: news [mailto:news@sea.gmane.org] On Behalf Of Jakob Praher
Sent: Monday, May 23, 2005 1:04 AM
To: harmony-dev@incubator.apache.org
Subject: Re: Harmonizing on modularity

Hi Doug,

thanks for joining the discussion.

Doug Lea wrote:
>
> No matter whether you think you are starting with a JVM written in
> Java or a micro-kernel-ish one in C (which seem to be the leading
> options), you will probably discover that you end up writing most of
> it in Java.
I think that a SystemJava dialect, like in Jikes (where the compiler
does the magic) is very interesting. I was and am a big Self/Smalltalk
fan, where this debate is much larger, but I also think writing the core
in a C/C++ can give some interesting points too. Some kind of language
argnosticism and the ability to export some features more easily to non
java systems, and I think when it comes to Address manipulation, C is
much more handy. [this is just my opinion]

As I said the most interesting part is that of a lower level common
infrastructure

a) to extend the vm more safely
b) to build bridges to other runtimes (parley)
c) to make pluggable extensions.

Implementing a middle layer in Java makes much sense to me.

> For just about every major subsystem, you will find that
> some of it has to be in Java anyway,
I would agree with you, but I wouldn't call it Java here. Because that
is mostly a very restricted sub-dialect of Java. ("No" invocation costs
of methods, etc).

> and most of it turns out to be
> better to write in Java.  Steve Blackburn mentioned some of the
> technical issues with respect to GC. Similar ones arise with
> concurrency support, reflection, IO, verification, code generation,
> and so on, as has been discovered by people developing both
> research and commercial JVMs.

What your results from the OVM project?

Probably true. But I would favor:

* System langauge for the very low level infrastructure and to do
express raw access (pointers, explicit memory layout) to objects for
high performance very rare use.

* Special Java (JikesRVM like) to have a low level interface, which
abstracts from much very low level stuff (implemented on top/in terms of
that)
	-> GC details
Perhaps even restrict much of Java's dynamic behavior for that.
Having multiple linkages would be interesting for that (implemented via
attributes for instance).

* Java for much of the rest

AFAIK working with LLVM is also a joy (it is written in C++) and this
has many advantages too.

>
> One of the challenges here is that the Java programming language
> curently does not make it possible to distinguish those classes
> sitting in the JRE library (usually, the stuff in "rt.jar") that are
> logically part of the JVM vs the normal APIs that normal Java
> programmers are supposed to use. (Although most existing JVMs
> dynamically enforce inaccessiblity of some APIs by exploiting rules
> about bootclasspaths in some special cases, but this is not a general
> solution.)
Good point. I would see it abit otherwise. One has to question what
classes are "system" on whose not. rt.jar is kind of grown out of suns
implementation. I think they have chosen this big rt.jar because you can
mmap it and you have access to all the "core" Java apis faster. I thinkk
one should make a distinction between classes which are closer to the vm
(for instance AtomcXxx would be of that kind) and others not
(xerces,xalan classes). rt.jar is much too coarse grained for that.

But making system classes cross vm implies that there exists some
special API for that, which is probably hard to achieve (as you said
below). But having such system classes would at least make explict
treatment of special classes much more easy.

I think this discussion soon gets into a java language/system debate,
because one could argue why we need to do this tight bundling between
the bunch of classes in rt.jar and the vm version. For instance: Why do
I have to wait for JVM 6 to fix that bug in Swing, which I need now in
my implementation. On the other hand this "expected behavior" is what
makes Java very appealing to integrators. ....

>
> Independently, there has been a lot of discussion lately of possible
> language enhancements resulting in some kind of module support for
> J2SE 7.0 (Yes, the one after Mustang) to provide a more general
> solution to the need for semi-private interfaces among subsystem-level
> components, as well as for similar issues that arise in layered
> middleware, as well as higher-level escape-from-jar-hell issues.
A powerful friend like approach would be very cool. I would like to join
the discussion on that.

Some
> people would like to see a first class module system (see for example
> MJ http://www.research.ibm.com/people/d/dgrove/papers/oopsla03.html,
> as well as similar work at Utah http://www.cs.utah.edu/flux/) Some
> people would like something more like a saner version of C++
> "friends". I don't think that any of these have been subjected to
> enough thought and empirical experience to make a decision about which
> way to go.
:-) - I have started a modjava project at sf.net (modjava.sf.net) some
time ago, but doing it without the power of the vm sort of was not that
interesting to me. (Security issues, and not beeing able to faster then
 jar based approach was some kind of frustrating) Also I haven't found
that momentum in the open source communities.
>
> So it seems to me that this would be a good opportunity to harmonize:
> Help work on possible forms of language-based module support, work
> within JCP and with other JVM providers to eventually propel this into
> the standard, and help shape the basic structure of a second/third
> generation JVM.
Good advice. Looking forward for a JSR on that.
>
> (Digression: The issues underlying all of this have long been a big
> concern to me, but I spent a long time fruitlessly dealing with them
> the wrong way. JVM providers don't want to standardize on the contents
> of the APIs for intrinsics etc. But they should be pleased to
> standardize on the way in which they are expressed.)
Having a standardized middle-layer IR, whith a well defined low level
type system and export format to store and transform the ir would be a
big step forward for me. Every advanced adaptible vm compiler does a
transformation of java bytecodes to at least a MIR before generating
processor code. If the format were standardized (like some kind of LLVM)
this would be *very* interesting. Also they have the ability to
implement instriscs via shared objects to extend the IR without
introducing new opcodes.

-- Jakob


Re: Harmonizing on modularity

Posted by Jakob Praher <jp...@yahoo.de>.
Hi Doug,

thanks for joining the discussion.

Doug Lea wrote:
> 
> No matter whether you think you are starting with a JVM written in
> Java or a micro-kernel-ish one in C (which seem to be the leading
> options), you will probably discover that you end up writing most of
> it in Java.  
I think that a SystemJava dialect, like in Jikes (where the compiler
does the magic) is very interesting. I was and am a big Self/Smalltalk
fan, where this debate is much larger, but I also think writing the core
in a C/C++ can give some interesting points too. Some kind of language
argnosticism and the ability to export some features more easily to non
java systems, and I think when it comes to Address manipulation, C is
much more handy. [this is just my opinion]

As I said the most interesting part is that of a lower level common
infrastructure

a) to extend the vm more safely
b) to build bridges to other runtimes (parley)
c) to make pluggable extensions.

Implementing a middle layer in Java makes much sense to me.

> For just about every major subsystem, you will find that
> some of it has to be in Java anyway, 
I would agree with you, but I wouldn't call it Java here. Because that
is mostly a very restricted sub-dialect of Java. ("No" invocation costs
of methods, etc).

> and most of it turns out to be
> better to write in Java.  Steve Blackburn mentioned some of the
> technical issues with respect to GC. Similar ones arise with
> concurrency support, reflection, IO, verification, code generation,
> and so on, as has been discovered by people developing both
> research and commercial JVMs.

What your results from the OVM project?

Probably true. But I would favor:

* System langauge for the very low level infrastructure and to do
express raw access (pointers, explicit memory layout) to objects for
high performance very rare use.

* Special Java (JikesRVM like) to have a low level interface, which
abstracts from much very low level stuff (implemented on top/in terms of
that)
	-> GC details
Perhaps even restrict much of Java's dynamic behavior for that.
Having multiple linkages would be interesting for that (implemented via
attributes for instance).

* Java for much of the rest

AFAIK working with LLVM is also a joy (it is written in C++) and this
has many advantages too.

> 
> One of the challenges here is that the Java programming language
> curently does not make it possible to distinguish those classes
> sitting in the JRE library (usually, the stuff in "rt.jar") that are
> logically part of the JVM vs the normal APIs that normal Java
> programmers are supposed to use. (Although most existing JVMs
> dynamically enforce inaccessiblity of some APIs by exploiting rules
> about bootclasspaths in some special cases, but this is not a general
> solution.)
Good point. I would see it abit otherwise. One has to question what
classes are "system" on whose not. rt.jar is kind of grown out of suns
implementation. I think they have chosen this big rt.jar because you can
mmap it and you have access to all the "core" Java apis faster. I thinkk
one should make a distinction between classes which are closer to the vm
(for instance AtomcXxx would be of that kind) and others not
(xerces,xalan classes). rt.jar is much too coarse grained for that.

But making system classes cross vm implies that there exists some
special API for that, which is probably hard to achieve (as you said
below). But having such system classes would at least make explict
treatment of special classes much more easy.

I think this discussion soon gets into a java language/system debate,
because one could argue why we need to do this tight bundling between
the bunch of classes in rt.jar and the vm version. For instance: Why do
I have to wait for JVM 6 to fix that bug in Swing, which I need now in
my implementation. On the other hand this "expected behavior" is what
makes Java very appealing to integrators. ....

> 
> Independently, there has been a lot of discussion lately of possible
> language enhancements resulting in some kind of module support for
> J2SE 7.0 (Yes, the one after Mustang) to provide a more general
> solution to the need for semi-private interfaces among subsystem-level
> components, as well as for similar issues that arise in layered
> middleware, as well as higher-level escape-from-jar-hell issues.  
A powerful friend like approach would be very cool. I would like to join
the discussion on that.

Some
> people would like to see a first class module system (see for example
> MJ http://www.research.ibm.com/people/d/dgrove/papers/oopsla03.html,
> as well as similar work at Utah http://www.cs.utah.edu/flux/) Some
> people would like something more like a saner version of C++
> "friends". I don't think that any of these have been subjected to
> enough thought and empirical experience to make a decision about which
> way to go.
:-) - I have started a modjava project at sf.net (modjava.sf.net) some
time ago, but doing it without the power of the vm sort of was not that
interesting to me. (Security issues, and not beeing able to faster then
 jar based approach was some kind of frustrating) Also I haven't found
that momentum in the open source communities.
> 
> So it seems to me that this would be a good opportunity to harmonize:
> Help work on possible forms of language-based module support, work
> within JCP and with other JVM providers to eventually propel this into
> the standard, and help shape the basic structure of a second/third
> generation JVM.
Good advice. Looking forward for a JSR on that.
> 
> (Digression: The issues underlying all of this have long been a big
> concern to me, but I spent a long time fruitlessly dealing with them
> the wrong way. JVM providers don't want to standardize on the contents
> of the APIs for intrinsics etc. But they should be pleased to
> standardize on the way in which they are expressed.)
Having a standardized middle-layer IR, whith a well defined low level
type system and export format to store and transform the ir would be a
big step forward for me. Every advanced adaptible vm compiler does a
transformation of java bytecodes to at least a MIR before generating
processor code. If the format were standardized (like some kind of LLVM)
this would be *very* interesting. Also they have the ability to
implement instriscs via shared objects to extend the IR without
introducing new opcodes.

-- Jakob