You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Mark Hindess <ma...@googlemail.com> on 2009/08/13 15:17:45 UTC

Re: [GSOC] The code for smallest class set for customer application in now on JIRA

In message <2c...@mail.gmail.com>,
Daniel Gong writes:
> 
> Hi all,
> I have my code attached in issue HARMONY-6291 on JIRA. I'd like to call it
> MinJre Toolkit.

Nice work Daniel.

A couple of comments . . .

When running on linux I have to set jdk.dir (or JAVA_HOME), origin.dir
and target.dir.  It would be nice if I could just set one property
(assuming people are running with Harmony and if not they deserve to
have to set more properties ;-) and have the others default to something
sensible.  So something like:

  	<property name="jdk.dir" location="${env.JAVA_HOME}" />
	<property name="origin.dir" location="${jdk.dir}/jre" />
	<property name="target.dir" location="${origin.dir}-min" />

You will also note that I changed the properties to use location="..."
rather than value="..." since the later converts them to a full path.
If I didn't do this I had a problem because g++ was execute in a
different working directory and a relative path is then broken.

With these changes I can just do:

  JAVA_HOME=../target/hdk/jdk ant

and it uses Harmony to create a new minimal jre in "../target/hdk/jdk/jre-min".

In addition to the class lists, in cns it would be nice to have some 
information about why a class was required.  For instance, I'd like to know
why the static analysis decided that org.apache.bcel.generic.LoadClass was
required for Hello.class.

Similarly, it would be nice to have a log file created that shows why
each item in jre-min was copied from the jre.  The default should be to
copy almost nothing but the launcher and justify everything else you
copy.  I think this is a good approach since it would avoid copying:

1) Artifacts for which there is no justification.  For example,
security-kernel-stubs.jar which is a jar of empty method stubs used
for satisfying dependencies during compilation.

2) Artifacts for which the justification is dependent on another artifact.
For example, without awt.jar there is no point in having the awt artifacts
such as:

  bin/libFL.so
  bin/libgl.so
  bin/libjpegdecoder.so
  bin/liblcmm.so
  bin/liblinuxfont.so
  bin/liboglwrapper.so
  bin/libX11Wrapper.so
  lib/fonts
  lib/cmm

Similarly for the DLLs for natives corresponding to jars.  You'd also
probably not end up with empty directories such as lib/boot/yoko-1.0 or
manifests with no corresponding jar such as lib/boot/asm-3.1.


Since it is simple to automate, you might want to comment out any jars
you remove from the bootclasspath.properties file to avoid the VM having
to look for them at all.


As an experiment, I took the jre-min for the example Hello class and
removed the unused DLLs, cmm, font and jar files[0].  Excluding the jvm
in jre/bin/default that reduced the size from 19.4k to 10.4k which is
53.7% of the original minimal jre size.

Taking the -verbose:class output, I then removed the classes from the
jars that were not listed in the -verbose:class output.  This reduced
the since to 8.6k which is 44.2% of the original minimal jre size.  This
is totally crazy for a real world application but give some idea of the
absolute minimal boot class set.

Thanks again for your interesting work.

Regards,
 Mark.

[0] I basically wrote a script which did 'chmod 0' on each file, ran
the Hello class, then either removed the file or reverted the chmod
depending on the success or otherwise of the Hello run.  The list of
files I removed is:

bin/libaccessors.so
bin/libFL.so
bin/libgl.so
bin/libhyauth.so
bin/libhyinstrument.so
bin/libhyniochar.so[1]
bin/libhysecurity.so
bin/libjpegdecoder.so
bin/libjpegencoder.so
bin/liblcmm.so
bin/liblinuxfont.so
bin/liboglwrapper.so
bin/libpngencoder.so
bin/libX11Wrapper.so
lib/boot/archive.jar
lib/boot/asm-3.1/META-INF/MANIFEST.MF
lib/boot/auth.jar
lib/boot/bcel-5.2/bcel-5.2.jar
lib/boot/beans.jar
lib/boot/concurrent.jar
lib/boot/crypto.jar
lib/boot/instrument.jar
lib/boot/lang-management.jar
lib/boot/logging.jar
lib/boot/math.jar
lib/boot/misc.jar
lib/boot/mx4j_3.0.2/META-INF/MANIFEST.MF
lib/boot/mx4j_3.0.2/mx4j.jar
lib/boot/mx4j_3.0.2/mx4j-remote.jar
lib/boot/regex.jar
lib/boot/rmi.jar
lib/boot/security-kernel-stubs.jar
lib/boot/suncompat.jar
lib/boot/text.jar
lib/boot/xalan-j_2.7.0/META-INF/MANIFEST.MF
lib/boot/xerces_2.9.1/META-INF/MANIFEST.MF
lib/boot/xerces_2.9.1/xml-apis.jar
lib/cmm/CIEXYZ.pf
lib/cmm/GRAY.pf
lib/cmm/LINEAR_RGB.pf
lib/cmm/sRGB.pf
lib/fonts/DejaVuSans-BoldOblique.ttf
lib/fonts/DejaVuSans-Bold.ttf
lib/fonts/DejaVuSans-Oblique.ttf
lib/fonts/DejaVuSans.ttf
lib/fonts/DejaVuSerif-BoldItalic.ttf
lib/fonts/DejaVuSerif-Bold.ttf
lib/fonts/DejaVuSerif-Italic.ttf
lib/fonts/DejaVuSerif.ttf

[1] Removing libniochar.so means that the ICU4J implementation is used
for the charset providers (rather than the native version) which may not
be a good idea for all applications.

[2] Using an IBM VME you can also remove:

bin/libhyarchive.so
lib/boot/annotation.jar

If you remove these with DRLVM, then you get a SIGABORT crash dump.  I
can't help wondering why we don't handle these errors a little more
gracefully.



Re: [GSOC] The code for smallest class set for customer application in now on JIRA

Posted by Daniel Gong <da...@gmail.com>.
Thank you Mark.
It is true that what the current version of MinJreToolkit produces is not a
true minimum one, and I believe it is not always possible to cut jre to a
minimum one because of the possible dynamic behavior of a java application.

However, in my opinion, the MinJreToolkit is designed to analyze a custom
application automatically in several different ways and integrate the
results to the final result. The current version of MinJreToolkit define the
format of result and contains two tools which can reduce the size of
jre markedly but it's still not enough. The current version is for GSoC, and
of course I will keep improving it.

I've also tried the same kind of experiments as you did. The problem is how
to find automatic algorithms to detect the dependence rather than analyze
with a large configure file specified for a certain implementation of jre.
In my future plan, I will try to find such kind of algorithms and optimize
the existing algorithm to reduce the size of jre further. More, your
comments remind me that I should offer a building log for those who want to
know what classes and files are reserved in the new generated jre and why.
The cns file is for the jre generator, not for the user, so I think it will
be better to place these information in a log file.

Thanks very much again for all your comments! By the way, I like the word
"interesting" you use to describe my work:) It is really an interesting
project and a challenge for me, and I've learnt a lot:)

On Thu, Aug 13, 2009 at 9:17 PM, Mark Hindess
<ma...@googlemail.com>wrote:

>
> In message <2c...@mail.gmail.com>,
> Daniel Gong writes:
> >
> > Hi all,
> > I have my code attached in issue HARMONY-6291 on JIRA. I'd like to call
> it
> > MinJre Toolkit.
>
> Nice work Daniel.
>
> A couple of comments . . .
>
> When running on linux I have to set jdk.dir (or JAVA_HOME), origin.dir
> and target.dir.  It would be nice if I could just set one property
> (assuming people are running with Harmony and if not they deserve to
> have to set more properties ;-) and have the others default to something
> sensible.  So something like:
>
>        <property name="jdk.dir" location="${env.JAVA_HOME}" />
>        <property name="origin.dir" location="${jdk.dir}/jre" />
>        <property name="target.dir" location="${origin.dir}-min" />
>
> You will also note that I changed the properties to use location="..."
> rather than value="..." since the later converts them to a full path.
> If I didn't do this I had a problem because g++ was execute in a
> different working directory and a relative path is then broken.
>
> With these changes I can just do:
>
>  JAVA_HOME=../target/hdk/jdk ant
>
> and it uses Harmony to create a new minimal jre in
> "../target/hdk/jdk/jre-min".
>
> In addition to the class lists, in cns it would be nice to have some
> information about why a class was required.  For instance, I'd like to know
> why the static analysis decided that org.apache.bcel.generic.LoadClass was
> required for Hello.class.
>
> Similarly, it would be nice to have a log file created that shows why
> each item in jre-min was copied from the jre.  The default should be to
> copy almost nothing but the launcher and justify everything else you
> copy.  I think this is a good approach since it would avoid copying:
>
> 1) Artifacts for which there is no justification.  For example,
> security-kernel-stubs.jar which is a jar of empty method stubs used
> for satisfying dependencies during compilation.
>
> 2) Artifacts for which the justification is dependent on another artifact.
> For example, without awt.jar there is no point in having the awt artifacts
> such as:
>
>  bin/libFL.so
>  bin/libgl.so
>  bin/libjpegdecoder.so
>  bin/liblcmm.so
>  bin/liblinuxfont.so
>  bin/liboglwrapper.so
>  bin/libX11Wrapper.so
>  lib/fonts
>  lib/cmm
>
> Similarly for the DLLs for natives corresponding to jars.  You'd also
> probably not end up with empty directories such as lib/boot/yoko-1.0 or
> manifests with no corresponding jar such as lib/boot/asm-3.1.
>
>
> Since it is simple to automate, you might want to comment out any jars
> you remove from the bootclasspath.properties file to avoid the VM having
> to look for them at all.
>
>
> As an experiment, I took the jre-min for the example Hello class and
> removed the unused DLLs, cmm, font and jar files[0].  Excluding the jvm
> in jre/bin/default that reduced the size from 19.4k to 10.4k which is
> 53.7% of the original minimal jre size.
>
> Taking the -verbose:class output, I then removed the classes from the
> jars that were not listed in the -verbose:class output.  This reduced
> the since to 8.6k which is 44.2% of the original minimal jre size.  This
> is totally crazy for a real world application but give some idea of the
> absolute minimal boot class set.
>
> Thanks again for your interesting work.
>
> Regards,
>  Mark.
>
> [0] I basically wrote a script which did 'chmod 0' on each file, ran
> the Hello class, then either removed the file or reverted the chmod
> depending on the success or otherwise of the Hello run.  The list of
> files I removed is:
>
> bin/libaccessors.so
> bin/libFL.so
> bin/libgl.so
> bin/libhyauth.so
> bin/libhyinstrument.so
> bin/libhyniochar.so[1]
> bin/libhysecurity.so
> bin/libjpegdecoder.so
> bin/libjpegencoder.so
> bin/liblcmm.so
> bin/liblinuxfont.so
> bin/liboglwrapper.so
> bin/libpngencoder.so
> bin/libX11Wrapper.so
> lib/boot/archive.jar
> lib/boot/asm-3.1/META-INF/MANIFEST.MF
> lib/boot/auth.jar
> lib/boot/bcel-5.2/bcel-5.2.jar
> lib/boot/beans.jar
> lib/boot/concurrent.jar
> lib/boot/crypto.jar
> lib/boot/instrument.jar
> lib/boot/lang-management.jar
> lib/boot/logging.jar
> lib/boot/math.jar
> lib/boot/misc.jar
> lib/boot/mx4j_3.0.2/META-INF/MANIFEST.MF
> lib/boot/mx4j_3.0.2/mx4j.jar
> lib/boot/mx4j_3.0.2/mx4j-remote.jar
> lib/boot/regex.jar
> lib/boot/rmi.jar
> lib/boot/security-kernel-stubs.jar
> lib/boot/suncompat.jar
> lib/boot/text.jar
> lib/boot/xalan-j_2.7.0/META-INF/MANIFEST.MF
> lib/boot/xerces_2.9.1/META-INF/MANIFEST.MF
> lib/boot/xerces_2.9.1/xml-apis.jar
> lib/cmm/CIEXYZ.pf
> lib/cmm/GRAY.pf
> lib/cmm/LINEAR_RGB.pf
> lib/cmm/sRGB.pf
> lib/fonts/DejaVuSans-BoldOblique.ttf
> lib/fonts/DejaVuSans-Bold.ttf
> lib/fonts/DejaVuSans-Oblique.ttf
> lib/fonts/DejaVuSans.ttf
> lib/fonts/DejaVuSerif-BoldItalic.ttf
> lib/fonts/DejaVuSerif-Bold.ttf
> lib/fonts/DejaVuSerif-Italic.ttf
> lib/fonts/DejaVuSerif.ttf
>
> [1] Removing libniochar.so means that the ICU4J implementation is used
> for the charset providers (rather than the native version) which may not
> be a good idea for all applications.
>
> [2] Using an IBM VME you can also remove:
>
> bin/libhyarchive.so
> lib/boot/annotation.jar
>
> If you remove these with DRLVM, then you get a SIGABORT crash dump.  I
> can't help wondering why we don't handle these errors a little more
> gracefully.
>
>
>