You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Mark Hindess <ma...@googlemail.com> on 2010/10/01 17:43:38 UTC

Optimizing jars

Our modular structure means we have many jars that need to be read at
startup, this is always going to give us a disadvantage in terms of
startup time compared to implementations with fewer jars.

I was reading:

  http://blog.mozilla.com/tglek/2010/09/14/firefox-4-jar-jar-jar/

and wondered if perhaps some of these techniques could be employed to
help us reduce the disadvantage a little.

So I took a federated build and tried:

  target/hdk/jdk/jre/bin/java -verbose:class -cp ~/hy/java HelloWorld \
    >trace.log 2>&1

  # create one log file per jar in trace subdir, one class filename per line
  class-load-by-jar trace.log

  cp -pr target target.new

  python optimizejars.py --optimize trace \
         target{,.new}/hdk/jdk/jre/lib/boot

  python optimizejars.py --optimize trace \
         target{,.new}/hdk/jdk/jre/bin/default

so now new contains optimised versions of the jars need to run HelloWorld.

I then obtained some timing data.

The figures I get back weren't really conclusive - on one machine it
showed improvement but not on another.  I tried the same experiment
with a j9 vm but the zip implementation is obviously more fragile as
it crashed complaining about not finding java.lang.Object.  Ironically
given what I started thinking about, I suspect that the one big jar case
(e.g. the RI) will probably see more benefit than the multiple jar case.[1]

Anyway, I was just experimenting but I thought I'd write down what I
found in case anyone had similar/better ideas.

Regards,
 Mark.

[1] Since you can't really reduce the disk seeks between multiple jars but
    you can within a single big jar.

class-load-by-jar is:

#!/usr/bin/perl
use strict;
use warnings;

my %fh;
mkdir 'trace', 0755;
while (<>) {
  my ($class, $jar) =
    m!(?:Loaded|class load:) (\S+) from:? .*\/([^\/\.]+\.jar)! or next;
  my $fh = file_handle($2);
  print $fh $class, ".class\n";
}
close $_ foreach (keys %fh);

sub file_handle {
  my $jar = shift;
  return $fh{$jar} if (exists $fh{$jar});
  open $fh{$jar}, '>', 'trace/'.$jar.'.log'
    or die "Failed to open $jar.log: $!\n";
  $fh{$jar};
}



Re: Optimizing jars

Posted by Mark Hindess <ma...@googlemail.com>.
In message <4C...@gmail.com>, Tim Ellison writes:
>
> On 01/Oct/2010 16:43, Mark Hindess wrote:
> >
> > Our modular structure means we have many jars that need to be read
> > at startup, this is always going to give us a disadvantage in terms
> > of startup time compared to implementations with fewer jars.
>
> Do you have some measurements?

Yes.

> Why is it always going to be a disadvantage?

For startup, I think so.  A significant proportion of the startup time
is spent reading jars and their manifests.  We read all the manifests of
all jars on our bootclasspath.

However, if we were to make use of the OSGi manifests at runtime then
the class lookups would be more targeted and ultimately having multiple
jars may be advantageous for non-trivial programs.

> > I was reading:
> > 
> >   http://blog.mozilla.com/tglek/2010/09/14/firefox-4-jar-jar-jar/
> > 
> > and wondered if perhaps some of these techniques could be employed
> > to help us reduce the disadvantage a little.
> > 
> > So I took a federated build and tried:
> > 
> >   target/hdk/jdk/jre/bin/java -verbose:class -cp ~/hy/java HelloWorld \
> >     >trace.log 2>&1
> > 
> >   # create one log file per jar in trace subdir, one class filename per
> >   # line
> >   class-load-by-jar trace.log
> > 
> >   cp -pr target target.new
> > 
> >   python optimizejars.py --optimize trace \
> >          target{,.new}/hdk/jdk/jre/lib/boot
> > 
> >   python optimizejars.py --optimize trace \
> >          target{,.new}/hdk/jdk/jre/bin/default
> > 
> > so now new contains optimised versions of the jars need to run
> > HelloWorld.
> > 
> > I then obtained some timing data.
> > 
> > The figures I get back weren't really conclusive - on one machine it
> > showed improvement but not on another.
>
> What were you measuring? and what improvements did you see?

I was measuring HelloWorld execution time.  I re-ran my tests under
better conditions and I tried a few more options.

 Original: 1.317806 +/- 0.006%
     Trim: 1.295730 +/- 0.015%
  Reorder: 1.293020 +/- 0.029%
      One: 1.290560 +/- 0.016%
 Optimize: 1.295950 +/- 0.010%
FullOrder: 1.317230 +/- 0.017%

Original is the raw build.

Trim is the raw build with the bootclasspath.properties file edited to
exclude unnecessary jars.

Reorder is Trim but with the bootclasspath.properties file sorted so
that the jars with the most used classes come first.

One is the result of combining all of the used jars in to one big jar and
having only that entry on the boot classpath.

Optimize is One with the mozilla optimization applied.  (Yes, it makes
it slower!)

Full Reorder is Original but with sorted bootclasspath.properties file.

> > I tried the same experiment with a j9 vm but the zip implementation
> > is obviously more fragile as it crashed complaining about not
> > finding java.lang.Object.  Ironically given what I started thinking
> > about, I suspect that the one big jar case (e.g. the RI) will
> > probably see more benefit than the multiple jar case.[1]
> > 
> > Anyway, I was just experimenting but I thought I'd write down what I
> > found in case anyone had similar/better ideas.
> > 
> > Regards,
> >  Mark.
> > 
> > [1] Since you can't really reduce the disk seeks between multiple
> >     jars but you can within a single big jar.
> 
> You mean we can't influence how close separate JAR files are on a
> disk, but we can influence the layout of an individual JAR file file?

Yes.

> I'd like to see some numbers on where the time is actually being spent
> during start-up.


> Is it really in searching and seeking?  It's not at all clear to me
> that we can navigate the directory structure in the ZIP file format
> any faster than an OS can navigate the file system directories.

I'd say that searching/seeking isn't significant.  I'd speculate that
this was because our jar files - even if we combined them - are still
relatively small.

For my HelloWorld startup the hot methods seem to be:

 22.08% void java.util.jar.InitManifest.readValue()
  8.86% void java.util.zip.ZipEntry.<init>(java.util.zip.ZipEntry$LittleEndianReader, java.io.InputStream)
  7.53% java.nio.charset.CoderResult org.apache.harmony.niochar.charset.ISO_8859_1$Decoder.decodeLoop(java.nio.ByteBuffer, java.nio.CharBuffer)
  5.32% void java.util.jar.InitManifest.readName()
  3.70% int java.util.jar.Attributes$Name.hashCode()
  2.29% void java.util.jar.InitManifest.initEntries(java.util.Map, java.util.Map)
  1.75% int java.io.BufferedInputStream.read(byte[], int, int)
  1.70% java.lang.Object java.util.LinkedHashMap.putImpl(java.lang.Object, java.lang.Object)
  1.37% boolean org.apache.harmony.archive.util.Util.asciiEqualsIgnoreCase(byte[], byte[])
  1.25% void java.util.jar.InitManifest.decode(int, int, boolean)
  1.12% compileme.java/lang/Object.<init>()V
  1.04% boolean java.util.jar.Attributes$Name.equals(java.lang.Object)
  1.00% void java.util.zip.ZipEntry.myReadFully(java.io.InputStream, byte[])
  0.91% void java.util.LinkedHashMap.linkEntry(java.util.LinkedHashMap$LinkedHashMapEntry)
  0.83% java.util.HashMap$Entry java.util.LinkedHashMap.createHashedEntry(java.lang.Object, int, int)
  0.79% void java.util.zip.ZipFile.readCentralDir()
  0.75% void java.util.zip.ZipEntry.<init>(java.util.zip.ZipEntry)
  0.67% boolean java.util.jar.InitManifest.readHeader()
  0.58% byte[] java.util.jar.InitManifest.wrap(int, int)
  0.50% void java.util.LinkedHashMap$AbstractMapIterator.makeNext()

> (You will remember this discussion from 2008 and associated work, e.g.
> HARMONY-6002)

Will re-read that.

Regards,
 Mark.



Re: Optimizing jars

Posted by Tim Ellison <t....@gmail.com>.
On 01/Oct/2010 16:43, Mark Hindess wrote:
> Our modular structure means we have many jars that need to be read at
> startup, this is always going to give us a disadvantage in terms of
> startup time compared to implementations with fewer jars.

Do you have some measurements?  Why is it always going to be a disadvantage?

> I was reading:
> 
>   http://blog.mozilla.com/tglek/2010/09/14/firefox-4-jar-jar-jar/
> 
> and wondered if perhaps some of these techniques could be employed to
> help us reduce the disadvantage a little.
> 
> So I took a federated build and tried:
> 
>   target/hdk/jdk/jre/bin/java -verbose:class -cp ~/hy/java HelloWorld \
>     >trace.log 2>&1
> 
>   # create one log file per jar in trace subdir, one class filename per line
>   class-load-by-jar trace.log
> 
>   cp -pr target target.new
> 
>   python optimizejars.py --optimize trace \
>          target{,.new}/hdk/jdk/jre/lib/boot
> 
>   python optimizejars.py --optimize trace \
>          target{,.new}/hdk/jdk/jre/bin/default
> 
> so now new contains optimised versions of the jars need to run HelloWorld.
> 
> I then obtained some timing data.
> 
> The figures I get back weren't really conclusive - on one machine it
> showed improvement but not on another.

What were you measuring? and what improvements did you see?

> I tried the same experiment
> with a j9 vm but the zip implementation is obviously more fragile as
> it crashed complaining about not finding java.lang.Object.  Ironically
> given what I started thinking about, I suspect that the one big jar case
> (e.g. the RI) will probably see more benefit than the multiple jar case.[1]
> 
> Anyway, I was just experimenting but I thought I'd write down what I
> found in case anyone had similar/better ideas.
> 
> Regards,
>  Mark.
> 
> [1] Since you can't really reduce the disk seeks between multiple jars but
>     you can within a single big jar.

You mean we can't influence how close separate JAR files are on a disk,
but we can influence the layout of an individual JAR file file?

I'd like to see some numbers on where the time is actually being spent
during start-up.

Is it really in searching and seeking?  It's not at all clear to me that
we can navigate the directory structure in the ZIP file format any
faster than an OS can navigate the file system directories.

(You will remember this discussion from 2008 and associated work, e.g.
HARMONY-6002)

Regards,
Tim