You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Robin Anil <ro...@gmail.com> on 2009/09/04 13:00:04 UTC

About Javac compiler optimization

I saw that the maven compile:compile plugin didnt have the optimize=true
enabled for mahout
so i ran an experiment. Forget about the machine specs. Its run on the same
system

Before enabling optimize:
Compile Time
Core : 50 secs
Examples 10 secs
FPGrowth  50K transactions with minSupport 20%
Time taken: 1:40 min

After enabling optimize
Compile TIme
Core: 1:08 min
Examples: 20 secs
FPGrowth  50K transactions with minSupport 20%
Time taken: 1:28 min

There seem to be some improvement in speed.


Any thoughts or comments on this. Or on java optimizations in general

Robin

Wishing Java had a -O3

Re: About Javac compiler optimization

Posted by Ted Dunning <te...@gmail.com>.
If it were just a few seconds.  In my experience, really exploring
optimization settings takes hours of scanning the parameter space.

That same amount of time could easily double the speed of our sparse
vectors.

On Fri, Sep 4, 2009 at 11:31 AM, Sean Owen <sr...@gmail.com> wrote:

> 10% is certainly nothing to
> sneeze at IMHO, for a couple seconds of change to a script. Right?
>



-- 
Ted Dunning, CTO
DeepDyve

Re: About Javac compiler optimization

Posted by Sean Owen <sr...@gmail.com>.
Agree. Just don't want this to lead to the conclusion we *shouldn't*
somehow help recommend good JVM settings or compile-time optimization.
And I don't think, in the end, anyone is saying that.

On Sat, Sep 5, 2009 at 12:28 AM, Ted Dunning<te...@gmail.com> wrote:
> I am sure that our Sparse Vector stuff is at least twice as slow as it
> should be.
>
> My preliminary experiments indicate that we should be able to get within
> about a factor of 2 of the speed of a DenseVector (ish).  I am pretty sure
> our current SparseVector is a long ways from there.  Shashi's experience
> indicates that replacing SV can give us a near doubling of overall
> performance in some cases.

Re: About Javac compiler optimization

Posted by Ted Dunning <te...@gmail.com>.
I am sure that our Sparse Vector stuff is at least twice as slow as it
should be.

My preliminary experiments indicate that we should be able to get within
about a factor of 2 of the speed of a DenseVector (ish).  I am pretty sure
our current SparseVector is a long ways from there.  Shashi's experience
indicates that replacing SV can give us a near doubling of overall
performance in some cases.

Remember how much difference the distance optimization made?  By using
sparsity well, we can probably get 100x speedup in k-means for some datasets
because the distance computation uses cached values for the squared
magnitude of the centroids and only iterates over the non-zero data values.

I also expect that there *has* to be code that we have that conses way too
much.

And I am sure you remember how much difference it makes to have better data
storage in Taste.  For many applications, we can go further and have binary
sparse matrices.  These don't even need to store the data because if we know
a value is non-zero, we also know its value.  With integer compression and
no data storage cost, we might be able to store 10x more data in memory and
get an astronomical speed up for large data sets.

And then there are on-line algorithms like vowpal.  Using these often
results in many orders of magnitude speedup (5 orders is not impossible for
some probablems).  Even if the number of operations for an on-line algorithm
is twice as large, the effect of passing through the data completely
sequentially can make four orders of magnitude difference in speed.

So, my answer to this is that I can't imagine that we don't have 2-10x
speedups in just the low-hanging fruit.

On Fri, Sep 4, 2009 at 3:54 PM, Sean Owen <sr...@gmail.com> wrote:

>
> (Do you really think some implementations are twice as slow as they
> need be? can't tell if that's good or bad news. )




-- 
Ted Dunning, CTO
DeepDyve

Re: About Javac compiler optimization

Posted by Sean Owen <sr...@gmail.com>.
-server is the default on "server" class machines. But as you point
out, even "non-server" machines are pretty powerful. The server
defaults are appropriate so I think it is right to explicitly select
it, regardless.

I agree it's hard to figure out the right GC settings. That is all the
more reason to provide a good starting point. For example the young
generation is by default about 25% of the heap. In the CF code, little
memory is needed for young, transient objects. If you use the default
settings, you will waste 25% of the heap then. When your heap is even
2GB, that's huge. I can't imagine not telling people to set NewRatio
high. (Whether it's 8, or 10, sure, up to your testing.)


I think writing tight code and configuring the JVM are separate things
and both are important. We can write the best code ever but the JVM
behavior will still be vital to performance. It's not a tradeoff; we
need to do well on both fronts.


(Do you really think some implementations are twice as slow as they
need be? can't tell if that's good or bad news. )


On Fri, Sep 4, 2009 at 10:47 PM, Grant Ingersoll<gs...@apache.org> wrote:
> Of course, but it often becomes a black hole figuring out how all those
> parameters work and which ones help and hurt.  Not to say you shouldn't do
> it and that it isn't welcome, just that it is often hard to give general
> recs for those things given all the platforms that Java runs on.  Instead,
> I'd suggest we could give resources to people on how they can determine
> them.

Re: About Javac compiler optimization

Posted by Grant Ingersoll <gs...@apache.org>.
On Sep 4, 2009, at 1:31 PM, Sean Owen wrote:

> (just to make sure everyone's on the same page, given the title of
> this thread -- these are not javac optimizations)
>
> I do think it's worth cataloging recommended JVM flags for runtime
> use. My current recommendations are:
>
> -server
> -da -dsa (I think these are redundant since assertions are off by
> default but...)
> -XX:+NewRatio=9 (for CF code this tunes for more long-lived objects)
> -XX:+UseParallelGC -XX:+UseParallelOldGC (nice win for  
> multiprocessor machines)

But even -server is redundant on server class machines, which even  
some desktop machines with a few cores qualify as these days.  IMO,  
garbage collection parameters are best set after observing your  
application under load based on log analysis and typical usage  
patterns.  Until then, just use the defaults.  Moreover, the number of  
options available to the GC is mostly overwhelming and leads to a  
whole big mess of parameters with most people never knowing what they  
really need and often just making things worse.  Time and time again I  
see people making assumptions about performance without being  
pragmatic and doing the work to test (Paging in Lucene is the classic  
example).  I've seen clients spend weeks on JVM tuning when some  
changes (often simple) in their application gave 10-20% improvements  
and massive reductions in memory all at very little cost.  None of  
those gains would have been feasible w/o pragmatic testing under load  
and an analysis of their application.  As Ted said, time is better  
spent on algorithms and data structures, especially in these early  
stages.

>
> Of course the right flags vary according to usage. It's important
> enough to performance to publish recommendations.
>
> While it may be true that one could get a 2x speedup by better
> algorithms, I don't see that it follows that a 10% win just by adding
> the right JVM flags is somehow not useful. 10% is certainly nothing to
> sneeze at IMHO, for a couple seconds of change to a script. Right?

Of course, but it often becomes a black hole figuring out how all  
those parameters work and which ones help and hurt.  Not to say you  
shouldn't do it and that it isn't welcome, just that it is often hard  
to give general recs for those things given all the platforms that  
Java runs on.  Instead, I'd suggest we could give resources to people  
on how they can determine them.

-Grant

Re: About Javac compiler optimization

Posted by Sean Owen <sr...@gmail.com>.
(just to make sure everyone's on the same page, given the title of
this thread -- these are not javac optimizations)

I do think it's worth cataloging recommended JVM flags for runtime
use. My current recommendations are:

-server
-da -dsa (I think these are redundant since assertions are off by
default but...)
-XX:+NewRatio=9 (for CF code this tunes for more long-lived objects)
-XX:+UseParallelGC -XX:+UseParallelOldGC (nice win for multiprocessor machines)

Of course the right flags vary according to usage. It's important
enough to performance to publish recommendations.

While it may be true that one could get a 2x speedup by better
algorithms, I don't see that it follows that a 10% win just by adding
the right JVM flags is somehow not useful. 10% is certainly nothing to
sneeze at IMHO, for a couple seconds of change to a script. Right?

On Fri, Sep 4, 2009 at 7:15 PM, Ted Dunning<te...@gmail.com> wrote:
> In java 1.6 (recent releases) the AggressiveOpts flag does some escape
> analysis and other optimizations that could help some kinds of code.
>
> See
> http://java.sun.com/performance/reference/whitepapers/tuning.html#section4.2.4
>
> My own feeling is that changes less than 2x in performance are not very
> exciting since there are definitely 2-10x opportunities available to us by
> algorithmic and structural changes.  Changes with less 10% or less impact
> are distractions at this point.  Remember 10% is just a few months of
> Moore's law.

Re: About Javac compiler optimization

Posted by Ted Dunning <te...@gmail.com>.
In java 1.6 (recent releases) the AggressiveOpts flag does some escape
analysis and other optimizations that could help some kinds of code.

See
http://java.sun.com/performance/reference/whitepapers/tuning.html#section4.2.4

My own feeling is that changes less than 2x in performance are not very
exciting since there are definitely 2-10x opportunities available to us by
algorithmic and structural changes.  Changes with less 10% or less impact
are distractions at this point.  Remember 10% is just a few months of
Moore's law.

On Fri, Sep 4, 2009 at 4:00 AM, Robin Anil <ro...@gmail.com> wrote:

> Any thoughts or comments on this. Or on java optimizations in general




-- 
Ted Dunning, CTO
DeepDyve

Re: About Javac compiler optimization

Posted by Grant Ingersoll <gs...@apache.org>.
On Sep 4, 2009, at 6:03 AM, Sean Owen wrote:

> Yeah I tried a load test with and without and saw virtually zero
> difference in run time.
>
> Returning briefly to the issue of optimization -- would still somehow
> like to work in ProGuard. Certainly a subset of its optimizations
> would not even interfere with debugging symbols. For instance, if
> you're only doing, say, loop unrolling and peephole-style
> optimizations, should be no reason that messes with the line number
> table.

It might be worth asking on legal-discuss@.  While Proguard itself is  
GPL, it says the resulting output can still be whatever you want.  We  
could always ship two versions.



Re: About Javac compiler optimization

Posted by Sean Owen <sr...@gmail.com>.
Yeah I tried a load test with and without and saw virtually zero
difference in run time.

Returning briefly to the issue of optimization -- would still somehow
like to work in ProGuard. Certainly a subset of its optimizations
would not even interfere with debugging symbols. For instance, if
you're only doing, say, loop unrolling and peephole-style
optimizations, should be no reason that messes with the line number
table.

On Fri, Sep 4, 2009 at 12:17 PM, Robin Anil<ro...@gmail.com> wrote:
> Could you check it on a computation intensive task in CF

Re: About Javac compiler optimization

Posted by Robin Anil <ro...@gmail.com>.
On Fri, Sep 4, 2009 at 4:39 PM, Sean Owen <sr...@gmail.com> wrote:

> This flag actually has no effect in javac other than letting the
> compiler inline constants. (It's JIT in the JVM at runtime that does
> most of the optimization.) I'm actually surprised it had any
> observable effect here, and wonder if this is consistently
> reproducible?
>
Could you check it on a computation intensive task in CF


>
> I certainly favor setting this flag; there's no real reason not to.
>
Hadoop is using it

>
> Real byte-code optimization comes from ProGuard. I use it constantly
> but had to remove it from our build because I don't think Maven plays
> nice with it and maybe there are license issues. It's a real shame to
> lose performance because of stuff like that. The speed bump varied,
> but was up to about 10% IIRC. Not bad.
>
> Yep.

>
>
> On Fri, Sep 4, 2009 at 12:00 PM, Robin Anil<ro...@gmail.com> wrote:
> > I saw that the maven compile:compile plugin didnt have the optimize=true
> > enabled for mahout
> > so i ran an experiment. Forget about the machine specs. Its run on the
> same
> > system
> >
> > Before enabling optimize:
> > Compile Time
> > Core : 50 secs
> > Examples 10 secs
> > FPGrowth  50K transactions with minSupport 20%
> > Time taken: 1:40 min
> >
> > After enabling optimize
> > Compile TIme
> > Core: 1:08 min
> > Examples: 20 secs
> > FPGrowth  50K transactions with minSupport 20%
> > Time taken: 1:28 min
> >
> > There seem to be some improvement in speed.
> >
> >
> > Any thoughts or comments on this. Or on java optimizations in general
> >
> > Robin
> >
> > Wishing Java had a -O3
> >
>

Re: About Javac compiler optimization

Posted by Sean Owen <sr...@gmail.com>.
Yeah it definitely interferes. Debug symbols don't work. You wouldn't
use it in development, only in your 'bug-free' production release.
Stack traces still 'work' though you might find some surprises due to inlining.

On Fri, Sep 4, 2009 at 12:17 PM, Shalin Shekhar
Mangar<sh...@gmail.com> wrote:
> Wouldn't that mess/remove up debug info leaving no way to debug a production
> application? Does it change the line numbers and stack traces?

Re: About Javac compiler optimization

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Fri, Sep 4, 2009 at 4:39 PM, Sean Owen <sr...@gmail.com> wrote:

>
> Real byte-code optimization comes from ProGuard. I use it constantly
> but had to remove it from our build because I don't think Maven plays
> nice with it and maybe there are license issues. It's a real shame to
> lose performance because of stuff like that. The speed bump varied,
> but was up to about 10% IIRC. Not bad.
>
>
Wouldn't that mess/remove up debug info leaving no way to debug a production
application? Does it change the line numbers and stack traces?

-- 
Regards,
Shalin Shekhar Mangar.

Re: About Javac compiler optimization

Posted by Sean Owen <sr...@gmail.com>.
This flag actually has no effect in javac other than letting the
compiler inline constants. (It's JIT in the JVM at runtime that does
most of the optimization.) I'm actually surprised it had any
observable effect here, and wonder if this is consistently
reproducible?

I certainly favor setting this flag; there's no real reason not to.

Real byte-code optimization comes from ProGuard. I use it constantly
but had to remove it from our build because I don't think Maven plays
nice with it and maybe there are license issues. It's a real shame to
lose performance because of stuff like that. The speed bump varied,
but was up to about 10% IIRC. Not bad.



On Fri, Sep 4, 2009 at 12:00 PM, Robin Anil<ro...@gmail.com> wrote:
> I saw that the maven compile:compile plugin didnt have the optimize=true
> enabled for mahout
> so i ran an experiment. Forget about the machine specs. Its run on the same
> system
>
> Before enabling optimize:
> Compile Time
> Core : 50 secs
> Examples 10 secs
> FPGrowth  50K transactions with minSupport 20%
> Time taken: 1:40 min
>
> After enabling optimize
> Compile TIme
> Core: 1:08 min
> Examples: 20 secs
> FPGrowth  50K transactions with minSupport 20%
> Time taken: 1:28 min
>
> There seem to be some improvement in speed.
>
>
> Any thoughts or comments on this. Or on java optimizations in general
>
> Robin
>
> Wishing Java had a -O3
>

Re: About Javac compiler optimization

Posted by Sean Owen <sr...@gmail.com>.
No it's definitely still there, and does have some tiny effects.

On Fri, Sep 4, 2009 at 12:15 PM, Shalin Shekhar
Mangar<sh...@gmail.com> wrote:
> I thought Sun had removed that flag?
>
> Anyways, most of the useful optimization is done by the JVM at runtime.
> There's a reason it is called a "Hotspot" VM.

Re: About Javac compiler optimization

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Fri, Sep 4, 2009 at 4:30 PM, Robin Anil <ro...@gmail.com> wrote:

> I saw that the maven compile:compile plugin didnt have the optimize=true
> enabled for mahout
> so i ran an experiment. Forget about the machine specs. Its run on the same
> system
>
> Before enabling optimize:
> Compile Time
> Core : 50 secs
> Examples 10 secs
> FPGrowth  50K transactions with minSupport 20%
> Time taken: 1:40 min
>
> After enabling optimize
> Compile TIme
> Core: 1:08 min
> Examples: 20 secs
> FPGrowth  50K transactions with minSupport 20%
> Time taken: 1:28 min
>
> There seem to be some improvement in speed.
>
>
> Any thoughts or comments on this. Or on java optimizations in general
>
> Robin
>
> Wishing Java had a -O3
>

I thought Sun had removed that flag?

Anyways, most of the useful optimization is done by the JVM at runtime.
There's a reason it is called a "Hotspot" VM.

-- 
Regards,
Shalin Shekhar Mangar.