You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oro-dev@jakarta.apache.org by "Daniel F. Savarese" <df...@savarese.org> on 2002/11/18 23:47:26 UTC

Re: ORO performance tuning efforts

>I spent that last couple of (long) days profiling and tuning ORO.  My

Thanks for sharing the results of your hard work.

>The single biggest thing that I found was the performance of the _toLower
>method in Perl5Matcher.  There is a big comment there indicating that this
>needs to be refactored, presumably for performance.  For case insensitive
..
>3-4% of this, and Character.isUpperCase( ) was most of the balance.  As per
>the comment that is already there, this method really just needs to go away.
>I think there are solid performance related reasons to make that happen
>sooner than later.

Yes, efficient case folding has been a todo item for quite some time.  It's
good to have it confirmed that the admittedly klugey current implementation
is indeed a performance hit.

>On a related note, there was some discussion on the list about updating ORO
>to use CharSequence for input, in part to eliminate the toCharArray( ) that
...
>going to CharSequence exclusively.  I seems like a reasonably
>straightforward change for someone who knows the code.  The big question is
>probably whether a change that made ORO require JDK 1.4 was cool.  We

I don't think we could go to CharSequence exclusively.  We should wrap
a CharSequence in some other interface or only make it available to
J2SE 1.4 so that we can continue to support earlier JVMs.  My main
motivation in saything that is really just so we can support J2ME.
The more time goes by, the less important it is to support the other
JDKs (people can just use older releases of the code).  I think it's
OK to slow down performance on an earlier JVM by eliminating direct
char[] indexing and using a method call instead that will be inlined
by a J2SE 1.4 JVM (and just delegate to CharSequence) in order to
make it easy to support the latest J2SE platform and J2ME from
the same code base.  The Velocity-based conditional compilation
mechanism that Didge threw together doesn't appear to have found a
home yet, and since we only really need the level of conditional
compilation supported by CPP, I'll finally look at the Ant CPP task
and see about introducing different JVM targets to the build process
in preparation for making this change.

>will reduce the number of match/contains entrypoints significantly and I
>think will completely eliminate the need for the PatternMatcherInput (though
>I could be wrong about that).

PatternMatcherInput would indeed never have been devised had there been
such a thing as a standard CharSequence interface, but I think we're
stuck with PatternMatcherInput for the sake of J2ME, although it may look
different under the hood when compiled for J2SE 1.4.

>I found one other change that produced significant improved performance in
>my tests.  It seems that the Character static methods, which are called a
...
>performance decreases by 1% to 2%.  If the table is initialized, it will use
>128k bytes and will take a little startup time (not that much really), but
>yield 10% to 20% better performance.  For us, using the static table is a no
>brainer.  But I have no idea how attractive this change would be to the ORO
>community at large.  I wrote it in a way that doesn't hurt (and may even

I'm not averse to making this change, since it is up to the programmer
to decide whether or not to incur the additional memory overhead.  I'd
want to move the code out of Perl5Matcher though so it could be used
by other (future) engines.

>The one other thing that I found was that when using high thread counts (10
>or more), there was significant time spent on Perl5Repetition object
>allocation (in both places where these objects are created).  I'm doing
>583,000 regexes in about a minute, so a lot of these objects get allocated.

I tried pretty hard way back when to allocate as few objects as possible,
but Sun was preaching (and especially the HotSpot team right now) not
to pool objects and let the JVM do its thing.  There's no getting around
the need to create Perl5Repetition instances, but they don't have to go
away if there are measurable performance gains to be made by pooling
them.  Do you have any numbers indicating how much time percentagewise
is being sucked up in new Perl5Repetition()?

>As with anything in Java that needs to be fast, object creation can hurt
>(the fact that ORO generally doesn't use objects during matching is a big
>reason it beats Sun, IMO).  I'm not sure if there is a way to effectively

That was my intuition, but after Jeffrey Friedl ran his mysterious
benchmarks, I began to believe the ORO/Perl/Henry Spencer approach
was outdated.  Even if your application turns out to be a special case
where ORO does especially well, given the choice of using a synthetic
benchmark as a guide and a real production application, I have to choose
the real production application.  I also know that with ORO, we tried to
make it possible to match hundreds of thousands of patterns efficiently
(as far as Java allows) instead of making it convenient to match a couple
of patterns here and there.

>I'm not entirely sure how to move forward from here.  Comments appreciated.

If possible, I'd like to see some of the profiling data you generated
and match that up with the regular expressions being used if possible.
We need a sanity check to make sure we don't embark on a set of changes
that only benefits your special case.  I don't think that's going to
be the case, because everything you've said coincides with intuitions
I've never taken the time to test and verify the way you have.

We've settled into a reactive maintenance mode on this project.  If
there's a great enough need, we push forward and someone submits a
patch or makes a strong case for a change and a committer (OK, I'll
be honest, that means me) makes it.  We've all agreed we don't want
to make major changes without getting our testing framework in order,
but we never seem to get our testing framework in order.  However,
I don't see that working on any of what you've brought up is likely to
break anything in a big way the way working on some Perl 5.6/6.0
features would. 

I would suggest the following:
 1. We vote (counting Mark Murphy and Takashi's votes since they've
    submitted patches that have been applied in one way or another)
    on releasing v2.0.7 so the latest crop of fixes are available.
 2. For you (Bob) to put together some performance unit tests we can
    run so that we know the changes we're making are having their
    intended effect.
 3. Prioritize this crop of changes.  My bias is:
      1. Conditional compilation supporting a J2ME target
      2. Optional table-based character type lookup
      3. Theoretically inlinable input iteration abstraction,
         using CharSequence for J2SE 1.4
      4. Proper case folding.
      5. Possibly pool Perl5Repetition objects or something else
         to reduce impact of memory allocation.
    This order is based on dependencies that will minimize work as well
    as complexity.  You need 1 before you can do 3 if you're going to
    support multiple JVMs.  You want to do 3 before you do 4 if 4 might
    affect code that iterates through input; also 3 is easier to implement
    and less likely to introduce a bug than 4.  5 we don't know if we need
    to do yet.

Number 4 may not be a quick change to make, but the rest aren't large
time sinks.  If Mark could get us started with just one functionality
unit test and Bob could get us started with some performance tests,
I think there will be sufficient grounds to nominate you both as
committers (Jon and I just have to dig one of the inactive initial
committers to provide a third vote), which will make it easier for
each of you to support your respective company's use of jakarta-oro.
This may just be the thing to kick some life back into development
and keep my time constraints from being such a bottleneck.  As I
recall, Bob, you also were hoping for group-local modifiers.  That's
something we can tackle if we successfully make it through the above.

As a side note, for bug fixes I'm comfortable with just making the
fixes as necessary.  But for changes that impact the overall API
or implementation, I think the httpd group's original review code
first before commit, or at least discuss and agree on the implementation
beforehand, is the best way to go (and very manageable for this
project since it's not a lot of code).  So, even though I've implicitly
assigned myself the implementation of some of these changes, I'm not
going to have at them all without discussion.  For example, I'll propose
a way to reimplement the input traversal to support the use of
CharSequence and the list can criticize it and counter propose.

daniel



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: ORO performance tuning efforts

Posted by "Bob Dickinson (BSL)" <bo...@brutesquadlabs.com>.

Dan,

    I think that your plan sounds great.  I'll be happy to put together some
performance tests.  It will probably take a week or so.  I'm in the middle
of something right now and I have to rework the tests to require as little
support code as possible (right now they use a bunch of our util stuff that
I'd rather not include and will just complicate things).

    Regarding the Perl5Repitition object overhead, with one thread it was
negligible (less than 1%, maybe a lot less than that).  With 10 threads it
was 40% (about 20% in each place).  I didn't verify this outside of the
profiler, and I didn't run the thread profiler to actually see the heap
contention, so I can't swear that the profiler was telling the truth here.
We have definitely run into degenerate object creation performance with high
thread counts (50+) and moderate to high object allocation in our app, since
regardless of the number of processors or threads, Java can only create one
object at a time.  I can't swear that this was the problem I saw with the
Perl5Repitition.  I'll try to nail it down a little better.

Bob


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>