You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oro-user@jakarta.apache.org by "Daniel F. Savarese" <df...@savarese.org> on 2001/05/10 21:14:54 UTC

Re: Differences between oro and regexp (was RE: jakarta-oro 2.0.2 released)

>Basically, the regexp package is smaller and has a reduced feature set.
>In fact, the regexp package jar file is less than half the size of the oro
>package jar.

The feeling that jakarta-oro is large is a common misconception.  The size
of what used to be OROMatcher is very small.  All you need for
regular expressions is the org.apache.oro.text.regex package, not all
of the other stuff.  To alleviate this misconception, we're going to
provide a jakarta-oro jar that has everything and then separate jars for
strictly those slices that people want, roughly corresponding to the old
OROMatcher, PerlTools, AwkTools, and TextTools packages.

>Initially, regexp handles matching (and rejecting matches) more quickly. But,
>after a few hundred matches, the time required by the regexp package
>(especially in rejecting matches) increases considerably when compared to
>the oro package.

This is also another misconception, although not directly in relation to
the regexp package.  The jakarta-oro package has 4 different regular
expression packages.  So when you compare performance, you have to
specify which one.  Also, a lot of times people talk about jakarta-oro
when they really mean the Perl5Util class, which is a convenience
wrapper around the org.apache.oro.text.regex package.  Perl5Util will
always be slow (although we can improve its performance) because it
does a higher level set of parsing so that you can use Perl-specific
syntactic sugar like 's/foobar/barfoo/g' instead of the allegedly
more cumbersome approach of directly using the org.apache.oro.text.regex
classes.  Furthermore, most people blatantly misuse the
org.apache.oro.text.regex package by constantly reinstantiating and
Perl5Compiler and Perl5Matcher instances and constantly recompiling
regular expressions.  Hopefully this will stop after we write a new
user's guide explaining how to make proper use of the package.
A valid performance comparison can only be made by posting the code used
to make the comparison.  I don't know how you reached the assessment you
made.  All performance evaluation code is welcome on oro-dev because
even though the primary goal for at least the Perl related stuff is to
achieve compatibility with Perl, the secondary goal is to be as fast
as possible within the constraints of Perl's regex syntax and Java's
runtime performance.

daniel



Re: Differences between oro and regexp (was RE: jakarta-oro 2.0.2

Posted by Michael McCallum <mi...@spinsoftware.com>.
> :-)
> 
> If anything, we should just combine the two projects. I don't have the
> time/energy to make that happen, so, it is really up to our users to
> evaluate which product they want to use and go with it based on their own
> analysis...regardless of misconceptions...

I was thinking the same thing.
How would we go about that?

I kept having ideas for improving regexp but they were already in oro so there seemed no point.

Michael


Re: Differences between oro and regexp (was RE: jakarta-oro 2.0.2 released)

Posted by Jon Stevens <jo...@latchkey.com>.
on 5/10/01 12:14 PM, "Daniel F. Savarese" <df...@savarese.org> wrote:

> This is also another misconception,

Regardless of misconceptions, regexp came first because at the time you
wanted a GPL license for ORO so I went out and found another package
(regexp) and released it. Soon after, you decided to give us ORO.

:-)

If anything, we should just combine the two projects. I don't have the
time/energy to make that happen, so, it is really up to our users to
evaluate which product they want to use and go with it based on their own
analysis...regardless of misconceptions...

:-)

-jon

-- 
If you come from a Perl or PHP background, JSP is a way to take
your pain to new levels. --Anonymous
<http://jakarta.apache.org/velocity/ymtd/ymtd.html>