You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@jakarta.apache.org by Edwin Martin <ed...@bitstorm.nl> on 2001/06/07 00:36:58 UTC

Regexp 1.2 weirdness

[I already posted this problem to the regexp mailinglist,
  without results. Maybe anybody on this list can help?]

Hello,

I stumbled upon a problem with regexp 1.2 which I can't match
with any general regex-documentation, either old or new.

In short: "[a-z0-9-]" doesn't match alphanumerics and '-'.

Here's an JSP test page I made:

---------------- begin retest.jsp ----------------

<%@ page import="org.apache.regexp.*" %>

<h2>RE test</h2>

<%

String s = "&lt;john.doe-001.002@my.com&gt;";

out.print(s);

out.print("<p>1<br>");
RE emailRE1 = new RE("([a-z0-9]+)@");
if ( emailRE1.match( s ) )
         out.print( emailRE1.getParen(1) );

out.print("<p>2<br>");
RE emailRE2 = new RE("([a-z0-9.]+)@");
if ( emailRE2.match( s ) )
         out.print( emailRE2.getParen(1) );

out.print("<p>3<br>");
RE emailRE3 = new RE("([a-z0-9.-]+)@");
if ( emailRE3.match( s ) )
         out.print( emailRE3.getParen(1) );

out.print("<p>4<br>");
RE emailRE4 = new RE("([a-z0-9-]+)@");
if ( emailRE4.match( s ) )
         out.print( emailRE4.getParen(1) );

s = "&lt;john.doe.001-002@my.com&gt;";

out.print("<hr>");
out.print(s);

out.print("<p>5<br>");
RE emailRE5 = new RE("([a-z0-9-]+)@");
if ( emailRE5.match( s ) )
         out.print( emailRE5.getParen(1) );

out.print("<p>6<br>");
RE emailRE6 = new RE("([a-z0-9.-]+)@");
if ( emailRE6.match( s ) )
         out.print( emailRE6.getParen(1) );

out.print("<p>7<br>");
RE emailRE7 = new RE("([a-z0-9.]+)@");
if ( emailRE7.match( s ) )
         out.print( emailRE7.getParen(1) );
%>

---------------- end retest.jsp ----------------

This is the output:

---------------- begin output  ----------------
RE test
<jo...@my.com>
1
002
2
001.002
3
001.002
4
<john.doe-001.002
-------------------------------------------------
<jo...@my.com>
5
<john.doe.001-002
6
002
7
002
---------------- end output ----------------

Points 1 and 2 are as expected.

Point 3 should match "john.doe-001.002"

Point 4 (removing the dot) matches all!

Point 5, 6 and 7 are added to see what happens when
the dot and minus are swapped. The same strange
behavior :-(

Do I overlook something?

Bye,
Edwin Martin.


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@jakarta.apache.org
For additional commands, e-mail: general-help@jakarta.apache.org


Re: Regexp 1.2 weirdness

Posted by Jon Stevens <jo...@latchkey.com>.
on 6/6/01 3:36 PM, "Edwin Martin" <ed...@bitstorm.nl> wrote:

> [I already posted this problem to the regexp mailinglist,
> without results. Maybe anybody on this list can help?]

Posting here isn't appropriate.

-jon

-- 
"Open source is not available to commercial companies."
            -Steve Balmer, CEO Microsoft
<http://www.suntimes.com/output/tech/cst-fin-micro01.html>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@jakarta.apache.org
For additional commands, e-mail: general-help@jakarta.apache.org