You are viewing a plain text version of this content. The canonical link for it is here.
Posted to regexp-dev@jakarta.apache.org by bu...@apache.org on 2002/12/17 22:38:21 UTC

DO NOT REPLY [Bug 15461] New: - Word boundry (\b) not matching using org.apache.regexp.RE

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=15461>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=15461

Word boundry (\b) not matching using org.apache.regexp.RE

           Summary: Word boundry (\b) not matching using
                    org.apache.regexp.RE
           Product: Regexp
           Version: unspecified
          Platform: PC
        OS/Version: Other
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: Other
        AssignedTo: regexp-dev@jakarta.apache.org
        ReportedBy: curtis.paris@metro1.com


The word boundry escape \b does not seem to be matching.  Using the Jakarta-
Regexp-1.2 Binary implementation fails.  Perl5 works correctly as a comparison.

Simplest Example:

   Regexp: \b(eek.)
   Match against: "week"

   Jakarta returns "true" for a match, while Perl returns "false".

Workaround:

   Instead of \b I've been using (^|\s), but this changes my parenCount for 
retriving data.

Sample JSP:

<%@page import="org.apache.regexp.*"%>
<%
	String[] stringList = new String[] { "eekeek", "week", "my eeker", "the 
meeker" };

	String regExp = "\\b(eek.*)";
	RE testRE = new RE(regExp);

	out.write("<PRE>Testing RE Engine for word boundry: (" + regExp + ")
\n");
	for (int pos = 0; pos < stringList.length; pos ++)
	{
		if (testRE.match(stringList[pos]))
		{
			int start = testRE.getParenStart(1);
			int end = testRE.getParenLength(1) + start;
	
			out.write("Matched " + stringList[pos].substring(0, 
start) + "<B><U>" + stringList[pos].substring(start, end) + "</U></B>" + 
stringList[pos].substring(end) + "\n");
		} else {
			out.write("No Match: " + stringList[pos] + "\n");
		}
	}
%>

Result:

Testing RE Engine for word boundry: (\b(eek.*))
Matched eekeek
Matched week
Matched my eeker
Matched the meeker

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>