You are viewing a plain text version of this content. The canonical link for it is here.
Posted to regexp-dev@jakarta.apache.org by bu...@apache.org on 2003/08/28 22:08:16 UTC
DO NOT REPLY [Bug 22804] New: -
java.lang.ArrayIndexOutOfBoundsException on negated classes
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=22804>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=22804
java.lang.ArrayIndexOutOfBoundsException on negated classes
Summary: java.lang.ArrayIndexOutOfBoundsException on negated
classes
Product: Regexp
Version: unspecified
Platform: PC
OS/Version: All
Status: NEW
Severity: Major
Priority: Other
Component: Other
AssignedTo: regexp-dev@jakarta.apache.org
ReportedBy: fernando@mecon.gov.ar
I use this code as a "sanitizer" (ie, filters bad input from users) on JDK 1.3.1:
String allowed= "a-zA-Z0-9_@.: ñÑáéíóúÁÉÍÓÚ\r\n\\-";
RE r= new RE("[^"+allowed+"]");
output= r.subst(input, "_", RE.REPLACE_ALL);
When running:
sanitize("aé$.JOla^|-+_")
I get:
java.lang.ArrayIndexOutOfBoundsException: 16
at org.apache.regexp.RECompiler$RERange.delete(Unknown Source)
at org.apache.regexp.RECompiler$RERange.remove(Unknown Source)
at org.apache.regexp.RECompiler$RERange.include(Unknown Source)
at org.apache.regexp.RECompiler$RERange.include(Unknown Source)
at org.apache.regexp.RECompiler.characterClass(Unknown Source)
at org.apache.regexp.RECompiler.terminal(Unknown Source)
at org.apache.regexp.RECompiler.closure(Unknown Source)
at org.apache.regexp.RECompiler.branch(Unknown Source)
at org.apache.regexp.RECompiler.expr(Unknown Source)
at org.apache.regexp.RECompiler.compile(Unknown Source)
at org.apache.regexp.RE.<init>(Unknown Source)
at org.apache.regexp.RE.<init>(Unknown Source)
This is with both 1.2 and 1.3-dev (CVS) as of 28/Aug/2003.
Everything works if I use:
String allowed= "a-zA-Z0-9_@.: ñÑáéíóúÁÉÍÓÚ\r\\-"; (removed \n)
The same happen with other characters inside de [^].
Is there's any other info needed, please let me know.