You are viewing a plain text version of this content. The canonical link for it is here.
Posted to regexp-user@jakarta.apache.org by Keith Kyzivat <kk...@iconverse.com> on 2000/08/08 21:00:54 UTC
Array index out of bounds on RE creation...
Hello there...
Jakarta Regexp (v1.1) has worked quite well up till now... I have a
complicated multi-part regular expression that matches numbers with comma
separated triplets, which works fine when testing it with Vim (after
appropriate syntactical changes):
Jakarta regexp format:
(^\s*(\d{1,3}(,\d\d\d)*)??\.\d+\s*$|^\s*\d{1,3}(,\d\d\d)*(\.\d+)??\s*$)
Vim format:
\(^\s*\(\d{1,3}\(,\d\d\d\)*\)\=\.\d+\s*$\|^\s*\d{1,3}\(,\d\d\d\)*\(\.\d+\)\=
\s*$\)
But, when I pass the expression to the constructor of a RE, it comes back
with an ArrayIndexOutOfBoundsException at index 65809 (as seen below). I
traced it a bit, but couldn't quite follow it. Attached is a test that
should show it. (it's extremely easy to reproduce)
65809
java.lang.ArrayIndexOutOfBoundsException: 65809
at org.apache.regexp.RECompiler.setNextOfEnd(RECompiler.java:207)
at org.apache.regexp.RECompiler.branch(RECompiler.java:1160)
at org.apache.regexp.RECompiler.expr(RECompiler.java:1217)
at org.apache.regexp.RECompiler.terminal(RECompiler.java:866)
at org.apache.regexp.RECompiler.closure(RECompiler.java:942)
at org.apache.regexp.RECompiler.branch(RECompiler.java:1151)
at org.apache.regexp.RECompiler.expr(RECompiler.java:1203)
at org.apache.regexp.RECompiler.compile(RECompiler.java:1281)
at org.apache.regexp.RE.<init>(RE.java:490)
at org.apache.regexp.RE.<init>(RE.java:475)
at Foo.main(Foo.java:13)
RE: Array index out of bounds on RE creation...
Posted by Keith Kyzivat <kk...@iconverse.com>.
More info on the problem:
I put the expression into the REDemo tester app, and played with it for a
while.
I concluded that the ArrayIndexOutOfBounds exception happens with any atom
after a "??" (0 or 1 match) if that next atom is a complex atom (not an
individual character), or an individual character with multiple matching
(i.e. a* or a?? or a{3.4}, etc).
This is definitely incorrect.
Here's a simpler regular expression that shows the problem:
a??b*
> -----Original Message-----
> From: Keith Kyzivat [mailto:kkyzivat@iconverse.com]
> Sent: Tuesday, August 08, 2000 3:01 PM
> To: Apache Regexp-User
> Subject: Array index out of bounds on RE creation...
>
> Hello there...
>
> Jakarta Regexp (v1.1) has worked quite well up till now... I have a
> complicated multi-part regular expression that matches numbers with comma
> separated triplets, which works fine when testing it with Vim (after
> appropriate syntactical changes):
>
> Jakarta regexp format:
> (^\s*(\d{1,3}(,\d\d\d)*)??\.\d+\s*$|^\s*\d{1,3}(,\d\d\d)*(\.\d+)??\s*$)
> Vim format:
> \(^\s*\(\d{1,3}\(,\d\d\d\)*\)\=\.\d+\s*$\|^\s*\d{1,3}\(,\d\d\d\)*\(\.\d+\)
> \=\s*$\)
>
> But, when I pass the expression to the constructor of a RE, it comes back
> with an ArrayIndexOutOfBoundsException at index 65809 (as seen below). I
> traced it a bit, but couldn't quite follow it. Attached is a test that
> should show it. (it's extremely easy to reproduce)
>
>
> 65809
> java.lang.ArrayIndexOutOfBoundsException: 65809
> at org.apache.regexp.RECompiler.setNextOfEnd(RECompiler.java:207)
> at org.apache.regexp.RECompiler.branch(RECompiler.java:1160)
> at org.apache.regexp.RECompiler.expr(RECompiler.java:1217)
> at org.apache.regexp.RECompiler.terminal(RECompiler.java:866)
> at org.apache.regexp.RECompiler.closure(RECompiler.java:942)
> at org.apache.regexp.RECompiler.branch(RECompiler.java:1151)
> at org.apache.regexp.RECompiler.expr(RECompiler.java:1203)
> at org.apache.regexp.RECompiler.compile(RECompiler.java:1281)
> at org.apache.regexp.RE.<init>(RE.java:490)
> at org.apache.regexp.RE.<init>(RE.java:475)
> at Foo.main(Foo.java:13)
>
>
>
> << File: Foo.java >>