You are viewing a plain text version of this content. The canonical link for it is here.
Posted to regexp-dev@jakarta.apache.org by bu...@apache.org on 2002/05/13 16:16:21 UTC
DO NOT REPLY [Bug 9035] New: -
big Latitude Longitude RE causes IndexOutOfBoundsException
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9035>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9035
big Latitude Longitude RE causes IndexOutOfBoundsException
Summary: big Latitude Longitude RE causes
IndexOutOfBoundsException
Product: Regexp
Version: unspecified
Platform: All
OS/Version: Linux
Status: NEW
Severity: Major
Priority: Other
Component: Other
AssignedTo: regexp-dev@jakarta.apache.org
ReportedBy: mnewcomb@tacintel.com
I have two faily big REs dealing with Latitude and Longitude. When I use them
separately, no problems. However, when I combine the 2 REs, so I can pass one
Latitude-Longitude string to it, it bombs out with an exception (detailed
below).
Here is the test program. Refer to the example run for usage:
import java.io.*;
import java.util.*;
import org.apache.regexp.*;
public class LatLonREBug
{
private static final String LATITUDE_RE_STRING =
"-?(([0-8]?[0-9]((\\.[0-9]+)|((([0-5][0-9])|60)((([0-5][0-9])|60))?))?)|90)[nNsS]";
private static final String LONGITUDE_RE_STRING =
"-?(((([0-9]?[0-9])|(1[0-7][0-9]))((\\.[0-9]+)|((([0-5][0-9])|60)(([0-5][0-9])|60)?))?)|180)[eEwW]";
public static final String LATITUDE_LONGITUDE_RE_STRING =
"^" + LATITUDE_RE_STRING + LONGITUDE_RE_STRING + "$";
public static void main(String[] args)
throws Throwable
{
RE latlonRE = new RE(LATITUDE_LONGITUDE_RE_STRING);
System.out.println("LATITUDE_LONGITUDE_RE_STRING: " +
LATITUDE_LONGITUDE_RE_STRING);
RE latRE = new RE("^" + LATITUDE_RE_STRING + "$");
System.out.println("LATITUDE_RE_STRING: " + LATITUDE_RE_STRING);
RE lonRE = new RE("^" + LONGITUDE_RE_STRING + "$");
System.out.println("LONGITUDE_RE_STRING: " + LONGITUDE_RE_STRING);
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
String line = br.readLine();
while (line != null && !line.equals("quit") && !line.equals("exit"))
{
StringTokenizer st = new StringTokenizer(line);
int tokens = st.countTokens();
if (tokens > 1)
{
String command = st.nextToken();
if (command.equalsIgnoreCase("lat"))
{
String lat = st.nextToken();
latRE.match(lat);
System.out.println(lat + " is a properly formatted latitude");
}
else if (command.equalsIgnoreCase("lon"))
{
String lon = st.nextToken();
lonRE.match(lon);
System.out.println(lon + " is a properly formatted longitude");
}
else if (command.equalsIgnoreCase("latlon"))
{
String latlon = st.nextToken();
latlonRE.match(latlon);
System.out.println(latlon + " is a properly formatted lat-lon");
}
else
{
System.out.println("unknown command: " + command);
}
}
else
{
System.out.println("invalid line: " + line);
}
line = br.readLine();
}
}
}
Here is an example run of the test-case. As you will see, when just doing
latitude or longitude, the REs match as expected. But, when I do a 'latlon'
string, it pukes...
[mnewcomb@localhost sandbox]$ java -classpath
/usr/local/regexp/jakarta-regexp-1.2.jar:. LatLonREBug
LATITUDE_LONGITUDE_RE_STRING:
^-?(([0-8]?[0-9]((\.[0-9]+)|((([0-5][0-9])|60)((([0-5][0-9])|60))?))?)|90)[nNsS]-?(((([0-9]?[0-9])|(1[0-7][0-9]))((\.[0-9]+)|((([0-5][0-9])|60)(([0-5][0-9])|60)?))?)|180)[eEwW]$
LATITUDE_RE_STRING:
-?(([0-8]?[0-9]((\.[0-9]+)|((([0-5][0-9])|60)((([0-5][0-9])|60))?))?)|90)[nNsS]
LONGITUDE_RE_STRING:
-?(((([0-9]?[0-9])|(1[0-7][0-9]))((\.[0-9]+)|((([0-5][0-9])|60)(([0-5][0-9])|60)?))?)|180)[eEwW]
lat 55N
55N is a properly formatted latitude
lat 55.454N
55.454N is a properly formatted latitude
lat 5545N
5545N is a properly formatted latitude
lon 123E
123E is a properly formatted longitude
lon 5E
5E is a properly formatted longitude
lon 123.444E
123.444E is a properly formatted longitude
lon 1784532W
1784532W is a properly formatted longitude
latlon 55N44E
55N44E is a properly formatted lat-lon
latlon 55N44.33E
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
at org.apache.regexp.RE.getParenEnd(RE.java:724)
at org.apache.regexp.RE.matchNodes(RE.java:942)
at org.apache.regexp.RE.matchNodes(RE.java:933)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:910)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:910)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:933)
at org.apache.regexp.RE.matchNodes(RE.java:933)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:910)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:910)
at org.apache.regexp.RE.matchNodes(RE.java:910)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:910)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:933)
at org.apache.regexp.RE.matchNodes(RE.java:933)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:910)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchNodes(RE.java:910)
at org.apache.regexp.RE.matchNodes(RE.java:1376)
at org.apache.regexp.RE.matchAt(RE.java:1448)
at org.apache.regexp.RE.match(RE.java:1498)
at org.apache.regexp.RE.match(RE.java:1468)
at org.apache.regexp.RE.match(RE.java:1561)
at LatLonREBug.main(LatLonREBug.java:54)
[mnewcomb@localhost sandbox]$
Any help will be greatly appreciated.
Thanks,
Michael
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>
Re: - IndexOutOfBoundsException: clarification
Posted by Holger Stratmann <Ho...@cheerful.com>.
Actually, you can write much simpler RE's to reproduce this :-))
I had wanted to file a bugreport (along with a few others):
RegExp does not "support" more than 16 parenthesized sub-expressions.
As soon as you have more than 16 '(...)', you get ArrayIndexOOBExceptions :-(
(Actually, I had seen that while taking a look at the sources and then confirmed the problem by trying it ;)
That's why your two expressions work separately, but not combined.
I guess I'll write a fix for that, but considering i didn#t have time to file a bugreport...
A "workaround" in this case (just as a temporary help for Michael):
Your RE has two clearly defined parts... You can probably use one more general expression to find potential matches and then check two parts separately. Not nice...
Fixing the problem may actually be faster :-))
I had an estimate of 1-3 hours for fixing the code, but I'd need to find out something about the process [of submitting code] first and that would probably take longer...
Cheers,
Holger
bugzilla@apache.org wrote:
> DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
> RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
> <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9035>.
> ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
> INSERTED IN THE BUG DATABASE.
>
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9035
>
> big Latitude Longitude RE causes IndexOutOfBoundsException
>
> Summary: big Latitude Longitude RE causes
> IndexOutOfBoundsException
> Product: Regexp
> Version: unspecified
> Platform: All
> OS/Version: Linux
> Status: NEW
> Severity: Major
> Priority: Other
> Component: Other
> AssignedTo: regexp-dev@jakarta.apache.org
> ReportedBy: mnewcomb@tacintel.com
>
> I have two faily big REs dealing with Latitude and Longitude. When I use them
> separately, no problems. However, when I combine the 2 REs, so I can pass one
> Latitude-Longitude string to it, it bombs out with an exception (detailed
> below).
>
> Here is the test program. Refer to the example run for usage:
>
> import java.io.*;
> import java.util.*;
> import org.apache.regexp.*;
>
> public class LatLonREBug
> {
> private static final String LATITUDE_RE_STRING =
>
> "-?(([0-8]?[0-9]((\\.[0-9]+)|((([0-5][0-9])|60)((([0-5][0-9])|60))?))?)|90)[nNsS]";
> private static final String LONGITUDE_RE_STRING =
>
> "-?(((([0-9]?[0-9])|(1[0-7][0-9]))((\\.[0-9]+)|((([0-5][0-9])|60)(([0-5][0-9])|60)?))?)|180)[eEwW]";
>
> public static final String LATITUDE_LONGITUDE_RE_STRING =
> "^" + LATITUDE_RE_STRING + LONGITUDE_RE_STRING + "$";
>
> public static void main(String[] args)
> throws Throwable
> {
> RE latlonRE = new RE(LATITUDE_LONGITUDE_RE_STRING);
> System.out.println("LATITUDE_LONGITUDE_RE_STRING: " +
> LATITUDE_LONGITUDE_RE_STRING);
>
> RE latRE = new RE("^" + LATITUDE_RE_STRING + "$");
> System.out.println("LATITUDE_RE_STRING: " + LATITUDE_RE_STRING);
>
> RE lonRE = new RE("^" + LONGITUDE_RE_STRING + "$");
> System.out.println("LONGITUDE_RE_STRING: " + LONGITUDE_RE_STRING);
>
> BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
> String line = br.readLine();
> while (line != null && !line.equals("quit") && !line.equals("exit"))
> {
> StringTokenizer st = new StringTokenizer(line);
> int tokens = st.countTokens();
>
> if (tokens > 1)
> {
> String command = st.nextToken();
>
> if (command.equalsIgnoreCase("lat"))
> {
> String lat = st.nextToken();
> latRE.match(lat);
> System.out.println(lat + " is a properly formatted latitude");
> }
> else if (command.equalsIgnoreCase("lon"))
> {
> String lon = st.nextToken();
> lonRE.match(lon);
> System.out.println(lon + " is a properly formatted longitude");
> }
> else if (command.equalsIgnoreCase("latlon"))
> {
> String latlon = st.nextToken();
> latlonRE.match(latlon);
> System.out.println(latlon + " is a properly formatted lat-lon");
> }
> else
> {
> System.out.println("unknown command: " + command);
> }
> }
> else
> {
> System.out.println("invalid line: " + line);
> }
>
> line = br.readLine();
> }
> }
> }
>
> Here is an example run of the test-case. As you will see, when just doing
> latitude or longitude, the REs match as expected. But, when I do a 'latlon'
> string, it pukes...
>
> [mnewcomb@localhost sandbox]$ java -classpath
> /usr/local/regexp/jakarta-regexp-1.2.jar:. LatLonREBug
> LATITUDE_LONGITUDE_RE_STRING:
> ^-?(([0-8]?[0-9]((\.[0-9]+)|((([0-5][0-9])|60)((([0-5][0-9])|60))?))?)|90)[nNsS]-?(((([0-9]?[0-9])|(1[0-7][0-9]))((\.[0-9]+)|((([0-5][0-9])|60)(([0-5][0-9])|60)?))?)|180)[eEwW]$
> LATITUDE_RE_STRING:
> -?(([0-8]?[0-9]((\.[0-9]+)|((([0-5][0-9])|60)((([0-5][0-9])|60))?))?)|90)[nNsS]
> LONGITUDE_RE_STRING:
> -?(((([0-9]?[0-9])|(1[0-7][0-9]))((\.[0-9]+)|((([0-5][0-9])|60)(([0-5][0-9])|60)?))?)|180)[eEwW]
> lat 55N
> 55N is a properly formatted latitude
> lat 55.454N
> 55.454N is a properly formatted latitude
> lat 5545N
> 5545N is a properly formatted latitude
> lon 123E
> 123E is a properly formatted longitude
> lon 5E
> 5E is a properly formatted longitude
> lon 123.444E
> 123.444E is a properly formatted longitude
> lon 1784532W
> 1784532W is a properly formatted longitude
> latlon 55N44E
> 55N44E is a properly formatted lat-lon
> latlon 55N44.33E
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
> at org.apache.regexp.RE.getParenEnd(RE.java:724)
> at org.apache.regexp.RE.matchNodes(RE.java:942)
> at org.apache.regexp.RE.matchNodes(RE.java:933)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:910)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:910)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:933)
> at org.apache.regexp.RE.matchNodes(RE.java:933)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:910)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:910)
> at org.apache.regexp.RE.matchNodes(RE.java:910)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:910)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:933)
> at org.apache.regexp.RE.matchNodes(RE.java:933)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:910)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchNodes(RE.java:910)
> at org.apache.regexp.RE.matchNodes(RE.java:1376)
> at org.apache.regexp.RE.matchAt(RE.java:1448)
> at org.apache.regexp.RE.match(RE.java:1498)
> at org.apache.regexp.RE.match(RE.java:1468)
> at org.apache.regexp.RE.match(RE.java:1561)
> at LatLonREBug.main(LatLonREBug.java:54)
> [mnewcomb@localhost sandbox]$
>
> Any help will be greatly appreciated.
>
> Thanks,
> Michael
>
> --
> To unsubscribe, e-mail: <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>