You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@harmony.apache.org by "Anton Ivanov (JIRA)" <ji...@apache.org> on 2006/10/11 14:39:28 UTC
[jira] Updated: (HARMONY-688) java.util.regex.Matcher does not
support Unicode supplementary characters
[ http://issues.apache.org/jira/browse/HARMONY-688?page=all ]
Anton Ivanov updated HARMONY-688:
---------------------------------
Attachment: patch_src_corrected.txt
I corrected the patch (patch_src.txt) and attached it to the issue (patch_src_corrected.txt).
I verified that regex and luni tests pass normally with the patch applied.
There was a bug in the newly created class SupplRangeSet.java.
There was the following code in the method matches() of SupplRangeSet.java:
...
if (stringIndex < strLength) {
char high = testString.charAt(stringIndex++);
if (contains(high) &&
next.matches(stringIndex, testString, matchResult) > 0) {
return 1;
}
...
But it is wrong simply to return 1, though we can read about method matches() in AbstractSet.java comments:
"Checks if this node matches in given position and recursively call
next node matches on positive self match. Returns positive integer if
entire match succeed, negative otherwise
return -1 if match fails or n > 0;"
In fact method matches() returns not only a positive n > 0. The n is an offset in case of a positive
match attempt. This fact is took into account in all old classes of java.util.regex, but I forgot this fact in SupplRangeSet.java
So I corrected method matches() of the SupplRangeSet class as follows:
...
int offset = -1;
if (stringIndex < strLength) {
char high = testString.charAt(stringIndex++);
if (contains(high) &&
(offset = next.matches(stringIndex, testString, matchResult)) > 0) {
return offset;
}
...
Thanks,
Anton
> java.util.regex.Matcher does not support Unicode supplementary characters
> -------------------------------------------------------------------------
>
> Key: HARMONY-688
> URL: http://issues.apache.org/jira/browse/HARMONY-688
> Project: Harmony
> Issue Type: Bug
> Components: Classlib
> Reporter: Richard Liang
> Assigned To: Tim Ellison
> Attachments: patch_src.txt, patch_src_corrected.txt, patch_tests.txt
>
>
> Hello Nikolay,
> The following test case pass on RI, but fail on Harmony. Would you please have a look at this issue? Thanks a lot.
> public void test_matcher() {
> Pattern p = Pattern.compile("\\p{javaLowerCase}");
> Matcher matcher = p.matcher("\uD801\uDC28");
> assertTrue(matcher.find());
> }
> Best regards,
> Richard
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira