You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@harmony.apache.org by "Nikolay Kuznetsov (JIRA)" <ji...@apache.org> on 2006/06/28 20:51:30 UTC

[jira] Commented: (HARMONY-688) java.util.regex.Matcher does not support Unicode supplementary characters

    [ http://issues.apache.org/jira/browse/HARMONY-688?page=comments#action_12418290 ] 

Nikolay Kuznetsov commented on HARMONY-688:
-------------------------------------------

Yes, we do not support supplementary characters. The main reason for this was that such a support breaks quantifiers optimizations over character classes of fixed length(we support 1:-)). Now I think that I can support two different types of character classes: one for fixed with 1(2), second for unknown(1 or 2, \\p{javaLowerCase}, for instance).

BTW, an I right that if we do not take into account unicode normalization support this problem affects only character classes and ranges behaviour? In all the other cases it's impossible to construct such a pattern wich will work incorrectly, if not could you please give me an example.

Thanks.
   Nik.

> java.util.regex.Matcher does not support Unicode supplementary characters
> -------------------------------------------------------------------------
>
>          Key: HARMONY-688
>          URL: http://issues.apache.org/jira/browse/HARMONY-688
>      Project: Harmony
>         Type: Bug

>   Components: Classlib
>     Reporter: Richard Liang

>
> Hello Nikolay,
> The following test case pass on RI, but fail on Harmony.  Would you please have a look at this issue? Thanks a lot.
>     public void test_matcher() {
>         Pattern p = Pattern.compile("\\p{javaLowerCase}");
>         Matcher matcher = p.matcher("\uD801\uDC28");
>         assertTrue(matcher.find());
>     }
> Best regards,
> Richard

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Re: [jira] Commented: (HARMONY-688) java.util.regex.Matcher does not support Unicode supplementary characters

Posted by Richard Liang <ri...@gmail.com>.

Nikolay Kuznetsov (JIRA) wrote:
>     [ http://issues.apache.org/jira/browse/HARMONY-688?page=comments#action_12418290 ] 
>
> Nikolay Kuznetsov commented on HARMONY-688:
> -------------------------------------------
>
> Yes, we do not support supplementary characters. The main reason for this was that such a support breaks quantifiers optimizations over character classes of fixed length(we support 1:-)). Now I think that I can support two different types of character classes: one for fixed with 1(2), second for unknown(1 or 2, \\p{javaLowerCase}, for instance).
>
>   
Great! Now I'm eager for this function. Thanks a lot. ;-) 
> BTW, am I right that if we do not take into account unicode normalization support this problem affects only character classes and ranges behaviour? 
Yes, I think so.
> In all the other cases it's impossible to construct such a pattern which will work incorrectly, if not could you please give me an example.
>   
I'm not sure. At least, I cannot give the example. ;-)
> Thanks.
>    Nik.
>
>   
>> java.util.regex.Matcher does not support Unicode supplementary characters
>> -------------------------------------------------------------------------
>>
>>          Key: HARMONY-688
>>          URL: http://issues.apache.org/jira/browse/HARMONY-688
>>      Project: Harmony
>>         Type: Bug
>>     
>
>   
>>   Components: Classlib
>>     Reporter: Richard Liang
>>     
>
>   
>> Hello Nikolay,
>> The following test case pass on RI, but fail on Harmony.  Would you please have a look at this issue? Thanks a lot.
>>     public void test_matcher() {
>>         Pattern p = Pattern.compile("\\p{javaLowerCase}");
>>         Matcher matcher = p.matcher("\uD801\uDC28");
>>         assertTrue(matcher.find());
>>     }
>> Best regards,
>> Richard
>>     
>
>   

-- 
Richard Liang
China Software Development Lab, IBM