You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Richard Liang <ri...@gmail.com> on 2006/06/29 10:40:26 UTC

Re: [jira] Commented: (HARMONY-688) java.util.regex.Matcher does not support Unicode supplementary characters


Nikolay Kuznetsov (JIRA) wrote:
>     [ http://issues.apache.org/jira/browse/HARMONY-688?page=comments#action_12418290 ] 
>
> Nikolay Kuznetsov commented on HARMONY-688:
> -------------------------------------------
>
> Yes, we do not support supplementary characters. The main reason for this was that such a support breaks quantifiers optimizations over character classes of fixed length(we support 1:-)). Now I think that I can support two different types of character classes: one for fixed with 1(2), second for unknown(1 or 2, \\p{javaLowerCase}, for instance).
>
>   
Great! Now I'm eager for this function. Thanks a lot. ;-) 
> BTW, am I right that if we do not take into account unicode normalization support this problem affects only character classes and ranges behaviour? 
Yes, I think so.
> In all the other cases it's impossible to construct such a pattern which will work incorrectly, if not could you please give me an example.
>   
I'm not sure. At least, I cannot give the example. ;-)
> Thanks.
>    Nik.
>
>   
>> java.util.regex.Matcher does not support Unicode supplementary characters
>> -------------------------------------------------------------------------
>>
>>          Key: HARMONY-688
>>          URL: http://issues.apache.org/jira/browse/HARMONY-688
>>      Project: Harmony
>>         Type: Bug
>>     
>
>   
>>   Components: Classlib
>>     Reporter: Richard Liang
>>     
>
>   
>> Hello Nikolay,
>> The following test case pass on RI, but fail on Harmony.  Would you please have a look at this issue? Thanks a lot.
>>     public void test_matcher() {
>>         Pattern p = Pattern.compile("\\p{javaLowerCase}");
>>         Matcher matcher = p.matcher("\uD801\uDC28");
>>         assertTrue(matcher.find());
>>     }
>> Best regards,
>> Richard
>>     
>
>   

-- 
Richard Liang
China Software Development Lab, IBM