You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@harmony.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2010/09/15 16:20:34 UTC

[jira] Commented: (HARMONY-6649) String.toLowerCase/toUpperCase incorrect for supplementary characters

    [ https://issues.apache.org/jira/browse/HARMONY-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909744#action_12909744 ] 

Robert Muir commented on HARMONY-6649:
--------------------------------------

I started looking at this to try to produce a patch, but I found more problems.

For example, this Locale-sensitive lowerCase does not appear to handle greek final sigma correctly, or various other things
from SpecialCasing (http://www.unicode.org/Public/4.0-Update/SpecialCasing-4.0.0.txt)

To implement the casing algorithms here from Unicode ch3.13 is quite complex, is it possible
for String.toLowerCase(Locale)/toUpperCase(Locale) to somehow use the ICU static methods in UCharacter?


> String.toLowerCase/toUpperCase incorrect for supplementary characters
> ---------------------------------------------------------------------
>
>                 Key: HARMONY-6649
>                 URL: https://issues.apache.org/jira/browse/HARMONY-6649
>             Project: Harmony
>          Issue Type: Bug
>          Components: Classlib
>    Affects Versions: 5.0M15
>            Reporter: Robert Muir
>
> Simple testcase:
> {code}
>     assertEquals("\uD801\uDC44", "\uD801\uDC1C".toLowerCase());
> {code}
> Looking at modules/luni/src/main/java/java/lang/String.java, the problem is these methods iterate code units (char) not codepoints (int),
> and use Character.toLowerCase(char) and Character.toUpperCase(char), instead of Character.toLowerCase(int), and Character.toUpperCase(int)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.