You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Henri Yandell (JIRA)" <ji...@apache.org> on 2011/01/17 06:51:43 UTC
[jira] Updated: (LANG-607) StringUtils methods do not handle
Unicode 2.0+ supplementary characters correctly.
[ https://issues.apache.org/jira/browse/LANG-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henri Yandell updated LANG-607:
-------------------------------
Moving to 3.1 as not a backwards incompatibility.
> StringUtils methods do not handle Unicode 2.0+ supplementary characters correctly.
> ----------------------------------------------------------------------------------
>
> Key: LANG-607
> URL: https://issues.apache.org/jira/browse/LANG-607
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.*
> Affects Versions: 2.5
> Environment: java version "1.6.0_16"
> Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
> Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)
> Microsoft Windows [Version 6.0.6002]
> Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700)
> Java version: 1.6.0_16
> Java home: C:\Program Files\Java\jdk1.6.0_16\jre
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows vista" version: "6.0" arch: "amd64" Family: "windows"
> Reporter: Gary Gregory
> Assignee: Gary Gregory
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LANG-607.diff
>
>
> StringUtils.containsAny methods incorrectly matches Unicode 2.0+ supplementary characters.
> For example, define a test fixture to be the Unicode character U+20000 where U+20000 is written in Java source as "\uD840\uDC00"
> private static final String CharU20000 = "\uD840\uDC00";
> private static final String CharU20001 = "\uD840\uDC01";
> You can see Unicode supplementary characters correctly implemented in the JRE call:
> assertEquals(-1, CharU20000.indexOf(CharU20001));
> But this is broken:
> assertEquals(false, StringUtils.containsAny(CharU20000, CharU20001));
> assertEquals(false, StringUtils.containsAny(CharU20001, CharU20000));
> This is fine:
> assertEquals(true, StringUtils.contains(CharU20000 + CharU20001, CharU20000));
> assertEquals(true, StringUtils.contains(CharU20000 + CharU20001, CharU20001));
> assertEquals(true, StringUtils.contains(CharU20000, CharU20000));
> assertEquals(false, StringUtils.contains(CharU20000, CharU20001));
> because the method calls the JRE to perform the match.
> More than you want to know:
> - http://java.sun.com/developer/technicalArticles/Intl/Supplementary/
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.