You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Taro Yabuki (JIRA)" <ji...@apache.org> on 2011/07/16 10:47:59 UTC
[jira] [Created] (LANG-729) StringEscapeUtils.unescapeXml(str) does
not support supplemental characters.
StringEscapeUtils.unescapeXml(str) does not support supplemental characters.
----------------------------------------------------------------------------
Key: LANG-729
URL: https://issues.apache.org/jira/browse/LANG-729
Project: Commons Lang
Issue Type: Improvement
Components: lang.*
Affects Versions: 2.6
Reporter: Taro Yabuki
Priority: Trivial
Attachments: lang_2_6_unescapexml_20110716.diff
StringEscapeUtils.unescapeXml(str) does not unescape numeric character references of supplemental characters:
String str2 = StringEscapeUtils.unescapeXml("&#144308;");
System.out.println(str2.codePointAt(0));
//38 (it means '&'.)
This output should be 144308.
Currently, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is equal to str, so it doesn't seem to be wrong. But, as we reported in LANG-728, StringEscapeUtils.escapeXml(str) has a bug. When the bug is fixed, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) would not be equal to str. We do not expect it. (Of course, we don't expect that StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is always equal to str.)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (LANG-729) StringEscapeUtils.unescapeXml(str) does
not support supplemental characters.
Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henri Yandell closed LANG-729.
------------------------------
Resolution: Fixed
Fix Version/s: (was: 3.0.1)
3.0
Per LANG-728, this is resolved in Lang 3.0. Note that the code to do the unescape is not specialized as with the escape, the following will work happily:
{code:java}
assertEquals("Supplementary character must be represented using a single escape", "\uD84C\uDFB4",
StringEscapeUtils.unescapeXml("𣎴") );
{code}
Resolving as Fixed in 3.0.
> StringEscapeUtils.unescapeXml(str) does not support supplemental characters.
> ----------------------------------------------------------------------------
>
> Key: LANG-729
> URL: https://issues.apache.org/jira/browse/LANG-729
> Project: Commons Lang
> Issue Type: Improvement
> Components: lang.*
> Affects Versions: 2.6
> Reporter: Taro Yabuki
> Priority: Trivial
> Labels: patch
> Fix For: 3.0
>
> Attachments: lang_2_6_unescapexml_20110716.diff
>
>
> StringEscapeUtils.unescapeXml(str) does not unescape numeric character references of supplemental characters:
> String str2 = StringEscapeUtils.unescapeXml("&#144308;");
> System.out.println(str2.codePointAt(0));
> //38 (it means '&'.)
> This output should be 144308.
> Currently, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is equal to str, so it doesn't seem to be wrong. But, as we reported in LANG-728, StringEscapeUtils.escapeXml(str) has a bug. When the bug is fixed, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) would not be equal to str. We do not expect it. (Of course, we don't expect that StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is always equal to str.)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (LANG-729) StringEscapeUtils.unescapeXml(str) does
not support supplemental characters.
Posted by "Taro Yabuki (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Taro Yabuki updated LANG-729:
-----------------------------
Attachment: lang_2_6_unescapexml_20110716.diff
Test code and patch for org/apache/commons/lang/Entities.java.
> StringEscapeUtils.unescapeXml(str) does not support supplemental characters.
> ----------------------------------------------------------------------------
>
> Key: LANG-729
> URL: https://issues.apache.org/jira/browse/LANG-729
> Project: Commons Lang
> Issue Type: Improvement
> Components: lang.*
> Affects Versions: 2.6
> Reporter: Taro Yabuki
> Priority: Trivial
> Labels: patch
> Attachments: lang_2_6_unescapexml_20110716.diff
>
>
> StringEscapeUtils.unescapeXml(str) does not unescape numeric character references of supplemental characters:
> String str2 = StringEscapeUtils.unescapeXml("&#144308;");
> System.out.println(str2.codePointAt(0));
> //38 (it means '&'.)
> This output should be 144308.
> Currently, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is equal to str, so it doesn't seem to be wrong. But, as we reported in LANG-728, StringEscapeUtils.escapeXml(str) has a bug. When the bug is fixed, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) would not be equal to str. We do not expect it. (Of course, we don't expect that StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is always equal to str.)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (LANG-729) StringEscapeUtils.unescapeXml(str) does
not support supplemental characters.
Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henri Yandell updated LANG-729:
-------------------------------
Fix Version/s: 3.0.1
> StringEscapeUtils.unescapeXml(str) does not support supplemental characters.
> ----------------------------------------------------------------------------
>
> Key: LANG-729
> URL: https://issues.apache.org/jira/browse/LANG-729
> Project: Commons Lang
> Issue Type: Improvement
> Components: lang.*
> Affects Versions: 2.6
> Reporter: Taro Yabuki
> Priority: Trivial
> Labels: patch
> Fix For: 3.0.1
>
> Attachments: lang_2_6_unescapexml_20110716.diff
>
>
> StringEscapeUtils.unescapeXml(str) does not unescape numeric character references of supplemental characters:
> String str2 = StringEscapeUtils.unescapeXml("&#144308;");
> System.out.println(str2.codePointAt(0));
> //38 (it means '&'.)
> This output should be 144308.
> Currently, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is equal to str, so it doesn't seem to be wrong. But, as we reported in LANG-728, StringEscapeUtils.escapeXml(str) has a bug. When the bug is fixed, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) would not be equal to str. We do not expect it. (Of course, we don't expect that StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is always equal to str.)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira