You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Michael Konietzka (JIRA)" <ji...@apache.org> on 2010/11/13 15:32:14 UTC
[jira] Created: (LANG-658) Some Entitys like Ö are not matched
properly against its ISO8859-1 representation
Some Entitys like Ö are not matched properly against its ISO8859-1 representation
--------------------------------------------------------------------------------------
Key: LANG-658
URL: https://issues.apache.org/jira/browse/LANG-658
Project: Commons Lang
Issue Type: Bug
Components: lang.text.translate.*
Affects Versions: 3.0
Reporter: Michael Konietzka
Fix For: 3.0
In EntityArrays
In
private static final String[][] ISO8859_1_ESCAPE
some matching is wrong, for example
{"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D8", "×"}, // multiplication sign
but this must be
{"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D7", "×"}, // multiplication sign
according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (LANG-658) Some Entitys like Ö
are not matched properly against its ISO8859-1 representation
Posted by "Sebb (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931693#action_12931693 ]
Sebb edited comment on LANG-658 at 11/13/10 11:19 AM:
------------------------------------------------------
Another duplicate entry:
{noformat}
{"\u00F1", "ñ"}, // ñ - lowercase n, tilde
{"\u00F3", "ò"}, // ò - lowercase o, grave accent
{"\u00F3", "ó"}, // ó - lowercase o, acute accent
{noformat}
first F3 entry should be F2
was (Author: sebb@apache.org):
Another duplicate entry:
{"\u00F1", "ñ"}, // ñ - lowercase n, tilde
{"\u00F3", "ò"}, // ò - lowercase o, grave accent
{"\u00F3", "ó"}, // ó - lowercase o, acute accent
first F3 entry should be F2
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> but this must be
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (LANG-658) Some Entitys like Ö are not matched
properly against its ISO8859-1 representation
Posted by "Sebb (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebb updated LANG-658:
----------------------
Description:
In EntityArrays
In
private static final String[][] ISO8859_1_ESCAPE
some matching is wrong, for example
{noformat}
{"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D8", "×"}, // multiplication sign
{noformat}
but this must be
{noformat}
{"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D7", "×"}, // multiplication sign
{noformat}
according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
First look:
u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
was:
In EntityArrays
In
private static final String[][] ISO8859_1_ESCAPE
some matching is wrong, for example
{"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D8", "×"}, // multiplication sign
but this must be
{"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D7", "×"}, // multiplication sign
according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
First look:
u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {noformat}
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> {noformat}
> but this must be
> {noformat}
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> {noformat}
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (LANG-658) Some Entitys like Ö are not
matched properly against its ISO8859-1 representation
Posted by "Sebb (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebb resolved LANG-658.
-----------------------
Resolution: Fixed
Fix Version/s: 3.0
Now hopefully fixed.
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
> Fix For: 3.0
>
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {noformat}
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> {noformat}
> but this must be
> {noformat}
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> {noformat}
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (LANG-658) Some Entitys like Ö are not
matched properly against its ISO8859-1 representation
Posted by "Sebb (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931709#action_12931709 ]
Sebb commented on LANG-658:
---------------------------
Note: ran a check comparing the values agains the ones from lang2 Entities, and the two implementations now seem to agree
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
> Fix For: 3.0
>
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {noformat}
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> {noformat}
> but this must be
> {noformat}
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> {noformat}
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (LANG-658) Some Entitys like Ö are not matched
properly against its ISO8859-1 representation
Posted by "Michael Konietzka (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Konietzka updated LANG-658:
-----------------------------------
Fix Version/s: (was: 3.0)
Description:
In EntityArrays
In
private static final String[][] ISO8859_1_ESCAPE
some matching is wrong, for example
{"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D8", "×"}, // multiplication sign
but this must be
{"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D7", "×"}, // multiplication sign
according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
First look:
u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
was:
In EntityArrays
In
private static final String[][] ISO8859_1_ESCAPE
some matching is wrong, for example
{"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D8", "×"}, // multiplication sign
but this must be
{"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D7", "×"}, // multiplication sign
according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> but this must be
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Closed: (LANG-658) Some Entitys like Ö are not matched
properly against its ISO8859-1 representation
Posted by "Michael Konietzka (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Konietzka closed LANG-658.
----------------------------------
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
> Fix For: 3.0
>
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {noformat}
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> {noformat}
> but this must be
> {noformat}
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> {noformat}
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (LANG-658) Some Entitys like Ö are not matched
properly against its ISO8859-1 representation
Posted by "Michael Konietzka (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Konietzka updated LANG-658:
-----------------------------------
Description:
In EntityArrays
In
private static final String[][] ISO8859_1_ESCAPE
some matching is wrong, for example
{"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D8", "×"}, // multiplication sign
but this must be
{"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D7", "×"}, // multiplication sign
according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
First look:
u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
was:
In EntityArrays
In
private static final String[][] ISO8859_1_ESCAPE
some matching is wrong, for example
{"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D8", "×"}, // multiplication sign
but this must be
{"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
{"\u00D7", "×"}, // multiplication sign
according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
First look:
u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> but this must be
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (LANG-658) Some Entitys like Ö are not
matched properly against its ISO8859-1 representation
Posted by "Sebb (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931691#action_12931691 ]
Sebb commented on LANG-658:
---------------------------
Later on, there are two instances of E5:
{"\u00E5", "ä"}, // ä - lowercase a, umlaut
{"\u00E5", "å"}, // å - lowercase a, ring
The latter is correct, and subsequent entries seem OK.
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> but this must be
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (LANG-658) Some Entitys like Ö are not
matched properly against its ISO8859-1 representation
Posted by "Sebb (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931693#action_12931693 ]
Sebb commented on LANG-658:
---------------------------
Another duplicate entry:
{"\u00F1", "ñ"}, // ñ - lowercase n, tilde
{"\u00F3", "ò"}, // ò - lowercase o, grave accent
{"\u00F3", "ó"}, // ó - lowercase o, acute accent
first F3 entry should be F2
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> but this must be
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (LANG-658) Some Entitys like Ö
are not matched properly against its ISO8859-1 representation
Posted by "Sebb (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LANG-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931691#action_12931691 ]
Sebb edited comment on LANG-658 at 11/13/10 11:19 AM:
------------------------------------------------------
Later on, there are two instances of E5:
{noformat}
{"\u00E5", "ä"}, // ä - lowercase a, umlaut
{"\u00E5", "å"}, // å - lowercase a, ring
{noformat}
The latter is correct, and subsequent entries seem OK.
was (Author: sebb@apache.org):
Later on, there are two instances of E5:
{"\u00E5", "ä"}, // ä - lowercase a, umlaut
{"\u00E5", "å"}, // å - lowercase a, ring
The latter is correct, and subsequent entries seem OK.
> Some Entitys like Ö are not matched properly against its ISO8859-1 representation
> --------------------------------------------------------------------------------------
>
> Key: LANG-658
> URL: https://issues.apache.org/jira/browse/LANG-658
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.0
> Reporter: Michael Konietzka
>
> In EntityArrays
> In
> private static final String[][] ISO8859_1_ESCAPE
> some matching is wrong, for example
>
> {"\u00D7", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D8", "×"}, // multiplication sign
> but this must be
> {"\u00D6", "Ö"}, // Ö - uppercase O, umlaut
> {"\u00D7", "×"}, // multiplication sign
> according to http://www.fileformat.info/info/unicode/block/latin_supplement/list.htm
> First look:
> u00CA is missing in the array and all following entries are matched wrong by an offset of 1.
> Found on http://stackoverflow.com/questions/4172784/bug-in-apache-commons-stringescapeutil/4172915#4172915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.