You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Burt Leung <bu...@gmail.com> on 2010/12/18 09:07:49 UTC

Bug with StringEscapeUtilities' escapeHTML/unescapeHTML for certain characters

Hello,

I recently used the StringEscapeUtilities to encode/decode a character
into its equivalent HTML entity. While I haven't used it much I do
notice that a couple cases in particular seem "wrong".

case1: StringEscapeUtils.escapeHtml4("ä")
This appears to give "&atilde;". This should actually be "&auml;".

case2: StringEscapeUtils.escapeHtml4("å");
This gives "&aring" but should actually be "&atilde;".

Using the unescape functional also gives the (incorrect) reverse results.

Is this an actual bug or am I missing something?

Thanks,
Burt

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: Bug with StringEscapeUtilities' escapeHTML/unescapeHTML for certain characters

Posted by Henri Yandell <fl...@gmail.com>.
Yes, the 3.0-beta is from the beginning of August.

There have been 20 fixes since then:

https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&&pid=12310481&updated%3Aafter=4%2FAug%2F10&status=5&status=6&fixfor=12311714&resolution=1&sorter/field=updated&sorter/order=DESC

Hen

On Mon, Dec 20, 2010 at 8:50 AM, Burt Leung <bu...@gmail.com> wrote:
> Hi Sebb,
>
> I observed this error ("ä" equated with "&atilde") in the JAR file
> from http://mirrors.axint.net/apache//commons/lang/binaries/commons-lang3-3.0-beta-bin.tar.gz.
> This is the latest beta-3.0 download that I can see is available on
> http://commons.apache.org/lang/download_lang.cgi.
>
> Could it be that the fix wasn't merged into this build yet?
>
> Thanks,
> Burt
>
>
> On Sat, Dec 18, 2010 at 4:38 AM, sebb <se...@gmail.com> wrote:
>> On 18 December 2010 08:07, Burt Leung <bu...@gmail.com> wrote:
>>> Hello,
>>>
>>> I recently used the StringEscapeUtilities to encode/decode a character
>>> into its equivalent HTML entity. While I haven't used it much I do
>>> notice that a couple cases in particular seem "wrong".
>>>
>>> case1: StringEscapeUtils.escapeHtml4("ä")
>>> This appears to give "&atilde;". This should actually be "&auml;".
>>>
>>> case2: StringEscapeUtils.escapeHtml4("å");
>>> This gives "&aring" but should actually be "&atilde;".
>>>
>>> Using the unescape functional also gives the (incorrect) reverse results.
>>>
>>> Is this an actual bug or am I missing something?
>>
>> See:
>> https://issues.apache.org/jira/browse/LANG-658
>>
>>> Thanks,
>>> Burt
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: user-help@commons.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>> For additional commands, e-mail: user-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: Bug with StringEscapeUtilities' escapeHTML/unescapeHTML for certain characters

Posted by Burt Leung <bu...@gmail.com>.
Hi Sebb,

I observed this error ("ä" equated with "&atilde") in the JAR file
from http://mirrors.axint.net/apache//commons/lang/binaries/commons-lang3-3.0-beta-bin.tar.gz.
This is the latest beta-3.0 download that I can see is available on
http://commons.apache.org/lang/download_lang.cgi.

Could it be that the fix wasn't merged into this build yet?

Thanks,
Burt


On Sat, Dec 18, 2010 at 4:38 AM, sebb <se...@gmail.com> wrote:
> On 18 December 2010 08:07, Burt Leung <bu...@gmail.com> wrote:
>> Hello,
>>
>> I recently used the StringEscapeUtilities to encode/decode a character
>> into its equivalent HTML entity. While I haven't used it much I do
>> notice that a couple cases in particular seem "wrong".
>>
>> case1: StringEscapeUtils.escapeHtml4("ä")
>> This appears to give "&atilde;". This should actually be "&auml;".
>>
>> case2: StringEscapeUtils.escapeHtml4("å");
>> This gives "&aring" but should actually be "&atilde;".
>>
>> Using the unescape functional also gives the (incorrect) reverse results.
>>
>> Is this an actual bug or am I missing something?
>
> See:
> https://issues.apache.org/jira/browse/LANG-658
>
>> Thanks,
>> Burt
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>> For additional commands, e-mail: user-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: Bug with StringEscapeUtilities' escapeHTML/unescapeHTML for certain characters

Posted by sebb <se...@gmail.com>.
On 18 December 2010 08:07, Burt Leung <bu...@gmail.com> wrote:
> Hello,
>
> I recently used the StringEscapeUtilities to encode/decode a character
> into its equivalent HTML entity. While I haven't used it much I do
> notice that a couple cases in particular seem "wrong".
>
> case1: StringEscapeUtils.escapeHtml4("ä")
> This appears to give "&atilde;". This should actually be "&auml;".
>
> case2: StringEscapeUtils.escapeHtml4("å");
> This gives "&aring" but should actually be "&atilde;".
>
> Using the unescape functional also gives the (incorrect) reverse results.
>
> Is this an actual bug or am I missing something?

See:
https://issues.apache.org/jira/browse/LANG-658

> Thanks,
> Burt
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: Bug with StringEscapeUtilities' escapeHTML/unescapeHTML for certain characters

Posted by Dennis Lundberg <de...@apache.org>.
On 2010-12-18 09:07, Burt Leung wrote:
> Hello,
> 
> I recently used the StringEscapeUtilities to encode/decode a character
> into its equivalent HTML entity. While I haven't used it much I do
> notice that a couple cases in particular seem "wrong".
> 
> case1: StringEscapeUtils.escapeHtml4("ä")
> This appears to give "&atilde;". This should actually be "&auml;".
> 
> case2: StringEscapeUtils.escapeHtml4("å");
> This gives "&aring" but should actually be "&atilde;".

This one is correct "å" should give "&aring;"

> Using the unescape functional also gives the (incorrect) reverse results.
> 
> Is this an actual bug or am I missing something?
> 
> Thanks,
> Burt
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
> 
> 


-- 
Dennis Lundberg

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org