You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@commons.apache.org by "Rob Tompkins (JIRA)" <ji...@apache.org> on 2017/02/18 14:53:44 UTC

[jira] [Commented] (TEXT-40) Escape HTML characters only once

    [ https://issues.apache.org/jira/browse/TEXT-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873194#comment-15873194 ] 

Rob Tompkins commented on TEXT-40:
----------------------------------

Interesting point. I think that I may have mechanism that can be used to maintain idempotency of the escaping process. Either way though it seems like a good dev list discussion. 

I see if I can formulate the mechanics of how to maintain idempotency and bubble that up to the dev list. Personally, I'm relatively indifferent on keeping the functionality. I, however, can see an argument both ways: keeping v. removing.

> Escape HTML characters only once
> --------------------------------
>
>                 Key: TEXT-40
>                 URL: https://issues.apache.org/jira/browse/TEXT-40
>             Project: Commons Text
>          Issue Type: Improvement
>            Reporter: Sampanna Kahu
>            Assignee: Rob Tompkins
>            Priority: Minor
>              Labels: features, newbie
>
> If already escaped HTML characters are in the input test, they get escaped again using StringEscapeUtils.escapeHtml4().
> For example:
> If the input is:
> 100 kg & l t ; 1000kg <without the spaces>
> Then the output of escapeHtml4() becomes:
> 100kg & amp ; l t ; 1000kg <without the spaces>
> At my workplace, we felt the need for a method in StringEscapeUtils which does not escape already escaped characters.
> I have attempted to create this method. Creating a pull request soon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)