You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Sebb (JIRA)" <ji...@apache.org> on 2017/02/06 17:35:41 UTC
[jira] [Reopened] (TEXT-40) Escape HTML characters only once
[ https://issues.apache.org/jira/browse/TEXT-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebb reopened TEXT-40:
----------------------
How does one know whether the input has already been escaped or not?
If the input only contains unescaped characters then it has not been escaped.
But if there is a mixture, or if all the input contains escaped enties how can one know?
For example the input could be text explaining how to escape an ampersand.
I don't think these methods make any sense; the only way to be sure whether input has been escaped or not is to keep track of the state in the application.
As an analogy, how about a method that multiplies numbers by 10 unless they have already been multiplied by 10?
> Escape HTML characters only once
> --------------------------------
>
> Key: TEXT-40
> URL: https://issues.apache.org/jira/browse/TEXT-40
> Project: Commons Text
> Issue Type: Improvement
> Reporter: Sampanna Kahu
> Assignee: Rob Tompkins
> Priority: Minor
> Labels: features, newbie
>
> If already escaped HTML characters are in the input test, they get escaped again using StringEscapeUtils.escapeHtml4().
> For example:
> If the input is:
> 100 kg & l t ; 1000kg <without the spaces>
> Then the output of escapeHtml4() becomes:
> 100kg & amp ; l t ; 1000kg <without the spaces>
> At my workplace, we felt the need for a method in StringEscapeUtils which does not escape already escaped characters.
> I have attempted to create this method. Creating a pull request soon.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)