You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@ofbiz.apache.org by "Jacques Le Roux (JIRA)" <ji...@apache.org> on 2017/11/27 15:04:00 UTC

[jira] [Created] (OFBIZ-10023) Replace org.apache.commons.lang.StringEscapeUtils.unescapeHtml() method by org.jsoup.parser.Parser.unescapeEntities()

Jacques Le Roux created OFBIZ-10023:
---------------------------------------

             Summary: Replace org.apache.commons.lang.StringEscapeUtils.unescapeHtml() method by org.jsoup.parser.Parser.unescapeEntities()
                 Key: OFBIZ-10023
                 URL: https://issues.apache.org/jira/browse/OFBIZ-10023
             Project: OFBiz
          Issue Type: Improvement
          Components: framework
    Affects Versions: Trunk
            Reporter: Jacques Le Roux
            Assignee: Jacques Le Roux
             Fix For: Upcoming Release


[~mleila] from Nereide crossed an issue using org.apache.commons.lang.StringEscapeUtils.unescapeHtml() and she wrote a custom method instead. While reviewing her code I spotted jsoup possible use reading https://stackoverflow.com/questions/599634/convert-html-character-back-to-text-using-java-standard-library and I asked her if she could try jsoup rather. She told me that she was inspired by https://stackoverflow.com/questions/994331/java-how-to-unescape-html-character-entities-in-java and confirmed it was OK after using jsoup and replaced her custom method by a call to org.jsoup.parser.Parser.unescapeEntities().

So I put my grain of salt in stackoverflow and decided to replace org.apache.commons.lang.StringEscapeUtils.unescapeHtml() method by org.jsoup.parser.Parser.unescapeEntities() in OFBiz.

After reading  https://jsoup.org/apidocs/org/jsoup/parser/Parser.html#unescapeEntities-java.lang.String-boolean- and https://www.programcreek.com/java-api-examples/index.php?source_dir=CN1ML-NetbeansModule-master/CN1MLParser/jsoup/src/main/java/org/jsoup/parser/Parser.java#unescapeEntities-String-string-boolean-inAttribute I decided to use the strict  mode in all cases in OFBiz because we may cross cases like mentionned at top of WidgetWorker.buildHyperlinkUrl() or labels like (there few other cases)

{code}
    <property key="ScrumTab">
        <value xml:lang="en">&#160;&#160;&#160;&#160;&#160;</value>
    </property>
{code}

BTW I really wonder about this one. I guess using & nbsp; did not work so & #160; was used. I did not check it was the best way to do what it's used for...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)