You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Henri Yandell (JIRA)" <ji...@apache.org> on 2009/07/07 09:16:14 UTC
[jira] Commented: (LANG-293) StringEscapeUtils.unescape* can be
faster
[ https://issues.apache.org/jira/browse/LANG-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727952#action_12727952 ]
Henri Yandell commented on LANG-293:
------------------------------------
Needs to be rethought after rewrite of Entities class into text.translate. However the idea still holds. Possibly an optimization for LookupTranslators such that they can optionally define a set of characters to check that the absence of makes them short circuit. ie) no & and ; then it short circuits etc.
Alternatively - that might be a different translator - an OptimizationUnlessTranslator. If it can't find the passed in characters, it passes the whole string through.
> StringEscapeUtils.unescape* can be faster
> -----------------------------------------
>
> Key: LANG-293
> URL: https://issues.apache.org/jira/browse/LANG-293
> Project: Commons Lang
> Issue Type: Improvement
> Affects Versions: Nightly Builds
> Reporter: Stepan Koltsov
> Fix For: 3.0
>
> Attachments: commons-lang-unescape-performace2-stepancheg-2006-10-31.diff, EntitiesPerformance2TestSecret.java
>
>
> Typical string that need to be unescaped contains almost no XML entities, so copying input string to output buffer char by char is slow.
> I've refactored Entities.unescape() so it works faster. Going to submitting patch and tests.
> Patch contains both hacked and original versions of unescape, to run tests.
> Test shows that performance remains same on short strings, and much better or large strings with rare entities.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.