You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Henri Yandell (JIRA)" <ji...@apache.org> on 2009/07/07 09:16:14 UTC

[jira] Commented: (LANG-293) StringEscapeUtils.unescape* can be faster

    [ https://issues.apache.org/jira/browse/LANG-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727952#action_12727952 ] 

Henri Yandell commented on LANG-293:
------------------------------------

Needs to be rethought after rewrite of Entities class into text.translate. However the idea still holds. Possibly an optimization for LookupTranslators such that they can optionally define a set of characters to check that the absence of makes them short circuit. ie) no & and ; then it short circuits etc.

Alternatively - that might be a different translator - an OptimizationUnlessTranslator. If it can't find the passed in characters, it passes the whole string through.

> StringEscapeUtils.unescape* can be faster
> -----------------------------------------
>
>                 Key: LANG-293
>                 URL: https://issues.apache.org/jira/browse/LANG-293
>             Project: Commons Lang
>          Issue Type: Improvement
>    Affects Versions: Nightly Builds
>            Reporter: Stepan Koltsov
>             Fix For: 3.0
>
>         Attachments: commons-lang-unescape-performace2-stepancheg-2006-10-31.diff, EntitiesPerformance2TestSecret.java
>
>
> Typical string that need to be unescaped contains almost no XML entities, so copying input string to output buffer char by char is slow.
> I've refactored Entities.unescape() so it works faster. Going to submitting patch and tests.
> Patch contains both hacked and original versions of unescape, to run tests.
> Test shows that performance remains same on short strings, and much better or large strings with rare entities.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.