You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mime4j-dev@james.apache.org by "Benoit Tellier (Jira)" <mi...@james.apache.org> on 2021/04/16 03:10:00 UTC

[jira] [Updated] (MIME4J-283) DecoderUtil performance fix

     [ https://issues.apache.org/jira/browse/MIME4J-283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benoit Tellier updated MIME4J-283:
----------------------------------
    Fix Version/s: 0.8.3

> DecoderUtil performance fix
> ---------------------------
>
>                 Key: MIME4J-283
>                 URL: https://issues.apache.org/jira/browse/MIME4J-283
>             Project: James Mime4j
>          Issue Type: Improvement
>          Components: parser (core)
>    Affects Versions: master, 0.8.2
>            Reporter: Dmitry Potapov
>            Priority: Minor
>             Fix For: 0.8.3
>
>         Attachments: patch
>
>
> DecoderUtil currently uses the following regex pattern for rfc2047-encoded words: 
> {code:java}
> "(.*?)=\\?(.+?)\\?(\\w)\\?(.*?)\\?="
> {code}
> First capturing group {{(.*?)}} is a very expensive regular expression causing next pattern node evaluation on every input character. Because of this decoding of 4 KB input ({{To:}} field with 40-80 recipients) takes up to 200ms on modern CPUs.
> At the same time, this capturing group used only to store separator text between encoded words. Proposed patch reuses existing {{tailIndex}} for separator text extraction and same input decoding now takes only 1-2ms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)