You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mime4j-dev@james.apache.org by "Stefano Bagnara (JIRA)" <mi...@james.apache.org> on 2011/06/21 12:38:47 UTC

[jira] [Commented] (MIME4J-116) Avoid duplicate parsing of header fields

    [ https://issues.apache.org/jira/browse/MIME4J-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052463#comment-13052463 ] 

Stefano Bagnara commented on MIME4J-116:
----------------------------------------

Hi Oleg, I finally had the time to review this code. I can't understand why FieldParser is needed in the whole AbstractEntity/MimeEntity chain as it is only used once, just before the MutableBodyDescriptor.addField() method call. As we know MutableBodyDescriptor is already pluggable, why don't we simply leave the parsing job to this object? 

This way the FieldParser interface can still live in the dom package, together with all of the remaining pluggable stuff regarding to field parsing and together with the "advanced" MutableBodyDescriptor implementations.

In fact using the DefaultBodyDescriptor with the FieldParser doesn't currently make sense because it will parse the field once in the Fieldparser and then ignore the parsed field data by using, for example, the DefaultBodyDescriptor.parseContentType method that recreates a RawField starting from the parsed field. 

If we move the "FieldParser" logic to the body descriptor then we make it more clear and we move the code where it really is used. Also, this way the MutableBodyDescriptor.addField can be better tied to the RawField object as we know it works on raw stuff (it doesn't make sense to accept Field and then encapsulate non-RawField in new RawField, giving to the use a false sense of optimization).

Also, moving FieldParser to dom will let us to change it signature from  "FieldParser<T extends Field>" to a stricter "FieldParser<T extends ParsedField>".

I tried to generate a diff but it is hard to understand, I will try to put the change in a branch so to better show what I mean and to let you review.

> Avoid duplicate parsing of header fields
> ----------------------------------------
>
>                 Key: MIME4J-116
>                 URL: https://issues.apache.org/jira/browse/MIME4J-116
>             Project: JAMES Mime4j
>          Issue Type: Improvement
>    Affects Versions: 0.6
>            Reporter: Markus Wiederkehr
>             Fix For: 0.7
>
>
> Currently some header fields are parsed twice when building a DOM. Once by DefaultBodyDescriptor or MaximalBodyDescriptor and a second time by MessageBuilder using Field.parse().
> Also different parsers are used in both stages. The body descriptors use handcrafted parsers whereas Field.parse uses JavaCC generated parsers. The handcrafted version does not seem to handle comments in a header correctly.
> The situation should be improved by parsing a header field only once and passing that already parsed field to a content handler. Also only one sort of field parser should be used; either handcrafted or generated. My personal opinion is that it might be easier for a handcrafted parser to be more tolerant against malformed header fields.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira