You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by Stefano Bagnara <ap...@bago.org> on 2008/07/14 12:59:08 UTC

[mime4j] boundaries and parsing

Niklas submitted this message as part of MIME4J-52
==========================
Content-Type: multipart/mixed; boundary="outer-boundary"

Outer preamble

--outer-boundary
Content-Type: text/plain

Foo

--outer-boundary
Content-Type: multipart/alternative; boundary="inner-boundary"

AAA

--outer-boundary--
Outer epilouge
============================

What is the expected result for this?

Should the parser understand that the inner multipart never starts and 
that it instead find another outer boundary?

ATM mime4j locks in a loop.
Adding the fix I proposed in a comment to that issue the result is that 
mime4j consider:
=============================
AAA

--outer-boundary--
Outer epilouge
=============================
the preamble of the inner multipart ignoring the "--outer-boundary--" of 
the external multipart.


What is the correct/best way to handle bad multiparts like this?


Stefano

PS1: I found this discussion related to boundaries, MUAs and malformed 
mime messages:
http://mail-archives.apache.org/mod_mbox/spamassassin-dev/200409.mbox/%3C20040906010538.3CC27838B8@bugzilla.spamassassin.org%3E
This seems to tell us that it would be better (when in non strict mode) 
to support spaces at the end of the boundaries because other MUAs do 
that, too.

PS2: I remembered we had more messages we removed because of licensing 
issues (https://issues.apache.org/jira/browse/MIME4J-11). I revamped 
them and checked the mime4j 0.2 expected results with what we return now 
and I found 3 differences (frag.msg, multi-2gifs-base64.msg, 
multi-frag.msg). I hope I will soon find the time to create simple 
example messages to reproduce the issue so we can decide wether the new 
behaviour is the expected one or we have bugs.

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] boundaries and parsing

Posted by Stefano Bagnara <ap...@bago.org>.
Niklas Therning ha scritto:
> Stefano Bagnara wrote:
>> Niklas submitted this message as part of MIME4J-52
>> ==========================
>> Content-Type: multipart/mixed; boundary="outer-boundary"
>>
>> Outer preamble
>>
>> --outer-boundary
>> Content-Type: text/plain
>>
>> Foo
>>
>> --outer-boundary
>> Content-Type: multipart/alternative; boundary="inner-boundary"
>>
>> AAA
>>
>> --outer-boundary--
>> Outer epilouge
>> ============================
>>
>> What is the expected result for this?
>>
>> Should the parser understand that the inner multipart never starts and 
>> that it instead find another outer boundary?
>>
>> ATM mime4j locks in a loop.
>> Adding the fix I proposed in a comment to that issue the result is 
>> that mime4j consider:
>> =============================
>> AAA
>>
>> --outer-boundary--
>> Outer epilouge
>> =============================
>> the preamble of the inner multipart ignoring the "--outer-boundary--" 
>> of the external multipart.
>>
>>
>> What is the correct/best way to handle bad multiparts like this?
> With the old version of Mime4j the result would be:
> 
> ===============================
> AAA
> 
> ===============================
> 
> this would be reported as the inner multipart's preamble while
> 
> ===============================
> Outer epilogue
> ===============================
> 
> would reported as the outer multipart's epilogue.
> 
> Furthermore, IIRC, the old Mime4j would report an empty bodypart and 
> empty epilogue for the inner multipart. I think this is a nice feature 
> since the events generated by Mime4j would always represent a valid MIME 
> message even if the input isn't entirely valid.

I think that what you reported is also what the RFC mandates, so we 
should really revert to that behaviour.

I just tested the result of nesting of MimeBoundaryInputStream I tried 
locally (referred in MIME4J-52) and it is equivalent to what you reported.

Stefano

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: [mime4j] boundaries and parsing

Posted by Niklas Therning <ni...@trillian.se>.
Stefano Bagnara wrote:
> Niklas submitted this message as part of MIME4J-52
> ==========================
> Content-Type: multipart/mixed; boundary="outer-boundary"
>
> Outer preamble
>
> --outer-boundary
> Content-Type: text/plain
>
> Foo
>
> --outer-boundary
> Content-Type: multipart/alternative; boundary="inner-boundary"
>
> AAA
>
> --outer-boundary--
> Outer epilouge
> ============================
>
> What is the expected result for this?
>
> Should the parser understand that the inner multipart never starts and 
> that it instead find another outer boundary?
>
> ATM mime4j locks in a loop.
> Adding the fix I proposed in a comment to that issue the result is 
> that mime4j consider:
> =============================
> AAA
>
> --outer-boundary--
> Outer epilouge
> =============================
> the preamble of the inner multipart ignoring the "--outer-boundary--" 
> of the external multipart.
>
>
> What is the correct/best way to handle bad multiparts like this?
With the old version of Mime4j the result would be:

===============================
AAA

===============================

this would be reported as the inner multipart's preamble while

===============================
Outer epilogue
===============================

would reported as the outer multipart's epilogue.

Furthermore, IIRC, the old Mime4j would report an empty bodypart and 
empty epilogue for the inner multipart. I think this is a nice feature 
since the events generated by Mime4j would always represent a valid MIME 
message even if the input isn't entirely valid.


-- 
Niklas Therning
www.spamdrain.net

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org