You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mime4j-dev@james.apache.org by Alejandro Valdez <al...@gmail.com> on 2009/08/18 15:03:21 UTC

Soft line breaks in quoted printable decoding

Hi list, I noticed that the class QuotedPrintableInputStream expects
that the input stream is encoded using CRLF as line terminator (in
particular, when dealing with quoted printable soft line breaks), but
I have found that some e-mails are encoded using LF as line
terminator. In such cases QuotedPrintableInputStream leaves in the
decoded output a =LF for each line with a soft line break.

I tried to find what RFC specifies that a mime e-mail should uses CRLF
as line terminator, but I could not find it...

I patched my QuotedPrintableInputStream class to identify =LF as a
soft line break, it works ok, but maybe I missing something.

Re: Soft line breaks in quoted printable decoding

Posted by Robert Burrell Donkin <ro...@gmail.com>.
On Tue, Aug 18, 2009 at 9:43 PM, Oleg Kalnichevski<ol...@apache.org> wrote:
> Markus Wiederkehr wrote:
>>
>> On Tue, Aug 18, 2009 at 3:03 PM, Alejandro
>> Valdez<al...@gmail.com> wrote:
>>>
>>> Hi list, I noticed that the class QuotedPrintableInputStream expects
>>> that the input stream is encoded using CRLF as line terminator (in
>>> particular, when dealing with quoted printable soft line breaks), but
>>> I have found that some e-mails are encoded using LF as line
>>> terminator. In such cases QuotedPrintableInputStream leaves in the
>>> decoded output a =LF for each line with a soft line break.
>>>
>>> I tried to find what RFC specifies that a mime e-mail should uses CRLF
>>> as line terminator, but I could not find it...
>>>
>>> I patched my QuotedPrintableInputStream class to identify =LF as a
>>> soft line break, it works ok, but maybe I missing something.
>>
>> Lines have to be terminated with CRLF if a message is transmitted via
>> SMTP (see RFC 5321 section 2.3.8.).
>>
>> As far as I know this only applies when the message is transferred. It
>> does not apply to the message itself, e.g. when it is stored on disk.
>>
>> So QuotedPrintableInputStream should probably accept other line endings I
>> guess.
>>
>> Other opinions?
>>
>> Markus
>
>
> I think QuotedPrintableInputStream should be lenient about line breaks and
> treat both CRLF and lone LF as valid line breaks.
>
> I am still planning to review (and possibly rewrite most of)
> QuotedPrintableInputStream class in order to address MIME4J-103 [1]. I could
> tackle the line break issue at the same time. I just do not know when I may
> get around to doing that.

cool

are there any volunteers out there who might be able to find time to
take this on (with oleg's help, of course ;-)?

- robert

Re: Soft line breaks in quoted printable decoding

Posted by Oleg Kalnichevski <ol...@apache.org>.
Markus Wiederkehr wrote:
> On Tue, Aug 18, 2009 at 3:03 PM, Alejandro
> Valdez<al...@gmail.com> wrote:
>> Hi list, I noticed that the class QuotedPrintableInputStream expects
>> that the input stream is encoded using CRLF as line terminator (in
>> particular, when dealing with quoted printable soft line breaks), but
>> I have found that some e-mails are encoded using LF as line
>> terminator. In such cases QuotedPrintableInputStream leaves in the
>> decoded output a =LF for each line with a soft line break.
>>
>> I tried to find what RFC specifies that a mime e-mail should uses CRLF
>> as line terminator, but I could not find it...
>>
>> I patched my QuotedPrintableInputStream class to identify =LF as a
>> soft line break, it works ok, but maybe I missing something.
> 
> Lines have to be terminated with CRLF if a message is transmitted via
> SMTP (see RFC 5321 section 2.3.8.).
> 
> As far as I know this only applies when the message is transferred. It
> does not apply to the message itself, e.g. when it is stored on disk.
> 
> So QuotedPrintableInputStream should probably accept other line endings I guess.
> 
> Other opinions?
> 
> Markus


I think QuotedPrintableInputStream should be lenient about line breaks 
and treat both CRLF and lone LF as valid line breaks.

I am still planning to review (and possibly rewrite most of) 
QuotedPrintableInputStream class in order to address MIME4J-103 [1]. I 
could tackle the line break issue at the same time. I just do not know 
when I may get around to doing that.

Oleg

[1] https://issues.apache.org/jira/browse/MIME4J-103

Re: Soft line breaks in quoted printable decoding

Posted by Markus Wiederkehr <ma...@gmail.com>.
On Tue, Aug 18, 2009 at 3:03 PM, Alejandro
Valdez<al...@gmail.com> wrote:
> Hi list, I noticed that the class QuotedPrintableInputStream expects
> that the input stream is encoded using CRLF as line terminator (in
> particular, when dealing with quoted printable soft line breaks), but
> I have found that some e-mails are encoded using LF as line
> terminator. In such cases QuotedPrintableInputStream leaves in the
> decoded output a =LF for each line with a soft line break.
>
> I tried to find what RFC specifies that a mime e-mail should uses CRLF
> as line terminator, but I could not find it...
>
> I patched my QuotedPrintableInputStream class to identify =LF as a
> soft line break, it works ok, but maybe I missing something.

Lines have to be terminated with CRLF if a message is transmitted via
SMTP (see RFC 5321 section 2.3.8.).

As far as I know this only applies when the message is transferred. It
does not apply to the message itself, e.g. when it is stored on disk.

So QuotedPrintableInputStream should probably accept other line endings I guess.

Other opinions?

Markus