You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2014/08/13 20:27:12 UTC

[Bug 7073] New: Oddly-formed MIME-Version header prevents base64 decoding

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

            Bug ID: 7073
           Summary: Oddly-formed MIME-Version header prevents base64
                    decoding
           Product: Spamassassin
           Version: 3.4.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Libraries
          Assignee: dev@spamassassin.apache.org
          Reporter: dave@pifke.org

Created attachment 5227
  --> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5227&action=edit
False negative due to weird MIME-Version header

The attached message does not trigger any body or uri rules because it is not
being base64-decoded.

Removing the line after the MIME-Version: 1.0 header causes it to be properly
base64-decoded, and the rules trigger as expected.

I suspect the header is invalid according to the RFCs, however the original
message renders correctly in both alpine and the Android email client.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

Mark Martinec <Ma...@ijs.si> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |WONTFIX

--- Comment #10 from Mark Martinec <Ma...@ijs.si> ---
(In reply to Mark Martinec from comment #1)
> The header section is supposed to end by an empty line - but, like several
> other mail message parsers, SpamAssassin also considers a header to end
> when a header field starting with '---' is encountered. This is not strictly
> correct, but on the other hand avoids a case where a separator line is
> missing and an entire message body is thus considered part of a header,
> and a body would then be considered empty.
> [...]
> So, garbage-in, garbage-out. Fixing one case would break the other.

Tentatively closing as WONTFIX.
Please re-open if disagreeing.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

--- Comment #8 from Dave Pifke <da...@pifke.org> ---
Created attachment 5230
  --> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5230&action=edit
Additional sample #2

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

--- Comment #9 from Karsten Bräckelmann <gu...@rudersport.de> ---
(In reply to Dave Pifke from comment #6)
> >  From jpeterson@olxagmzmkt.com Wed Aug 13 13:47:10 2014
> >  >From dave  Wed Aug 13 13:47:10 2014

> This is indeed a raw, unharmed sample.  The relevant Exim configuration that
> wrote it is:
> 
> procmail_pipe:
>         driver = pipe
>         command = /usr/bin/procmail -d ${local_part}
>         user = ${local_part}
>         check_string = "From "
>         escape_string = ">From "

This seems to explain the weird first two lines. I am by no means an Exim
expert, but glancing docs for those check/escape string options tells:

Those together form a string match and replace command, performed on each line.

This particular substitution matches the common From_ escape: In mbox format, a
line beginning with "From " marks the start of a new message. Thus, such string
in the body of a message must be escaped, to differentiate it from the mbox
format begin-of-message marker.

In a pipe to the procmail command for delivery, From_ escaping is

(a) unnecessary, since procmail does the escaping if (and only if) the delivery
target is in mbox format (no need for other formats), and

(b) slightly harmful in this case: The check/escape string option does not
differentiate between message headers and body, thus invalidating the already
existing From_ line, forcing procmail to add one of its own.


So, again, that explains the weird first two lines. And it is entirely
unrelated to any other format breakage. It is in particular unrelated to the
broken MIME structure and headers which led to this bug report.

Herring. Red. But at least we know it's smelly fish, not spam...

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

Kevin A. McGrail <km...@pccc.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kmcgrail@pccc.com

--- Comment #3 from Kevin A. McGrail <km...@pccc.com> ---
Have we seen this broken behavior with Ham or is it a complete sign of Spam and
perhaps a rule for this issue is a spam indicator?

Regards,
KAM

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

--- Comment #2 from Karsten Bräckelmann <gu...@rudersport.de> ---
That's a badly broken MIME message. Some relevant headers and MIME structure
from attachment 5227:

  MIME-version: 1.0;
  ----_64617665407069666b652e6f7267_Content-type: text/html; charset=UTF-8
  Content-Transfer-Encoding: base64

  ----_64617665407069666b652e6f7267_
  Content-type: text/calendar; charset=UTF-8
  Content-Transfer-Encoding: base64

  ----_64617665407069666b652e6f7267_--

That first line, the MIME-Version actually is (and meant to be) part of the
general mail headers. The following line is utterly broken, being a mixture of
supposed overall mail headers, MIME structure and MIME-part headers. The first
two lines of the above should have been:

  Mime-Version: 1.0
  Content-Type: multipart/mixed; boundary="----_64617665407069666b652e6f7267_"

  ----_64617665407069666b652e6f7267_
  Content-type: text/html; charset=UTF-8

That is a missing Content-Type header (possibly more headers following, like
forged Mailer) with the MIME boundary. An empty line (double newline) to
separate mail header from body. And a newline following that boundary that
forms the beginning of the second line in the sample.

That boundary is not supposed to be part of the general mail headers, but the
MIME structure. The following Content-Type is the first MIME part's header.


I can see how dropping (or outright ignoring, in the case of the mentioned
MUAs) that line would base64 decode the payload, due to the following C-T-E
header. Also, those MUAs ignore bad, trailing content at the end of the base64
blob, which is entirely valid.

The base64 encoded blob does not in fact contain HTML, but plain text content.

At this point, SA simply seems to be stricter about that one broken header,
than those MUAs mentioned.


However, there's another oddity in the attached sample (besides missing
Received, X-Spam and other headers), that make me wonder if this is indeed a
raw, unharmed sample. The first four lines:

  From jpeterson@olxagmzmkt.com Wed Aug 13 13:47:10 2014
  >From dave  Wed Aug 13 13:47:10 2014
  Return-path: <jp...@olxagmzmkt.com>
  Envelope-to: dave@pifke.org

That second line should just not be there. Even though its formatting (double
space between address and date) looks better than the first one's...


Dave, is this the only instance of such headers you encountered? Can you (or
anyone else) dig up more samples or confirm? [1]

Can you explain the bad first two headers, and the missing ones? How much, if
at all, has this sample possibly been re-formatted, or transferred back and
forth? In particular with those latter oddities in mind.


[1] FWIW, the second MIME part (with the structure corrected) is an empty
text/calendar Content-Type, as recently mentioned on the users@ list. IIRC
there was no raw sample, though it is unclear if the MIME structure was
similarly broken, too.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

--- Comment #4 from AXB <ax...@gmail.com> ---
(In reply to Kevin A. McGrail from comment #3)
> Have we seen this broken behavior with Ham or is it a complete sign of Spam
> and perhaps a rule for this issue is a spam indicator?
> 
> Regards,
> KAM

It's a broken msg indicator. Not something which happens regularly for neither
spam or ham and doesn't rate a rule.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

--- Comment #7 from Dave Pifke <da...@pifke.org> ---
Created attachment 5229
  --> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5229&action=edit
Additional sample #1

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

--- Comment #1 from Mark Martinec <Ma...@ijs.si> ---
  Date: Wed, 13 Aug 2014 10:39:11 -0400
  MIME-version: 1.0;
  ----_64617665407069666b652e6f7267_Content-type: text/html; charset=UTF-8
  Content-Transfer-Encoding: base64

  LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCkV2ZXJ5b25lIGRlc2VydmVz

The header section is supposed to end by an empty line - but, like several
other mail message parsers, SpamAssassin also considers a header to end
when a header field starting with '---' is encountered. This is not strictly
correct, but on the other hand avoids a case where a separator line is
missing and an entire message body is thus considered part of a header,
and a body would then be considered empty.

In the above case the 'Content-Transfer-Encoding: base64' is no longer
considered part of a header section, so the rest of the body is not
decoded according to base64.

> I suspect the header is invalid according to the RFCs, however the original
> message renders correctly in both alpine and the Android email client.

So, garbage-in, garbage-out. Fixing one case would break the other.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

Dave Pifke <da...@pifke.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dave@pifke.org

--- Comment #6 from Dave Pifke <da...@pifke.org> ---
Re #1:

> The header section is supposed to end by an empty line - but, like several
> other mail message parsers, SpamAssassin also considers a header to end
> when a header field starting with '---' is encountered. This is not
> strictly correct, but on the other hand avoids a case where a separator
> line is missing and an entire message body is thus considered part of a
> header, and a body would then be considered empty.

I was unable to reproduce this parsing behavior in any MUA to which I have
access.  A Subject: header occurring after a header like "---X-Foo: bar" was
still rendered in Gmail, Thunderbird, alpine, and the Android email client.

Re #2:

> However, there's another oddity in the attached sample (besides missing
> Received, X-Spam and other headers), that make me wonder if this is
> indeed a raw, unharmed sample. The first four lines:
>
>  From jpeterson@olxagmzmkt.com Wed Aug 13 13:47:10 2014
>  >From dave  Wed Aug 13 13:47:10 2014
>  Return-path: <jp...@olxagmzmkt.com>
>  Envelope-to: dave@pifke.org
>
> That second line should just not be there. Even though its formatting
> (double space between address and date) looks better than the first
> one's...

This is indeed a raw, unharmed sample.  The relevant Exim configuration that
wrote it is:

procmail_pipe:
        driver = pipe
        command = /usr/bin/procmail -d ${local_part}
        user = ${local_part}
        check_string = "From "
        escape_string = ">From "
        delivery_date_add
        envelope_to_add
        return_path_add

The above configuration hasn't been touched in 5+ years and was probably
cargo-culted from somewhere else, so I couldn't tell you if the extra lines
serve any valid purpose, but the samples are all ripped right out of my mail
spool.

> Dave, is this the only instance of such headers you encountered? Can you
> (or anyone else) dig up more samples or confirm?

I have two more samples from last week, will attach.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7073] Oddly-formed MIME-Version header prevents base64 decoding

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7073

--- Comment #5 from AXB <ax...@gmail.com> ---

iirc, we have 

body MISSING_MIME_HB_SEP   
eval:check_msg_parse_flags('missing_mime_head_body_separator')
describe MISSING_MIME_HB_SEP    Missing blank line between MIME header and body

score MISSING_MIME_HB_SEP 0.001 0.001 0.001 0.001

for such cases

-- 
You are receiving this mail because:
You are the assignee for the bug.